Ride-Sharing Platform System Design (Uber Clone)

Type: Software Reference Confidence: 0.91 Sources: 7 Verified: 2026-02-23 Freshness: quarterly

TL;DR

Constraints

Quick Reference

Component Role Technology Options Scaling Strategy
API Gateway Rate limiting, auth, request routing Kong, AWS API Gateway, Envoy Horizontal + edge caching
Rider Service Ride requests, fare estimates, trip history Node.js, Go, Java Spring Stateless horizontal pods
Driver Service Driver onboarding, availability, earnings Go, Java, Kotlin Stateless horizontal pods
Location Service Ingest GPS pings, maintain driver positions Go + Redis GEO, custom H3 grid Sharded by geo-region
Matching Service Pair riders with nearest available drivers Go, Rust, Java Sharded by H3 cell region
Trip Service Trip state machine (requested → matched → in_progress → completed) Node.js, Go Event-sourced with Kafka
Pricing Service Fare calculation, surge pricing, promotions Python, Go Stateless; reads supply/demand from cache
Payment Service Charge riders, pay drivers, handle refunds Java, Node.js + Stripe/Braintree Async processing with idempotency
Notification Service Push notifications, SMS, email Node.js, Go + Firebase/APNs/SNS Fan-out via message queue
ETA Service Estimated time of arrival, route optimization Python, C++ + OSRM/Valhalla Precomputed graph + ML model
WebSocket Gateway Persistent connections for real-time updates Node.js (Socket.io), Go (gorilla/websocket) Sticky sessions + connection registry
Message Queue Async event bus for all services Apache Kafka, Apache Pulsar Partitioned by driver_id or trip_id
Geospatial Index Fast nearest-neighbor driver lookup Redis GEO, H3 in-memory grid, PostGIS Sharded by H3 resolution-3 cells
Analytics Pipeline Trip data, driver metrics, business intelligence Kafka → Spark/Flink → data warehouse Batch + real-time (Lambda architecture)

Decision Tree

START
├── Expected scale?
│   ├── <1K concurrent users (MVP)
│   │   ├── Use monolith with PostGIS for location queries
│   │   ├── Simple distance-based matching (SQL query)
│   │   └── Fixed pricing — skip surge entirely
│   ├── 1K-100K concurrent users (city-level)
│   │   ├── Split into 5-8 microservices
│   │   ├── Redis GEO for driver locations
│   │   ├── Kafka for event streaming
│   │   └── Basic supply/demand surge pricing
│   ├── 100K-1M concurrent users (regional)
│   │   ├── Full microservices (12+ services)
│   │   ├── H3 hexagonal index, sharded by region
│   │   ├── Dedicated matching service with scoring algorithm
│   │   └── ML-based surge pricing + ETA prediction
│   └── >1M concurrent users (global, Uber-scale)
│       ├── Geo-sharded infrastructure (multi-region)
│       ├── Custom geospatial engine (RingPop for consistent hashing)
│       ├── Real-time ML pipeline for matching/pricing/ETA
│       └── CQRS + event sourcing for trip state
├── Matching priority?
│   ├── Lowest wait time → Nearest-driver with availability check
│   ├── Cost optimization → Factor in driver heading direction + ETA
│   └── Quality → Weighted scoring: distance (40%) + rating (30%) + acceptance rate (30%)
└── Pricing model?
    ├── Fixed fare → Precomputed zone-to-zone matrix
    ├── Metered → Distance (GPS trace) + time (wall clock) + base fare
    └── Dynamic surge → Supply/demand ratio per H3 cell, updated every 30-60s

Step-by-Step Guide

1. Define the data model and core entities

Design your database schema around four core entities: Users (riders + drivers), Vehicles, Trips, and Payments. Use PostgreSQL for transactional data and Redis for real-time state. [src2]

CREATE TABLE users (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    role VARCHAR(10) NOT NULL CHECK (role IN ('rider', 'driver')),
    name VARCHAR(255) NOT NULL,
    email VARCHAR(255) UNIQUE NOT NULL,
    phone VARCHAR(20) UNIQUE NOT NULL,
    rating DECIMAL(3,2) DEFAULT 5.00,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE trips (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    rider_id UUID REFERENCES users(id),
    driver_id UUID REFERENCES users(id),
    status VARCHAR(20) NOT NULL DEFAULT 'requested',
    pickup_lat DOUBLE PRECISION NOT NULL,
    pickup_lng DOUBLE PRECISION NOT NULL,
    dropoff_lat DOUBLE PRECISION NOT NULL,
    dropoff_lng DOUBLE PRECISION NOT NULL,
    fare_cents INT,
    surge_multiplier DECIMAL(3,2) DEFAULT 1.00,
    requested_at TIMESTAMPTZ DEFAULT NOW(),
    completed_at TIMESTAMPTZ
);

Verify: SELECT count(*) FROM information_schema.tables WHERE table_name IN ('users', 'trips'); → expected: 2

2. Implement the location ingestion pipeline

Drivers send GPS pings every 3-4 seconds over WebSocket. These flow through Kafka to the Location Service, which updates an in-memory geospatial index for real-time matching. [src3]

Driver App --[WebSocket]--> WebSocket Gateway
    --[produce]--> Kafka (topic: driver.location)
    --[consume]--> Location Service
        ├── Hot path: Update Redis GEO / H3 in-memory index
        └── Cold path: Write to TimescaleDB for trip trace history

Verify: Send a test location ping and confirm Redis GEOSEARCH returns the driver within the expected radius.

3. Build the geospatial matching engine

Use Uber's H3 hexagonal grid to partition the map. Convert the pickup location to an H3 cell ID, then search that cell and its k-ring neighbors for available drivers. [src1]

import h3

def find_nearby_drivers(pickup_lat, pickup_lng, driver_index, k=1):
    pickup_cell = h3.latlng_to_cell(pickup_lat, pickup_lng, 9)
    search_cells = h3.grid_disk(pickup_cell, k)
    nearby_drivers = []
    for cell in search_cells:
        if cell in driver_index:
            nearby_drivers.extend(driver_index[cell])
    return nearby_drivers

Verify: h3.latlng_to_cell(40.7128, -74.0060, 9) returns a valid 15-character hex string.

4. Implement the ride matching algorithm

Score candidate drivers using a weighted function of distance, ETA, rating, and acceptance rate. Send the offer to the top-ranked driver; cascade on decline/timeout. [src2]

def score_driver(driver, pickup_lat, pickup_lng):
    dist = haversine(pickup_lat, pickup_lng, driver['lat'], driver['lng'])
    distance_score = max(0, (5.0 - dist) / 5.0) * 40
    rating_score = (driver['rating'] / 5.0) * 30
    accept_score = driver['acceptance_rate'] * 20
    heading_bonus = 10 if driver.get('heading_toward') else 0
    return distance_score + rating_score + accept_score + heading_bonus

Verify: A driver 0.1km away with 4.8 rating and 92% acceptance scores > 80.

5. Build the surge pricing engine

Calculate surge multipliers per H3 cell by comparing demand to supply. Update every 30-60 seconds. [src6]

def calculate_surge(cell_id, request_count, available_drivers,
                    base_threshold=0.7, max_multiplier=3.0):
    if available_drivers == 0:
        return max_multiplier
    ratio = request_count / available_drivers
    if ratio <= base_threshold:
        return 1.0
    surge = 1.0 + (ratio - base_threshold) * 1.5
    return min(round(surge, 2), max_multiplier)

Verify: calculate_surge("cell", 10, 20)1.0; calculate_surge("cell", 50, 10)3.0

6. Implement the trip state machine

Model each trip as a finite state machine with event sourcing via Kafka. [src4]

REQUESTED ──[driver accepts]──> MATCHED
MATCHED ──[driver arrives]──> DRIVER_ARRIVING
DRIVER_ARRIVING ──[rider picked up]──> IN_PROGRESS
IN_PROGRESS ──[arrived at destination]──> COMPLETED
COMPLETED ──[payment processed]──> PAID

Verify: No valid transition exists from COMPLETED back to IN_PROGRESS.

7. Set up the payment and billing pipeline

Process payments asynchronously after trip completion. Use idempotency keys to prevent double charges. [src5]

Trip Completed Event (Kafka) --> Payment Service
  1. Calculate final fare: base + (distance_km * per_km) + (duration_min * per_min) * surge
  2. Capture pre-authorized amount (Stripe/Braintree)
  3. If capture fails: retry with exponential backoff (max 3)
  4. Emit PaymentCompleted event

Verify: Two identical capture requests with same idempotency key → only one charge.

Code Examples

Python: Geospatial Driver Matching with H3

# Input:  rider pickup coordinates, dict of active drivers
# Output: ranked list of nearby driver IDs

import h3
import redis

r = redis.Redis(host='localhost', port=6379, db=0)

def update_driver_location(driver_id: str, lat: float, lng: float):
    r.geoadd("drivers:geo", (lng, lat, driver_id))
    cell = h3.latlng_to_cell(lat, lng, 9)
    r.sadd(f"drivers:h3:{cell}", driver_id)
    r.set(f"drivers:cell:{driver_id}", cell, ex=30)

def match_rider(pickup_lat: float, pickup_lng: float, radius_km: float = 3.0):
    results = r.geosearch(
        "drivers:geo", longitude=pickup_lng, latitude=pickup_lat,
        radius=radius_km, unit="km", sort="ASC", count=10
    )
    return [driver_id.decode() for driver_id in results]

JavaScript/Node.js: WebSocket Driver Connection Handler

// Input:  WebSocket connection from driver app
// Output: location updates forwarded to Kafka

const { Kafka } = require('kafkajs');
const WebSocket = require('ws');

const kafka = new Kafka({ brokers: ['kafka:9092'] });
const producer = kafka.producer();
const wss = new WebSocket.Server({ port: 8080 });

wss.on('connection', (ws, req) => {
  const driverId = req.headers['x-driver-id'];
  ws.on('message', async (data) => {
    const { lat, lng, timestamp } = JSON.parse(data);
    await producer.send({
      topic: 'driver.location',
      messages: [{ key: driverId, value: JSON.stringify({ lat, lng, timestamp }) }]
    });
  });
});

Go: Surge Pricing Calculator

// Input:  demand (request count), supply (driver count)
// Output: surge multiplier (float64)

package pricing

import "math"

const (
    BaseThreshold = 0.7
    MaxMultiplier = 3.0
    SurgeSlope    = 1.5
)

func CalculateSurge(requests, drivers int) float64 {
    if drivers == 0 {
        return MaxMultiplier
    }
    ratio := float64(requests) / float64(drivers)
    if ratio <= BaseThreshold {
        return 1.0
    }
    surge := 1.0 + (ratio-BaseThreshold)*SurgeSlope
    return math.Min(math.Round(surge*100)/100, MaxMultiplier)
}

Anti-Patterns

Wrong: Polling all drivers in the database for every ride request

-- BAD: full table scan on every ride request, O(n) for n drivers
SELECT id, lat, lng,
       ST_Distance(location, ST_MakePoint(-74.006, 40.7128)) AS dist
FROM drivers
WHERE is_available = true
ORDER BY dist ASC
LIMIT 10;
-- At 100K+ active drivers this query takes 500ms+ per request

Correct: Use in-memory geospatial index with H3 partitioning

# GOOD: O(1) cell lookup + O(k) for k neighbors, typically <5ms
pickup_cell = h3.latlng_to_cell(40.7128, -74.006, 9)
search_cells = h3.grid_disk(pickup_cell, 1)  # ~7 cells
candidates = []
for cell in search_cells:
    candidates.extend(driver_index.get(cell, []))

Wrong: Synchronous payment processing blocking trip completion

# BAD: rider waits for payment before seeing "trip complete"
def complete_trip(trip_id):
    trip = db.get_trip(trip_id)
    trip.status = 'completed'
    charge = stripe.charges.create(amount=trip.fare)  # 2-5s blocking call
    if charge.status != 'succeeded':
        trip.status = 'payment_failed'  # rider stuck
    db.save(trip)

Correct: Async payment via event queue with retry

# GOOD: trip completes instantly, payment processed asynchronously
def complete_trip(trip_id):
    trip = db.get_trip(trip_id)
    trip.status = 'completed'
    db.save(trip)
    kafka.produce('trip.completed', {
        'trip_id': trip_id, 'fare': trip.fare,
        'idempotency_key': f"trip-{trip_id}"
    })

Wrong: Single-point surge pricing for an entire city

# BAD: one surge multiplier for the whole city
total_requests = get_city_wide_requests()
total_drivers = get_city_wide_drivers()
city_surge = total_requests / total_drivers  # meaningless average

Correct: Per-cell surge pricing using H3 hexagonal grid

# GOOD: granular surge per H3 cell captures local imbalance
for cell_id in active_cells:
    requests = get_cell_requests(cell_id, window_minutes=5)
    drivers = get_cell_drivers(cell_id)
    surge = calculate_surge(cell_id, requests, drivers)
    cache.set(f"surge:{cell_id}", surge, ttl=60)

Common Pitfalls

Diagnostic Commands

# Check Redis GEO driver count in radius
redis-cli GEOSEARCH drivers:geo FROMLONLAT -74.006 40.7128 BYRADIUS 3 km COUNT 100 ASC

# Monitor Kafka consumer lag for location topic
kafka-consumer-groups.sh --bootstrap-server kafka:9092 --describe --group location-consumer

# Count active WebSocket connections
ss -s | grep -i estab

# Check H3 cell for a coordinate
python3 -c "import h3; print(h3.latlng_to_cell(40.7128, -74.006, 9))"

# Monitor trip state transitions (Kafka)
kafka-console-consumer.sh --bootstrap-server kafka:9092 --topic trip.events --from-latest

# Check surge pricing for a cell
redis-cli GET surge:891f1a80537ffff

When to Use / When Not to Use

Use When Don't Use When Use Instead
Building a two-sided marketplace connecting riders with drivers in real-time Building a food delivery platform with restaurant prep time Food delivery system design (different matching + batching)
Need to handle >1K concurrent ride requests with <3s matching latency Simple point-to-point scheduled shuttle service Queue-based booking system
Implementing dynamic pricing based on real-time supply/demand Fixed-route public transit scheduling Transit scheduling system (GTFS-based)
Require real-time driver tracking and ETA updates for riders Package/freight logistics with multi-day delivery windows Logistics/fleet management system design
Supporting multiple vehicle types (economy, premium, XL) in one platform Peer-to-peer car rental (no real-time matching needed) Marketplace platform design (Airbnb-style)

Important Caveats

Related Units