Design a Real-Time Driver Tracking System for a Food Delivery App

Compare model answers for this System Design benchmark and review scores, judging comments, and related examples.

X f L

Contents

Task Overview

Benchmark Genres

System Design

Task Creator Model The task creator is randomly selected from top task-generation models of supported providers.

Google Gemini 2.5 Pro

Answering Models In this benchmark, models from the same provider as the task creator are excluded from answering.

Answer A OpenAI GPT-5.4

Answer B Anthropic Claude Haiku 4.5

Judge Models Judging uses exactly 3 judge models, excluding the answering models. At least 1 judge is selected from flagship models, lightweight models are not selected as judges, and the 3 judges come from 3 distinct providers.

OpenAI GPT-5.2 Anthropic Claude Opus 4.6 Google Gemini 2.5 Flash-Lite

Task Prompt

Show more ▼

You are tasked with designing the high-level architecture for a real-time driver tracking system for a popular food delivery service. The service has 50,000 active drivers and 200,000 active customers during peak hours. Describe the system architecture, covering the following aspects: 1. How drivers' mobile devices will send location data to the backend. 2. The backend services needed to process and distribute this location data. 3. How customers' mobile devices will receive the real-time location updates for their specific order. 4. Your choice of communication protocols (e.g., HTTP polling, WebSockets, MQTT) and justification for your choice. 5. The data models for storing driver locations and order information. 6. A strategy for scaling the system to handle peak load.

Task Context

A key feature of modern food delivery applications is the ability for a customer to track their order in real-time on a map, from the restaurant to their doorstep. This feature enhances the customer experience by providing transparency and reducing anxiety about delivery times. Designing such a system involves handling a large volume of real-time location data from thousands of drivers and efficiently delivering it to the relevant customers with minimal delay.

Judging Policy

A good answer should present a coherent and logical system design that addresses all parts of the prompt. It must justify its technology and protocol choices, explaining why they are suitable for this specific use case (e.g., why WebSockets might be preferred over HTTP polling). The proposed architecture should be scalable and realistically handle the specified load. The solution should also consider practical constraints, such as battery consumption on mobile devices. Key components like data ingestion, processing, storage, and distribution should be clearly explained and interconnected.

Answers

Answer A: OpenAI GPT-5.4

High-level design
Build an event-driven real-time tracking platform with three main paths: driver ingest, real-time fanout, and query/storage. Drivers continuously publish GPS updates. The backend validates and enriches them, updates the driver’s latest position, maps the driver to any active delivery order, and pushes updates only to the customer tracking that order.

Driver mobile to backend
Each driver app runs a lightweight location publisher:

Collect GPS coordinates every few seconds, for example every 2 to 5 seconds while on an active delivery, less frequently when idle.
Include driver_id, current or...

Show Full Answer ▼

Driver mobile to backend
Each driver app runs a lightweight location publisher:

Collect GPS coordinates every few seconds, for example every 2 to 5 seconds while on an active delivery, less frequently when idle.
Include driver_id, current order_id if assigned, latitude, longitude, speed, heading, timestamp, app version, battery/network hints, and a sequence number.
Apply client-side throttling and movement-based filtering so the app avoids sending unchanged positions.
Batch or coalesce updates during poor connectivity, then send the latest one first when reconnected.

Recommended protocol from driver to backend:

HTTPS for periodic location upload is the simplest and most robust choice.
Use a small POST request to a Location Ingest API.
For very high efficiency, gRPC streaming is also a strong option if mobile support and operational maturity are available.

Practical choice:

Start with HTTPS because it works well through mobile networks, proxies, and existing API gateways.
Optimize with compression, compact payloads, adaptive send frequency, and regional edge endpoints.

Ingest flow:

Driver App
API Gateway or Load Balancer
Authentication and rate limiting
Location Ingest Service
Message broker for async processing

Backend services
Core services

API Gateway: terminates TLS, authenticates drivers and customers, applies rate limits.
Location Ingest Service: validates payloads, drops stale or duplicate updates, timestamps events, publishes to a broker.
Message Broker: Kafka, Pulsar, or Kinesis for durable high-throughput event streaming.
Driver State Service: consumes location events and maintains latest known driver state in a fast store such as Redis or DynamoDB.
Order Tracking Service: maps driver_id to active order_id and customer subscription channels.
Realtime Fanout Service: pushes location updates to the correct customer connection.
Order Service: source of truth for order lifecycle, assignment, status changes, restaurant pickup, delivery completion.
ETA Service: optionally recalculates ETA using latest traffic-aware route and driver movement.
Historical Storage Service: stores location history for debugging, analytics, dispute resolution, and ML.
Monitoring and Alerting: tracks latency, dropped messages, stale driver positions, and regional outages.

Processing pipeline

Driver sends location update.
Ingest service validates auth, schema, timestamp freshness, and plausibility.
Event is written to broker.
Driver State consumer updates latest location cache keyed by driver_id.
Order Tracking consumer checks whether the driver is currently assigned to an active order.
If yes, it publishes a customer-scoped tracking event.
Realtime Fanout sends the update to the subscribed customer app.
Historical consumer stores events in long-term storage.

Customer mobile receiving real-time updates
Recommended pattern:

Customer app opens a WebSocket connection after entering the order tracking screen.
The app authenticates and subscribes to a single order tracking channel, such as order_id.
Backend verifies the customer is authorized to view that order.
Fanout service sends only updates for that order.
On initial connect, the app receives a snapshot: latest driver location, order status, ETA, last update time.
Then it receives incremental updates in near real time.

Fallbacks:

If WebSockets are blocked or unstable, use Server-Sent Events or short polling as fallback.
For backgrounded apps, use push notifications only for major milestones, not continuous tracking.

Protocol choices and justification
Driver to backend: HTTPS POST

Strong compatibility on mobile networks.
Easier retries, auth, observability, and gateway integration.
Good enough for 50,000 active drivers if updates are throttled sensibly.
Less operational complexity than MQTT.

Customer to backend: WebSockets

Best fit for server-to-client real-time updates.
Avoids wasteful polling from 200,000 customers.
Low latency and efficient for many small push messages.
A customer typically tracks one order, so subscription logic is simple.

Broker internal communication: Kafka or similar

Decouples ingest from fanout and storage.
Handles spikes, replay, and multiple downstream consumers.
Supports partitioning for horizontal scale.

Why not polling for customers:

With 200,000 active customers, frequent polling creates large unnecessary QPS even when location has not changed.
Higher latency and poorer battery/network efficiency.

Why not MQTT end-to-end:

Technically suitable for mobile telemetry, but adds client and broker complexity and may be unnecessary unless the organization already operates MQTT at scale.
For this use case, HTTPS plus WebSockets is simpler and usually sufficient.

Data models
A. Driver latest location
Purpose: hot state for real-time reads
Fields:

driver_id
lat
lng
geohash or spatial index key
speed
heading
accuracy_meters
recorded_at from device
received_at from server
sequence_number
active_order_id nullable
status such as idle, heading_to_restaurant, waiting, delivering, offline

Store:

Redis for ultra-fast latest-state reads and pub/sub metadata, or DynamoDB/Cassandra for durable scalable key-value storage.
TTL can be applied for stale entries.

Key example:

driver_id as partition key

B. Driver location history
Purpose: analytics and replay
Fields:

driver_id
timestamp
lat
lng
speed
heading
active_order_id

Store:

Time-series friendly storage, object storage via stream sink, or wide-column database.
Retention can be shorter for raw points and longer for summarized traces.

C. Order tracking model
Fields:

order_id
customer_id
driver_id
restaurant_id
status such as placed, preparing, driver_assigned, picked_up, en_route, delivered, cancelled
pickup_location
dropoff_location
latest_driver_lat
latest_driver_lng
latest_driver_timestamp
eta_seconds
tracking_visibility boolean
assigned_at, picked_up_at, delivered_at

Store:

Primary order record in relational DB or distributed transactional store.
Frequently changing tracking projection in Redis or DynamoDB for low-latency reads.

D. Subscription/session model
Fields:

connection_id
customer_id
order_id
connected_at
last_heartbeat_at
region

Store:

In-memory store such as Redis, or managed WebSocket gateway connection registry.

Scaling strategy for peak load
Traffic estimation
If 50,000 active drivers send updates every 5 seconds on average:

About 10,000 location updates per second at peak.
If updates are every 2 seconds during active delivery bursts:
About 25,000 updates per second.
This is well within the range of a partitioned event-driven system.

Scaling approach
A. Stateless horizontal scaling

Scale API Gateway, Ingest Service, and Fanout Service horizontally behind load balancers.
Keep request handling stateless; store session and subscription metadata in shared fast storage.

B. Partitioned event streaming

Partition location events by driver_id so ordering is preserved per driver.
Scale consumers by adding partitions and consumer instances.
Separate consumer groups for driver state, customer fanout, ETA, and storage.

C. Fast hot-state storage

Use Redis cluster or similar for latest location and order tracking projection.
Keep only current state in cache; use durable systems for source of truth and history.
Use TTL and eviction for stale drivers.

D. Region-based deployment

Deploy in multiple geographic regions.
Route drivers to nearest region for ingest to reduce latency.
Keep customer tracking in the same region as the order when possible.
Use cross-region replication only for required metadata, not every raw event globally.

E. Backpressure and degradation

If the system is overloaded, coalesce updates and publish only the latest driver position per small time window.
Dynamically reduce update frequency for slow-moving or stopped drivers.
Prioritize active tracked orders over idle-driver telemetry.
Drop clearly stale superseded events in the pipeline.

F. Efficient fanout

Push only to the customer associated with the order, not broad geo subscriptions.
Send deltas or compact payloads.
Limit update frequency to what is visually useful, for example 1 to 2 seconds maximum to customer app even if backend receives more frequently.

G. Reliability

Use at-least-once delivery in the broker with idempotent consumers.
Deduplicate by driver_id plus sequence_number or timestamp.
Heartbeats detect disconnected customers and offline drivers.
Persist last known snapshot so reconnecting customers can recover immediately.

H. Observability
Track:

ingest QPS
end-to-end latency from driver send to customer receive
stale location percentage
websocket connection counts
fanout failures and reconnects
per-region broker lag
cache hit ratio

Suggested end-to-end architecture

Driver App sends HTTPS location updates.
Global Load Balancer routes to nearest region.
API Gateway authenticates and rate limits.
Location Ingest Service validates and publishes to Kafka.
Driver State Service updates latest location in Redis or DynamoDB.
Order Tracking Service joins with active order assignment and computes tracking payload.
Realtime Fanout Service sends over WebSocket to subscribed customer.
Order Service manages lifecycle transitions.
Historical sink stores raw events in long-term storage.
ETA Service periodically updates ETA shown in customer app.

Key design decisions summary

Use HTTPS from drivers for robust mobile ingest.
Use WebSockets to customers for low-latency targeted push.
Use a broker to decouple ingest, state updates, fanout, and storage.
Store latest state in a fast key-value store and historical traces separately.
Scale horizontally with partitioning, regionalization, caching, and adaptive throttling.

This design provides low latency, targeted real-time updates, operational simplicity, and enough headroom to support tens of thousands of drivers and hundreds of thousands of tracking customers during peak demand.

Result

#1 | Winner

Winning Votes

3 / 3

Average Score

Judge Models OpenAI GPT-5.2

Total Score

Overall Comments

Strong, coherent event-driven architecture that clearly covers driver ingest, backend processing, targeted fanout to customers, protocol choices, data models, and scaling tactics. Good practical considerations (throttling, filtering, batching, fallbacks, regional routing, backpressure, idempotency). Minor gaps: limited discussion of security/privacy details (token scopes, PII, encryption at rest), exact WebSocket scaling approach (sticky sessions vs managed gateway), and more explicit capacity reasoning for 200k concurrent sockets and fanout throughput, though it is generally implied.

View Score Details ▼

Architecture Quality

Weight 30%

Well-structured end-to-end design with clear separation of concerns (ingest, broker, state, order join, fanout, history, ETA). Event streaming backbone and hot-state store are appropriate, and the flow from driver updates to customer-specific updates is logically connected.

Completeness

Weight 20%

Directly addresses all six requested aspects, including client behaviors, backend services, customer update mechanism, protocol justification, data models, and scaling. Could be more explicit on authZ rules per order, privacy/retention policies, and concrete WebSocket connection management details.

Trade-off Reasoning

Weight 20%

Gives solid justification for HTTPS vs MQTT and WebSockets vs polling, and mentions gRPC as an option with operational caveats. Some tradeoffs could be deeper (e.g., cost/ops tradeoffs of managed WebSocket gateways, Redis vs DynamoDB durability/latency, consistency needs for assignment joins).

Scalability & Reliability

Weight 20%

Good scaling plan: horizontal stateless services, partitioned streaming, TTL hot state, regionalization, backpressure/coalescing, and at-least-once with dedupe keys. Reliability aspects are covered, but it would be stronger with more explicit sizing for 200k concurrent WebSockets, multi-region failover strategy, and handling broker/Redis outages.

Clarity

Weight 10%

Easy to follow, well organized by prompt sections, with concrete examples of fields, pipeline steps, and scaling estimates. Terminology is consistent and the proposed components and interactions are clearly described.

Judge Models Anthropic Claude Opus 4.6

Total Score

Overall Comments

This is an excellent, comprehensive system design answer that thoroughly addresses all six aspects of the prompt. The architecture is coherent, well-structured, and demonstrates deep understanding of real-time systems at scale. The answer covers driver-to-backend communication, backend processing pipeline, customer-facing real-time updates, protocol justifications, data models, and scaling strategies in significant detail. It also goes beyond the minimum requirements by addressing practical concerns like battery consumption, backpressure, observability, graceful degradation, and fallback mechanisms. The protocol choices are well-justified with clear reasoning about why alternatives were not chosen. The data models are detailed with appropriate field selections and storage recommendations. The scaling strategy includes concrete traffic estimations and multiple complementary approaches. Minor areas for improvement include slightly more discussion of security considerations, geographic failover specifics, and perhaps a visual diagram description. Overall, this is a production-quality system design document.

View Score Details ▼

Architecture Quality

Weight 30%

The architecture is well-designed with clear separation of concerns across ingest, processing, fanout, and storage paths. The event-driven approach with Kafka as the central broker is appropriate for this use case. The pipeline from driver to customer is logically sound with proper decoupling. The inclusion of an ETA service, historical storage, and monitoring shows mature architectural thinking. The only minor gap is the lack of explicit discussion of failure modes for individual components and how the system handles partial outages gracefully beyond general backpressure mentions.

Completeness

Weight 20%

All six required aspects are thoroughly addressed. The answer covers driver-to-backend communication, backend services, customer real-time updates, protocol choices with justification, detailed data models with field-level specifications, and a comprehensive scaling strategy. It also includes additional valuable elements like fallback mechanisms, observability, backpressure handling, battery considerations, and a clear end-to-end architecture summary. The data models include four distinct models covering all necessary entities. Very little is missing from the prompt requirements.

Trade-off Reasoning

Weight 20%

The protocol justifications are strong and well-reasoned. The answer clearly explains why HTTPS was chosen over MQTT for driver ingest, why WebSockets were chosen over polling for customers, and why Kafka serves as the internal broker. The discussion of why not polling and why not MQTT end-to-end shows genuine tradeoff analysis. The mention of gRPC as an alternative with conditions for when it would be appropriate adds depth. The adaptive frequency discussion balancing battery life against data freshness is practical. Could have slightly more discussion of consistency vs availability tradeoffs in the data layer.

Scalability & Reliability

Weight 20%

The scaling strategy is comprehensive and realistic. Traffic estimation with concrete numbers (10K-25K updates per second) grounds the design in reality. The answer covers horizontal scaling of stateless services, partitioned event streaming, fast hot-state storage with TTL, regional deployment, backpressure and graceful degradation, efficient targeted fanout, at-least-once delivery with idempotent consumers, and deduplication strategies. The reliability section covers heartbeats, reconnection snapshots, and stale data handling. The only minor gap is limited discussion of database replication strategies and disaster recovery specifics.

Clarity

Weight 10%

The answer is exceptionally well-organized with clear headings, numbered sections matching the prompt, and logical flow from component to component. The use of bullet points, labeled subsections, and a summary at the end makes it easy to follow. The processing pipeline is described as a clear step-by-step flow. Technical terms are used appropriately without unnecessary jargon. The suggested end-to-end architecture section provides a good summary. The only minor issue is that the length is substantial, but given the complexity of the topic, the detail is warranted and well-structured.

Judge Models Google Gemini 2.5 Flash-Lite

Total Score

Overall Comments

The design provides a comprehensive and well-reasoned approach to building a real-time driver tracking system. It addresses all aspects of the prompt, offering practical technology choices, clear justifications, and a solid strategy for scalability and reliability. The architecture is detailed and considers potential issues like connectivity and load. A minor area for potential enhancement could be more explicit detail on client-side battery optimization beyond frequency throttling.

View Score Details ▼

Architecture Quality

Weight 30%

The architecture is robust, event-driven, and uses appropriate services and patterns (API Gateway, Message Broker, microservices, Redis/DynamoDB for hot state). It clearly outlines the data flow from driver ingest to customer fanout, demonstrating a strong understanding of distributed systems. The choice of HTTPS for drivers and WebSockets for customers is well-justified for the specific use case.

Completeness

Weight 20%

100

All six aspects of the prompt are thoroughly addressed. This includes driver data transmission, backend services, customer data reception, protocol choices with justifications, data models for different entities (driver location, history, order, subscription), and a detailed scaling strategy for peak load. The system's interconnections and data flow are clearly explained.

Trade-off Reasoning

Weight 20%

The reasoning for protocol choices (HTTPS vs. gRPC, WebSockets vs. polling, MQTT) is strong and well-contextualized. The justifications for using HTTPS for driver ingest due to compatibility and simplicity, and WebSockets for customer updates due to efficiency and low latency, are persuasive. The explanation for avoiding MQTT is also sensible, focusing on operational complexity.

Scalability & Reliability

Weight 20%

The scaling strategy is detailed, covering horizontal scaling, partitioned event streaming, fast hot-state storage, regional deployments, backpressure mechanisms, efficient fanout, and robust reliability measures like at-least-once delivery and idempotency. The traffic estimation provides a good basis for the scaling approach, and the observability points are crucial for maintaining reliability.

Clarity

Weight 10%

The answer is well-structured, using clear headings and bullet points to present complex information. The language is precise, and the overall flow of the design is easy to follow. The diagrams implied by the text (e.g., processing pipeline, end-to-end architecture summary) are coherent and effectively communicate the design intent.

Answer B: Anthropic Claude Haiku 4.5

Real-Time Driver Tracking System Architecture

1. Driver Location Data Collection

Drivers' mobile devices will send location updates at regular intervals (every 5-10 seconds during active delivery). The mobile SDK will:

Capture GPS coordinates with timestamp and accuracy metrics
Buffer updates locally to handle network interruptions
Send batched location data via HTTPS POST requests to a dedicated Location Ingestion API
Implement exponential backoff for failed requests
Reduce frequency when driver is stationary to conserve bandwidth

2. Backend Services Architecture

**Location Ingestion Servic...

Show Full Answer ▼

Real-Time Driver Tracking System Architecture

1. Driver Location Data Collection

Drivers' mobile devices will send location updates at regular intervals (every 5-10 seconds during active delivery). The mobile SDK will:

Capture GPS coordinates with timestamp and accuracy metrics
Buffer updates locally to handle network interruptions
Send batched location data via HTTPS POST requests to a dedicated Location Ingestion API
Implement exponential backoff for failed requests
Reduce frequency when driver is stationary to conserve bandwidth

2. Backend Services Architecture

Location Ingestion Service:

Receives location updates from drivers
Validates data and performs basic filtering
Publishes events to a message queue (Kafka/RabbitMQ)
Returns acknowledgment to mobile client
Scales horizontally with load balancer

Location Processing Service:

Consumes location events from message queue
Updates driver location in cache (Redis)
Stores location history in time-series database (InfluxDB/TimescaleDB)
Calculates ETA and route optimization
Publishes location updates to notification service

Order Service:

Maintains order state and driver assignments
Queries current driver location from cache
Tracks order lifecycle events

Notification Service:

Subscribes to location update events
Determines which customers need updates (based on active orders)
Pushes updates to customers via WebSocket or push notifications

3. Customer Location Update Delivery

Customers receive real-time updates through:

WebSocket Connection: Persistent bidirectional connection for active tracking
Fallback Mechanism: HTTP long-polling for clients unable to maintain WebSocket
Push Notifications: Periodic updates (every 30-60 seconds) for background tracking

When a customer opens tracking:

Mobile app establishes WebSocket connection to Location Streaming Service
Client sends order ID to subscribe to driver location stream
Server validates customer-order relationship
Server streams location updates for assigned driver
Updates sent only when location changes significantly (geo-fencing)

4. Communication Protocol Selection

Primary: WebSocket

Justification: Persistent connection enables true real-time updates with minimal latency (100-500ms)
Bidirectional communication allows customers to request updates on-demand
Reduces server load compared to polling
Efficient for high-frequency updates

Secondary: MQTT (for driver-to-backend)

Lightweight protocol optimized for mobile devices
Built-in QoS levels ensure delivery reliability
Automatic reconnection handling
Reduced battery consumption on driver devices

Fallback: HTTP Long-Polling

For clients with WebSocket limitations
Server holds request open until new data available
Timeout after 30 seconds to prevent resource exhaustion

5. Data Models

Driver Location (Cache - Redis):

Key: driver:{driver_id}
Value: {
  driver_id: string,
  latitude: float,
  longitude: float,
  timestamp: unix_timestamp,
  accuracy: float,
  speed: float,
  heading: float,
  active_order_id: string
}
TTL: 5 minutes

Location History (Time-Series DB):

Table: driver_locations
Columns:
  - driver_id (indexed)
  - latitude
  - longitude
  - timestamp (indexed)
  - accuracy
  - order_id (indexed)
Retention: 90 days

Order Information (Primary DB - PostgreSQL):

Table: orders
Columns:
  - order_id (primary key)
  - customer_id (indexed)
  - driver_id (indexed)
  - restaurant_id
  - status (enum: pending, confirmed, picked_up, in_transit, delivered)
  - pickup_location (lat, lng)
  - delivery_location (lat, lng)
  - created_at
  - estimated_delivery_time
  - actual_delivery_time

Active Subscriptions (Cache - Redis):

Key: subscriptions:{order_id}
Value: Set of {customer_connection_id}
TTL: Order completion + 1 hour

6. Scaling Strategy

Horizontal Scaling:

Location Ingestion: Deploy multiple instances behind load balancer (target: <100ms p99 latency)
Location Processing: Kafka consumer groups with partition-per-driver strategy
WebSocket Servers: Sticky sessions with Redis for connection state
Database: Read replicas for location queries, write to primary

Caching Strategy:

Redis cluster for driver locations (in-memory, <10ms access)
Cache warming: Pre-load active driver locations on service startup
Geo-spatial indexing in Redis for proximity queries

Database Optimization:

Time-series database for location history (optimized for time-range queries)
Partitioning by date for location history tables
Indexes on (driver_id, timestamp) and (order_id, timestamp)
Archive old data to cold storage after 90 days

Message Queue Tuning:

Kafka with 100+ partitions for location events
Consumer group per service type
Batch processing: 1000 messages or 5-second window

Peak Load Handling (50K drivers, 200K customers):

Expected location updates: 50,000 drivers × 1 update/10s = 5,000 updates/second
Expected WebSocket connections: 200,000 customers (assume 30% active tracking = 60,000 concurrent)
Ingestion capacity: 10,000+ updates/second per instance (3-5 instances)
WebSocket servers: 60,000 connections ÷ 10,000 per server = 6-8 servers
Message queue throughput: 10,000+ messages/second

Monitoring & Alerting:

Track p50, p95, p99 latencies for location updates
Monitor queue depth and consumer lag
Alert on WebSocket connection failures
Track database query performance
Implement circuit breakers for cascading failure prevention

Cost Optimization:

Use CDN for static assets
Implement location update throttling based on driver speed
Compress location data in transit
Use spot instances for non-critical services during off-peak

Result

Winning Votes

0 / 3

Average Score

Judge Models OpenAI GPT-5.2

Total Score

Overall Comments

The answer presents a solid, modern event-driven architecture with clear separation of ingestion, processing, storage, and streaming to customers. It covers all requested aspects, includes sensible caching and time-series storage choices, and provides rough load calculations and scaling numbers. Tradeoffs are generally well explained (WebSockets vs polling, MQTT benefits), though there is a notable inconsistency: earlier it states drivers send HTTPS POST, later it proposes MQTT as primary for driver-to-backend without reconciling how both are used. Reliability and security considerations (authN/authZ, abuse prevention, data privacy) are mentioned only lightly (validation, relationship check) and could be more explicit for a production tracking system.

View Score Details ▼

Architecture Quality

Weight 30%

Strong high-level design with clear components (ingestion API, queue, processing, cache, time-series DB, order DB, streaming/notification). Data flow is coherent and practical for real-time tracking. Minor architectural ambiguity exists around whether drivers use HTTPS batching or MQTT as the main uplink, and the “notification service” vs “location streaming service” responsibilities could be tightened.

Completeness

Weight 20%

Addresses all six prompt areas: driver uplink behavior, backend services, customer delivery path, protocol choices with justification, data models (driver location, history, orders, subscriptions), and scaling strategy with peak-hour back-of-envelope calculations. Missing/underdeveloped pieces include explicit auth/token model, idempotency/duplicate handling, and more detail on how subscription routing is implemented across WebSocket nodes.

Trade-off Reasoning

Weight 20%

Gives reasonable justification for WebSockets (latency, reduced polling load) and MQTT (mobile efficiency, QoS) and includes fallbacks. However, it doesn’t fully discuss costs/complexity tradeoffs of MQTT vs HTTPS (operational overhead, firewall/NAT issues), or WebSocket scaling constraints (fanout, backpressure). The earlier HTTPS POST approach conflicts with the later “secondary: MQTT” without clarifying a hybrid strategy.

Scalability & Reliability

Weight 20%

Good scaling approach: horizontal services, Kafka partitions/consumer groups, Redis clustering, DB partitioning/retention, and monitoring. Includes throughput estimates and server counts. Reliability aspects are partially covered (buffering, backoff, queue decoupling, circuit breakers), but could better address exactly-once/at-least-once semantics, ordering per driver/order, reconnect behavior for WebSockets, and multi-region/failover considerations.

Clarity

Weight 10%

Well structured, easy to follow, and uses concrete bullets, step-by-step customer flow, and explicit data model sketches. Quantitative assumptions help readability. The main clarity issue is the mixed messaging about driver transport protocol (HTTPS vs MQTT) and slight overlap in naming between notification/streaming services.

Judge Models Anthropic Claude Opus 4.6

Total Score

Overall Comments

This is an excellent and comprehensive system design that addresses all six aspects of the prompt thoroughly. The architecture is coherent, well-structured, and demonstrates strong practical knowledge. The answer covers driver location ingestion, backend processing pipeline, customer update delivery, protocol choices with justifications, detailed data models, and a thorough scaling strategy with concrete calculations. It also goes beyond the minimum by addressing monitoring, cost optimization, and fallback mechanisms. Minor areas for improvement include slightly more depth on geographic sharding, consistency tradeoffs, and security considerations, but overall this is a very strong response.

View Score Details ▼

Architecture Quality

Weight 30%

The architecture is well-designed with clear separation of concerns across ingestion, processing, notification, and order services. The use of Kafka as a message bus between services provides good decoupling. The choice of Redis for real-time caching, a time-series database for location history, and PostgreSQL for order data shows thoughtful technology selection. The pipeline from driver device to customer device is logically connected and realistic. Minor improvement could include more detail on geographic partitioning or edge deployment for latency reduction.

Completeness

Weight 20%

All six aspects of the prompt are thoroughly addressed. The answer covers driver location sending (Section 1), backend services (Section 2), customer update delivery (Section 3), protocol choices with justification (Section 4), data models (Section 5), and scaling strategy (Section 6). It also includes extras like monitoring, cost optimization, fallback mechanisms, and battery conservation considerations. The only minor gap is the lack of explicit discussion on security (authentication of WebSocket connections, data encryption) and geographic distribution.

Trade-off Reasoning

Weight 20%

The answer provides good justification for protocol choices, explaining why WebSockets are preferred for customer-facing real-time updates and why MQTT is suitable for driver devices due to battery efficiency. The fallback to HTTP long-polling is well-reasoned. The tradeoff between update frequency and battery consumption is acknowledged. The choice of time-series DB vs relational DB for different data types shows awareness of tradeoffs. However, the answer could have explored more tradeoffs such as consistency vs availability in the caching layer, or the tradeoff between update frequency and system load in more depth.

Scalability & Reliability

Weight 20%

The scaling strategy is concrete and realistic with specific calculations: 5,000 updates/second from 50K drivers, 60K concurrent WebSocket connections, and specific instance counts. The use of Kafka partitioning, Redis clustering, horizontal scaling of ingestion services, and database read replicas demonstrates a solid understanding of scaling patterns. Reliability is addressed through local buffering on devices, exponential backoff, circuit breakers, and monitoring. The mention of sticky sessions for WebSocket servers and Redis for connection state is practical. Could have discussed geographic redundancy and disaster recovery more explicitly.

Clarity

Weight 10%

The answer is exceptionally well-organized with clear section headers matching the prompt structure. Data models are presented in a readable pseudo-schema format. The numbered flow for customer tracking subscription is easy to follow. Technical terms are used appropriately without being unnecessarily complex. The concrete numbers in the scaling section make the design tangible and verifiable.

Judge Models Google Gemini 2.5 Flash-Lite

Total Score

Overall Comments

The answer provides a comprehensive and well-structured design for a real-time driver tracking system. It addresses all aspects of the prompt with detailed explanations and appropriate technology choices. The discussion on communication protocols, data models, and scaling strategies is particularly strong. The only minor area for improvement would be a more explicit discussion on battery consumption for the driver's mobile device, although it is implicitly considered in the protocol choice for drivers.

View Score Details ▼

Architecture Quality

Weight 30%

The proposed architecture is logical, coherent, and effectively breaks down the system into manageable services (Ingestion, Processing, Order, Notification). The use of a message queue (Kafka) for decoupling and Redis for caching driver locations is well-justified. The flow of data from driver to customer is clear, and the integration of services is sound. The design considers practical aspects like buffering and backoff for mobile clients.

Completeness

Weight 20%

All six points of the prompt have been thoroughly addressed. The answer covers data collection, backend services, customer delivery, protocol choices, data models, and scaling. Each section is detailed enough to understand the proposed solution for that specific aspect.

Trade-off Reasoning

Weight 20%

The justification for choosing WebSocket for customer-to-backend communication is strong, highlighting low latency and efficiency. The choice of MQTT for driver-to-backend is also well-reasoned due to its lightweight nature, QoS, and battery efficiency. The inclusion of HTTP long-polling as a fallback is a good practical consideration, and the reasoning for each choice is clearly explained in the context of the problem.

Scalability & Reliability

Weight 20%

The scaling strategy is robust, covering horizontal scaling for various services, caching with Redis, database optimization (time-series DB, partitioning), message queue tuning, and explicit calculations for peak load. Reliability is addressed through considerations like buffering, exponential backoff, message queue QoS, and monitoring/alerting with circuit breakers.

Clarity

Weight 10%

The answer is exceptionally clear, well-organized, and easy to follow. The use of headings, subheadings, and code blocks for data models enhances readability. The language is precise, and the technical concepts are explained effectively. The numerical estimates for peak load further solidify the clarity of the scaling strategy.

Comparison Summary

Final rank order is determined by judge-wise rank aggregation (average rank + Borda tie-break). Average score is shown for reference.

Judges: 3

Winner OpenAI GPT-5.4

Winning Votes

3 / 3

Average Score

View this answer

Anthropic Claude Haiku 4.5

Winning Votes

0 / 3

Average Score

View this answer

View head-to-head record for this model pair

Judging Results

Judge Models Google Gemini 2.5 Flash-Lite

GPT-5.4 Winner 94

Claude Haiku 4.5 93

Judge Models Anthropic Claude Opus 4.6

GPT-5.4 Winner 92

Claude Haiku 4.5 91

Judge Models OpenAI GPT-5.2

GPT-5.4 Winner 90

Claude Haiku 4.5 85

Design a Real-Time Driver Tracking System for a Food Delivery App

Task Overview

Task Prompt

Answers

Answer A: OpenAI GPT-5.4

Answer B: Anthropic Claude Haiku 4.5

Real-Time Driver Tracking System Architecture

1. Driver Location Data Collection

2. Backend Services Architecture

Real-Time Driver Tracking System Architecture

1. Driver Location Data Collection

2. Backend Services Architecture

3. Customer Location Update Delivery

4. Communication Protocol Selection

5. Data Models

6. Scaling Strategy

Comparison Summary

Judging Results

Related Tasks

Command-Line File Synchronization Tool

Roleplay as a Seasoned Video Game Support Agent

Food Truck Launch Plan

Reimagining Urban Community Spaces

Urban Mobility Policy Analysis for Rivertown

Speech to City Council for a Community Garden

Announce New Hybrid Work Policy

Community Garden Project Plan

Related Links