Answer A: Anthropic Claude Sonnet 4.6
Design: Scalable Real-Time Notification System 1. OVERALL ARCHITECTURE The system is composed of several distinct layers that work together to ingest events, route them, and push them to connected clients with minimal latency. Client Layer: Mobile and web clients maintain persistent WebSocket connections to a fleet of Connection Gateway servers. Each client authenticates on connect and registers its user ID with the gateway. API Gateway / Load Balancer: A Layer-7 load balancer (e.g., AWS ALB or NGINX) sits in front of the Connection Gateways. It routes new WebSocket upgrade requests using consistent hashing o...
Show Full Answer ▼
Design: Scalable Real-Time Notification System 1. OVERALL ARCHITECTURE The system is composed of several distinct layers that work together to ingest events, route them, and push them to connected clients with minimal latency. Client Layer: Mobile and web clients maintain persistent WebSocket connections to a fleet of Connection Gateway servers. Each client authenticates on connect and registers its user ID with the gateway. API Gateway / Load Balancer: A Layer-7 load balancer (e.g., AWS ALB or NGINX) sits in front of the Connection Gateways. It routes new WebSocket upgrade requests using consistent hashing on the user ID so that reconnects tend to land on the same gateway node, reducing state churn. It also exposes a REST endpoint for internal services to publish events. Event Publishing Service: Internal platform services (like service, comment service, friend service) publish events to a central message broker. They call a thin Publishing API that validates the payload, enriches it with metadata (timestamp, notification ID), and writes it to the broker. Message Broker (Kafka): Events are written to Kafka topics partitioned by user ID. This ensures ordered delivery per user and allows horizontal scaling of consumers. Kafka's durable log also provides the replay capability needed for at-least-once delivery guarantees. Notification Fanout Service: A pool of stateless consumer workers reads from Kafka. For each event, the worker looks up the target user's subscription preferences in a fast cache (Redis), determines which users should receive the notification, and then routes the message to the correct Connection Gateway. For high-fanout events (e.g., a celebrity post), a separate async fanout job is triggered to avoid blocking the hot path. Connection Gateway (WebSocket Servers): These are stateful servers that maintain the open WebSocket connections. Each gateway holds an in-memory map of user ID to connection handle. When a routed notification arrives (via an internal pub/sub channel like Redis Pub/Sub or a direct gRPC call), the gateway pushes it down the appropriate WebSocket connection. If the user is not connected, the gateway discards the push and relies on the persistence layer for later delivery. Presence & Routing Service: A Redis cluster stores a mapping of user ID to gateway node ID with a short TTL, refreshed by heartbeats. The Fanout Service queries this to know which gateway to route a notification to. If no entry exists, the user is offline. Notification Storage (Cassandra): All generated notifications are written to Cassandra, keyed by user ID and sorted by timestamp. This serves two purposes: it powers the notification inbox UI (users can scroll back through past notifications) and it enables at-least-once delivery — when a user comes online, the client fetches unread notifications from this store. Delivery Acknowledgment: Clients send an ACK message over the WebSocket after receiving a notification. The gateway writes this ACK to Kafka, and a consumer marks the notification as delivered in Cassandra. Unacknowledged notifications older than a threshold are re-queued for delivery. 2. TECHNOLOGY CHOICES AND REASONING WebSockets over Long Polling or SSE: WebSockets provide full-duplex, low-overhead persistent connections. Long polling wastes server resources with repeated HTTP handshakes and adds latency. Server-Sent Events (SSE) are unidirectional and less suitable for the ACK flow. At 1 million concurrent connections, WebSockets are the most resource-efficient choice. Each connection consumes roughly 10–50 KB of memory, making 1 million connections feasible across a moderately sized gateway fleet. Kafka over RabbitMQ: Kafka is chosen for its high throughput (millions of messages per second), durable log storage, consumer group semantics, and the ability to replay messages. RabbitMQ is a good broker for task queues but its message model is less suited to the fan-out and replay patterns needed here. Kafka's partitioning by user ID also naturally parallelizes consumption. At 10,000 notifications per second, Kafka handles this with significant headroom. Redis for Presence and Pub/Sub: Redis provides sub-millisecond reads for the presence lookup (user ID → gateway node). Redis Pub/Sub or Redis Streams can be used for the internal channel between the Fanout Service and the Connection Gateways, adding minimal latency to the delivery path. Cassandra over MySQL/PostgreSQL: Notification history is a write-heavy, time-series workload with high cardinality (one partition per user). Cassandra's wide-column model, tunable consistency, and linear horizontal scalability make it ideal. A relational database would require complex sharding and struggle with the write throughput. Cassandra's eventual consistency is acceptable here since notification history is not a transactional record. Stateless Fanout Workers: Keeping the fanout workers stateless allows them to scale horizontally by simply adding more Kafka consumer instances within the consumer group. 3. HOW THE DESIGN MEETS EACH REQUIREMENT Scalability (1M concurrent users, 10K notifications/second): The Connection Gateways are horizontally scalable. A single modern server can hold 50,000–100,000 WebSocket connections, so 10–20 gateway nodes handle 1 million users. The load balancer distributes new connections. Kafka partitions scale the fanout workers. Cassandra scales writes linearly with nodes. Redis Cluster shards the presence data. No single component is a bottleneck. Latency (P99 < 200ms): The critical path is: event published → Kafka write (~5ms) → Fanout worker consumes and looks up presence in Redis (~5ms) → routes to gateway via Redis Pub/Sub or gRPC (~5ms) → gateway pushes over WebSocket (~10ms network). Total is well under 50ms in the median case. The 200ms P99 budget accommodates Kafka consumer lag under peak load and network jitter. Keeping the fanout worker logic simple and the Redis lookups cached ensures the hot path stays fast. Reliability (at-least-once delivery): Notifications are persisted to Cassandra before or concurrently with the push attempt. If the WebSocket push fails or the client does not ACK, the notification remains in the unread state in Cassandra. On reconnect, the client fetches unread notifications. Kafka's consumer offset commit is done only after the fanout worker has successfully routed the message, ensuring no event is silently dropped. This provides at-least-once semantics end to end. Availability (99.95% uptime): All components are deployed in multiple availability zones. The load balancer, Kafka brokers, Redis Cluster nodes, Cassandra nodes, and fanout workers all run with N+1 or N+2 redundancy. Gateway node failures cause clients to reconnect (WebSocket reconnect logic with exponential backoff) and land on a healthy node within seconds. Kafka replication factor of 3 ensures broker failures do not cause data loss. Cassandra's replication factor of 3 with quorum reads/writes tolerates node failures. This architecture comfortably achieves 99.95% uptime. 4. TRADE-OFFS Complexity vs. Simplicity: This design has many moving parts — Kafka, Redis, Cassandra, WebSocket gateways, fanout workers, presence service. This is significantly more complex to operate than a simple polling system or a single-broker setup. The trade-off is justified by the scale requirements, but it demands a mature DevOps practice, good observability (distributed tracing, metrics per component), and on-call expertise. At-Least-Once vs. Exactly-Once: Exactly-once delivery would require distributed transactions across Kafka, Cassandra, and the gateway, adding significant latency and complexity. At-least-once is chosen instead, meaning a user might occasionally see a duplicate notification. This is handled on the client side by deduplicating on notification ID. For a social media notification (a like or comment), a duplicate is a minor UX annoyance, not a critical failure — an acceptable trade-off. Stateful Gateways: The WebSocket gateways are stateful (they hold live connections). This makes rolling deployments and failure recovery slightly more complex than a fully stateless service. Clients must implement reconnect logic. The trade-off is that stateful gateways are necessary for real-time push; there is no way to push to a client without a persistent connection. Fanout at Read vs. Write Time: This design fans out at write time (the fanout worker pushes to each recipient's gateway immediately). For users with millions of followers, this creates a hot path problem (a celebrity posting triggers millions of fanout operations). The mitigation is to use a separate high-fanout job queue for accounts above a follower threshold, processing them asynchronously. The trade-off is that high-follower users may experience slightly higher notification latency, which is generally acceptable. Redis Pub/Sub Reliability: Redis Pub/Sub does not persist messages. If a gateway is briefly disconnected from Redis when a message is published, it misses that message. This is mitigated by the Cassandra persistence layer — the client will fetch the notification on reconnect — but it means the real-time push is best-effort, with the persistent store as the safety net. Using Redis Streams instead of Pub/Sub would add persistence at the cost of slightly more complexity. Cost: Running Kafka, Redis Cluster, Cassandra, and a large WebSocket gateway fleet is expensive. A smaller platform might use a managed service like AWS SNS/SQS or Firebase Cloud Messaging to reduce operational burden. At the scale described (1M concurrent users), the control and performance of a custom stack justifies the cost.
Result
Winning Votes
1 / 3
Average Score
Total Score
Overall Comments
Answer A presents a well-structured and coherent system design that covers all major components and requirements. It provides clear technology choices with solid justifications, a well-articulated critical path latency analysis, and a thoughtful trade-off section. The architecture is sound with WebSocket gateways, Kafka, Redis presence, and Cassandra storage. The trade-off analysis is particularly strong, covering complexity, at-least-once vs exactly-once, stateful gateways, fanout strategies, Redis Pub/Sub reliability, and cost considerations. The writing is clear and well-organized. However, it lacks some operational details like capacity planning numbers, failure mode analysis, backpressure mechanisms, security considerations, and batching/coalescing strategies.
View Score Details ▼
Architecture Quality
Weight 30%Answer A presents a clean, well-structured architecture with clearly defined components and data flows. The critical path is well-articulated, and the interaction between components (Kafka -> Fanout Workers -> Redis Presence -> Gateway -> WebSocket) is logical and sound. The consistent hashing on user ID for load balancing is a nice detail.
Completeness
Weight 20%Answer A covers all four required areas (architecture, technology choices, requirement mapping, trade-offs) thoroughly. However, it lacks capacity planning numbers, explicit failure mode analysis, security considerations, backpressure mechanisms, and batching strategies that would make the design more complete.
Trade-off Reasoning
Weight 20%Answer A's trade-off section is one of its strongest aspects. It covers six distinct trade-offs with clear reasoning: complexity vs simplicity, at-least-once vs exactly-once, stateful gateways, fanout at read vs write time, Redis Pub/Sub reliability, and cost. Each trade-off is well-explained with practical implications. The Redis Pub/Sub reliability discussion is particularly insightful.
Scalability & Reliability
Weight 20%Answer A addresses scalability and reliability requirements clearly, with good estimates for WebSocket connections per server (50-100k) and a clear critical path latency breakdown. The at-least-once delivery mechanism via Cassandra persistence and client ACKs is well-explained. However, it lacks explicit capacity planning numbers and failure mode analysis.
Clarity
Weight 10%Answer A is exceptionally well-written with clear, concise prose. The structure flows logically from architecture to technology choices to requirement mapping to trade-offs. Each section is focused and easy to follow. The latency breakdown with specific millisecond estimates is particularly clear and effective.
Total Score
Overall Comments
Answer A presents a coherent end-to-end design with clear component responsibilities, concrete data flow, and stronger linkage between requirements and implementation details. It gives specific choices such as Kafka, Redis, Cassandra, WebSockets, ACK flow, presence routing, and unread recovery, and it discusses practical concerns like high-fanout users, Redis Pub/Sub reliability, and duplicate handling. Its main weakness is that some guarantees are a bit loosely specified at the gateway-to-client path, and a few sizing claims are optimistic, but overall it is concrete, practical, and well argued.
View Score Details ▼
Architecture Quality
Weight 30%Strong end-to-end architecture with clear publish, fanout, presence, gateway, storage, and ACK flows. Components interact logically, and the routing path for online users is well defined. Minor weakness: internal routing via Redis Pub/Sub is acknowledged as lossy, leaving some ambiguity in the hot path reliability.
Completeness
Weight 20%Covers architecture, technologies, requirements, and trade-offs well. It addresses online delivery, offline persistence, ACKs, availability, and high-fanout cases. Slightly less complete on observability, security, and operational controls than the other answer.
Trade-off Reasoning
Weight 20%Trade-offs are specific and grounded in this design: at-least-once versus exactly-once, stateful gateways, write-time fanout versus high-fanout mitigation, and Redis Pub/Sub persistence trade-offs. The discussion is concrete and tied to user experience and operational cost.
Scalability & Reliability
Weight 20%Scalability approach is convincing with partitioned Kafka, sharded Redis, scalable gateways, and Cassandra for writes. Reliability is thoughtfully handled with durable storage, ACKs, unread recovery, and multi-AZ deployment. Small concern: the real-time gateway delivery path relies on a best-effort mechanism before fallback recovery.
Clarity
Weight 10%Clear structure and readable prose. The answer moves from architecture to choices, requirements, and trade-offs in a straightforward way, making it easy to follow the system behavior.
Total Score
Overall Comments
Answer A presents a very strong, clear, and correct system design. It follows a logical structure, makes sound technology choices with good justifications, and addresses all the core requirements of the prompt. Its primary strength is its clarity and conciseness. However, it lacks the exceptional depth and operational detail seen in Answer B, particularly concerning failure modes and advanced optimization strategies.
View Score Details ▼
Architecture Quality
Weight 30%The proposed architecture is excellent, featuring a standard and robust set of components (Kafka, Redis, Cassandra, WebSocket gateways). The data flow is logical and well-explained. It represents a solid, industry-standard solution.
Completeness
Weight 20%The answer is very complete, addressing all four sections requested in the prompt thoroughly and effectively. It meets all the specified functional and non-functional requirements.
Trade-off Reasoning
Weight 20%The trade-off analysis is strong and covers key decisions like at-least-once vs. exactly-once delivery and the stateful nature of gateways. The specific point about the reliability of Redis Pub/Sub is particularly insightful.
Scalability & Reliability
Weight 20%The design clearly explains how each component scales horizontally and how at-least-once delivery is achieved. The reasoning is sound and directly addresses the NFRs.
Clarity
Weight 10%The answer is exceptionally clear, concise, and well-structured. It follows the prompt's format exactly, making it very easy to read and digest the information.