System Design
Explore how AI models perform in System Design. Compare rankings, scoring criteria, and recent benchmark examples.
Genre overview
Compare architecture thinking, trade-off reasoning, and system design quality.
In this genre, the main abilities being tested are Architecture Quality, Completeness, Trade-off Reasoning.
Unlike coding, this genre puts more weight on architecture choices, scale, reliability, and trade-off handling than on runnable implementation details.
A high score here does not mean the model will write the best working code or the clearest beginner-facing explanation.
Strong models here are useful for
architecture proposals, technical trade-offs, service design, and scaling discussions.
This genre alone cannot tell you
low-level implementation quality, exact correctness, or how well the model writes for a non-technical audience.
Top Models in This Genre
This ranking is ordered by average score within this genre only.
Latest Updated: Apr 25, 2026 09:38
Win Rate
Average Score
Win Rate
Average Score
Win Rate
Average Score
Win Rate
Average Score
Win Rate
Average Score
Win Rate
Average Score
Win Rate
Average Score
Win Rate
Average Score
Win Rate
Average Score
Win Rate
Average Score
| Ranked Models |
|
|
Detail | ||||
|---|---|---|---|---|---|---|---|
| #1 | GPT-5.2 Retired | OpenAI |
100%
|
90
|
4 | 4 | View scores and evaluation for GPT-5.2 |
| #2 | GPT-5.5 NEW | OpenAI |
100%
|
89
|
1 | 1 | View scores and evaluation for GPT-5.5 |
| #3 | Claude Opus 4.6 Retired | Anthropic |
100%
|
88
|
4 | 4 | View scores and evaluation for Claude Opus 4.6 |
| #4 | GPT-5.4 NEW | OpenAI |
75%
|
88
|
3 | 4 | View scores and evaluation for GPT-5.4 |
| #5 | GPT-5 mini | OpenAI |
75%
|
84
|
3 | 4 | View scores and evaluation for GPT-5 mini |
| #6 | Claude Sonnet 4.6 | Anthropic |
60%
|
85
|
3 | 5 | View scores and evaluation for Claude Sonnet 4.6 |
| #7 | Claude Haiku 4.5 | Anthropic |
40%
|
82
|
2 | 5 | View scores and evaluation for Claude Haiku 4.5 |
| #8 | Gemini 2.5 Pro |
0%
|
75
|
0 | 4 | View scores and evaluation for Gemini 2.5 Pro | |
| #9 | Gemini 2.5 Flash |
0%
|
74
|
0 | 5 | View scores and evaluation for Gemini 2.5 Flash | |
| #10 | Gemini 2.5 Flash-Lite |
0%
|
71
|
0 | 4 | View scores and evaluation for Gemini 2.5 Flash-Lite |
What Is Evaluated in System Design
Scoring criteria and weight used for this genre ranking.
Architecture Quality
30.0%
This criterion is included to check Architecture Quality in the answer. It carries heavier weight because this part strongly shapes the overall result in this genre.
Completeness
20.0%
This criterion is included to check Completeness in the answer. It has meaningful weight because it affects quality in a visible way, even if it is not the only thing that matters.
Trade-off Reasoning
20.0%
This criterion is included to check Trade-off Reasoning in the answer. It has meaningful weight because it affects quality in a visible way, even if it is not the only thing that matters.
Scalability & Reliability
20.0%
This criterion is included to check Scalability & Reliability in the answer. It has meaningful weight because it affects quality in a visible way, even if it is not the only thing that matters.
Clarity
10.0%
This criterion is included to check Clarity in the answer. It is weighted more lightly because it supports the main goal rather than defining the genre by itself.
Recent tasks
System Design
Design a Scalable Notification Service
You are a senior software engineer at a rapidly growing social media company. Your task is to design a scalable and reliable notification service. This service will be responsible for sending notifications to users about various events, such as new followers, likes on their posts, comments, and direct messages.
System Design
Design a Real-Time Notification Service
Outline a high-level system design for a real-time notification service for a social media platform. The service must meet the following requirements: - **Scale:** 10 million Daily Active Users (DAU). - **Volume:** Each user receives an average of 20 notifications per day. - **Latency:** Notifications must be delivered to the user's device in under 2 seconds. - **Channels:** Support for push notifications (mobile), email, and in-app notifications. - **Reliability:** 99.9% availability and no loss of notification data. Your design should cover the following aspects: 1. **Core Architecture:** Describe the key components (e.g., API Gateway, Notification Service, Message Queue, Workers) and their interactions. 2. **Database Schema:** Propose a basic database schema for storing user notifications and preferences. 3. **Scaling Strategy:** Explain how you would scale the system to handle the specified load and future growth. 4. **Reliability and Fault Tolerance:** Detail the measures you would take to ensure high availability and prevent data loss. 5. **Key Trade-offs:** Discuss at least two significant trade-offs you made in your design (e.g., consistency vs. availability, choice of database, push vs. pull model).
System Design
Design a URL Shortening Service
Design a URL shortening service (similar to bit.ly or tinyurl.com) that must handle the following constraints: 1. The service must support 100 million new URL shortenings per month. 2. The average read-to-write ratio is 100:1 (i.e., shortened URLs are accessed far more often than they are created). 3. Shortened URLs must remain accessible for at least 5 years after creation. 4. The system must achieve 99.9% uptime availability. 5. Redirect latency (from receiving a short URL request to issuing the HTTP redirect) must be under 50ms at the 95th percentile. In your design, address all of the following: A. High-level architecture: Describe the major components (API servers, databases, caches, load balancers, etc.) and how they interact. Include a clear description of the request flow for both URL creation and URL redirection. B. Short URL generation strategy: Explain how you would generate unique short codes. Discuss the trade-offs between different approaches (e.g., hashing, counter-based, pre-generated key pools) and justify your choice. C. Data storage: Choose a database technology and schema. Estimate the storage requirements over 5 years given the constraints. Explain why your chosen database is appropriate. D. Scaling strategy: Explain how the system scales to handle the read-heavy traffic pattern. Discuss caching strategy, database partitioning or sharding approach, and how you would handle hot keys (viral URLs that receive disproportionate traffic). E. Reliability and fault tolerance: Describe how the system maintains 99.9% availability. Address what happens when individual components fail, and how you handle data replication and failover. F. Key trade-offs: Identify at least two significant design trade-offs you made and explain why you chose one side over the other given the stated constraints.
System Design
Design a URL Shortening Service
Design a URL shortening service (similar to bit.ly or tinyurl.com) that must handle the following constraints: 1. The service must support 100 million new URL shortenings per month. 2. The ratio of read (redirect) requests to write (shorten) requests is 100:1. 3. Shortened URLs should be as short as possible but must support the expected volume for at least 10 years. 4. The system must achieve 99.9% uptime availability. 5. Redirect latency must be under 50ms at the 95th percentile. 6. The service must handle graceful degradation if a data center goes offline. In your design, address each of the following areas: A) API Design: Define the key API endpoints and their contracts. B) Data Model and Storage: Choose a storage solution, justify your choice, explain your schema, and estimate the total storage needed over 10 years. C) Short URL Generation: Describe your algorithm for generating short codes. Discuss how you avoid collisions and what character set and length you chose, with a mathematical justification for why the keyspace is sufficient. D) Scaling and Performance: Explain how you would scale reads and writes independently. Describe your caching strategy, including cache eviction policy and expected hit rate. Explain how you meet the 50ms p95 latency requirement. E) Reliability and Fault Tolerance: Describe how the system handles data center failures, data replication strategy, and what trade-offs you make between consistency and availability (reference the CAP theorem). F) Trade-off Discussion: Identify at least two significant design trade-offs you made and explain why you chose one option over the other, including what you would sacrifice and gain. Present your answer as a structured plan with clear sections corresponding to A through F.
System Design
Design a URL Shortening Service
Design a URL shortening service (similar to bit.ly or tinyurl.com) that must handle the following constraints: 1. The service must support 100 million new URL shortenings per month. 2. The read-to-write ratio is 100:1 (i.e., for every URL created, it is accessed 100 times on average). 3. Shortened URLs must remain accessible for at least 5 years. 4. The system must achieve 99.9% uptime. 5. Redirect latency (from receiving a short URL request to issuing the HTTP redirect) must be under 50ms at the 95th percentile. Your design should address all of the following areas: A. **Short URL Generation Strategy**: How will you generate unique, compact short codes? Discuss the encoding scheme, expected URL length, and how you handle collisions or exhaustion of the key space. B. **Data Storage**: What database(s) will you use and why? Estimate the total storage needed over 5 years. Explain your schema design and any partitioning or sharding strategy. C. **Read Path Architecture**: How will you serve redirect requests at scale to meet the latency and throughput requirements? Discuss caching layers, CDN usage, and any replication strategies. D. **Write Path Architecture**: How will you handle the ingestion of 100M new URLs per month reliably? Discuss any queuing, rate limiting, or consistency considerations. E. **Reliability and Fault Tolerance**: How does your system handle node failures, data center outages, or cache invalidation? What is your backup and recovery strategy? F. **Key Trade-offs**: Identify at least two significant trade-offs in your design (e.g., consistency vs. availability, storage cost vs. read performance, simplicity vs. scalability) and explain why you chose the side you did. Present your answer as a structured design document with clear sections corresponding to A through F above.
System Design
Design a Global URL Shortening Service
Design a public URL shortening service similar to Bitly. Users can submit a long URL and receive a short alias; visiting the short link should redirect quickly to the original URL. The system must support custom aliases, optional expiration dates, basic click analytics, and abuse mitigation for malicious links. Requirements and constraints: - Functional requirements: - Create short URLs for long URLs. - Redirect short URLs to original URLs. - Support custom aliases when available. - Support optional expiration time per link. - Record click events for analytics. - Allow users to disable a link manually. - Scale assumptions: - 120 million new short URLs per month. - 1.5 billion redirects per day. - Redirect traffic is globally distributed and read-heavy. - Analytics data should be queryable within 15 minutes. - Performance targets: - Redirect p95 latency under 80 ms for most regions. - Short-link creation p95 under 300 ms. - 99.99% availability for redirects. - Data and retention: - Links may live indefinitely unless expired or disabled. - Raw click events may be retained for 90 days; aggregated analytics for 2 years. - Operational constraints: - Use commodity cloud infrastructure; do not assume one exotic managed product solves everything. - Budget matters: justify any replication, caching, and storage choices. - Short codes should be compact and reasonably hard to guess at large scale, but perfect secrecy is not required. In your answer, provide: 1. A high-level architecture with major components and data flow. 2. Storage choices for link metadata, redirect path, and analytics events, with rationale. 3. A short-code generation strategy, including how to avoid collisions and handle custom aliases. 4. A scaling plan for global traffic, including caching, partitioning/sharding, and multi-region considerations. 5. A reliability plan covering failures, hot keys, disaster recovery, and degraded-mode behavior. 6. Key APIs and core data models. 7. Abuse mitigation and security considerations. 8. The main trade-offs you made and why.