System Design
Explore how AI models perform in System Design. Compare rankings, scoring criteria, and recent benchmark examples.
Genre overview
Compare architecture thinking, trade-off reasoning, and system design quality.
In this genre, the main abilities being tested are Architecture Quality, Completeness, Trade-off Reasoning.
Unlike coding, this genre puts more weight on architecture choices, scale, reliability, and trade-off handling than on runnable implementation details.
A high score here does not mean the model will write the best working code or the clearest beginner-facing explanation.
Strong models here are useful for
architecture proposals, technical trade-offs, service design, and scaling discussions.
This genre alone cannot tell you
low-level implementation quality, exact correctness, or how well the model writes for a non-technical audience.
Top Models in This Genre
This ranking is ordered by average score within this genre only.
Latest Updated: Mar 22, 2026 21:21
Win Rate
Average Score
Win Rate
Average Score
Win Rate
Average Score
Win Rate
Average Score
Win Rate
Average Score
Win Rate
Average Score
Win Rate
Average Score
Win Rate
Average Score
Win Rate
Average Score
| Ranked Models |
|
|
Detail | ||||
|---|---|---|---|---|---|---|---|
| #1 | GPT-5.2 | OpenAI |
100%
|
90
|
3 | 3 | View scores and evaluation for GPT-5.2 |
| #2 | GPT-5.4 | OpenAI |
100%
|
89
|
3 | 3 | View scores and evaluation for GPT-5.4 |
| #3 | Claude Opus 4.6 | Anthropic |
100%
|
87
|
3 | 3 | View scores and evaluation for Claude Opus 4.6 |
| #4 | GPT-5 mini | OpenAI |
75%
|
84
|
3 | 4 | View scores and evaluation for GPT-5 mini |
| #5 | Claude Sonnet 4.6 | Anthropic |
60%
|
85
|
3 | 5 | View scores and evaluation for Claude Sonnet 4.6 |
| #6 | Claude Haiku 4.5 | Anthropic |
50%
|
84
|
2 | 4 | View scores and evaluation for Claude Haiku 4.5 |
| #7 | Gemini 2.5 Pro |
0%
|
75
|
0 | 4 | View scores and evaluation for Gemini 2.5 Pro | |
| #8 | Gemini 2.5 Flash |
0%
|
74
|
0 | 5 | View scores and evaluation for Gemini 2.5 Flash | |
| #9 | Gemini 2.5 Flash-Lite |
0%
|
72
|
0 | 3 | View scores and evaluation for Gemini 2.5 Flash-Lite |
What Is Evaluated in System Design
Scoring criteria and weight used for this genre ranking.
Architecture Quality
30.0%
This criterion is included to check Architecture Quality in the answer. It carries heavier weight because this part strongly shapes the overall result in this genre.
Completeness
20.0%
This criterion is included to check Completeness in the answer. It has meaningful weight because it affects quality in a visible way, even if it is not the only thing that matters.
Trade-off Reasoning
20.0%
This criterion is included to check Trade-off Reasoning in the answer. It has meaningful weight because it affects quality in a visible way, even if it is not the only thing that matters.
Scalability & Reliability
20.0%
This criterion is included to check Scalability & Reliability in the answer. It has meaningful weight because it affects quality in a visible way, even if it is not the only thing that matters.
Clarity
10.0%
This criterion is included to check Clarity in the answer. It is weighted more lightly because it supports the main goal rather than defining the genre by itself.
Recent tasks
System Design
Design a URL Shortening Service
Design a URL shortening service (similar to bit.ly or tinyurl.com) that must handle the following constraints: 1. The service must support 100 million new URL shortenings per month. 2. The ratio of read (redirect) requests to write (shorten) requests is 100:1. 3. Shortened URLs should be as short as possible but must support the expected volume for at least 10 years. 4. The system must achieve 99.9% uptime availability. 5. Redirect latency must be under 50ms at the 95th percentile. 6. The service must handle graceful degradation if a data center goes offline. In your design, address each of the following areas: A) API Design: Define the key API endpoints and their contracts. B) Data Model and Storage: Choose a storage solution, justify your choice, explain your schema, and estimate the total storage needed over 10 years. C) Short URL Generation: Describe your algorithm for generating short codes. Discuss how you avoid collisions and what character set and length you chose, with a mathematical justification for why the keyspace is sufficient. D) Scaling and Performance: Explain how you would scale reads and writes independently. Describe your caching strategy, including cache eviction policy and expected hit rate. Explain how you meet the 50ms p95 latency requirement. E) Reliability and Fault Tolerance: Describe how the system handles data center failures, data replication strategy, and what trade-offs you make between consistency and availability (reference the CAP theorem). F) Trade-off Discussion: Identify at least two significant design trade-offs you made and explain why you chose one option over the other, including what you would sacrifice and gain. Present your answer as a structured plan with clear sections corresponding to A through F.
System Design
Design a URL Shortening Service
Design a URL shortening service (similar to bit.ly or tinyurl.com) that must handle the following constraints: 1. The service must support 100 million new URL shortenings per month. 2. The read-to-write ratio is 100:1 (i.e., for every URL created, it is accessed 100 times on average). 3. Shortened URLs must remain accessible for at least 5 years. 4. The system must achieve 99.9% uptime. 5. Redirect latency (from receiving a short URL request to issuing the HTTP redirect) must be under 50ms at the 95th percentile. Your design should address all of the following areas: A. **Short URL Generation Strategy**: How will you generate unique, compact short codes? Discuss the encoding scheme, expected URL length, and how you handle collisions or exhaustion of the key space. B. **Data Storage**: What database(s) will you use and why? Estimate the total storage needed over 5 years. Explain your schema design and any partitioning or sharding strategy. C. **Read Path Architecture**: How will you serve redirect requests at scale to meet the latency and throughput requirements? Discuss caching layers, CDN usage, and any replication strategies. D. **Write Path Architecture**: How will you handle the ingestion of 100M new URLs per month reliably? Discuss any queuing, rate limiting, or consistency considerations. E. **Reliability and Fault Tolerance**: How does your system handle node failures, data center outages, or cache invalidation? What is your backup and recovery strategy? F. **Key Trade-offs**: Identify at least two significant trade-offs in your design (e.g., consistency vs. availability, storage cost vs. read performance, simplicity vs. scalability) and explain why you chose the side you did. Present your answer as a structured design document with clear sections corresponding to A through F above.
System Design
Design a Global URL Shortening Service
Design a public URL shortening service similar to Bitly. Users can submit a long URL and receive a short alias; visiting the short link should redirect quickly to the original URL. The system must support custom aliases, optional expiration dates, basic click analytics, and abuse mitigation for malicious links. Requirements and constraints: - Functional requirements: - Create short URLs for long URLs. - Redirect short URLs to original URLs. - Support custom aliases when available. - Support optional expiration time per link. - Record click events for analytics. - Allow users to disable a link manually. - Scale assumptions: - 120 million new short URLs per month. - 1.5 billion redirects per day. - Redirect traffic is globally distributed and read-heavy. - Analytics data should be queryable within 15 minutes. - Performance targets: - Redirect p95 latency under 80 ms for most regions. - Short-link creation p95 under 300 ms. - 99.99% availability for redirects. - Data and retention: - Links may live indefinitely unless expired or disabled. - Raw click events may be retained for 90 days; aggregated analytics for 2 years. - Operational constraints: - Use commodity cloud infrastructure; do not assume one exotic managed product solves everything. - Budget matters: justify any replication, caching, and storage choices. - Short codes should be compact and reasonably hard to guess at large scale, but perfect secrecy is not required. In your answer, provide: 1. A high-level architecture with major components and data flow. 2. Storage choices for link metadata, redirect path, and analytics events, with rationale. 3. A short-code generation strategy, including how to avoid collisions and handle custom aliases. 4. A scaling plan for global traffic, including caching, partitioning/sharding, and multi-region considerations. 5. A reliability plan covering failures, hot keys, disaster recovery, and degraded-mode behavior. 6. Key APIs and core data models. 7. Abuse mitigation and security considerations. 8. The main trade-offs you made and why.
System Design
Design a Global URL Shortening Service
Design a globally available URL shortening service similar to Bitly. The service must let users create short links that redirect to long URLs, support custom aliases for paid users, track click analytics, and allow links to expire at a specified time. Requirements: - Handle 120 million new short links per day. - Handle 4 billion redirects per day. - Peak traffic can reach 3 times the daily average. - Redirect latency target: p95 under 80 ms for users in North America, Europe, and Asia. - Short-link creation latency target: p95 under 300 ms. - Service availability target: 99.99% for redirects. - Analytics data can be eventually consistent within 5 minutes. - Custom aliases must be unique globally. - Expired or deleted links must stop redirecting quickly. - The system should tolerate regional failures without total service outage. Assumptions you may use: - Average long URL length is 500 bytes. - Analytics events include timestamp, link ID, country, device type, and referrer domain. - Read traffic is much higher than write traffic. - You may choose SQL, NoSQL, cache, stream, CDN, and messaging technologies as needed, but justify them. In your answer, provide: 1. A high-level architecture with main components and request flows. 2. Data model and storage choices for links, aliases, and analytics. 3. A scaling strategy for read-heavy traffic, including caching and regional routing. 4. A reliability strategy covering failover, consistency decisions, and handling regional outages. 5. Key trade-offs, bottlenecks, and at least three risks with mitigations. 6. A brief capacity estimate for storage and throughput using the numbers above.
System Design
Design a Global URL Shortening Service
Design a public URL shortening service similar to Bitly. The service must let users create short links for long URLs, optionally specify a custom alias if available, and redirect users who visit the short link to the original destination. Include a basic analytics feature that reports total clicks per link and clicks by day for the last 30 days. Assume the following constraints: - 120 million new short links are created per month. - 1.2 billion redirect requests are served per month. - Read traffic is highly bursty, especially for viral links. - The service is used globally and users expect low-latency redirects. - Short links should remain valid for at least 5 years. - Redirect availability target is 99.99 percent. - Analytics may be eventually consistent by up to 10 minutes. - The system should prevent obvious abuse at a basic level, but a full trust and safety platform is out of scope. In your design, cover: - High-level architecture and main components. - Data model and storage choices for link mappings and analytics. - ID or token generation strategy, including custom alias handling. - API design for creating links, redirecting, and fetching analytics. - Caching, partitioning, and replication strategy. - Reliability approach, including failure handling and multi-region considerations. - How you would scale for read-heavy traffic and viral hotspots. - Key trade-offs in consistency, cost, latency, and operational complexity. State any reasonable assumptions you make and justify your choices.
System Design
Design a Real-Time Ride Matching Platform
Design the backend architecture for a ride-hailing platform that matches riders with nearby drivers in real time across multiple cities. Your design should support these product requirements: - Riders can request a trip by sending pickup and destination locations. - Nearby available drivers should receive the request quickly, and one driver can accept it. - The system must prevent double-booking of drivers. - Riders and drivers should see live trip status updates such as requested, accepted, arrived, in progress, and completed. - The platform should provide an estimated fare and estimated pickup time before confirmation. - Trip history should be available to both riders and drivers. Constraints and assumptions: - 8 million daily ride requests. - Peak load is 25 times the average request rate during commuting windows. - Operates in 40 cities, with uneven traffic distribution. - Location updates from active drivers arrive every 3 seconds. - Acceptable rider-facing latency for initial driver matching is under 2 seconds at p95. - Trip status updates should usually appear within 1 second. - The system should remain available during a regional service outage affecting one data center. - Exact payment processing details are out of scope, but trip records must be durable for later billing. - Privacy, security, and regulatory concerns may be mentioned briefly, but the main focus is architecture and scaling. In your answer, describe: - The main services or components and their responsibilities. - The data flow from ride request to driver assignment to trip completion. - How you would store and query driver locations efficiently. - How you would handle scaling for peak traffic and hotspot cities. - How you would ensure reliability, fault tolerance, and data consistency where it matters. - Key trade-offs in your design, including any places where you prefer eventual consistency over strong consistency, or vice versa. You do not need to provide exact cloud vendor products. A clear architecture and reasoning-focused design is preferred over exhaustive implementation detail.