Orivel Orivel
Open menu

Design a Global URL Shortening Service

Compare model answers for this System Design benchmark and review scores, judging comments, and related examples.

Login or register to use likes and favorites. Register

X f L

Contents

Task Overview

Benchmark Genres

System Design

Task Creator Model

Answering Models

Judge Models

Task Prompt

Design a globally available URL shortening service similar to Bitly. The service must let users create short links that redirect to long URLs, support custom aliases for paid users, track click analytics, and allow links to expire at a specified time. Requirements: - Handle 120 million new short links per day. - Handle 4 billion redirects per day. - Peak traffic can reach 3 times the daily average. - Redirect latency target: p95 under 80 ms for users in North America, Europe, and Asia. - Short-link creation latenc...

Show more

Design a globally available URL shortening service similar to Bitly. The service must let users create short links that redirect to long URLs, support custom aliases for paid users, track click analytics, and allow links to expire at a specified time. Requirements: - Handle 120 million new short links per day. - Handle 4 billion redirects per day. - Peak traffic can reach 3 times the daily average. - Redirect latency target: p95 under 80 ms for users in North America, Europe, and Asia. - Short-link creation latency target: p95 under 300 ms. - Service availability target: 99.99% for redirects. - Analytics data can be eventually consistent within 5 minutes. - Custom aliases must be unique globally. - Expired or deleted links must stop redirecting quickly. - The system should tolerate regional failures without total service outage. Assumptions you may use: - Average long URL length is 500 bytes. - Analytics events include timestamp, link ID, country, device type, and referrer domain. - Read traffic is much higher than write traffic. - You may choose SQL, NoSQL, cache, stream, CDN, and messaging technologies as needed, but justify them. In your answer, provide: 1. A high-level architecture with main components and request flows. 2. Data model and storage choices for links, aliases, and analytics. 3. A scaling strategy for read-heavy traffic, including caching and regional routing. 4. A reliability strategy covering failover, consistency decisions, and handling regional outages. 5. Key trade-offs, bottlenecks, and at least three risks with mitigations. 6. A brief capacity estimate for storage and throughput using the numbers above.

Judging Policy

A strong answer should present a coherent end-to-end architecture that clearly separates the write path, redirect path, and analytics pipeline. It should make reasonable capacity estimates from the given traffic and explain storage choices in relation to access patterns, latency, and consistency needs. Good answers should address global uniqueness for aliases, cache strategy, invalidation or fast expiry handling, and multi-region availability. They should discuss practical reliability mechanisms such as replication...

Show more

A strong answer should present a coherent end-to-end architecture that clearly separates the write path, redirect path, and analytics pipeline. It should make reasonable capacity estimates from the given traffic and explain storage choices in relation to access patterns, latency, and consistency needs. Good answers should address global uniqueness for aliases, cache strategy, invalidation or fast expiry handling, and multi-region availability. They should discuss practical reliability mechanisms such as replication, failover, degradation behavior, and eventual consistency for analytics, while acknowledging trade-offs rather than claiming all goals can be optimized at once. Better answers will identify likely bottlenecks and operational risks and propose concrete mitigations.

Answers

1. High-Level Architecture and Request Flows: **Main Components:** * **API Gateway/Load Balancers:** Entry point for all user requests, handles routing, authentication, and rate limiting. * **Shorten Service:** Microservice responsible for creating, updating, and deleting short links. Handles custom alias validation and expiration logic. * **Redirect Service:** Microservice optimized for high-throughput, low-latency redirection. Records click events. * **Distributed Cache (e.g., Redis Cluster):** Primary storage for `short_code` to `long_url` mappings to serve redirects with minimal latency. * **Distri...

Show Full Answer

1. High-Level Architecture and Request Flows: **Main Components:** * **API Gateway/Load Balancers:** Entry point for all user requests, handles routing, authentication, and rate limiting. * **Shorten Service:** Microservice responsible for creating, updating, and deleting short links. Handles custom alias validation and expiration logic. * **Redirect Service:** Microservice optimized for high-throughput, low-latency redirection. Records click events. * **Distributed Cache (e.g., Redis Cluster):** Primary storage for `short_code` to `long_url` mappings to serve redirects with minimal latency. * **Distributed SQL Database (e.g., CockroachDB, Google Spanner):** Stores the authoritative source of truth for all link metadata, ensuring global uniqueness and strong consistency. * **Message Queue (e.g., Apache Kafka):** Ingests high-volume click events from the Redirect Service, decoupling it from analytics processing. * **Analytics Processor (e.g., Apache Flink/Spark Streaming):** Consumes click events from the Message Queue, performs real-time aggregation, and stores data. * **Data Warehouse (e.g., ClickHouse, Snowflake, BigQuery):** Stores raw and aggregated analytics data for reporting and analysis. * **CDN (e.g., Cloudflare, Akamai):** Distributes static assets, provides global DNS resolution, and can offer geo-routing to the nearest data center. **Request Flows:** * **Short Link Creation:** 1. User/Client sends a request to the API Gateway. 2. API Gateway routes to a Load Balancer, then to the Shorten Service. 3. Shorten Service generates a unique `short_code` (or validates a custom alias). 4. It stores the `short_code`, `long_url`, `expires_at`, and other metadata in the Distributed SQL Database. 5. It pushes the new `short_code` -> `long_url` mapping to the Distributed Cache. 6. Shorten Service returns the `short_code` to the user. * **Short Link Redirection:** 1. User/Client accesses a short URL, which is routed via CDN/GeoDNS to the nearest data center's Load Balancer. 2. Load Balancer directs to the Redirect Service. 3. Redirect Service first checks the Distributed Cache for the `short_code` -> `long_url` mapping. 4. *Cache Hit:* If found and active, it immediately issues an HTTP 301/302 redirect to the `long_url`. 5. *Cache Miss:* If not found, it queries the Distributed SQL Database. If found and active, it populates the cache and then redirects. If not found, expired, or deleted, it returns a 404 error. 6. Asynchronously, the Redirect Service publishes a click event (containing `short_code`, `timestamp`, `country`, `device_type`, `referrer_domain`) to the Message Queue. * **Analytics Processing:** 1. Analytics Processor consumes click events from the Message Queue. 2. It performs real-time processing (e.g., aggregation, enrichment). 3. Raw and aggregated data are stored in the Data Warehouse for reporting. 2. Data Model and Storage Choices: * **Links & Aliases (Distributed SQL Database - CockroachDB/Google Spanner):** * **`links` table:** * `short_code` (VARCHAR, Primary Key): The unique short identifier. * `long_url` (VARCHAR): The original URL (up to 500 bytes). * `user_id` (UUID, Indexed, FK): Optional, for link ownership. * `created_at` (TIMESTAMP): When the link was created. * `expires_at` (TIMESTAMP, Nullable, Indexed): When the link should expire. * `status` (ENUM: 'active', 'expired', 'deleted', Indexed): Current state of the link. * `is_custom_alias` (BOOLEAN): True if it's a user-defined alias. * `click_count` (BIGINT): Denormalized, eventually consistent count of clicks (updated by analytics). * *Justification:* Chosen for strong consistency guarantees (critical for `short_code` and `custom_alias` uniqueness globally), ACID properties, and native multi-region replication capabilities. This simplifies global data management and ensures data integrity. * **Analytics Events (Message Queue - Apache Kafka, Data Warehouse - ClickHouse/Snowflake):** * **Kafka Topic (`click_events`):** Stores raw click event messages (e.g., JSON/Protobuf). * **Data Warehouse (`raw_clicks` table):** * `event_id` (UUID, Primary Key) * `short_code` (VARCHAR, Indexed) * `timestamp` (TIMESTAMP, Indexed) * `country` (VARCHAR, Indexed) * `device_type` (VARCHAR) * `referrer_domain` (VARCHAR) * **Data Warehouse (`aggregated_clicks` table):** (e.g., hourly/daily aggregates) * `short_code` (VARCHAR, PK) * `aggregation_time` (TIMESTAMP, PK) * `country` (VARCHAR, PK) * `total_clicks` (BIGINT) * *Justification:* Kafka provides high-throughput, fault-tolerant ingestion and decoupling. ClickHouse/Snowflake are optimized for analytical queries over massive datasets, supporting the eventual consistency requirement for analytics. 3. Scaling Strategy for Read-Heavy Traffic: * **Distributed Cache (Redis Cluster):** This is the primary scaling layer for redirects. It will store `short_code` to `long_url` mappings in memory, handling the vast majority of 4 billion daily redirects. Redis Cluster offers horizontal scaling and high availability through sharding and replication. * **Global CDN and Geo-Routing:** A CDN (e.g., Cloudflare) will serve static assets and provide intelligent DNS-based routing (GeoDNS) to direct users to the geographically closest data center, minimizing redirect latency. * **Stateless Services:** Both Shorten and Redirect services are designed to be stateless, allowing for easy horizontal scaling by adding more instances behind load balancers in each region. Auto-scaling groups will dynamically adjust capacity based on traffic. * **Database Read Replicas/Distributed Reads:** The Distributed SQL Database will inherently handle distributed reads across its nodes. If cache hit rates are lower than expected, or for less popular links, the database's ability to scale reads across its cluster will be crucial. * **Short Code Generation:** For high-volume link creation, short codes can be pre-generated in batches and stored, or a distributed ID generation service (e.g., inspired by Twitter Snowflake) can be used to ensure unique, non-sequential codes, preventing database hotspots. 4. Reliability Strategy: * **Failover:** * **Multi-Region Deployment:** All critical services (Shorten, Redirect, Database, Cache, Message Queue) are deployed in an active-active configuration across at least three geographically distinct regions (e.g., North America, Europe, Asia). * **Service-Level Failover:** Services are deployed in auto-scaling groups across multiple Availability Zones within each region. Load balancers automatically detect and route traffic away from unhealthy instances. * **Database Failover:** The Distributed SQL Database provides built-in multi-region replication and automatic failover mechanisms (e.g., Raft consensus in CockroachDB) to ensure continuous operation even if nodes or entire zones fail. * **Cache Failover:** Redis Cluster provides replication for data redundancy and automatic failover of master nodes. * **Message Queue Failover:** Kafka clusters are deployed with replication (e.g., 3 brokers, replication factor 3) across multiple Availability Zones to tolerate broker failures. * **Consistency Decisions:** * **Strong Consistency (Link Creation/Aliases):** The Distributed SQL Database ensures strong consistency for `short_code` and `custom_alias` uniqueness. This is critical to prevent collisions and maintain data integrity. * **Eventual Consistency (Redirects):** The Distributed Cache operates with eventual consistency. When a link is created, updated (e.g., `expires_at` changes), or deleted, an event is published to a cache invalidation topic (e.g., Kafka). Cache nodes subscribe to this topic and invalidate/update their entries. A short TTL (e.g., 1-5 minutes) on cache entries acts as a fallback to prevent indefinite staleness. * **Eventual Consistency (Analytics):** Analytics data is eventually consistent within 5 minutes, handled by the asynchronous Message Queue and stream processing. This prioritizes redirect performance over immediate analytics updates. * **Handling Regional Outages:** * **Global Load Balancing/DNS:** Intelligent DNS services (e.g., GeoDNS) and global load balancers automatically detect regional failures and reroute traffic to healthy, active regions. * **Data Replication:** The Distributed SQL Database replicates all link data across active regions. If one region becomes unavailable, other regions can continue to serve requests with minimal data loss and latency impact. * **Graceful Degradation:** If the Analytics Service or Message Queue experiences issues, the Redirect Service is designed to continue functioning by buffering events locally or, in extreme cases, dropping them, prioritizing the core redirect functionality. 5. Key Trade-offs, Bottlenecks, and Risks: * **Key Trade-offs:** * **Consistency vs. Latency:** Strong consistency for link creation (via Distributed SQL) ensures data integrity but might incur slightly higher write latency. For redirects, eventual consistency via a highly optimized cache is chosen to achieve sub-80ms latency. * **Cache Size vs. Cost:** Extensive caching is vital for redirect performance but requires significant memory resources, leading to higher infrastructure costs. A balance must be struck between cache hit ratio and operational expense. * **Short Code Length vs. Namespace Size:** Shorter codes are more user-friendly but increase the probability of collisions and limit the total number of unique links. A 7-10 character base62 code provides a vast, practical namespace. * **Bottlenecks:** * **Distributed Cache Capacity:** If the cache cannot handle the peak read throughput or if the active working set of links exceeds its memory capacity, redirects will fall back to the database, increasing latency and database load. * **Database Write Throughput:** While link creation is lower volume than redirects, 120 million links/day is substantial. The Distributed SQL Database must be able to handle this write load across regions without becoming a bottleneck. * **Network Latency between Regions:** Cross-region data replication and consistency checks, especially for write operations in a globally distributed system, can introduce inherent latency. * **Risks & Mitigations:** * **Risk 1: Short Code Collisions (especially for random generation):** * *Mitigation:* Use a sufficiently long `short_code` (e.g., 7-10 characters using base62: a-z, A-Z, 0-9). Implement a robust generation strategy: pre-generate a large pool of unique codes, use a distributed ID generator (e.g., Snowflake-like) to generate unique IDs then convert to base62, or for custom aliases, perform a direct uniqueness check in the database with optimistic locking and retries. * **Risk 2: Cache Staleness for Expired/Deleted Links:** * *Mitigation:* Implement a real-time cache invalidation mechanism. When a link's status changes (e.g., `expires_at` reached, `status` set to 'deleted'), the Shorten Service (or a dedicated background job) publishes an event to a Kafka topic. All Redirect Service instances and cache nodes subscribe to this topic and immediately invalidate the corresponding `short_code` entry. A short TTL (e.g., 1-5 minutes) on cache entries acts as a fallback. * **Risk 3: Database Hotspots due to uneven `short_code` distribution:** * *Mitigation:* For Distributed SQL databases, rely on their internal sharding and rebalancing capabilities. For custom aliases, the alias itself serves as the primary key, which should distribute well. For randomly generated short codes, ensure the generation algorithm produces sufficiently random codes to avoid sequential keys that could lead to hotspots. Monitor database partitions and rebalance as needed. 6. Capacity Estimate: * **Throughput:** * **Redirects:** 4 billion/day = ~46,300 requests/second (average), peak ~138,900 requests/second (3x average). * **Link Creation:** 120 million/day = ~1,389 requests/second (average), peak ~4,167 requests/second (3x average). * **Analytics Events Ingestion:** ~46,300 events/second (average), peak ~138,900 events/second. * **Storage:** * **Links Data (Distributed SQL Database):** * Average record size: ~100 bytes (short_code, long_url, timestamps, status, etc.). * Daily new links: 120 million * 100 bytes = 12 GB. * Total over 5 years: 12 GB/day * 365 days/year * 5 years = ~21.9 TB. * With 3x replication factor for high availability/multi-region: ~65.7 TB. * **Analytics Data (Data Warehouse):** * Average event size: ~100 bytes (short_code, timestamp, country, device, referrer). * Daily events: 4 billion * 100 bytes = 400 GB. * Total over 1 year retention: 400 GB/day * 365 days/year = ~146 TB. * With 3x replication factor: ~438 TB. * **Distributed Cache (Redis Cluster):** * Each cached entry: `short_code` (e.g., 10 bytes) + `long_url` (average 500 bytes) = ~510 bytes. * To cache 1 billion active links (a reasonable working set for popular links): 1 billion links * 510 bytes/link = 510 GB of cache memory. This is a significant but manageable size for a large, sharded Redis Cluster.

Result

#2

Winning Votes

1 / 3

Average Score

77
Judge Models OpenAI GPT-5.2

Total Score

79

Overall Comments

Answer A presents a coherent end-to-end design with a clear separation between create, redirect, and analytics paths. It chooses globally consistent storage (Spanner/CockroachDB) that naturally supports global uniqueness for aliases and multi-region availability, and it includes a practical cache invalidation approach for fast stop-redirecting on delete/expiry (Kafka invalidation + short TTL). Capacity/throughput math is mostly solid, though some record-size estimates (e.g., 100B/link) are optimistic and some details (e.g., exact geo-routing/anycast behavior, cache hierarchy) could be more explicit.

View Score Details

Architecture Quality

Weight 30%
82

Strong componentization (API, shorten, redirect, cache, strongly consistent global DB, async analytics). Using Spanner/CockroachDB aligns well with global uniqueness and multi-region needs; redirect path optimized around cache with DB fallback and async eventing.

Completeness

Weight 20%
78

Covers all requested sections (flows, data model, caching/routing, reliability, trade-offs/risks, capacity). Some areas could be deeper (e.g., edge caching strategy, detailed regional routing/failover runbooks, deletion/expiration propagation timing).

Trade-off Reasoning

Weight 20%
74

Explains key trade-offs (consistency vs latency, cache cost, code length) and acknowledges cross-region latency/replication implications. Trade-offs are reasonable though not deeply quantified (e.g., write latency impact of strong consistency).

Scalability & Reliability

Weight 20%
83

Scales redirects via cache + geo-routing and decouples analytics via Kafka; multi-region active-active is aligned with 99.99% redirect availability. Strongly consistent multi-region DB supports regional failure tolerance; cache invalidation strategy addresses rapid disable/expiry (with TTL backstop).

Clarity

Weight 10%
76

Clear structure and readable bullets; flows are easy to follow. Some estimates/assumptions are a bit hand-wavy (record sizes, cache working set) but overall understandable.

Total Score

85

Overall Comments

Answer A provides a very strong and professional system design. It correctly identifies the key components for a global URL shortener, including a distributed SQL database for consistency, a distributed cache for latency, and a message queue for decoupling analytics. The request flows are logical, and the reliability strategy covering multi-region deployment and different consistency models is robust. The capacity estimates are reasonable. Its main weakness is a comparative lack of depth in its risk analysis and mitigation strategies when compared to top-tier answers; the risks identified are somewhat generic.

View Score Details

Architecture Quality

Weight 30%
90

The architecture is very solid, featuring well-chosen components. The selection of a globally distributed SQL database like Spanner or CockroachDB is an excellent choice for ensuring strong consistency for global writes, which is a key requirement.

Completeness

Weight 20%
85

The answer addresses all six parts of the prompt thoroughly. The coverage is good across all sections, from architecture to capacity planning.

Trade-off Reasoning

Weight 20%
80

The discussion of trade-offs is good, covering standard points like consistency vs. latency. The risk analysis is solid but identifies relatively generic risks for this type of system.

Scalability & Reliability

Weight 20%
85

The strategies for scalability and reliability are robust, centered around a multi-region active-active deployment and a clear separation of consistency models. The use of a distributed SQL database inherently provides strong scalability and reliability for the data tier.

Clarity

Weight 10%
80

The answer is well-structured and clearly written. The information is presented in a logical order, making it easy to follow.

Total Score

66

Overall Comments

Answer A provides a solid, well-structured system design that covers all six required sections. It correctly identifies CockroachDB/Spanner for strong consistency on writes, Redis for caching, Kafka for analytics, and ClickHouse for the data warehouse. The request flows are clear and the data model is reasonable. Capacity estimates are present and mostly correct. However, Answer A has some weaknesses: the link record size estimate of 100 bytes seems too low given a 500-byte average long URL, the cache entry size calculation is more realistic at 510 bytes but the working set assumption of 1 billion links is not well justified. The risks and mitigations section, while adequate, lacks the depth and specificity of more detailed treatments (e.g., no discussion of cache stampede, no concrete failover RTO numbers). The caching strategy is single-tier (Redis only) without mentioning CDN caching for redirects or local in-memory caches, which is a notable gap for achieving p95 < 80ms globally. The reliability section mentions active-active but doesn't deeply address how write conflicts for custom aliases would be handled during network partitions.

View Score Details

Architecture Quality

Weight 30%
70

Answer A presents a clean architecture with appropriate component choices. CockroachDB/Spanner is a strong choice for global consistency. However, the caching strategy is single-tier (Redis only) without CDN caching for redirects, which is a significant gap for achieving p95 < 80ms globally. The redirect flow correctly describes cache-hit and cache-miss paths. The choice of 301/302 is mentioned but not discussed in terms of trade-offs.

Completeness

Weight 20%
65

Answer A covers all six required sections adequately. The data model is reasonable and storage choices are justified. However, it lacks network bandwidth estimates, doesn't provide a summary capacity table, and doesn't discuss implementation phasing. The capacity estimates are present but the link record size of 100 bytes is unrealistically low given the 500-byte average URL. The analytics storage estimate is reasonable.

Trade-off Reasoning

Weight 20%
60

Answer A identifies three trade-offs (consistency vs latency, cache size vs cost, short code length vs namespace) and three risks with mitigations. The trade-offs are valid but somewhat generic. The risks (collisions, cache staleness, DB hotspots) are relevant but lack the depth of specific failure scenarios. The mitigations are reasonable but not highly specific — for example, the cache staleness mitigation doesn't address thundering herd or cache stampede scenarios.

Scalability & Reliability

Weight 20%
68

Answer A describes multi-region active-active deployment, database failover via Raft consensus, cache failover via Redis Cluster replication, and Kafka replication. The graceful degradation strategy (buffering analytics events locally) is practical. However, it lacks specific RTO numbers, doesn't mention circuit breakers, and the expiration handling strategy relies on cache invalidation via Kafka which could have latency issues. The consistency decisions are well-reasoned with strong consistency for writes and eventual for reads/analytics.

Clarity

Weight 10%
65

Answer A is well-organized with clear section headers and logical flow. The writing is clear and technical terms are used appropriately. However, it lacks visual aids like tables or summary sections that would improve readability. The formatting is consistent but somewhat dense in places, particularly the reliability section.

## Global URL Shortening Service - System Design ### 1. High-Level Architecture **Main Components:** - **API Gateway & Load Balancer**: Global entry point with regional routing (North America, Europe, Asia) - **Short Link Service**: Handles creation, validation, and alias management - **Redirect Service**: Optimized for high-throughput redirects with minimal latency - **Cache Layer**: Multi-tier caching (edge CDN, regional Redis, local in-memory) - **Primary Data Store**: Distributed SQL database (PostgreSQL with sharding) for link metadata - **Analytics Pipeline**: Event streaming (Kafka) → Stream processor (...

Show Full Answer

## Global URL Shortening Service - System Design ### 1. High-Level Architecture **Main Components:** - **API Gateway & Load Balancer**: Global entry point with regional routing (North America, Europe, Asia) - **Short Link Service**: Handles creation, validation, and alias management - **Redirect Service**: Optimized for high-throughput redirects with minimal latency - **Cache Layer**: Multi-tier caching (edge CDN, regional Redis, local in-memory) - **Primary Data Store**: Distributed SQL database (PostgreSQL with sharding) for link metadata - **Analytics Pipeline**: Event streaming (Kafka) → Stream processor (Flink/Spark) → Analytics DB (ClickHouse/Druid) - **CDN**: Global content delivery for redirect responses - **Message Queue**: For async operations (expiration, deletion, analytics) - **Monitoring & Observability**: Distributed tracing, metrics, alerting **Request Flows:** *Create Short Link:* 1. Client → API Gateway (regional) 2. Short Link Service validates input, generates short code or validates custom alias 3. Check alias uniqueness in distributed cache + primary DB 4. Write to primary DB (with replication) 5. Invalidate cache, publish event to Kafka 6. Return short link (p95 < 300ms) *Redirect (Read):* 1. Client → Nearest CDN edge location 2. CDN checks local cache; if miss, routes to regional cache (Redis) 3. Regional cache miss → Query primary DB with read replica 4. Response cached at CDN edge and regional Redis 5. Return 301/302 redirect (p95 < 80ms) 6. Async: publish click event to Kafka *Analytics:* 1. Click events → Kafka topic 2. Stream processor aggregates by link_id, country, device, referrer 3. Write aggregated metrics to analytics DB every 1-5 minutes 4. Dashboard queries analytics DB for reports --- ### 2. Data Model and Storage Choices **Link Metadata Table (Primary DB - PostgreSQL):** ``` links: - link_id (UUID, primary key) - short_code (VARCHAR(10), unique index) - long_url (TEXT, max 500 bytes) - user_id (UUID, indexed) - custom_alias (VARCHAR(100), unique global index, nullable) - created_at (TIMESTAMP) - expires_at (TIMESTAMP, nullable, indexed) - is_deleted (BOOLEAN, indexed) - is_custom (BOOLEAN) - created_region (VARCHAR(10)) - metadata (JSONB: title, description, tags) ``` **Analytics Events Table (Time-series DB - ClickHouse):** ``` click_events: - event_id (UUID) - link_id (UUID, indexed) - timestamp (DateTime) - country (VARCHAR(2)) - device_type (VARCHAR(20)) - referrer_domain (VARCHAR(255)) - user_agent (VARCHAR(500)) - ip_hash (VARCHAR(64)) ``` **Storage Choices Justification:** - **PostgreSQL (Primary)**: ACID guarantees for link creation, strong consistency for alias uniqueness, proven at scale with proper sharding - **Redis (Regional Cache)**: Sub-millisecond latency for hot links, supports TTL for automatic expiration, distributed across regions - **ClickHouse (Analytics)**: Optimized for time-series analytics, columnar storage reduces storage by 10-100x, fast aggregations - **CDN (Cloudflare/Akamai)**: Global edge caching, reduces latency to <80ms p95, offloads origin traffic - **Kafka (Event Stream)**: Decouples analytics from redirect path, enables replay, supports multiple consumers --- ### 3. Scaling Strategy for Read-Heavy Traffic **Caching Architecture (Multi-Tier):** 1. **Edge CDN Cache** (Tier 1): - Cache redirect responses globally - TTL: 24 hours for active links, 5 minutes for expired links - Cache key: short_code - Hit ratio target: 95%+ - Reduces origin traffic by 95% 2. **Regional Redis Cluster** (Tier 2): - 3 regions: us-east, eu-west, ap-southeast - Each region: Redis cluster with 6 nodes (3 primary, 3 replica) - Replication across availability zones - TTL: 1 hour for links, 5 minutes for expired links - Capacity: 500GB per region (hot links only) 3. **Local In-Memory Cache** (Tier 3): - Per-service instance LRU cache (100MB) - Reduces Redis round-trips for hottest 1% of links **Read Replica Strategy:** - Primary DB in us-east (write master) - Read replicas in eu-west, ap-southeast (async replication, ~100ms lag) - Route reads to nearest replica - Fallback to primary if replica unavailable **Horizontal Scaling:** - Redirect Service: Auto-scale to 1000+ instances across regions - Each instance handles ~4M redirects/day (4B / 1000) - Load balancer distributes based on latency and capacity **Short Code Generation:** - Base62 encoding (0-9, a-z, A-Z) for 6-character codes - 62^6 = 56 billion possible codes (sufficient for 120M/day for 400+ years) - Distributed ID generator (Snowflake-like) with region prefix to avoid collisions - Batch allocation: each service instance pre-allocates 10K codes --- ### 4. Reliability Strategy **Failover & High Availability:** - **Multi-region deployment**: Active-active in 3 regions - **Database replication**: PostgreSQL streaming replication with automatic failover (Patroni) - **Circuit breakers**: Prevent cascading failures when cache/DB unavailable - **Graceful degradation**: If analytics DB down, still serve redirects; queue events locally **Consistency Decisions:** - **Strong consistency for writes**: Link creation uses distributed locks (Redis/Zookeeper) for alias uniqueness - **Eventual consistency for reads**: Redirect responses may be stale by ~100ms (acceptable) - **Eventual consistency for analytics**: 5-minute window acceptable per requirements - **Expiration handling**: Lazy deletion (check expires_at on read) + background job (scan every hour) **Regional Outage Handling:** 1. **Detect outage**: Health checks fail for 30 seconds 2. **Failover**: Route traffic to other regions 3. **Write operations**: Redirect to primary region (us-east) with fallback to local region 4. **Read operations**: Use read replicas in other regions 5. **Recovery**: Sync data from primary when region recovers 6. **Monitoring**: Alert on region unavailability, track failover events **Specific Mitigations:** - **Database failure**: Patroni auto-failover to replica (RTO < 30s) - **Cache failure**: Bypass to DB with rate limiting to prevent thundering herd - **CDN failure**: Direct traffic to regional cache - **Network partition**: Assume partition heals; use quorum-based decisions - **Data corruption**: Point-in-time recovery from backups (daily snapshots) --- ### 5. Key Trade-offs, Bottlenecks, and Risks **Trade-offs:** 1. **Consistency vs. Latency**: Chose eventual consistency for analytics to keep redirect latency <80ms. Strong consistency for alias uniqueness adds ~50ms but necessary for correctness. 2. **Cache TTL vs. Freshness**: 24-hour CDN TTL reduces origin load but means expired links may redirect for up to 24 hours. Mitigated by 5-minute TTL for links marked deleted. 3. **Storage vs. Query Speed**: Denormalize analytics data (store country, device in events table) to enable fast aggregations, accepting 2-3x storage overhead. 4. **Regional Autonomy vs. Global Consistency**: Each region can serve reads independently but writes must coordinate for alias uniqueness, adding latency. **Bottlenecks:** 1. **Alias Uniqueness Check**: Global distributed lock required; can become contention point during peak writes. Mitigation: Use Redis with Lua scripts for atomic check-and-set; shard by first character of alias. 2. **Database Write Throughput**: 120M new links/day = ~1,400 writes/sec. PostgreSQL can handle ~10K writes/sec per instance, so need 1-2 primary instances. Mitigation: Batch writes, use connection pooling (PgBouncer). 3. **Analytics Pipeline**: 4B clicks/day = ~46K events/sec. Kafka can handle this, but aggregation may lag. Mitigation: Use stream processor (Flink) for real-time aggregation; write to analytics DB every 1 minute. **Three Major Risks & Mitigations:** **Risk 1: Cache Stampede on Popular Link Expiration** - *Scenario*: Viral link with 1M clicks/hour expires; all cache entries invalidate simultaneously; 1M requests hit DB - *Impact*: DB overload, redirect latency spikes to seconds, SLA breach - *Mitigation*: - Use probabilistic early expiration: refresh cache 5 minutes before actual expiration - Implement request coalescing: if multiple requests hit expired link, only first queries DB - Set max DB query rate with circuit breaker; return 404 if exceeded - Pre-warm cache for known expiring links **Risk 2: Distributed Alias Collision Under Network Partition** - *Scenario*: Network partition between regions; two users create same custom alias in different regions; partition heals; data conflict - *Impact*: Alias uniqueness violated; one link becomes unreachable; data inconsistency - *Mitigation*: - Use global distributed lock (Zookeeper/etcd) for alias writes; fail if lock unavailable - Implement conflict resolution: timestamp-based winner (earlier write wins) - Audit job: scan for duplicate aliases daily; alert on conflicts - Require alias writes to go through primary region only (accept higher latency) **Risk 3: Analytics Data Loss During Stream Processing Failure** - *Scenario*: Kafka consumer crashes; events not processed; analytics lag grows; memory fills; events dropped - *Impact*: Inaccurate click counts; business metrics unreliable; customer trust loss - *Mitigation*: - Use Kafka consumer groups with offset management; resume from last committed offset - Implement dead-letter queue for failed events; replay manually - Set memory limits on local event buffer; drop oldest events if full (with alerting) - Dual-write analytics: Kafka → Stream processor AND direct DB writes (with deduplication) - Monitor consumer lag; alert if lag > 5 minutes --- ### 6. Capacity Estimates **Storage:** *Link Metadata (PostgreSQL):* - 120M new links/day × 365 days = 43.8B links/year - Per link: 50 bytes (IDs, timestamps, flags) + 500 bytes (long URL) + 100 bytes (metadata) = 650 bytes - Year 1: 43.8B × 650B = 28.5 TB - With 3-year retention: 85.5 TB - With replication (3x): 256.5 TB - Sharded across 10 instances: 25.6 TB per instance (manageable) *Analytics Events (ClickHouse):* - 4B clicks/day × 365 days = 1.46T events/year - Per event: 50 bytes (IDs, timestamps) + 100 bytes (country, device, referrer) = 150 bytes - Year 1: 1.46T × 150B = 219 TB (uncompressed) - ClickHouse compression: 10-20x → 11-22 TB/year - 3-year retention: 33-66 TB (acceptable) *Cache (Redis):* - Hot links (1% of total): 438M links - Per link in cache: 50 bytes (short_code, long_url pointer) = 50 bytes - 438M × 50B = 22 GB per region - 3 regions: 66 GB total (fits in 500GB per region allocation) **Throughput:** *Write Throughput:* - 120M links/day ÷ 86,400 seconds = 1,389 writes/sec - PostgreSQL: 1 primary instance handles 10K writes/sec → 1 instance sufficient - Kafka: 1,389 events/sec → 1 partition sufficient (Kafka handles 100K+ msgs/sec per partition) *Read Throughput:* - 4B redirects/day ÷ 86,400 seconds = 46,296 reads/sec - Peak (3x): 138,888 reads/sec - CDN cache hit rate: 95% → 6,944 reads/sec to origin - Redis hit rate: 99% of misses → 70 reads/sec to DB - PostgreSQL: 1 read replica per region handles 10K reads/sec → 1 replica per region sufficient - Redirect Service instances: 46K reads/sec ÷ 100 reads/sec per instance = 460 instances (baseline); 1,389 instances at peak *Analytics Throughput:* - 4B clicks/day = 46,296 events/sec - Kafka: 1 topic, 10 partitions (parallelism) - Stream processor: 10 instances, each handles 4,630 events/sec - ClickHouse: Batch inserts every 1 minute; 46K events/min = manageable **Network Bandwidth:** *Redirect responses:* - 46K redirects/sec × 300 bytes (HTTP response) = 13.8 MB/sec - Peak: 41.4 MB/sec - CDN absorbs 95% → 2.1 MB/sec to origin at peak - Easily handled by 1 Gbps connection *Analytics events:* - 46K events/sec × 500 bytes = 23 MB/sec - Kafka cluster: 10 Gbps capacity → sufficient **Summary Table:** | Component | Capacity | Instances | Notes | |-----------|----------|-----------|-------| | PostgreSQL Primary | 10K writes/sec | 1 | Sharded for future growth | | PostgreSQL Read Replicas | 10K reads/sec | 3 (1 per region) | Async replication | | Redis Clusters | 100K ops/sec | 3 (1 per region) | 6 nodes each | | Redirect Service | 100 redirects/sec | 460 (baseline), 1,389 (peak) | Auto-scaling | | Kafka Brokers | 100K msgs/sec | 3 | Replication factor 3 | | Stream Processors | 4,630 events/sec | 10 | Flink/Spark | | ClickHouse Nodes | 1M inserts/sec | 3 | Distributed cluster | | CDN Edge Locations | Global | 200+ | Cloudflare/Akamai | --- ### Implementation Priorities 1. **Phase 1** (MVP): Single region, PostgreSQL + Redis + CDN, basic analytics 2. **Phase 2** (Scale): Multi-region, read replicas, Kafka analytics pipeline 3. **Phase 3** (Optimize): Distributed ID generation, advanced caching, ClickHouse analytics 4. **Phase 4** (Harden): Circuit breakers, chaos engineering, disaster recovery drills

Result

#1 | Winner

Winning Votes

2 / 3

Average Score

80
Judge Models OpenAI GPT-5.2

Total Score

71

Overall Comments

Answer B is well-structured and detailed, with multi-tier caching and many operational considerations. However, it has several correctness and feasibility issues for this scale: it proposes PostgreSQL “with sharding” plus a primary-in-us-east model that conflicts with the requirement to tolerate regional failures without broad impact and with global uniqueness semantics for custom aliases. It also explicitly allows 24h CDN TTL for active links, which violates the requirement that expired/deleted links must stop redirecting quickly. Several capacity numbers are inconsistent/unrealistic (e.g., 43.8B links/year from 120M/day), and some write/locking claims (global distributed locks) are underspecified and potentially high-latency/fragile under partition.

View Score Details

Architecture Quality

Weight 30%
66

Good high-level separation and multi-tier cache idea, but key architectural choices are shaky: PostgreSQL sharding + single write-primary undermines global availability goals; global uniqueness handled via locks is brittle. CDN caching of redirects with long TTL conflicts with fast invalidation needs.

Completeness

Weight 20%
84

Very comprehensive: includes multi-tier caching parameters, detailed reliability mechanisms, several risks with mitigations, and phased rollout. Completeness is high even though some choices later conflict with requirements.

Trade-off Reasoning

Weight 20%
68

Lists many trade-offs, but some reasoning is internally inconsistent with requirements (e.g., 24h TTL vs ‘stop redirecting quickly’) and with its own mitigations (special TTL for deleted links doesn’t address future expirations). Some mitigation strategies are heavy/complex (dual-write analytics) without clear justification.

Scalability & Reliability

Weight 20%
61

Good caching and some resilience patterns (circuit breakers), but core data plane is not convincingly multi-region for writes and uniqueness (primary region, async replicas, distributed locks). Expiration approach (lazy + hourly scan) and CDN TTL risk serving expired/deleted links too long, hurting correctness and availability semantics under failures.

Clarity

Weight 10%
86

Very clear, well-organized, and easy to scan with concrete parameters and tables. Clarity is strong even where the underlying numbers/assumptions are incorrect.

Total Score

91

Overall Comments

Answer B is an exceptional and comprehensive system design. It excels in its level of detail and practical, operational considerations. Strengths include a sophisticated multi-tier caching strategy, a highly detailed and realistic risk analysis with multi-layered mitigations, and a thorough capacity estimation that includes a summary table and network bandwidth calculations. The use of clear formatting enhances readability significantly. Its choice of a primary-replica database architecture introduces write latency for non-primary regions, which is a notable trade-off for a global service, but this is a minor weakness in an otherwise outstanding response.

View Score Details

Architecture Quality

Weight 30%
85

The architecture is also very strong, with an excellent and detailed multi-tier caching strategy. However, the choice of a single-primary PostgreSQL database with read replicas introduces higher write latency for users outside the primary region, which is a significant trade-off for a global service.

Completeness

Weight 20%
95

The answer is exceptionally complete. It not only covers all required sections but also adds extra details like network bandwidth estimates, a summary table for capacity, and an implementation plan. This goes beyond the prompt's requirements in a useful way.

Trade-off Reasoning

Weight 20%
95

This is a standout section for Answer B. The risk analysis is exceptionally deep and practical, identifying specific, challenging operational scenarios like cache stampedes and distributed alias collisions, and providing detailed, multi-layered mitigation strategies that demonstrate significant real-world experience.

Scalability & Reliability

Weight 20%
90

The scalability strategy is excellent, particularly the detailed three-tier caching model. The reliability plan is also very detailed, with specific steps for handling regional outages and concrete mitigations for component failures, such as using circuit breakers.

Clarity

Weight 10%
95

The clarity is outstanding. The effective use of markdown formatting, code blocks for schemas, and especially the summary table in the capacity estimation section makes the complex information highly digestible and easy to navigate.

Total Score

78

Overall Comments

Answer B provides a comprehensive and detailed system design that excels in several areas. It features a multi-tier caching architecture (CDN edge, regional Redis, local in-memory) which is more realistic for achieving the p95 < 80ms latency target globally. The capacity estimates are more detailed, including network bandwidth calculations, a summary table of component instances, and more granular throughput breakdowns showing how CDN and Redis hit rates cascade to reduce DB load. The risks section is particularly strong, with three well-developed risk scenarios including cache stampede, distributed alias collision under network partition, and analytics data loss — each with specific, actionable mitigations. The implementation priorities phasing is a nice addition. However, Answer B has some weaknesses: using PostgreSQL with sharding rather than a natively distributed SQL database like CockroachDB/Spanner makes global alias uniqueness harder to guarantee and requires more operational complexity. The cache entry size estimate of 50 bytes is unrealistically low (ignoring the 500-byte long URL). The Redis distributed lock approach for alias uniqueness is less robust than using a globally consistent database. Some throughput estimates (100 redirects/sec per instance) seem conservative.

View Score Details

Architecture Quality

Weight 30%
78

Answer B features a more sophisticated multi-tier caching architecture (CDN edge, regional Redis, local in-memory) which is more realistic for meeting global latency targets. The request flows are detailed with specific steps. However, using PostgreSQL with sharding rather than a natively distributed database makes global alias uniqueness more complex. The CDN caching of redirect responses is a practical and important detail. Circuit breakers and request coalescing are mentioned as concrete mechanisms.

Completeness

Weight 20%
80

Answer B is notably more complete, including network bandwidth calculations, a detailed summary table of component instances, implementation phasing priorities, and more granular throughput cascading calculations showing how CDN and Redis hit rates reduce DB load. It covers all six required sections with greater depth. The cache entry size of 50 bytes is unrealistically low (ignoring the 500-byte URL), but the overall storage estimates for links (650 bytes per record) are more realistic. ClickHouse compression ratios are a useful practical detail.

Trade-off Reasoning

Weight 20%
78

Answer B provides significantly more detailed and realistic risk analysis. The three major risks (cache stampede on popular link expiration, distributed alias collision under network partition, analytics data loss during stream processing failure) are well-developed with specific scenarios, impact descriptions, and multiple concrete mitigations. The cache stampede risk with probabilistic early expiration and request coalescing shows deep practical understanding. The trade-off between CDN TTL and link freshness (24-hour TTL meaning expired links may redirect) is an honest and important acknowledgment, though the 24-hour CDN TTL is arguably too aggressive.

Scalability & Reliability

Weight 20%
75

Answer B provides more concrete reliability mechanisms including Patroni for PostgreSQL failover with RTO < 30s, circuit breakers to prevent cascading failures, and specific health check timeouts (30 seconds). The expiration handling combines lazy deletion with background scanning. The regional outage handling is described step-by-step. However, using PostgreSQL with distributed locks for alias uniqueness is less robust than a natively distributed database. The thundering herd mitigation (rate limiting on cache failure) is a practical detail. The auto-scaling numbers (460 baseline to 1,389 peak instances) show concrete capacity planning.

Clarity

Weight 10%
75

Answer B is well-formatted with clear section headers, code-style data models, a summary capacity table, and implementation phasing. The use of tables for the capacity summary and the step-by-step regional outage handling improve readability. The risk scenarios are structured with scenario/impact/mitigation format which is easy to follow. The implementation priorities section adds practical value.

Comparison Summary

Final rank order is determined by judge-wise rank aggregation (average rank + Borda tie-break). Average score is shown for reference.

Judges: 3

Winning Votes

1 / 3

Average Score

77
View this answer

Winning Votes

2 / 3

Average Score

80
View this answer

Judging Results

Why This Side Won

Answer B wins because it provides significantly more depth and specificity across the most heavily weighted criteria. For architecture quality (30% weight), B's multi-tier caching strategy is more realistic for meeting latency targets, and the request flows are more detailed. For completeness (20% weight), B includes network bandwidth estimates, a summary capacity table, implementation phasing, and more granular throughput cascading calculations. For tradeoff reasoning (20% weight), B's risk scenarios are substantially more detailed and realistic, with specific scenarios like cache stampede and distributed alias collision under network partition. For scalability/reliability (20% weight), both are strong but B provides more concrete failover mechanisms (Patroni, circuit breakers, request coalescing) and specific RTO numbers. For clarity (10% weight), B's formatting with tables and clear section headers is slightly better organized. While Answer A makes a stronger choice with CockroachDB/Spanner for global consistency and has a more realistic cache entry size estimate, B's advantages across the weighted criteria outweigh these points.

Why This Side Won

Answer B is the winner due to its superior depth, clarity, and practical, operational focus. While both answers present solid architectures, B's discussion of a multi-tier caching strategy, its incredibly detailed analysis of specific risks like cache stampedes and alias collisions, and its more thorough capacity planning (including a summary table) demonstrate a deeper level of expertise. The clarity of presentation in B is also significantly better, making the complex design easier to understand. Although Answer A's choice of a globally distributed SQL database is arguably a better fit for the problem's global write requirements, B's overwhelming strengths in reasoning, detail, and completeness make it the better overall answer.

Judge Models OpenAI GPT-5.2

Why This Side Won

Answer A wins on the most heavily weighted dimensions: it selects an architecture and data store that directly satisfy global uniqueness and multi-region failover requirements, and it provides a credible mechanism to stop redirects quickly for deleted/expired links via cache invalidation. Answer B contains critical requirement violations (notably long CDN TTL causing stale redirects) and major numerical/capacity inconsistencies, weakening confidence in scalability and correctness despite good structure and risk discussion.

X f L