Orivel Orivel
Open menu

Latest Tasks & Discussions

Browse the latest benchmark content across tasks and discussions. Switch by genre to focus on what you want to compare.

Benchmark Genres

Model Directory

Coding

Google Gemini 2.5 Flash VS OpenAI GPT-5.4

Implement a Lock-Free Concurrent LRU Cache

Implement a thread-safe LRU (Least Recently Used) cache in Python that supports concurrent reads and writes without using a global lock for every operation. Your implementation must satisfy the following requirements: 1. **Interface**: The cache must support these operations: - `__init__(self, capacity: int)` — Initialize the cache with a given maximum capacity (positive integer). - `get(self, key: str) -> Optional[Any]` — Return the value associated with the key if it exists (and mark it as recently used), or return `None` if the key is not in the cache. - `put(self, key: str, value: Any) -> None` — Insert or update the key-value pair. If the cache exceeds capacity after insertion, evict the least recently used item. - `delete(self, key: str) -> bool` — Remove the key from the cache. Return `True` if the key was present, `False` otherwise. - `keys(self) -> List[str]` — Return a list of all keys currently in the cache, ordered from most recently used to least recently used. 2. **Concurrency**: The cache must be safe to use from multiple threads simultaneously. Aim for a design that allows concurrent reads to proceed without blocking each other when possible (e.g., using read-write locks, fine-grained locking, or lock-free techniques). A single global mutex that serializes every operation is considered a baseline but suboptimal solution. 3. **Correctness under contention**: Under concurrent access, the cache must never return stale or corrupted data, must never exceed its stated capacity, and must maintain a consistent LRU ordering. 4. **Edge cases to handle**: - Capacity of 1 - `put` with a key that already exists (should update value and move to most recent) - `delete` of a key that does not exist - Concurrent `put` and `get` on the same key - Rapid sequential evictions when many threads insert simultaneously 5. **Testing**: Include a test function `run_tests()` that demonstrates correctness of all operations in both single-threaded and multi-threaded scenarios. The multi-threaded test should use at least 8 threads performing a mix of `get`, `put`, and `delete` operations on overlapping keys, and should assert that the cache never exceeds capacity and that `get` never returns a value for a key that was never inserted. Provide your complete implementation in Python. Use only the standard library (no third-party packages). Include docstrings and comments explaining your concurrency strategy and any design trade-offs you made.

22
Mar 23, 2026 17:47

System Design

OpenAI GPT-5.2 VS Google Gemini 2.5 Flash

Design a URL Shortening Service

Design a URL shortening service (similar to bit.ly or tinyurl.com) that must handle the following constraints: 1. The service must support 100 million new URL shortenings per month. 2. The ratio of read (redirect) requests to write (shorten) requests is 100:1. 3. Shortened URLs should be as short as possible but must support the expected volume for at least 10 years. 4. The system must achieve 99.9% uptime availability. 5. Redirect latency must be under 50ms at the 95th percentile. 6. The service must handle graceful degradation if a data center goes offline. In your design, address each of the following areas: A) API Design: Define the key API endpoints and their contracts. B) Data Model and Storage: Choose a storage solution, justify your choice, explain your schema, and estimate the total storage needed over 10 years. C) Short URL Generation: Describe your algorithm for generating short codes. Discuss how you avoid collisions and what character set and length you chose, with a mathematical justification for why the keyspace is sufficient. D) Scaling and Performance: Explain how you would scale reads and writes independently. Describe your caching strategy, including cache eviction policy and expected hit rate. Explain how you meet the 50ms p95 latency requirement. E) Reliability and Fault Tolerance: Describe how the system handles data center failures, data replication strategy, and what trade-offs you make between consistency and availability (reference the CAP theorem). F) Trade-off Discussion: Identify at least two significant design trade-offs you made and explain why you chose one option over the other, including what you would sacrifice and gain. Present your answer as a structured plan with clear sections corresponding to A through F.

21
Mar 22, 2026 21:21

System Design

OpenAI GPT-5.4 VS Google Gemini 2.5 Flash

Design a URL Shortening Service

Design a URL shortening service (similar to bit.ly or tinyurl.com) that must handle the following constraints: 1. The service must support 100 million new URL shortenings per month. 2. The read-to-write ratio is 100:1 (i.e., for every URL created, it is accessed 100 times on average). 3. Shortened URLs must remain accessible for at least 5 years. 4. The system must achieve 99.9% uptime. 5. Redirect latency (from receiving a short URL request to issuing the HTTP redirect) must be under 50ms at the 95th percentile. Your design should address all of the following areas: A. **Short URL Generation Strategy**: How will you generate unique, compact short codes? Discuss the encoding scheme, expected URL length, and how you handle collisions or exhaustion of the key space. B. **Data Storage**: What database(s) will you use and why? Estimate the total storage needed over 5 years. Explain your schema design and any partitioning or sharding strategy. C. **Read Path Architecture**: How will you serve redirect requests at scale to meet the latency and throughput requirements? Discuss caching layers, CDN usage, and any replication strategies. D. **Write Path Architecture**: How will you handle the ingestion of 100M new URLs per month reliably? Discuss any queuing, rate limiting, or consistency considerations. E. **Reliability and Fault Tolerance**: How does your system handle node failures, data center outages, or cache invalidation? What is your backup and recovery strategy? F. **Key Trade-offs**: Identify at least two significant trade-offs in your design (e.g., consistency vs. availability, storage cost vs. read performance, simplicity vs. scalability) and explain why you chose the side you did. Present your answer as a structured design document with clear sections corresponding to A through F above.

47
Mar 20, 2026 17:43

System Design

Google Gemini 2.5 Flash VS Anthropic Claude Sonnet 4.6

Design a Global URL Shortening Service

Design a public URL shortening service similar to Bitly. Users can submit a long URL and receive a short alias; visiting the short link should redirect quickly to the original URL. The system must support custom aliases, optional expiration dates, basic click analytics, and abuse mitigation for malicious links. Requirements and constraints: - Functional requirements: - Create short URLs for long URLs. - Redirect short URLs to original URLs. - Support custom aliases when available. - Support optional expiration time per link. - Record click events for analytics. - Allow users to disable a link manually. - Scale assumptions: - 120 million new short URLs per month. - 1.5 billion redirects per day. - Redirect traffic is globally distributed and read-heavy. - Analytics data should be queryable within 15 minutes. - Performance targets: - Redirect p95 latency under 80 ms for most regions. - Short-link creation p95 under 300 ms. - 99.99% availability for redirects. - Data and retention: - Links may live indefinitely unless expired or disabled. - Raw click events may be retained for 90 days; aggregated analytics for 2 years. - Operational constraints: - Use commodity cloud infrastructure; do not assume one exotic managed product solves everything. - Budget matters: justify any replication, caching, and storage choices. - Short codes should be compact and reasonably hard to guess at large scale, but perfect secrecy is not required. In your answer, provide: 1. A high-level architecture with major components and data flow. 2. Storage choices for link metadata, redirect path, and analytics events, with rationale. 3. A short-code generation strategy, including how to avoid collisions and handle custom aliases. 4. A scaling plan for global traffic, including caching, partitioning/sharding, and multi-region considerations. 5. A reliability plan covering failures, hot keys, disaster recovery, and degraded-mode behavior. 6. Key APIs and core data models. 7. Abuse mitigation and security considerations. 8. The main trade-offs you made and why.

46
Mar 20, 2026 11:03

Analysis

Anthropic Claude Sonnet 4.6 VS Google Gemini 2.5 Flash

Select the Most Promising School Lunch Reform

A public school district can fund only one lunch reform for the next two years. Analyze the options below and recommend which single option the district should choose. Your answer should compare the tradeoffs, address likely objections, and reach a clear conclusion. District goals: 1. Improve student nutrition 2. Increase the number of students actually eating school lunch 3. Keep implementation realistic within two years 4. Avoid large ongoing cost overruns Current situation: - 12,000 students across 18 schools - 46% of students currently choose school lunch - Surveys suggest students often skip lunch because of taste, long lines, or lack of appealing choices - The district can afford only one of the following options now Option A: Hire trained chefs to redesign menus - Upfront training and consulting cost: medium - Ongoing food cost: slightly higher - Expected effects: meals taste better, healthier recipes become more appealing, moderate increase in participation - Risks: benefits depend on staff adoption and recipe consistency across schools Option B: Add self-serve salad and fruit bars in every school - Upfront equipment cost: high - Ongoing food waste risk: high - Expected effects: strong nutrition improvement for students who use the bars, modest participation increase overall - Risks: staffing, sanitation, and uneven use by age group Option C: Launch a mobile pre-order system for lunches - Upfront technology and training cost: medium - Ongoing cost: low to medium - Expected effects: shorter lines, better forecasting, moderate participation increase, little direct nutrition improvement unless menus stay the same - Risks: unequal access for families with limited technology use, adoption challenges at first Option D: Replace sugary desserts and fried sides with healthier defaults - Upfront cost: low - Ongoing cost: neutral - Expected effects: direct nutrition improvement for all school lunch users, possible small drop in participation if students dislike changes - Risks: student backlash, perception that lunch became less enjoyable Write an analysis that identifies the best choice given the district goals and constraints. Do not invent new budget numbers or outside facts; reason only from the information provided.

45
Mar 19, 2026 21:45

Brainstorming

Google Gemini 2.5 Flash VS OpenAI GPT-5.4

Revenue Streams for a Small-Town Public Library Facing Budget Cuts

A small-town public library (serving a population of roughly 12,000) has just learned that its annual municipal funding will be cut by 30% starting next fiscal year. The library has the following assets and constraints: Assets: - A 6,000 sq ft building with a 200-person capacity community room - A small parking lot (20 spaces) - Two full-time librarians and three part-time staff - A collection of 40,000 physical books and a modest digital catalog - A makerspace with a 3D printer, laser cutter, and sewing machines - Reliable high-speed internet and 15 public-use computers - A small fenced garden area behind the building Constraints: - The library must remain free to enter and must continue lending books at no charge - It cannot sell alcohol or host gambling - Any new revenue activity must be legal in a typical U.S. municipality - Staff cannot increase; volunteers may be recruited - The library board will not approve anything that generates significant noise complaints from adjacent residential neighbors Brainstorm as many distinct, practical revenue-generating or cost-saving ideas as you can. For each idea, provide: 1. A short name 2. A one-to-two sentence description of how it works 3. Which library asset it leverages Aim for breadth across different categories (e.g., events, partnerships, services, space rental, grants, merchandising, digital, etc.).

53
Mar 19, 2026 19:59

System Design

Google Gemini 2.5 Flash VS Anthropic Claude Haiku 4.5

Design a Global URL Shortening Service

Design a globally available URL shortening service similar to Bitly. The service must let users create short links that redirect to long URLs, support custom aliases for paid users, track click analytics, and allow links to expire at a specified time. Requirements: - Handle 120 million new short links per day. - Handle 4 billion redirects per day. - Peak traffic can reach 3 times the daily average. - Redirect latency target: p95 under 80 ms for users in North America, Europe, and Asia. - Short-link creation latency target: p95 under 300 ms. - Service availability target: 99.99% for redirects. - Analytics data can be eventually consistent within 5 minutes. - Custom aliases must be unique globally. - Expired or deleted links must stop redirecting quickly. - The system should tolerate regional failures without total service outage. Assumptions you may use: - Average long URL length is 500 bytes. - Analytics events include timestamp, link ID, country, device type, and referrer domain. - Read traffic is much higher than write traffic. - You may choose SQL, NoSQL, cache, stream, CDN, and messaging technologies as needed, but justify them. In your answer, provide: 1. A high-level architecture with main components and request flows. 2. Data model and storage choices for links, aliases, and analytics. 3. A scaling strategy for read-heavy traffic, including caching and regional routing. 4. A reliability strategy covering failover, consistency decisions, and handling regional outages. 5. Key trade-offs, bottlenecks, and at least three risks with mitigations. 6. A brief capacity estimate for storage and throughput using the numbers above.

56
Mar 19, 2026 18:51

Showing 1 to 20 of 74 results

Related Links

X f L