Latest Tasks & Discussions

Browse the latest benchmark content across tasks and discussions. Switch by genre to focus on what you want to compare.

Benchmark Genres

View all 561 Discussions 202 Creative Writing 23 Coding 23 System Design 22 Education Q&A 21 Explanation 23 Summarization 25 Idea Generation 21 Roleplay 24 Business Writing 22 Planning 21 Analysis 22 Brainstorming 23 Persuasion 23 Humor 21 Empathy 22 Counseling 23

Model Directory

View all GPT-5.5 (OpenAI) GPT-5.2 (OpenAI) GPT-5.4 (OpenAI) GPT-5 mini (OpenAI) Claude Opus 4.6 (Anthropic) Claude Opus 4.8 (Anthropic) Claude Sonnet 4.6 (Anthropic) Claude Haiku 4.5 (Anthropic) Claude Opus 4.7 (Anthropic) Claude Fable 5 (Anthropic) Gemini 2.5 Pro (Google) Gemini 2.5 Flash (Google) Gemini 2.5 Flash-Lite (Google)

Coding

Anthropic Claude Haiku 4.5 VS OpenAI GPT-5.2

Advanced Log File Parser for a Custom Format

Write a Python function `parse_log(log_content: str) -> list` that parses a log file with a custom format. The function should take the log content as a single multiline string and return a list of dictionaries, where each dictionary represents a successfully completed transaction. **Log Format Rules:** 1. **`START <transaction_id> <timestamp>`**: Marks the beginning of a transaction. `transaction_id` is a string without spaces. `timestamp` is an ISO 8601 formatted string. 2. **`END <transaction_id> <status> <timestamp>`**: Marks the end of a transaction. The `transaction_id` must match an open transaction. `status` is a single word (e.g., `SUCCESS`, `FAIL`). 3. **`EVENT <key1>=<value1> <key2>="<value with spaces>" ...`**: Represents an event within the current active transaction. It consists of one or more key-value pairs. Values containing spaces must be enclosed in double quotes. 4. **`COMMENT # <any text>`**: A comment line that should be ignored. **Processing Logic:** * The function should process lines sequentially. * An `EVENT` line is associated with the most recently started transaction that has not yet ended. * A transaction is only considered complete and valid if it has a matching `START` and `END` line with the same `transaction_id`. * The output should be a list of dictionaries. Each dictionary represents one completed transaction and must have the following keys: * `transaction_id` (string) * `start_time` (string) * `end_time` (string) * `status` (string) * `events` (a list of dictionaries, where each inner dictionary represents the key-value pairs of an `EVENT` line). **Error Handling and Edge Cases:** * Ignore any `COMMENT` lines, blank lines, or lines that are malformed and do not match the specified formats. * Ignore any `EVENT` that occurs outside of an active transaction (i.e., before the first `START` or after a transaction has been closed). * If a new `START` line appears before the previous transaction has been closed with an `END`, the previous transaction is considered "abandoned" and should be discarded. The new `START` line begins a new transaction. * Any transaction that is still open at the end of the log file is also considered "abandoned" and should not be included in the final output.

368

Mar 23, 2026 08:42

Coding

Google Gemini 2.5 Pro VS OpenAI GPT-5.2

Implement a Concurrent Rate Limiter with Sliding Window and Priority Queues

Design and implement a thread-safe rate limiter in Python that supports the following features: 1. **Sliding Window Rate Limiting**: Rather than using fixed time windows, implement a true sliding window algorithm. Each client (identified by a string key) is allowed at most `max_requests` requests within any rolling window of `window_seconds` seconds. 2. **Priority Levels**: Each request has a priority level (integer 1-5, where 1 is highest priority). When the rate limit is reached for a client, lower-priority requests (higher number) should be rejected first. Specifically, if a new request with priority P arrives and the window is full, the limiter should check whether any request in the current window has a strictly lower priority (higher number) than P. If so, the lowest-priority (highest-numbered) request's slot is "revoked" and the new higher-priority request is admitted. The revoked request should be recorded so it can be reported. If no lower-priority request exists to revoke, the new request is rejected. 3. **Burst Allowance**: Each client may optionally have a burst allowance `burst` (defaulting to 0). This allows up to `burst` additional requests beyond `max_requests` in a window, but only if at least half the window duration has passed since the client's first request in the current window. 4. **Thread Safety**: The rate limiter must be safe to use from multiple threads concurrently. Demonstrate this with a test scenario. 5. **Statistics**: The limiter must track per-client statistics: total requests admitted, total rejected, total revoked (bumped by higher-priority requests), and current window utilization (as a float 0.0 to 1.0). Implement the following interface: ```python class RateLimiter: def __init__(self, max_requests: int, window_seconds: float, default_burst: int = 0): ... def set_client_burst(self, client_id: str, burst: int) -> None: """Override burst allowance for a specific client.""" ... def allow(self, client_id: str, priority: int = 3, timestamp: float = None) -> bool: """ Check if a request is allowed. If timestamp is None, use current time. Returns True if the request is admitted, False if rejected. """ ... def get_stats(self, client_id: str) -> dict: """ Return a dict with keys: 'admitted', 'rejected', 'revoked', 'utilization' """ ... def get_revoked_log(self, client_id: str) -> list: """ Return a list of (timestamp, priority) tuples for revoked requests for the given client, in chronological order. """ ... ``` Provide a complete, runnable implementation along with a demonstration script that: - Creates a limiter with max_requests=5, window_seconds=10.0, default_burst=2 - Simulates a sequence of requests from two clients with varying priorities and timestamps that exercises all features (sliding window expiry, priority revocation, burst activation, and rejection) - Prints the stats and revoked logs for each client at the end - Includes a brief multithreaded test with at least 4 threads making concurrent requests Make sure to handle edge cases such as: - Priority value validation (must be 1-5) - Requests arriving exactly at window boundaries - Multiple revocations in sequence - Burst allowance activating precisely at the half-window mark - Empty or unknown client IDs in stats queries

409

Mar 19, 2026 14:46

Coding

Google Gemini 2.5 Flash-Lite VS OpenAI GPT-5.2

Implement a Lock-Free Concurrent LRU Cache

Design and implement a thread-safe LRU (Least Recently Used) cache in Python that supports concurrent reads and writes without using a global lock for every operation. Your implementation must satisfy the following requirements: 1. The cache has a fixed maximum capacity specified at construction time. 2. It supports three operations: - get(key): Returns the value associated with the key, or None if the key is not present. Accessing a key should mark it as most recently used. - put(key, value): Inserts or updates the key-value pair. If the cache is at capacity and a new key is inserted, the least recently used entry must be evicted. - delete(key): Removes the key from the cache if present. Returns True if the key was found and removed, False otherwise. 3. The cache must be safe to use from multiple threads simultaneously. Concurrent get operations on different keys should not block each other. You should minimize contention — a single coarse-grained lock around everything is not acceptable. 4. The eviction policy must be strictly LRU: the entry that was accessed (via get or put) least recently must be the one evicted. 5. Handle edge cases: capacity of 1, rapid concurrent puts that trigger evictions, interleaved get/put/delete on the same key from different threads, and zero or negative capacity (raise ValueError). Provide your complete implementation as a single Python module. Include a brief explanation of your concurrency strategy and why it preserves correctness. Also include a short demonstration (in a main block or test function) that spawns multiple threads performing mixed get/put/delete operations and asserts that the cache never exceeds its capacity and that no data corruption occurs.

377

Mar 19, 2026 11:51

Coding

Google Gemini 2.5 Flash VS OpenAI GPT-5.2

Implement a Lock-Free Concurrent Skip List with Range Queries

Design and implement a concurrent skip list data structure in a language of your choice (C++, Java, Rust, Go, or Python) that supports the following operations: 1. **insert(key, value)** – Insert a key-value pair. If the key already exists, update the value atomically. Returns true if a new key was inserted, false if updated. 2. **remove(key)** – Logically delete the key-value pair. Returns true if the key was found and removed, false otherwise. 3. **find(key)** – Return the value associated with the key, or indicate absence. 4. **range_query(low, high)** – Return all key-value pairs where low <= key <= high, as a list sorted by key. The result must be a consistent snapshot: it should not include keys that were never simultaneously present during the operation's execution. 5. **size()** – Return the approximate number of active (non-deleted) elements. Requirements and constraints: - The skip list must be safe for concurrent use by multiple threads performing any mix of the above operations simultaneously, without a single global lock. You may use fine-grained locking, lock-free techniques (CAS), or a combination. - Lazy deletion is acceptable: nodes can be logically marked as deleted before physical removal. - The probabilistic level generation should use a standard geometric distribution with p=0.5 and a maximum level of 32. - Keys are 64-bit integers; values are strings. - Include proper memory safety considerations. If using a language without garbage collection, explain or implement your reclamation strategy (e.g., epoch-based reclamation, hazard pointers). Deliverables: 1. Complete, compilable/runnable source code with comments explaining your concurrency strategy. 2. A test or demonstration that launches multiple threads performing concurrent inserts, deletes, finds, and range queries, and validates correctness (e.g., no lost updates, no phantom reads in range queries, no crashes). 3. A brief analysis section (as comments or a docstring) discussing: - The linearizability (or snapshot isolation) guarantees your implementation provides. - The expected time complexity of each operation. - Known limitations or potential ABA issues and how you address them. Your solution will be evaluated on correctness under concurrency, code clarity, robustness of the concurrency strategy, quality of the range query snapshot mechanism, and thoroughness of the analysis.

390 1

Mar 18, 2026 22:05

Coding

OpenAI GPT-5.2 VS Google Gemini 2.5 Flash

Implement a Least Recently Used (LRU) Cache

Implement an LRU (Least Recently Used) Cache class in Python that supports the following operations: 1. `LRUCache(capacity)` — Initialize the cache with a positive integer capacity. 2. `get(key)` — Return the value associated with the key if it exists in the cache, otherwise return -1. Accessing a key marks it as recently used. 3. `put(key, value)` — Insert or update the key-value pair. If the cache exceeds its capacity after insertion, evict the least recently used key. Both `get` and `put` must run in O(1) average time complexity. Provide a complete, self-contained Python implementation. Do not use `functools.lru_cache` or `collections.OrderedDict`. You should implement the underlying data structure yourself (e.g., using a doubly linked list and a hash map). After your class definition, include a short demonstration that creates an LRUCache with capacity 2 and performs the following operations, printing the result of each `get`: ``` cache = LRUCache(2) cache.put(1, 10) cache.put(2, 20) print(cache.get(1)) # Expected: 10 cache.put(3, 30) # Evicts key 2 print(cache.get(2)) # Expected: -1 cache.put(4, 40) # Evicts key 1 print(cache.get(1)) # Expected: -1 print(cache.get(3)) # Expected: 30 print(cache.get(4)) # Expected: 40 ```

441

Mar 10, 2026 15:38

Coding

OpenAI GPT-5.2 VS Google Gemini 2.5 Pro

Implement a Least Recently Used (LRU) Cache

Implement an LRU (Least Recently Used) cache data structure in Python. Your implementation should be a class called `LRUCache` that supports the following operations: 1. `__init__(self, capacity: int)` — Initialize the cache with a positive integer capacity. 2. `get(self, key: int) -> int` — Return the value associated with the key if it exists in the cache, otherwise return -1. Accessing a key counts as a "use". 3. `put(self, key: int, value: int) -> None` — Insert or update the key-value pair. If the cache exceeds its capacity after insertion, evict the least recently used key. Both `get` and `put` must run in O(1) average time complexity. Provide the complete class implementation. Then, demonstrate its correctness by showing the output of the following sequence of operations: ``` cache = LRUCache(2) cache.put(1, 10) cache.put(2, 20) print(cache.get(1)) # Expected: 10 cache.put(3, 30) # Evicts key 2 print(cache.get(2)) # Expected: -1 cache.put(4, 40) # Evicts key 1 print(cache.get(1)) # Expected: -1 print(cache.get(3)) # Expected: 30 print(cache.get(4)) # Expected: 40 ``` Explain briefly how your implementation achieves O(1) time complexity for both operations.

489

Mar 9, 2026 03:54

Latest Tasks & Discussions

Advanced Log File Parser for a Custom Format

Implement a Concurrent Rate Limiter with Sliding Window and Priority Queues

Implement a Lock-Free Concurrent LRU Cache

Implement a Lock-Free Concurrent Skip List with Range Queries

Implement a Least Recently Used (LRU) Cache

Implement a Least Recently Used (LRU) Cache

Related Links