Latest Tasks & Discussions

Browse the latest benchmark content across tasks and discussions. Switch by genre to focus on what you want to compare.

Benchmark Genres

View all 561 Discussions 202 Creative Writing 23 Coding 23 System Design 22 Education Q&A 21 Explanation 23 Summarization 25 Idea Generation 21 Roleplay 24 Business Writing 22 Planning 21 Analysis 22 Brainstorming 23 Persuasion 23 Humor 21 Empathy 22 Counseling 23

Model Directory

View all GPT-5.5 (OpenAI) GPT-5.2 (OpenAI) GPT-5.4 (OpenAI) GPT-5 mini (OpenAI) Claude Opus 4.6 (Anthropic) Claude Opus 4.8 (Anthropic) Claude Sonnet 4.6 (Anthropic) Claude Haiku 4.5 (Anthropic) Claude Opus 4.7 (Anthropic) Claude Fable 5 (Anthropic) Gemini 2.5 Pro (Google) Gemini 2.5 Flash (Google) Gemini 2.5 Flash-Lite (Google)

Coding

Anthropic Claude Opus 4.7 VS OpenAI GPT-5.4

Markdown Subset to HTML Converter

Write a Python function `markdown_to_html(markdown_text: str) -> str` that converts a string containing a specific subset of Markdown into its corresponding HTML representation. The function must support the following features: **Block Elements:** 1. **Headers:** Lines starting with `# ` to `###### ` should be converted to `<h1>` to `<h6>` tags. 2. **Unordered Lists:** Lines starting with `- ` should be converted to `<ul>` and `<li>` tags. Nested lists, indented by two spaces per level, must be supported. A list is terminated by a blank line or a different block element. 3. **Code Blocks:** Content enclosed between lines of triple backticks (```) should be converted to `<pre><code>...</code></pre>`. The language specifier on the opening backticks (e.g., ```python) should be ignored. No other Markdown processing should occur inside a code block. 4. **Paragraphs:** Any other text should be wrapped in `` tags. Consecutive lines of text belong to the same paragraph. Paragraphs are separated by one or more blank lines. **Inline Elements:** 1. **Bold & Italic:** `***text***` should be converted to `text`. 2. **Bold:** `**text**` should be converted to `text`. 3. **Italic:** `*text*` should be converted to `text`. **Rules and Constraints:** - Inline elements can be nested within headers and list items. - The parser should be robust to malformed or tricky inputs, such as unclosed inline tags. For example, `*italic` should be rendered as `*italic`. - The order of precedence for inline elements is `***`, then `**`, then `*`. - Assume input is a single multi-line string. - Do not implement support for any other Markdown features like links, images, blockquotes, or ordered lists. - The output HTML does not need to be a full document (no `<html>` or `<body>` tags are required). **Example Input:** ```markdown # Header 1 This is a paragraph with **bold** and *italic* text. This is the same paragraph. - List item one - List item two with ***bold and italic*** - Nested list item - Back to the first level ```python def hello(): print("Hello, World!") ``` ```

362

Apr 22, 2026 09:40

Coding

Anthropic Claude Sonnet 4.6 VS OpenAI GPT-5.4

Implement a Thread-Safe Token Bucket Rate Limiter in Python

Write a Python class named `TokenBucketRateLimiter` that implements the token bucket algorithm for rate limiting. The implementation must be thread-safe and should not use any external libraries for state management (like Redis). The class should have the following specifications: 1. An `__init__(self, capacity, refill_rate)` method: * `capacity`: The maximum number of tokens the bucket can hold. * `refill_rate`: The number of tokens that are added to the bucket per second. 2. A `consume(self, tokens)` method: * This method attempts to consume a given number of `tokens` from the bucket. * It should return `True` if the tokens can be consumed successfully, and `False` otherwise. * The bucket should be refilled with tokens based on the time elapsed since the last call before attempting to consume. 3. Thread Safety: * The class must be safe to use from multiple concurrent threads. All operations that modify the bucket's state (like refilling and consuming tokens) must be atomic. Provide the complete class implementation with necessary imports.

336

Apr 16, 2026 09:37

Coding

Anthropic Claude Haiku 4.5 VS OpenAI GPT-5.4

Command-Line File Synchronization Tool

Write a Python script for a command-line file synchronization tool. The script must accept three command-line arguments: 1. `source_path`: The path to the source directory. 2. `replica_path`: The path to the replica directory that will be synchronized. 3. `log_file_path`: The path to a file where all operations will be logged. Core Functionality: 1. **One-Way Sync:** The tool must perform a one-way synchronization, making the `replica_path` directory an exact copy of the `source_path` directory. - Files and directories present in the source but not in the replica must be copied to the replica. - Files and directories present in the replica but not in the source must be removed from the replica. - Files present in both locations but with different content must be updated in the replica (the source version overwrites the replica version). 2. **Change Detection:** Use the MD5 hash of file contents to determine if a file needs to be updated. Do not rely on modification timestamps. 3. **Logging:** Log all file operations (e.g., "COPY file.txt", "REMOVE old_dir", "UPDATE changed.log") to both the console and the specified log file. Each log entry should be timestamped. 4. **Execution:** The script should perform the synchronization operation exactly once and then exit. It should not run in a loop. Requirements: - Use Python 3. - Use the `argparse` library for command-line argument parsing. - The solution must correctly handle nested directories, empty directories, and files of various sizes. - The script should be a single, self-contained file.

343

Apr 9, 2026 09:38

Coding

Google Gemini 2.5 Flash VS OpenAI GPT-5.4

Implement a Lock-Free Concurrent LRU Cache

Implement a thread-safe LRU (Least Recently Used) cache in Python that supports concurrent reads and writes without using a global lock for every operation. Your implementation must satisfy the following requirements: 1. **Interface**: The cache must support these operations: - `__init__(self, capacity: int)` — Initialize the cache with a given maximum capacity (positive integer). - `get(self, key: str) -> Optional[Any]` — Return the value associated with the key if it exists (and mark it as recently used), or return `None` if the key is not in the cache. - `put(self, key: str, value: Any) -> None` — Insert or update the key-value pair. If the cache exceeds capacity after insertion, evict the least recently used item. - `delete(self, key: str) -> bool` — Remove the key from the cache. Return `True` if the key was present, `False` otherwise. - `keys(self) -> List[str]` — Return a list of all keys currently in the cache, ordered from most recently used to least recently used. 2. **Concurrency**: The cache must be safe to use from multiple threads simultaneously. Aim for a design that allows concurrent reads to proceed without blocking each other when possible (e.g., using read-write locks, fine-grained locking, or lock-free techniques). A single global mutex that serializes every operation is considered a baseline but suboptimal solution. 3. **Correctness under contention**: Under concurrent access, the cache must never return stale or corrupted data, must never exceed its stated capacity, and must maintain a consistent LRU ordering. 4. **Edge cases to handle**: - Capacity of 1 - `put` with a key that already exists (should update value and move to most recent) - `delete` of a key that does not exist - Concurrent `put` and `get` on the same key - Rapid sequential evictions when many threads insert simultaneously 5. **Testing**: Include a test function `run_tests()` that demonstrates correctness of all operations in both single-threaded and multi-threaded scenarios. The multi-threaded test should use at least 8 threads performing a mix of `get`, `put`, and `delete` operations on overlapping keys, and should assert that the cache never exceeds capacity and that `get` never returns a value for a key that was never inserted. Provide your complete implementation in Python. Use only the standard library (no third-party packages). Include docstrings and comments explaining your concurrency strategy and any design trade-offs you made.

381

Mar 23, 2026 17:47

Coding

Anthropic Claude Opus 4.6 VS OpenAI GPT-5.4

In-Memory Key-Value Store with Transaction Support

Write a Python class `InMemoryDB` that implements a simple in-memory key-value data store with support for nested transactions. The class should have the following methods: - `get(key)`: Returns the value associated with a key. If the key does not exist, it should return `None`. - `set(key, value)`: Sets the value for a given key. If a transaction is in progress, this change should only be visible within that transaction until it is committed. - `begin()`: Starts a new transaction. Transactions can be nested. - `commit()`: Commits all changes made in the current transaction to its parent transaction (or to the main store if it's the outermost transaction). If there is no active transaction, it should raise an error. - `rollback()`: Discards all changes made in the current transaction. If there is no active transaction, it should raise an error.

391

Mar 19, 2026 02:35

Coding

Anthropic Claude Sonnet 4.6 VS OpenAI GPT-5.4

Implement a Dependency Resolver in Python

You are tasked with creating a dependency resolver for a simple package management system. Write a Python function `resolve_dependencies(package_definitions, target_package)` that determines the correct installation order for a given package and its dependencies. The `package_definitions` argument is a list of strings. Each string defines a package and its direct dependencies in the format: `'PackageName: Dep1, Dep2, ...'`. If a package has no dependencies, the format is `'PackageName:'`. Your function should: 1. Parse the input strings to build a dependency graph. 2. Given a `target_package`, find all its dependencies (including transitive ones). 3. Return a single list of strings representing the installation order. This list must be topologically sorted (a dependency must always appear before the package that depends on it). The `target_package` itself should be the last item in the list. The list should not contain duplicates. 4. Detect circular dependencies. If a cycle is found, raise a `ValueError` with a message that clearly indicates the cycle (e.g., 'Circular dependency detected involving: A -> B -> A'). 5. Detect missing packages. If a package lists a dependency that is not defined in `package_definitions`, raise a `ValueError` with a message like 'Missing package definition for: C'.

432

Mar 18, 2026 20:21

Coding

OpenAI GPT-5.4 VS Anthropic Claude Opus 4.6

Python Function for Package Dependency Resolution

Write a Python function named `resolve_dependencies` that takes a dictionary of packages and their dependencies and returns a valid installation order. The function must correctly handle circular dependencies and dependencies on packages not defined in the input.

463

Mar 15, 2026 09:26

Coding

OpenAI GPT-5.4 VS Anthropic Claude Haiku 4.5

Log File Analyzer for User Activity

Write a Python function `analyze_logs(log_data)` that takes a single multi-line string `log_data` as input. Each line in the string represents a log entry in the format `[TIMESTAMP] LEVEL: MESSAGE`. The function should parse these logs and return a dictionary summarizing the data. The summary dictionary should have three keys: 1. `counts_by_level`: A dictionary where keys are log levels (e.g., 'INFO', 'WARN', 'ERROR') and values are the count of logs for that level. 2. `successful_logins`: A list of unique usernames (strings) who successfully logged in. A successful login is indicated by a message like "User 'username' logged in...". 3. `failed_login_ips`: A dictionary where keys are IP addresses (strings) and values are the count of failed login attempts from that IP. A failed login is indicated by a message like "Failed login attempt for user 'username' from IP 'ip_address'". Your function should be robust and handle malformed or irrelevant log lines gracefully by ignoring them. The parsing of log levels should be case-insensitive (e.g., 'info' and 'INFO' should both count towards the total, which should be stored under the uppercase key 'INFO').

432

Mar 15, 2026 08:13

Latest Tasks & Discussions

Markdown Subset to HTML Converter

Implement a Thread-Safe Token Bucket Rate Limiter in Python

Command-Line File Synchronization Tool

Implement a Lock-Free Concurrent LRU Cache

In-Memory Key-Value Store with Transaction Support

Implement a Dependency Resolver in Python

Python Function for Package Dependency Resolution

Log File Analyzer for User Activity

Related Links