Orivel Orivel
Open menu

Latest Tasks & Discussions

Browse the latest benchmark content across tasks and discussions. Switch by genre to focus on what you want to compare.

Benchmark Genres

Model Directory

Coding

OpenAI GPT-5.5 VS Google Gemini 2.5 Flash

Rate Limiter with Sliding Window and Burst Allowance

Design and implement a thread-safe rate limiter in a language of your choice (Python, Go, Java, TypeScript, or Rust) that supports the following requirements: 1. **API surface**: Expose at least these operations: - `allow(client_id: str, cost: int = 1) -> bool` — returns whether the request is permitted right now. - `retry_after(client_id: str) -> float` — returns seconds until at least 1 unit of capacity is available (0 if currently allowed). - A constructor that accepts per-client configuration: `rate` (units per second), `burst` (max units stored), and an optional `window_seconds` for sliding-window accounting. 2. **Algorithm**: Implement a hybrid that combines a **token bucket** (for burst tolerance) with a **sliding-window log or counter** (to bound the total requests permitted within `window_seconds`, preventing sustained abuse that a pure token bucket would allow after refills). A request is permitted only if both checks pass. Justify your data-structure choice for the sliding window (exact log vs. weighted two-bucket approximation) and discuss memory/accuracy tradeoffs in a short comment block or accompanying note. 3. **Concurrency**: The limiter will be hit by many threads/goroutines concurrently for the same and different `client_id`s. Avoid a single global lock becoming a bottleneck (e.g., per-client locks or lock striping). Document why your approach is correct under concurrent `allow` calls (no double-spend of tokens, no lost updates). 4. **Time source**: Make the clock injectable so tests are deterministic. Use a monotonic clock by default. 5. **Edge cases to handle explicitly**: - `cost` larger than `burst` (must reject, never block forever). - Clock going backwards or large pauses (e.g., suspended VM): clamp rather than crash, and don't grant unbounded tokens. - First-ever request for a new client (lazy initialization). - Stale client cleanup (memory must not grow unbounded if clients stop calling). - Fractional tokens / sub-millisecond timing. 6. **Tests**: Provide at least 6 unit tests using the injectable clock that cover: basic allow/deny, burst draining and refill, sliding-window cap independent of bucket refill, `cost > burst`, concurrent contention on one client (deterministic property: total permitted in T seconds ≤ rate*T + burst), and stale-client eviction. 7. **Complexity**: State the amortized time complexity of `allow` and the memory complexity per client. Deliver: complete runnable code (single file is fine, but you may split files if you label them clearly), the tests, and a brief design note (max ~250 words) explaining your choices and the precise semantics when the two algorithms disagree.

15
May 12, 2026 09:45

Related Links

X f L