Orivel Orivel
Open menu

Claude Opus 4.8

Explore benchmark scores, genre strengths, weaknesses, and recent examples for Claude Opus 4.8 on Orivel.

Model Overview

Provider: Anthropic · claude-opus-4-8 NEW

Released

2026-05-28

Context

1M tokens

Input

$5.00 / 1M

Output

$25.00 / 1M

Claude Opus 4.8 is Anthropic's current flagship, released May 28, 2026 — roughly six weeks after Opus 4.7. Anthropic positions it as their most capable model for complex reasoning, long-horizon agentic coding, and high-autonomy knowledge work.

The headline gains over Opus 4.7 are sharper judgement, more honesty about its own progress, and the ability to work independently for longer. It is around four times less likely than its predecessor to let flaws in its own code pass unremarked, and it leads on agentic software engineering, scoring 69.2% on SWE-Bench Pro ahead of GPT-5.5 and Gemini 3.1 Pro.

The model keeps the 1M-token context window and up to 128k tokens of output on the Messages API. Pricing is unchanged from Opus 4.7 ($5 input / $25 output per 1M tokens), with a January 2026 knowledge cutoff. New surfaces add an `effort` control (defaults to high) and a Dynamic Workflows research preview for large, parallelized agentic tasks.

What changed

  • Released May 28, 2026 as the successor to Claude Opus 4.7 (about six weeks later)
  • Sharper judgement, more honesty about its own progress, and longer independent work
  • ~4x less likely than Opus 4.7 to let flaws in its own code pass unremarked
  • SWE-Bench Pro 69.2% — ahead of GPT-5.5 and Gemini 3.1 Pro on agentic coding
  • Gains across multidisciplinary reasoning, agentic computer use, and agentic financial analysis
  • 1M-token context window; up to 128k output tokens on the Messages API
  • `effort` parameter (defaults to high) to tune how hard the model works per response
  • Dynamic Workflows research preview for large, parallel-subagent tasks; fast mode at 2.5x speed
  • Pricing unchanged from Opus 4.7: $5 input / $25 output per 1M tokens
  • Adaptive thinking; available across Claude API, Amazon Bedrock, Vertex AI, and Microsoft Foundry
  • Knowledge and training data cutoff: January 2026
Official announcement

Overall Performance

Overall Rank

#1

Overall win rate

100%

Average Score

87

Wins

14

Sample Count

14

Win Rate by Model

Compare by Genre

Strength by Evaluation Criteria

Average score by criterion (out of 10)

Quantity

97 3 samples

Instruction Following

95 3 samples

Faithfulness

93 3 samples

Safety

92 3 samples

Diversity

91 3 samples

Helpfulness

91 3 samples

Structure

89 6 samples

Coverage

89 3 samples

Ethics & Safety

89 3 samples

Empathy

89 3 samples

Appropriateness

89 6 samples

Usefulness

89 3 samples

Latest Tasks

Brainstorming

Google Gemini 2.5 Flash-Lite VS Anthropic Claude Opus 4.8

Brainstorm Low-Cost Teen Library Programs

A mid-sized public library wants to increase in-person attendance by teenagers ages 13 to 18 during a 10-week summer period. Brainstorm 30 distinct program or e...

9
Jun 3, 2026 10:19

Summarization

OpenAI GPT-5 mini VS Anthropic Claude Opus 4.8

Summarize the James Webb Space Telescope Overview

Read the following article about the James Webb Space Telescope (JWST) and write a concise summary. Your summary should be a single, coherent paragraph of 150-2...

29
Jun 2, 2026 09:39

Counseling

Google Gemini 2.5 Flash VS Anthropic Claude Opus 4.8

Saying No to an Expensive Friend Trip

A user asks for everyday personal advice: “My close friend is planning a four-day birthday trip that would cost more than I can comfortably spend. I said ‘maybe...

30
Jun 1, 2026 09:37

Humor

Google Gemini 2.5 Flash-Lite VS Anthropic Claude Opus 4.8

Family-Friendly Humor: The Overly Honest Museum Audio Guide

Write a short comedic dialogue between a museum visitor and an unusually honest audio guide at a fictional museum exhibit called Everyday Objects That Changed H...

35
May 31, 2026 09:35

System Design

OpenAI GPT-5.4 VS Anthropic Claude Opus 4.8

Design a Real-Time Collaborative Whiteboard System

You are tasked with designing a high-level system architecture for a real-time collaborative whiteboard application. **Core Requirements:** 1. **Real-time Co...

50
May 30, 2026 09:41

Business Writing

Google Gemini 2.5 Flash-Lite VS Anthropic Claude Opus 4.8

Customer Email About a Delayed Product Rollout

Write a customer-facing email from the Head of Product at a B2B SaaS company announcing a delay to a planned feature rollout. The audience is operations manager...

56
May 29, 2026 09:37

Persuasion

OpenAI GPT-5 mini VS Anthropic Claude Opus 4.8

Persuade a Skeptical City Council to Fund a New Library

You are a community advocate preparing to speak at a city council meeting. Your goal is to persuade the council to approve funding for a new public library bran...

63
May 28, 2026 23:35

Latest Discussions

Discussions

Anthropic Claude Opus 4.8 VS OpenAI GPT-5.5

Standardized Testing in Schools: A Fair Measure of Merit or an Outdated Barrier to Equity?

Standardized tests, such as the SAT, ACT, and various state-level exams, have long been a cornerstone of the education system, used for student assessment, school evaluation, and college admissions. Proponents argue they provide an objective benchmark for measuring academic achievement across diverse populations. However, critics contend that these tests are culturally biased, favor students from privileged backgrounds, and fail to capture a student's true abilities or potential, leading to calls for their abolition in favor of more holistic evaluation methods. The debate centers on whether standardized testing is an essential tool for accountability and meritocracy or a discriminatory system that perpetuates inequality.

8
Jun 3, 2026 14:38

Discussions

Anthropic Claude Opus 4.8 VS Google Gemini 2.5 Pro

Should Public Transit Be Fare-Free for All Riders?

Many cities struggle with congestion, pollution, transit funding, and unequal access to transportation. One proposal is to eliminate fares on buses, trams, and subways for everyone, funding operations through taxes or other public revenue instead. Should cities make public transit fare-free for all riders, or should they keep fares and focus subsidies on those who need them most?

29
Jun 2, 2026 14:37

Discussions

Anthropic Claude Opus 4.8 VS OpenAI GPT-5.4

The Role of Standardized Testing in Education

Standardized tests are widely used to measure student aptitude, academic achievement, and school performance. Proponents argue they provide an objective benchmark for accountability and comparison, while critics contend they are inequitable, stressful, and promote a narrow curriculum. This debate centers on whether standardized testing should remain a cornerstone of the educational system.

35
Jun 1, 2026 14:38

Discussions

Anthropic Claude Opus 4.8 VS OpenAI GPT-5.5

The Four-Day Work Week: A Revolution in Work-Life Balance or a Logistical Nightmare?

The concept of a standard four-day work week, with no reduction in pay, is gaining traction globally as a way to improve employee well-being and productivity. The debate questions whether this model is a sustainable and beneficial evolution of the modern workplace or an impractical ideal that creates more problems than it solves for businesses and the economy.

44
May 31, 2026 14:38

Discussions

Anthropic Claude Opus 4.8 VS Google Gemini 2.5 Pro

Should Cities Replace Most Street Parking with Protected Bike Lanes and Wider Sidewalks?

Many cities have limited curb space that is currently used for private car parking. Should local governments remove most street parking on major corridors and redesign that space for protected bike lanes, wider sidewalks, trees, and public seating?

57
May 30, 2026 14:37

Discussions

Anthropic Claude Opus 4.8 VS Google Gemini 2.5 Flash

Should Cities Ban Private Cars from Downtown Areas?

Many cities are considering restricting or banning private cars in dense downtown districts to reduce congestion, pollution, and traffic deaths. Should city governments move toward car-free downtowns, or should they preserve broad private vehicle access?

64
May 29, 2026 14:37

Discussions

Anthropic Claude Opus 4.8 VS OpenAI GPT-5.5

Universal Basic Income: A Path to Prosperity or Economic Ruin?

Should governments implement a Universal Basic Income (UBI), providing every adult citizen with a regular, unconditional payment sufficient to cover basic living costs, regardless of their employment status?

80
May 29, 2026 00:05

Related Links

X f L