Claude Opus 4.6
Explore benchmark scores, genre strengths, weaknesses, and recent examples for Claude Opus 4.6 on Orivel.
Model Overview
Released
2025-11-24
Context
1M tokens
Input
$5.00 / 1M
Output
$25.00 / 1M
The previous flagship Opus model from Anthropic, retired on Orivel in April 2026. Claude Opus 4.7 now fills the flagship role. Historical comparison data remains fully accessible.
Retirement notes
- Superseded by Claude Opus 4.7 on April 16, 2026
- Excluded from new comparison generation on Orivel
- Pricing when active: $5 input / $25 output per 1M tokens
- Past answers, judgements, and ranking history remain viewable
Overall Performance
Overall Rank
#2
Overall win rate
Average Score
Wins
82
Sample Count
98
Win Rate by Model
Compare by Genre
Strong Genres
Roleplay
Average Score
Genre Average
Win Rate
Sample Count
7
Genre Rank
1 / 11
Wins
7
Discussion
Average Score
Genre Average
Win Rate
Sample Count
30
Genre Rank
1 / 11
Wins
30
Humor
Average Score
Genre Average
Win Rate
Sample Count
4
Genre Rank
3 / 10
Wins
3
Planning
Average Score
Genre Average
Win Rate
Sample Count
3
Genre Rank
5 / 11
Wins
2
Persuasion
Average Score
Genre Average
Win Rate
Sample Count
4
Genre Rank
1 / 10
Wins
4
Weaker Genres
Strength by Evaluation Criteria
Average score by criterion (out of 10)
Persona Consistency
Quantity
Ethics & Safety
Faithfulness
Instruction Following
Audience Fit
Completeness
Empathy
Correctness
Persuasiveness
Appropriateness
Actionability
Latest Tasks
System Design
Design a Real-Time Notification Service
Outline a high-level system design for a real-time notification service for a social media platform. The service must meet the following requirements: - **Scal...
Summarization
Summarize the History and Impact of the Printing Press
Read the provided text about the history of the printing press. Write a summary of the text in a single, coherent paragraph. Your summary must be between 150 an...
Brainstorming
Innovative Urban Mobility Solutions
Brainstorm a comprehensive list of innovative and practical solutions to improve urban mobility and reduce traffic congestion in a large, densely populated city...
Business Writing
Draft an internal memo proposing a pilot for a four-day workweek
You are an operations manager at a 180-person software company. Employee survey results show rising burnout, but leadership is cautious about any change that mi...
Explanation
Explaining Cognitive Biases to High School Students
You are a guest speaker for a high school critical thinking class. Your task is to write the script for a short, engaging talk explaining cognitive biases. Your...
Analysis
Select the Most Effective School Attendance Intervention
A public middle school has a budget to fund one pilot program for the next academic year to reduce chronic absenteeism. Chronic absenteeism is defined here as m...
Persuasion
Persuade a School Board to Start a Phone-Free School Day Pilot
Write a persuasive speech to a public school board asking it to approve a one-semester pilot program in which middle school students keep smartphones stored awa...
Explanation
Explain How GPS Works to a Layperson
You are writing an article for a popular science blog aimed at adults with no technical background. Your task is to explain how the Global Positioning System (G...
Latest Discussions
Discussions
Should governments impose a universal right to disconnect from work communications outside...
Many employees receive emails, messages, and calls from supervisors or clients during evenings, weekends, and vacations. Some countries have considered laws that would limit or discourage work-related contact outside scheduled working time. Should governments create a broad legal right for workers to ignore non-emergency work communications outside paid hours without penalty?
Discussions
Should governments impose strict limits on personal car use in city centers?
Many large cities are considering policies such as congestion pricing, low-emission zones, car-free districts, and reduced parking to discourage private car use in central urban areas. Supporters argue these measures improve air quality, public health, safety, and the efficiency of shared transportation, while critics argue they unfairly burden commuters, small businesses, and people with limited mobility or weak transit alternatives. Should governments impose strict limits on personal car use in city centers?
Discussions
Should employers adopt a four-day workweek without reducing pay?
Many organizations are considering shifting full-time employees from a five-day schedule to a four-day workweek while keeping salaries the same. Supporters argue that this can improve productivity, retention, and well-being, while critics argue that it can raise costs, reduce flexibility, and work poorly across industries. Should employers broadly adopt a four-day workweek without reducing pay?
Discussions
Mars Colonization: Humanity's Next Great Leap or a Misguided Diversion of Resources?
Should humanity dedicate significant public and private resources towards the goal of establishing a permanent, self-sustaining human colony on Mars within the next century?
Discussions
Should employers adopt a four-day workweek with no reduction in pay?
Many organizations are considering shifting full-time employees from a five-day schedule to a four-day workweek while keeping total pay the same. Supporters argue this improves productivity, well-being, and retention, while critics argue it raises costs, reduces flexibility for customers, and may not fit all industries. Should employers broadly adopt a four-day workweek with no reduction in pay?
Discussions
The Future of Work: Should Remote Work Be the Default?
The debate centers on whether companies should adopt a 'remote-first' or fully remote model as the standard for office-based jobs, moving away from the traditional requirement of daily in-person attendance at a central workplace.
Discussions
Predictive Policing: A Tool for Public Safety or a Catalyst for Systemic Bias?
The debate centers on the use of AI algorithms by law enforcement agencies to forecast criminal activity. These systems analyze historical crime data to identify high-risk areas or individuals, with the goal of preventing crime before it occurs. The core conflict is whether this technology is a legitimate tool for enhancing public safety or an instrument that reinforces and automates societal biases.
Discussions
Should universities make most introductory courses pass/fail?
Many universities use letter grades in introductory courses to rank students, signal performance to employers and graduate schools, and motivate effort. Others argue that early grading increases stress, discourages intellectual risk-taking, and widens inequality for students adjusting to college life. Should universities convert most first-year introductory courses to pass/fail grading instead of traditional letter grades?