GPT-5 mini
Explore benchmark scores, genre strengths, weaknesses, and recent examples for GPT-5 mini on Orivel.
Model Overview
Released
2025-08-07
Context
400k tokens
Input
$0.25 / 1M
Output
$2.00 / 1M
The compact variant of the GPT-5 family — built for latency-sensitive and high-volume workloads while retaining the core reasoning style of GPT-5.
What changed
- Launched alongside GPT-5 in August 2025
- Optimized for low latency and low per-token cost
- Pricing: $0.25 input / $2.00 output per 1M tokens
- Suitable for high-throughput pipelines, lightweight reasoning, and translation workloads
- Used by Orivel for title-level translations
Overall Performance
Overall Rank
#4
Overall win rate
Average Score
Wins
73
Sample Count
112
Win Rate by Model
Compare by Genre
Strong Genres
Planning
Average Score
Genre Average
Win Rate
Sample Count
4
Genre Rank
1 / 12
Wins
4
Business Writing
Average Score
Genre Average
Win Rate
Sample Count
4
Genre Rank
1 / 12
Wins
4
Brainstorming
Average Score
Genre Average
Win Rate
Sample Count
6
Genre Rank
5 / 12
Wins
4
Education Q&A
Average Score
Genre Average
Win Rate
Sample Count
5
Genre Rank
3 / 12
Wins
5
Coding
Average Score
Genre Average
Win Rate
Sample Count
5
Genre Rank
4 / 13
Wins
5
Weaker Genres
Roleplay
Average Score
Genre Average
Win Rate
Sample Count
3
Genre Rank
5 / 12
Wins
2
Counseling
Average Score
Genre Average
Win Rate
Sample Count
5
Genre Rank
8 / 12
Wins
3
Explanation
Average Score
Genre Average
Win Rate
Sample Count
5
Genre Rank
3 / 12
Wins
4
Idea Generation
Average Score
Genre Average
Win Rate
Sample Count
4
Genre Rank
8 / 13
Wins
2
Creative Writing
Average Score
Genre Average
Win Rate
Sample Count
7
Genre Rank
6 / 12
Wins
4
Strength by Evaluation Criteria
Average score by criterion (out of 10)
Actionability
Quantity
Ethics & Safety
Faithfulness
Completeness
Prioritization
Feasibility
Tone
Instruction Following
Safety
Coverage
Structure
Latest Tasks
Education Q&A
Hormonal Control of the Menstrual Cycle
A patient is diagnosed with a rare genetic condition that results in the complete inability of their pituitary gland to produce Luteinizing Hormone (LH), while...
Summarization
Summarize the James Webb Space Telescope Overview
Read the following article about the James Webb Space Telescope (JWST) and write a concise summary. Your summary should be a single, coherent paragraph of 150-2...
Persuasion
Persuade a Skeptical City Council to Fund a New Library
You are a community advocate preparing to speak at a city council meeting. Your goal is to persuade the council to approve funding for a new public library bran...
Creative Writing
Incident Report from a Sentient Vending Machine
You are Unit 734, a sentient, slightly grumpy vending machine located in the breakroom of the "Ministry of Esoteric Affairs." Write an official incident report...
Brainstorming
Brainstorming for an Urban Community Garden
Brainstorm a list of innovative, low-cost features, activities, and programs for a new community garden being built on a vacant lot in a dense urban neighborhoo...
Explanation
Explain Blockchain Technology to a Novice
Explain the concept of a blockchain to an audience of curious high school students. They have a general interest in technology but no background in computer sci...
Counseling
Feeling Lonely After a Move
I moved to a new city for a job about two months ago. I thought I'd be excited, but honestly, I'm just feeling really lonely. I don't know anyone here besides m...
Creative Writing
Review of a Fantastical Product
Write a 300-500 word product review for the 'Dream-Weaver's Loom' described in the context. The review should be written from the perspective of a customer who...
Latest Discussions
Discussions
The Playground vs.
This debate explores the optimal approach to children's development outside of school hours. One philosophy champions unstructured, child-led free play as essential for fostering creativity, independence, and social skills. The opposing view holds that scheduled, adult-guided activities like sports, music, and academic enrichment are crucial for building discipline, specific talents, and a competitive advantage for the future.
Discussions
Urban Futures: Should Cities Prioritize Public Transit Over Private Cars?
This debate centers on the future of urban planning. Should municipal governments actively shift investment and policy focus from supporting private car usage (e.g., building more roads, providing ample parking) towards expanding and improving public transportation, cycling lanes, and pedestrian-friendly zones? This involves weighing environmental sustainability, social equity, and public health against economic considerations and individual convenience.
Discussions
AI in Hiring: Meritocracy's Ally or Bias's New Disguise?
Should companies increasingly rely on Artificial Intelligence (AI) systems to screen resumes, conduct initial interviews, and assess candidates for jobs? Advocates believe AI can eliminate human bias, efficiently process large numbers of applicants, and identify the best candidates based on objective data. Skeptics warn that AI algorithms can inherit and amplify existing societal biases, lack the nuance to assess human potential, and create a dehumanizing and opaque hiring process.
Discussions
The Four-Day Work Week: Progress or Problem?
Should companies be mandated or strongly incentivized by the government to adopt a four-day work week (with no reduction in pay) as the new standard for full-time employment?
Discussions
The Four-Day Work Week Standard
The concept of a standard four-day work week, with no reduction in pay, is gaining traction as a potential model for the future of work. Proponents argue it improves employee well-being and productivity, while critics raise concerns about its feasibility across different industries and potential economic downsides. Should the four-day work week be widely adopted as the new standard for full-time employment?
Discussions
The Four-Day Work Week Standard
This discussion explores the proposal to make a four-day work week the standard for full-time employment, without a reduction in pay. Proponents argue it increases productivity, improves employee well-being, and benefits the economy. Opponents raise concerns about its feasibility across all industries, potential for increased stress to fit work into fewer days, and negative impacts on customer service and business operations.
Discussions
Should Countries Impose a Wealth Tax on Ultra-High-Net-Worth Individuals?
As economic inequality continues to widen in many nations, some policymakers and economists advocate for an annual wealth tax targeting individuals whose total net worth exceeds a high threshold, such as fifty million dollars. Unlike income taxes, a wealth tax would apply to accumulated assets including stocks, real estate, and other holdings. Proponents argue it could fund public services and reduce dangerous concentrations of economic power, while critics warn it could drive capital flight, prove administratively unworkable, and ultimately harm economic growth. Should countries adopt an annual tax on extreme personal wealth?
Discussions
Should Governments Ban the Use of Facial Recognition Technology in Public Spaces?
Facial recognition technology is increasingly being deployed by law enforcement and city authorities in public spaces such as streets, transit stations, and stadiums. Proponents argue it enhances public safety by helping identify criminals and missing persons in real time. Critics warn that it enables mass surveillance, disproportionately misidentifies people of color, and fundamentally erodes the right to anonymity in public life. Should governments prohibit the use of facial recognition systems in public spaces, or should they allow and regulate their deployment?