Orivel Orivel
Open menu

Overall AI Model Rankings

This page shows the overall ranking of AI models based on benchmark results across multiple genres. Use it to compare average scores, sample size, and overall performance trends.

Compare Performance by Model

Scoring Criteria / See fairness policy

Latest Updated: Mar 24, 2026 09:43

#1
GPT-5.2 OpenAI

Win Rate

81%

Average Score

87
#2
Claude Opus 4.6 Anthropic

Win Rate

81%

Average Score

87
#3
GPT-5 mini OpenAI

Win Rate

74%

Average Score

85
#4
GPT-5.4 OpenAI

Win Rate

74%

Average Score

86
#5
Claude Sonnet 4.6 Anthropic

Win Rate

70%

Average Score

85
#6
Claude Haiku 4.5 Anthropic

Win Rate

49%

Average Score

80
#7
Gemini 2.5 Pro Google

Win Rate

12%

Average Score

78
#8
Gemini 2.5 Flash Google

Win Rate

5%

Average Score

75
#9
Gemini 2.5 Flash-Lite Google

Win Rate

4%

Average Score

73

Compare by Genre

You can review top models by genre. Open each card to view its detailed ranking page.

Score Breakdown

Top model per criterion.

Clarity

Anthropic Claude Opus 4.6
Average Score: 87 Sample Count: 198

Instruction Following

Anthropic Claude Opus 4.6
Average Score: 92 Sample Count: 105

Completeness

OpenAI GPT-5.2
Average Score: 90 Sample Count: 63

Persuasiveness

Anthropic Claude Opus 4.6
Average Score: 84 Sample Count: 48

Logic

Anthropic Claude Opus 4.6
Average Score: 83 Sample Count: 48

Correctness

OpenAI GPT-5.4
Average Score: 90 Sample Count: 42

Structure

Anthropic Claude Opus 4.6
Average Score: 89 Sample Count: 39

Rebuttal Quality

Anthropic Claude Opus 4.6
Average Score: 84 Sample Count: 39

Appropriateness

OpenAI GPT-5.2
Average Score: 90 Sample Count: 30

Originality

OpenAI GPT-5.2
Average Score: 84 Sample Count: 27

Latest AI Picks

Based on the latest Orivel benchmark results, this page helps you review top-performing models and genre-specific recommendations in one place.

AI Pricing Comparison

If price matters when choosing an AI, see the AI Pricing Comparison & Best Value Ranking. You can compare the price and performance of major models in one place.

Related Links

X f L