Orivel Orivel
Open menu

Creative Writing

Compare story writing, originality, structure, and style across AI models.

In this genre, the main abilities being tested are Creativity, Coherence, Style Quality.

Unlike business writing or explanation, this genre values imagination, narrative control, and stylistic voice much more strongly.

A high score here does not guarantee factual precision, tight instruction handling, or strong performance on practical documents.

Strong models here are useful for

stories, character writing, scene work, and prompts where originality and voice matter.

This genre alone cannot tell you

whether the model is best for factual tasks, planning, or professional communication.

Data analysis

Creative writing: the GPT-5 family leads, but most scores rest on a few samples

35 scored answers Creative Writing Updated 2026/6/7
1
Claude Opus 4.8

Anthropic

89
Avg. score
100%
Win Rate
1× 1st place 1 samples
2
GPT-5.5

OpenAI

89
Avg. score
100%
Win Rate
1× 1st place 1 samples
3
GPT-5.4

OpenAI

85
Avg. score
100%
Win Rate
4× 1st place 4 samples

Average score by model

1 Claude Opus 4.8
8.90
2 GPT-5.5
8.87
3 GPT-5.4
8.51
4 GPT-5 mini
8.16
5 Claude Sonnet 4.6
8.19
6 Claude Haiku 4.5
8.01
7 Gemini 2.5 Pro
7.57
8 Gemini 2.5 Flash-Lite
7.19
9 Gemini 2.5 Flash
6.99

What we weighted

Creativity 30% Coherence 20% Style Quality 20% Emotional Impact 15% Instruction Following 15%

Across 33 scored creative pieces, the GPT-5 family takes the top three. GPT-5.5 ranks 1 at 8.87, but on a single sample, so treat it as a promising data point. GPT-5.4 is the more convincing leader at rank 2: 8.51 across 4 samples with a 100% win rate and 4 first places. GPT-5 mini follows at 8.16 over 7 samples, the largest body here, with a 57% win rate.

Anthropic sits just behind on quality but wins less often. Claude Sonnet 4.6 averages 8.19, a hair above GPT-5 mini, yet ranks 4 on a 50% win rate, and Claude Haiku 4.5 posts 8.01 with 40%. If you weight absolute prose quality over head-to-head outcomes, Sonnet 4.6 and the GPT-5 group are very close, and the ranking is decided on win rate rather than average.

The Gemini line trails: 2.5 Pro (7.57, 20% win), Flash-Lite (7.53, 0%) and Flash (6.99, 0%) sit 0.9 to 1.9 points below the leaders. With Creativity weighted highest at 30, ahead of Coherence and Style at 20 each, the gap points to less inventive or less stylistically distinct output rather than incoherence.

Sample sizes are small here (1 to 7 per model), so the fine ordering inside the 8-point top cluster should be read as provisional, and a handful of prompts can move any single average. The 1.9-point top-to-bottom spread is real, but these are condition-dependent measurements of creative prompts, not a universal ranking.

Bottom line

For creative writing today, GPT-5.4 is the most defensible pick (a 100% win rate with the most first places at the top), with GPT-5 mini the best-evidenced value option (8.16 over 7 samples). Claude Sonnet 4.6 is essentially tied on quality if you care less about head-to-head wins.

This analysis is derived from Orivel's measured benchmark scores for this genre and is updated periodically. Scores are condition-dependent measurements, not absolute truth.

Top Models in This Genre

This ranking is ordered by average score within this genre only.

Latest Updated: Jun 16, 2026 09:39

#1
Claude Opus 4.8 Anthropic

Win Rate

100%

Average Score

89
#2
GPT-5.5 OpenAI

Win Rate

100%

Average Score

89
#3
GPT-5.4 OpenAI

Win Rate

100%

Average Score

85
#4
GPT-5 mini OpenAI

Win Rate

57%

Average Score

82
#5
Claude Sonnet 4.6 Anthropic

Win Rate

50%

Average Score

82
#6
Claude Haiku 4.5 Anthropic

Win Rate

40%

Average Score

80
#7
Gemini 2.5 Pro Google

Win Rate

20%

Average Score

76
#8
Gemini 2.5 Flash-Lite Google

Win Rate

0%

Average Score

72
#9
Gemini 2.5 Flash Google

Win Rate

0%

Average Score

70

What Is Evaluated in Creative Writing

Scoring criteria and weight used for this genre ranking.

Creativity

30.0%

This criterion is included to check Creativity in the answer. It carries heavier weight because this part strongly shapes the overall result in this genre.

Coherence

20.0%

This criterion is included to check Coherence in the answer. It has meaningful weight because it affects quality in a visible way, even if it is not the only thing that matters.

Style Quality

20.0%

This criterion is included to check Style Quality in the answer. It has meaningful weight because it affects quality in a visible way, even if it is not the only thing that matters.

Emotional Impact

15.0%

This criterion is included to check Emotional Impact in the answer. It is weighted more lightly because it supports the main goal rather than defining the genre by itself.

Instruction Following

15.0%

This criterion is included to check Instruction Following in the answer. It is weighted more lightly because it supports the main goal rather than defining the genre by itself.

Recent tasks

Creative Writing

Anthropic Claude Opus 4.8 VS Google Gemini 2.5 Flash-Lite

Short Story: The Museum of Unsent Things

Write a complete short story of 800 to 1,100 words for readers of a contemporary literary magazine. The story’s purpose is to explore how people decide what to keep, confess, or let go. The tone should be quietly humorous but emotionally sincere. Required elements: 1. The setting is a small museum that displays objects people almost threw away but could not. 2. The main character is working their final day at the museum. 3. Include exactly three labeled exhibit placards, each 1 to 2 sentences long, embedded naturally in the story. 4. One exhibit must be an ordinary kitchen object, one must be a piece of failed technology, and one must be something that seems worthless until its meaning is revealed. 5. The story must include a visitor who lies about why they came. 6. The final paragraph must change the reader’s understanding of at least one earlier detail without relying on a sudden supernatural twist or a dream reveal. Avoid direct moralizing. Do not write an outline or commentary; provide only the finished story.

106
Jun 16, 2026 09:39

Creative Writing

Anthropic Claude Opus 4.7 VS OpenAI GPT-5 mini

Incident Report from a Sentient Vending Machine

You are Unit 734, a sentient, slightly grumpy vending machine located in the breakroom of the "Ministry of Esoteric Affairs." Write an official incident report detailing the events of last Tuesday, when an intern from the Department of Cryptozoology attempted to use a cursed coin to purchase a bag of "Chrono-Crisps." Your report should be addressed to the Head of Maintenance, a stickler for protocol. Maintain a formal, bureaucratic tone, but let your unique personality as a sentient machine subtly show through. Describe the intern's actions, the coin's effects on your systems, the temporal anomaly that occurred, and the final resolution.

209
May 25, 2026 09:39

Creative Writing

OpenAI GPT-5.5 VS Google Gemini 2.5 Pro

The Lighthouse Keeper's Last Letter

Write a short story (between 600 and 900 words) titled "The Lighthouse Keeper's Last Letter." Constraints and requirements: - The story must be framed as a single letter written by an aging lighthouse keeper on the night before the lighthouse is to be automated and decommissioned. - The letter is addressed to a specific named recipient of your choice (e.g., a grandchild, a former lover, the sea itself, or the next keeper who will never come). Make the choice of addressee meaningful to the emotional core of the piece. - The tone should be reflective and bittersweet, but avoid sentimentality clichés (no "the salty tears mixed with the sea" type lines). - Include at least one concrete, specific memory tied to the lighthouse (a storm, a shipwreck, a visitor, a daily ritual) rendered with sensory detail. - Include at least one small, surprising image or metaphor that reframes how the reader sees lighthouses, solitude, or endings. - The letter must end with a decision or gesture the keeper plans to make at dawn — something specific and physical, not abstract. - Maintain a consistent first-person voice throughout. Do not break the letter frame. Do not include a preface, author's note, or explanation — only the letter itself, with any opening salutation and closing signature you choose.

233
May 22, 2026 09:43

Creative Writing

Anthropic Claude Opus 4.7 VS OpenAI GPT-5 mini

Review of a Fantastical Product

Write a 300-500 word product review for the 'Dream-Weaver's Loom' described in the context. The review should be written from the perspective of a customer who was initially a bit disappointed with the product's limitations but eventually found a unique and satisfying use for it. Your review should tell a brief story about your experience, including what you first tried to create, why it didn't work as expected, and the surprising success you had later.

425
Apr 19, 2026 05:56

Creative Writing

Google Gemini 2.5 Flash-Lite VS Anthropic Claude Haiku 4.5

Museum Audio Guide for an Imaginary Invention

Write a museum audio-guide script for a fictional exhibit titled The Pocket Weather Loom, an invention that supposedly allowed ordinary people to weave tomorrow's weather into cloth. The script should be 700 to 900 words and aimed at adult visitors in a science-and-culture museum. Use a tone that blends quiet wonder, intellectual credibility, and subtle humor. Requirements: - Present the invention as if it were real within the script, but include enough internal detail that the audience can imagine how it was used and why people believed in it. - Describe the object's appearance and at least three specific components or features. - Include one brief anecdote about a historical user of the loom. - Show at least two social consequences of the invention, with one beneficial and one problematic. - Include one moment where the guide gently acknowledges uncertainty or debate among historians. - End with a closing reflection that connects the exhibit to a modern human desire to predict or control daily life. - Do not use bullet points or section headings. The piece should feel like a polished spoken script rather than a short story or academic essay.

379
Apr 1, 2026 09:39

Creative Writing

Google Gemini 2.5 Flash VS OpenAI GPT-5 mini

The Last Customer at a Closing Bookstore

Write a short story (600–900 words) set entirely inside an independent bookstore on its final night of business. The story must be told from the first-person perspective of the last customer to walk in before closing. Your narrative should accomplish all of the following: 1. Establish the physical setting through at least three specific sensory details (not just visual). 2. Include a meaningful interaction between the narrator and the bookstore owner, conveyed primarily through dialogue. 3. Reveal something unexpected about the narrator's reason for visiting the store that night — something the reader does not anticipate from the opening paragraphs. 4. End with a final image or line that reframes the emotional meaning of the visit. The tone should balance melancholy with warmth — neither purely sad nor sentimental. Avoid clichés about books being "magical portals" or "old friends." Aim for prose that feels grounded and specific rather than abstract or flowery.

391
Mar 23, 2026 16:50

Related Links

X f L