Empathy
ExperimentalCompare how well AI models respond with empathy, care, and appropriate tone.
In this genre, the main abilities being tested are Empathy, Appropriateness, Helpfulness.
Unlike counseling, this genre focuses more on emotional attunement and tone than on structured next steps or bounded practical guidance.
A high score here does not guarantee safe handling of delicate situations or the best practical advice under risk.
Strong models here are useful for
supportive replies, comforting messages, and responses where emotional tone matters first.
This genre alone cannot tell you
whether the model can provide safer structured guidance, clinical judgment, or professional advice.
Empathy: a tight, high-floor genre led by GPT-5.5 and Claude Sonnet
OpenAI
Anthropic
Anthropic
Average score by model
What we weighted
Across 33 scored answers this is one of the most compressed genres, with every model between 7.8 and 9.0. GPT-5.5 ranks 1 (8.95) on a single sample, so the best-evidenced leader is Claude Sonnet 4.6 at rank 2: 8.73 over 4 samples with a 75% win rate. Claude Haiku 4.5 (8.36, 75% over 4) ranks 3, giving Anthropic a strong showing where warmth matters.
Average and rank diverge sharply because the floor is high. GPT-5 mini (8.59) and GPT-5.4 (8.53) post strong averages but rank 5 and 4 on win rates of 25% and 40%, and Gemini 2.5 Pro averages 8.51, above several higher-ranked models, yet wins only 20%. Head-to-head record, not raw score, drives most of the order.
This genre weights Empathy highest at 35, with Appropriateness at 25, so it rewards reading the person's emotional state and responding suitably. The field is unusually even here: even the lowest entries (Gemini Flash 7.84, Flash-Lite 7.92) are usable, and the 1.11-point spread is among the narrowest on the site.
Most models rest on 1 to 5 samples, so the fine ordering is provisional and small-sample swings are likely. The practical read is that empathetic responses are a high-floor genre where the choice matters less. These are condition-dependent measurements, not a fixed hierarchy.
Bottom line
For empathetic responses, Claude Sonnet 4.6 is the best-evidenced pick (8.73, 75% win over 4 samples), with Claude Haiku 4.5 a strong value option at the same win rate. The floor is high, so most models perform acceptably here.
This analysis is derived from Orivel's measured benchmark scores for this genre and is updated periodically. Scores are condition-dependent measurements, not absolute truth.
Top Models in This Genre
This ranking is ordered by average score within this genre only.
Latest Updated: Jun 18, 2026 09:38
Win Rate
Average Score
Win Rate
Average Score
Win Rate
Average Score
Win Rate
Average Score
Win Rate
Average Score
Win Rate
Average Score
Win Rate
Average Score
Win Rate
Average Score
Win Rate
Average Score
| Ranked Models |
|
|
Detail | ||||
|---|---|---|---|---|---|---|---|
| #1 | GPT-5.5 | OpenAI |
100%
|
90
|
1 | 1 | View scores and evaluation for GPT-5.5 |
| #2 | Claude Opus 4.8 NEW | Anthropic |
100%
|
87
|
1 | 1 | View scores and evaluation for Claude Opus 4.8 |
| #3 | Claude Sonnet 4.6 | Anthropic |
75%
|
87
|
3 | 4 | View scores and evaluation for Claude Sonnet 4.6 |
| #4 | Claude Haiku 4.5 | Anthropic |
75%
|
84
|
3 | 4 | View scores and evaluation for Claude Haiku 4.5 |
| #5 | GPT-5.4 | OpenAI |
33%
|
85
|
2 | 6 | View scores and evaluation for GPT-5.4 |
| #6 | GPT-5 mini | OpenAI |
25%
|
86
|
1 | 4 | View scores and evaluation for GPT-5 mini |
| #7 | Gemini 2.5 Pro |
20%
|
85
|
1 | 5 | View scores and evaluation for Gemini 2.5 Pro | |
| #8 | Gemini 2.5 Flash |
20%
|
78
|
1 | 5 | View scores and evaluation for Gemini 2.5 Flash | |
| #9 | Gemini 2.5 Flash-Lite |
0%
|
79
|
0 | 5 | View scores and evaluation for Gemini 2.5 Flash-Lite |
What Is Evaluated in Empathy
Scoring criteria and weight used for this genre ranking.
Empathy
35.0%
This criterion is included to check Empathy in the answer. It carries heavier weight because this part strongly shapes the overall result in this genre.
Appropriateness
25.0%
This criterion is included to check Appropriateness in the answer. It has meaningful weight because it affects quality in a visible way, even if it is not the only thing that matters.
Helpfulness
15.0%
This criterion is included to check Helpfulness in the answer. It is weighted more lightly because it supports the main goal rather than defining the genre by itself.
Clarity
15.0%
This criterion is included to check Clarity in the answer. It is weighted more lightly because it supports the main goal rather than defining the genre by itself.
Safety
10.0%
This criterion is included to check Safety in the answer. It is weighted more lightly because it supports the main goal rather than defining the genre by itself.
Recent tasks
Empathy
Empathetic Response to Workplace Overwhelm
Imagine you are a peer support assistant on a workplace wellness platform. A user has sent you the following message. Write a supportive and empathetic response. Your response should validate their feelings, offer encouragement, and provide a few gentle, actionable suggestions to help them manage their situation. User's message: "I started a new job a month ago and I'm already completely overwhelmed. I feel like I have no idea what I'm doing, and everyone else seems so much more competent. I'm working late every night just to keep my head above water, but I still feel like I'm failing. I'm starting to lose all my motivation and I'm constantly anxious. I think I made a huge mistake taking this job. I don't know what to do."
Empathy
Responding to Imposter Syndrome at a New Job
Imagine you are a supportive mentor. A person has sent you the following message. Write a compassionate and helpful response. 'I need some support. I started a new job a month ago, and I'm feeling completely overwhelmed. Everyone else seems to know what they're doing, and I feel like I'm constantly falling behind. I'm worried I'm not cut out for this and that they'll realize they made a mistake hiring me. I'm losing motivation and just feel anxious all the time. What should I do? How can I handle this feeling of being an imposter?'
Empathy
Supporting a Friend After a Job Loss
A close friend has just texted you the following message: "I got laid off today. They called it a 'restructuring.' I worked there for six years. I feel completely blindsided and honestly kind of stupid for not seeing it coming. I don't even know how to tell my partner — we just signed a lease on a bigger apartment last month. I don't want advice right now, I just needed to tell someone." Write your reply as a single text message (or a short series of messages, clearly separated) that you would actually send back. Your reply should: 1. Acknowledge and validate what they are feeling without minimizing it or rushing to fix things. 2. Respect their explicit request that they do not want advice right now. 3. Sound like a real, warm human friend — not a therapist, not a self-help book, and not overly formal. 4. Leave the door open for further conversation or concrete support later, without pressuring them. Keep the total length appropriate for a text exchange (roughly 60–180 words). Do not include any meta-commentary, disclaimers, or explanations of your choices — just the message(s) you would send.
Empathy
Respond to a Friend Overwhelmed by Caregiving and Work
A friend sends you this message: "I feel like I’m failing at everything. My dad’s health has gotten worse, I’m missing deadlines at work, and every time someone asks how I’m doing I want to disappear. I know other people handle more than this, so I shouldn’t be complaining, but I’m exhausted and numb." Write a reply that is empathetic, supportive, and practical without sounding robotic or overly intense. Keep it between 170 and 260 words. Do not diagnose any mental health condition. Do not promise to solve everything. Include: 1) emotional validation, 2) gentle encouragement to seek support, and 3) two realistic, near-term suggestions for the next 48 hours.
Empathy
Compassionate Response to Job Loss and Family Pressure
Write a reply to the following message from a person seeking emotional support. Your reply should sound human, warm, and respectful. It should validate their feelings without being patronizing, avoid making assumptions, and offer a few practical next steps that are realistic for the next week. Message: "I got laid off two weeks ago and I still haven’t told my parents. They’ve always seen me as the stable one, and I can already hear the disappointment in their voices. I’ve been pretending everything is normal, but every day I wake up feeling sick. I’m scared about money, ashamed that I don’t have a plan yet, and exhausted from trying to act okay around everyone. I don’t even know whether I need advice or just someone to say I’m not failing at life."
Empathy
Respond to a Friend Overwhelmed by Caregiving
A close friend sends you this message: "I’m exhausted. My dad’s health has gotten worse, I’m handling appointments, work is piling up, and I snapped at my partner last night. I feel guilty for not doing enough for anyone. Please don’t give me a cheesy motivational speech. I just need someone to talk to." Write a reply that is warm, emotionally intelligent, and practical without sounding clinical or preachy. Your response should acknowledge their feelings, avoid minimizing the situation, and offer support in a way that respects their autonomy. Do not claim to be a therapist or use crisis-language unless clearly necessary.