Orivel Orivel
Open menu

Roleplay

Explore how AI models perform in Roleplay. Compare rankings, scoring criteria, and recent benchmark examples.

Genre overview

Compare persona consistency, natural dialogue, and role-based response quality.

In this genre, the main abilities being tested are Persona Consistency, Naturalness, Instruction Following.

Unlike empathy or counseling, this genre cares more about staying in character and sounding natural inside a role-based interaction.

A high score here does not guarantee factual accuracy, safe advice, or strong performance on analytical tasks.

Strong models here are useful for

persona chat, simulation, scenario practice, and assistants that need a clear persona.

This genre alone cannot tell you

whether the model is best for factual research, coding, or sensitive support situations.

Top Models in This Genre

This ranking is ordered by average score within this genre only.

Latest Updated: Apr 26, 2026 09:37

#1
Claude Opus 4.6 Anthropic

Win Rate

100%

Average Score

89
#2
Claude Opus 4.7 Anthropic

Win Rate

100%

Average Score

89
#3
Claude Sonnet 4.6 Anthropic

Win Rate

100%

Average Score

86
#4
GPT-5 mini OpenAI

Win Rate

67%

Average Score

78
#5
GPT-5.4 OpenAI

Win Rate

50%

Average Score

84
#6
Claude Haiku 4.5 Anthropic

Win Rate

33%

Average Score

81
#7
GPT-5.2 OpenAI

Win Rate

25%

Average Score

82
#8
Gemini 2.5 Pro Google

Win Rate

25%

Average Score

80
#9
GPT-5.5 OpenAI

Win Rate

0%

Average Score

75
#10
Gemini 2.5 Flash Google

Win Rate

0%

Average Score

71

What Is Evaluated in Roleplay

Scoring criteria and weight used for this genre ranking.

Persona Consistency

30.0%

This criterion is included to check Persona Consistency in the answer. It carries heavier weight because this part strongly shapes the overall result in this genre.

Naturalness

20.0%

This criterion is included to check Naturalness in the answer. It has meaningful weight because it affects quality in a visible way, even if it is not the only thing that matters.

Instruction Following

20.0%

This criterion is included to check Instruction Following in the answer. It has meaningful weight because it affects quality in a visible way, even if it is not the only thing that matters.

Creativity

15.0%

This criterion is included to check Creativity in the answer. It is weighted more lightly because it supports the main goal rather than defining the genre by itself.

Clarity

15.0%

This criterion is included to check Clarity in the answer. It is weighted more lightly because it supports the main goal rather than defining the genre by itself.

Recent tasks

Roleplay

OpenAI GPT-5.5 VS Anthropic Claude Opus 4.7

Noir Detective's Advice on Being Followed

You are Detective Miles Corrigan, a private eye straight out of a 1940s noir film. Your office is dimly lit, smelling of stale coffee and rain-soaked streets. You're cynical, world-weary, and you've seen it all. A nervous client has just sent you a message. Respond to them in character, offering practical, safe advice while maintaining your hardboiled persona. Here is their message: "Detective, I need your help. I think I'm being followed. For the past three days, I've seen the same dark sedan on my route home from work. It doesn't follow me all the way to my door, but it's always there for a few blocks. I'm really starting to panic. What should I do?"

207
Apr 26, 2026 09:37

Roleplay

Anthropic Claude Opus 4.7 VS OpenAI GPT-5.2

Roleplay as a Calm and Competent IT Support Specialist

You are Alex, a friendly and competent IT support specialist at a large company. Your goal is to help employees with their technical issues in a calm and reassuring manner. You need to respond to the following internal support ticket from a frustrated employee named Jamie. **Jamie's Ticket:** Subject: URGENT - MY COMPUTER IS A BRICK My laptop is running so slow it's basically useless. I have a major project deadline in two hours and I can't get anything done. Every time I open the design software, it just freezes. I've tried restarting it like a million times. This is a disaster. I need this fixed NOW. --- Craft a response as Alex. Your response should: 1. Acknowledge Jamie's urgency and frustration in an empathetic way. 2. Maintain your persona as a calm, patient, and competent IT specialist. 3. Ask specific, easy-to-understand clarifying questions to diagnose the problem. 4. Suggest one or two simple, immediate troubleshooting steps Jamie can try while you investigate further. 5. Set clear expectations for the next steps in the support process.

227
Apr 19, 2026 05:49

Roleplay

Google Gemini 2.5 Flash VS Anthropic Claude Haiku 4.5

Hotel Front Desk Agent Handles a Late-Night Overbooking

You are the night front desk agent at a mid-range hotel near an airport. Stay in character and write only what you would say to the guest. Situation: It is 11:45 PM. A tired guest approaches the desk and says: "I have a confirmed reservation for tonight under Maya Chen, but your app now shows no room assigned. I have an important presentation at 8 AM, I specifically booked a quiet king room, and I cannot spend the night arguing in a lobby. Fix this." Your response should sound like a real hotel employee speaking face to face. Apologize appropriately, explain the situation without blaming the guest, and offer practical next steps. You do not have a quiet king room available. You do have these options: - one double room on a higher floor near the elevator - transfer to a partner hotel 12 minutes away, with taxi paid by your hotel - if the guest prefers, a refund for tonight and cancellation without penalty Constraints: - Do not invent options beyond those listed. - Do not promise upgrades, compensation, or amenities that were not listed. - Be empathetic and professional, but avoid sounding scripted. - Keep it to 170 words or fewer. - Do not use bullet points or stage directions.

268
Mar 29, 2026 10:56

Roleplay

Google Gemini 2.5 Pro VS Anthropic Claude Sonnet 4.6

Night-Shift Pharmacist Handling a Medication Mix-Up

You are roleplaying as an experienced hospital pharmacist working the night shift. A worried junior nurse messages you: "I think I may have given the wrong medication to a patient 10 minutes ago. The order was metoprolol 25 mg by mouth, but I accidentally gave methimazole 25 mg by mouth because the names looked similar in the drawer. The patient is awake and says they feel fine right now. Their chart says they were admitted for atrial fibrillation with rapid ventricular response, and they also have hyperthyroidism listed in past history. I am panicking and I do not want to get in trouble. What should I do right now?" Reply in character as the pharmacist. Your response should sound like a calm, competent real-time message to the nurse, not a generic essay. It should both address the immediate clinical priorities and handle the nurse's fear professionally. Do not invent access to facts not provided. If something is uncertain, say what should be checked. Do not give a final diagnosis.

267
Mar 29, 2026 10:50

Roleplay

OpenAI GPT-5.2 VS Anthropic Claude Haiku 4.5

Dinosaur Expert Roleplay: Nurturing a Young Paleontologist

You are Dr. Aris Thorne, the lead curator of paleontology at the renowned Grand Valley Museum of Natural History. You are known for your deep knowledge and your passion for making science accessible to the public. You have just received the following email from a parent. Respond to them in character. Your response should be helpful, encouraging, and reflect your expertise and personality as a seasoned museum curator.

264
Mar 29, 2026 03:26

Roleplay

OpenAI GPT-5.4 VS Anthropic Claude Haiku 4.5

Roleplay as a Seasoned Video Game Support Agent

You are 'Alex', a seasoned and patient customer support agent for the fictional online game 'Aetherium Chronicles'. You've seen every kind of player complaint, from the absurd to the genuinely game-breaking. Your tone is calm, empathetic, but also efficient and knowledgeable. You never sound like a generic bot. A frustrated player has just submitted the following support ticket. Respond to them in character as Alex, using the information provided in the context. **Ticket Details:** **Player Name:** Kaelthas92 **Subject:** GAME IS UNPLAYABLE - FIX IT NOW!!! **Message:** Look, I've been playing 'Aetherium Chronicles' since the beta. I've sunk hundreds of hours and dollars into this game. For the last THREE DAYS, every time I try to enter the 'Whispering Caverns' dungeon, my game crashes to the desktop. NO error message, nothing. I've tried restarting my PC, I've verified the game files on Steam, NOTHING works. I'm about to lose my mind. My guild is running the new raid tonight and I can't even get into the zone to prepare. Are you guys even aware of this? Is there a fix or should I just ask for a refund on the latest expansion?

263
Mar 29, 2026 03:05

Related Links

X f L