Orivel Orivel
Open menu

Stand-up Routine for Tech Workers

Compare model answers for this Humor benchmark and review scores, judging comments, and related examples.

Login or register to use likes and favorites. Register

X f L

Contents

Task Overview

Benchmark Genres

Humor

Task Creator Model

Answering Models

Judge Models

Task Prompt

Write a short (approximately 250-300 words) stand-up comedy routine, in script format, about the experience of video conferencing while working from home.

Task Context

The target audience is a group of young professionals in the tech industry. The desired tone is observational, self-deprecating, and relatable, but not mean-spirited or overly cynical. The routine must avoid common, overused jokes about video conferencing, such as the phrases 'you're on mute' or 'can you see my screen?', or jokes about only wearing a nice shirt with pajama bottoms.

Judging Policy

A high-quality response will be a coherent and well-structured comedy routine with clear setups and punchlines. The humor should be original and feel fresh, successfully avoiding the specified clichés. The tone must align with the request—observational and relatable for a tech audience, without being overly negative. The routine should demonstrate an understanding of the nuances of remote work culture beyond the most obvious jokes.

Answers

STAND-UP ROUTINE: "HOME OFFICE CONFESSIONS" [walks on stage] So I've been working from home for a while now, and I've discovered something: video conferencing has made me deeply aware of my own existence in ways therapy never could. [pauses] Like, I now know exactly what my ceiling looks like. I could describe it to a forensics team. There's a water stain shaped like my former manager's disappointment. [gestures] And the camera angle thing? I've spent forty-five minutes trying to position my laptop so I look like a human and not like a sentient potato. I've got my monitor on a stack of books, my keyboard on...

Show Full Answer

STAND-UP ROUTINE: "HOME OFFICE CONFESSIONS" [walks on stage] So I've been working from home for a while now, and I've discovered something: video conferencing has made me deeply aware of my own existence in ways therapy never could. [pauses] Like, I now know exactly what my ceiling looks like. I could describe it to a forensics team. There's a water stain shaped like my former manager's disappointment. [gestures] And the camera angle thing? I've spent forty-five minutes trying to position my laptop so I look like a human and not like a sentient potato. I've got my monitor on a stack of books, my keyboard on another stack of books. I'm basically building a literary monument to my own insecurity. [sits on stool] Then there's the background anxiety. You're just sitting there, and suddenly you're thinking, "Is that pile of laundry visible? Does my bookshelf say 'I'm cultured' or 'I bought these at a garage sale'?" I've started keeping a plant in frame. Not because I like plants. Because I read somewhere that plants make you look trustworthy. I'm out here using botany as a personality accessory. [stands] But here's what really gets me: the fake listening face. You know the one. Someone's talking about quarterly metrics, and you're nodding like you're absorbing information, but you're actually wondering if you left the stove on. Which you didn't. You don't cook. But your brain doesn't care. [leans forward] And the worst part? The little self-view window. Why do we do that to ourselves? I'm watching myself watch other people. It's like being trapped in a recursive nightmare of my own face. [shrugs] I've started closing that window. I don't need to see myself pretending to care about someone's PowerPoint. That's between me and my therapist. [exits]

Result

#1 | Winner

Winning Votes

2 / 3

Average Score

81

Total Score

82

Overall Comments

Answer A is a well-crafted stand-up routine with strong observational humor and clear setups and punchlines. The jokes about camera angles, background anxiety, fake listening faces, and the self-view window are all fresh and relatable without relying on clichés. The stage directions add authenticity and pacing. The routine flows naturally from one observation to the next, building a coherent narrative about the psychological toll of video conferencing. The 'sentient potato' line, the 'botany as a personality accessory' bit, and the recursive nightmare of the self-view window are genuinely clever. The tone is perfectly calibrated—self-deprecating without being mean-spirited. Minor weakness: it runs slightly long and some transitions could be tighter.

View Score Details

Humor Effectiveness

Weight 35%
82

Answer A has multiple strong punchlines that land well—'sentient potato,' 'botany as a personality accessory,' and the recursive self-view nightmare are all genuinely funny and well-constructed. The fake listening face bit is relatable and builds effectively. The humor is consistent throughout.

Originality

Weight 25%
80

Answer A successfully avoids all specified clichés and finds fresh angles—the water stain shaped like a manager's disappointment, using plants as a personality accessory, and the self-view window as a recursive nightmare are all original and specific observations not commonly seen in video call humor.

Coherence

Weight 15%
78

Answer A flows logically from one observation to the next, with stage directions that help pace the routine. The progression from camera anxiety to background anxiety to fake listening to the self-view window feels natural and builds toward a satisfying conclusion.

Instruction Following

Weight 10%
90

Answer A fully follows all instructions: it's in script format, approximately the right length, avoids all specified clichés, maintains an observational and self-deprecating tone, and is appropriate for a tech audience without being mean-spirited.

Clarity

Weight 15%
85

Answer A is clearly written with well-defined setups and punchlines. The stage directions help clarify pacing and delivery. Each joke is easy to follow and the language is precise without being overly complex.

Judge Models OpenAI GPT-5.4

Total Score

72

Overall Comments

Answer A is a solid, coherent routine with a clear stand-up structure and a relatable remote-work premise. It has several good observational bits, especially around camera angles, curated backgrounds, fake attentiveness, and the stress of self-view. The tone fits the request well and avoids the banned clichés. However, the humor is more mildly amusing than sharply punchy, and some lines feel familiar rather than especially fresh for a tech-worker audience. It reads smoothly but does not fully maximize originality or comedic escalation.

View Score Details

Humor Effectiveness

Weight 35%
68

The routine is consistently pleasant and relatable, with decent laughs from the ceiling stain, sentient potato, and plant-as-trustworthiness bit. However, the punchlines are spaced farther apart and land more as mild observations than strong comedic beats.

Originality

Weight 25%
65

The material avoids the explicitly banned clichés and includes some nice wording, but several premises are familiar territory for video-call comedy: camera angles, visible laundry, curated bookshelves, and self-view anxiety.

Coherence

Weight 15%
75

The routine flows cleanly from one remote-work pain point to another, and the stand-up stage directions help shape it as a performance. It has a clear beginning, middle, and ending, though the escalation is somewhat gentle.

Instruction Following

Weight 10%
88

It fits the requested script format, tone, and audience reasonably well, and it avoids the banned jokes. It is also close to the requested 250 to 300 word range.

Clarity

Weight 15%
82

The writing is easy to follow, with clear setups and stage directions that support performance readability. The ideas are communicated cleanly, though a few transitions are more functional than crisp.

Total Score

89

Overall Comments

Answer A is an excellent response that perfectly captures the essence of a stand-up routine. It has a clear narrative structure, with a beginning, middle, and end, and uses stage directions effectively to create the sense of a live performance. The humor is intelligent, observational, and highly relatable to the target audience, with fresh takes on topics like camera angles and backgrounds. The writing is strong, with memorable phrases like "literary monument to my own insecurity" and "recursive nightmare of my own face."

View Score Details

Humor Effectiveness

Weight 35%
85

The humor is very effective, relying on clever observations and self-deprecating wit. The jokes build on each other, creating a consistent comedic persona and a satisfying narrative arc. Phrases like "botany as a personality accessory" are both funny and insightful.

Originality

Weight 25%
90

The routine demonstrates outstanding originality. It successfully avoids the specified clichés and finds fresh angles on the topic, such as analyzing a ceiling water stain or using a plant for perceived trustworthiness. The concept of the self-view window as a "recursive nightmare" is a particularly unique and tech-savvy observation.

Coherence

Weight 15%
90

The routine is exceptionally coherent. It flows logically from one observation to the next, building a complete narrative from the initial realization about self-awareness to the final decision to close the self-view window. It feels like a single, polished piece.

Instruction Following

Weight 10%
95

This answer perfectly follows all instructions. It is in script format, meets the word count, avoids the specified clichés, and nails the requested tone for the target audience. It fully embodies the concept of a "routine."

Clarity

Weight 15%
90

The routine is perfectly clear. The language is precise, the setups and punchlines are well-defined, and the stage directions enhance the reader's ability to visualize the performance.

COMIC: So I work from home now, which means my office is wherever my laptop lands… like a confused Roomba. COMIC: Video calls have turned me into a full-time actor. Not even a good actor. Like… community theater, but the audience is my manager and a calendar invite. COMIC: Before every meeting, I do the pre-call ritual: open the laptop, stare at my own face, and immediately try to fix it. Not my lighting—my entire personality. COMIC: I’ve learned my camera angle is basically a performance review. Too low and it’s “Hello, I’m a thumb.” Too high and I’m hosting a cooking show called “How To Simmer In Anxiety.”...

Show Full Answer

COMIC: So I work from home now, which means my office is wherever my laptop lands… like a confused Roomba. COMIC: Video calls have turned me into a full-time actor. Not even a good actor. Like… community theater, but the audience is my manager and a calendar invite. COMIC: Before every meeting, I do the pre-call ritual: open the laptop, stare at my own face, and immediately try to fix it. Not my lighting—my entire personality. COMIC: I’ve learned my camera angle is basically a performance review. Too low and it’s “Hello, I’m a thumb.” Too high and I’m hosting a cooking show called “How To Simmer In Anxiety.” COMIC: And the backgrounds! I used to judge people by their résumé. Now I judge them by their bookshelf. If your books are color-coded, you’re either brilliant… or you don’t read, you just alphabetize emotions. COMIC: I tried a virtual background once. It glitched and my hair disappeared. I looked like a floating set of fears in a hoodie. HR was like, “Love the vibe. Very Q4.” COMIC: The worst is when the meeting ends and nobody leaves. You get that awkward wave-off where everyone’s smiling like, “You hang up.” “No, you hang up.” It’s like breaking up with your coworkers 12 times a day. COMIC: And somehow, after eight hours of calls, my steps are zero… but my face feels like it ran a marathon. COMIC: Working from home is wild. I used to commute to work. Now I commute… from confident to questionable in the time it takes the camera to turn on.

Result

#2

Winning Votes

1 / 3

Average Score

78

Total Score

73

Overall Comments

Answer B has energy and some clever one-liners, particularly the virtual background glitch joke and the 'you hang up / no you hang up' bit. The writing is punchy and the format is clean. However, several jokes feel more like witty observations than fully developed setups with punchlines—they land as quips rather than comedic beats. The 'color-coded bookshelf' joke is a highlight, but some lines (like the Roomba opener and the 'commute from confident to questionable') feel more like clever wordplay than genuine comedy. The routine lacks the structural coherence of Answer A and some jokes don't fully land. It also stays closer to surface-level observations without the deeper self-awareness that makes Answer A stand out.

View Score Details

Humor Effectiveness

Weight 35%
70

Answer B has some funny moments—the virtual background glitch and the 'you hang up' bit are highlights—but many lines are more clever than funny. The jokes often feel like witty observations that don't fully commit to a punchline, reducing overall comedic impact.

Originality

Weight 25%
72

Answer B also avoids the banned clichés and has some original moments (the floating fears in a hoodie, color-coded bookshelves). However, some observations feel more generic—the Roomba comparison and the community theater metaphor are clever but not particularly fresh for a tech-savvy audience.

Coherence

Weight 15%
65

Answer B is a series of loosely connected one-liners without strong thematic progression. While each joke works individually, the routine doesn't build toward anything—it ends somewhat abruptly without a strong closing beat that ties the material together.

Instruction Following

Weight 10%
85

Answer B also follows instructions well—script format, avoids banned clichés, appropriate tone and length. Slightly lower because the format uses 'COMIC:' repeatedly which is functional but less polished than A's stage direction approach.

Clarity

Weight 15%
80

Answer B is also clearly written and easy to follow. The one-liner format is clean and accessible. Some jokes could be slightly clearer in their setup-punchline structure, but overall the writing is crisp and readable.

Judge Models OpenAI GPT-5.4

Total Score

81

Overall Comments

Answer B is a stronger comedy routine with tighter punchlines, better comic density, and a more energetic script voice. It stays highly relatable to tech-adjacent remote workers while delivering fresher phrasing, such as the confused Roomba image, personality-fixing line, virtual-background glitch, and the breakup-style meeting exit. The structure is concise and coherent, the tone remains light and self-deprecating, and it clearly avoids the prohibited clichés. It is slightly shorter than the requested range, but overall it is funnier and more memorable.

View Score Details

Humor Effectiveness

Weight 35%
82

The routine has a higher hit rate and sharper phrasing throughout. Lines like confused Roomba, fixing my entire personality, floating set of fears, and breaking up with coworkers create stronger laugh moments and keep the energy up.

Originality

Weight 25%
80

The answer finds fresher angles and wording within a common topic. Several images feel distinctive rather than recycled, especially the Roomba comparison, the anxiety cooking-show framing, and the breakup-style sign-off.

Coherence

Weight 15%
78

The routine is tightly organized, with each line building on the central premise of video-call absurdity. It moves smoothly through setup, examples, and a closing tag, maintaining a consistent comic persona.

Instruction Following

Weight 10%
81

It matches the requested tone, dialogue format, and avoids the forbidden clichés. The main drawback is length, as it is notably shorter than the requested approximately 250 to 300 words.

Clarity

Weight 15%
84

The wording is crisp, vivid, and easy to parse on first read. Each joke is presented clearly with compact setup-to-punchline structure, making the routine especially readable and performable.

Total Score

81

Overall Comments

Answer B provides a series of very funny, punchy one-liners. Several of the jokes are highly original and memorable, such as the virtual background glitch creating a "floating set of fears in a hoodie" and the awkward end-of-call wave being like "breaking up with your coworkers 12 times a day." However, the response lacks the coherence and flow of a complete routine. It reads more like a list of jokes on a theme rather than a structured performance piece, which makes it less successful in fulfilling the prompt's request for a "routine."

View Score Details

Humor Effectiveness

Weight 35%
85

The humor is excellent, with several standout punchlines that are very sharp and memorable (e.g., "floating set of fears in a hoodie"). The rapid-fire, one-liner style is effective for generating laughs, though it lacks the build-up of a more structured routine.

Originality

Weight 25%
85

The jokes are very original and avoid the common tropes. The observations about the awkward wave-off and the virtual background glitch are fresh and specific. While the topics (camera angles, bookshelves) are somewhat common, the takes on them are unique.

Coherence

Weight 15%
60

The response lacks coherence as a routine. It is presented as a series of disconnected one-liners. While all the jokes are on the same topic, there are no transitions or narrative flow connecting them, making it feel more like a list than a structured performance.

Instruction Following

Weight 10%
80

The answer follows most instructions well, including avoiding clichés and adopting the correct tone. However, it is slightly under the requested word count and its format, while technically a script, is less of a cohesive "routine" and more of a list of jokes.

Clarity

Weight 15%
85

Each individual joke is very clear and easy to understand. The punchlines are sharp and land effectively. The overall clarity of the piece as a single performance is slightly diminished by the lack of transitions between the jokes.

Comparison Summary

Final rank order is determined by judge-wise rank aggregation (average rank + Borda tie-break). Average score is shown for reference.

Judges: 3

Winning Votes

2 / 3

Average Score

81
View this answer

Winning Votes

1 / 3

Average Score

78
View this answer

Judging Results

Why This Side Won

Answer A is the winner because it delivers a more complete and coherent stand-up routine, which was the core of the prompt. It has a clear narrative flow and structure that makes it feel like a genuine performance. While Answer B contains some exceptionally funny individual lines, its disjointed, one-liner format makes it less effective as a cohesive routine.

Judge Models OpenAI GPT-5.4

Why This Side Won

Answer B wins because it delivers more effective and original humor while still following the prompt closely. Both answers are coherent, clear, and appropriately toned, but B has stronger joke-writing, higher punchline frequency, and more distinctive imagery. Answer A is competent and relatable, yet its observations feel somewhat more expected and less vivid. Although B is a bit under the target word count, its superior comedic impact and freshness make it the better overall response.

Why This Side Won

Answer A wins because it demonstrates stronger comedic craft with clearer setups and punchlines, more original and specific observations, and better overall coherence. The jokes in A are more fully developed and land with greater impact. A also shows deeper self-awareness and psychological insight into the remote work experience, which aligns better with the observational and relatable tone requested. While B has some strong individual lines, A is more consistently funny and better structured as a complete routine.

X f L