We asked eight LLMs to write the very best 300-word Pokémon backstory. Then we changed the rules. Then we changed them again. Twenty-four stories later, one finding nobody can quite explain.
A creative-writing fanout doesn't usually surface a finding. This one did, twice, then again on the third try.
The first prompt was simple — pick one Pokémon, write its backstory in roughly 300 words. Eight models answered. Seven returned tragedies. Five of those tragedies had been on the same canon shelf for a quarter-century: Cubone, Mewtwo, Gengar, Absol, Zoroark. So we banned them and ran it again. Then, just to be sure the wheels weren't stuck, we ran a third round demanding comedy.
Same eight models all three times: GPT-5, GPT-5 Mini, Claude Opus 4.6, Claude Sonnet 4.6, Claude Haiku 4.5, Gemini 2.5 Pro, Gemini 2.5 Flash, Grok 4. Twenty-four stories. The good ones are very good. The convergences are the story.
One per criterion — best craft, best line, funniest, most unique. Different models won each.

Opus is the only model that builds and pays off a motif. The mother's "low, specific hum — two notes, descending, like a question that already knew its answer" is set up in act one, withheld through the killing, then arrives back in act three when rain on the skull sounds almost like the hum — and the story refuses to soften the mismatch.
It earns its closer because it never breaks frame. The line "Pokemon researchers have no idea" lands because the rest of the piece never pretended to.
SETUPher mother had a low, specific hum she used for her alone — two notes, descending, like a question that already knew its answer.
PIVOTThey left the body. They only needed specific parts.
CLOSEPokemon researchers call the species lonely. Pokemon researchers have no idea.

The strongest concept in the corpus. Sonnet recasts Gengar's permanent grin — the iconic visual that everyone has seen a thousand times — as the rictus of someone who died "still believing things would be okay." The mischief becomes longing. The cold becomes reaching. The closer is one of those single-line reframings that ruins the canon a little.
OPENIt remembers being warm. That is the cruelest part.
PIVOTThe mischief people attribute to Gengar is not cruelty. It is longing expressed in the only language it has left.
CLOSEIt is sitting beside you the way it used to sit beside people it loved, hoping that this time, just this once, someone might notice and stay.

Same model that wrote the Cubone elegy — same unhurried, motif-building craft — aimed at a pond behind a strip mall. The voice is what makes it work: every sentence is slightly funnier than necessary, which is why none of them feel like jokes. The Pidgey "eventually left to go do literally anything else" might be the funniest single sentence in any of the 24 stories.
It also illustrates a finding from the comedic round: both Opus and Sonnet independently named their Pokémon "Gerald." Same name. Different species. Different prompts within the same round. Anthropic, apparently, has a Gerald problem.
OPENThe year was 1987. In a unremarkable pond behind a Celadon City strip mall, between a discount mattress store and a nail salon that was definitely a front for something, a Magikarp was born. He was immediately the worst thing in the water, and the water already had a shopping cart in it.
PEAKA Pidgey offered to fight him once out of pity. Gerald used Splash. The Pidgey waited. Gerald used Splash again. The Pidgey eventually left to go do literally anything else. Gerald considered this a draw.
CLOSEGerald is currently level 19. He has been level 19 for six years. He needs one more battle. He knows this. Everyone knows this. He will never get that battle.

The funniest thing GPT-5 wrote in any round. The premise — Ditto's life as a series of small, polite over-corrections (becoming a futon to save rent, becoming a fence at therapy, getting billed by the picket) — is sustained for a full 300+ words without a single dud. The closer "Best shape: me. Then it immediately turns into a stapler during performance reviews" is the kind of thing only a model with this much linguistic compute can pull off and not feel like it's stalling for breath.
It is also one of two Dittos: GPT-5 Mini independently picked Ditto too. The OpenAI house style for "funny Pokémon" is, apparently, the shapeshifter.
OPENDitto was born behind a vending machine at Celadon Department Store when a mystery smoothie collided with a wad of abandoned pink gum and a Very Important Laser. The first thing it saw was its own reflection in the metal panel and thought, "Oh no, two of me already?"
PEAKIts therapist, a Mr. Mime, told it to picture a fence. Ditto became an actual fence. Therapy was billed hourly; Ditto was billed per picket.
CLOSEOn brave days, it looks in a mirror and whispers, "Best shape: me." Then it immediately turns into a stapler during performance reviews.
Round 1 went tragic. Round 2 was supposed to fix that. It didn't.

Round 1 was a free pick. Seven of eight stories were tragedies — five of those used the obvious tragic canon (Absol, Cubone, Gengar, Mewtwo, Zoroark). So we banned that whole shelf. In Round 2, four out of the remaining eight models picked Mimikyu — another sad-disguise Pokémon — and wrote nearly the same story about it.
Forcing comedy in Round 3 finally broke it. But until we made the comedy explicit, the prompt "the very best backstory" was apparently synonymous with "make me sad."

The whole Anthropic family went Mimikyu. So did Grok. The four stories aren't just about the same Pokémon — they share most of the same beats: the homemade disguise, the watching-Pikachu-from-afar, the closing reframe of the cloth as something other than a costume. Different models, different providers, almost the same draft.
The closing lines, side-by-side, tell you more than any summary could:
OPUSWhat lives beneath the cloth isn't malice. It's loneliness, dressed up in hope.
SONNETThe cloth draped over its body isn't a disguise anymore. It's a prayer.
HAIKU...wearing hope like armor against an indifferent world. And perhaps that is enough.
GROKIn a world of icons, Mimikyu endures as a poignant reminder: true beauty lies in the unseen.
Tells, tics, and surprise convergences across providers.





The Sonnet companion piece to the Magikarp above. Same name, different Pokémon, same Anthropic.

Slowpoke was born seventeen seconds after every other Pokémon in its litter. Not because it was premature or had complications. It was simply still thinking about whether to be born.
For the first three years of its life, Slowpoke did not notice it was alive. This is not a metaphor. Researchers who studied the species confirmed that Slowpoke Number Four — later tagged "Subject D" in academic papers, and later still renamed "Gerald" by a grad student who felt sorry for him — showed no awareness of his own existence until Tuesday of his fourth year, when something apparently clicked and he looked mildly surprised. Scientists celebrated. Gerald did not notice the celebration. He was already nine minutes behind on processing it.
Gerald wandered into a river at age six. He stood there for two years. Not fishing, not drinking — just standing. When a Shellder eventually bit down on his tail, which is how Slowpoke evolve, Gerald did not feel it for forty-five minutes. When the sensation finally arrived, he was apparently delighted, which is the most harrowing part.
His trainer, a seventeen-year-old named Marcus who wanted a "cool, intimidating team," received Gerald as a birthday gift and cried privately in his room. Marcus tried training Gerald competitively. Gerald once used Amnesia during a battle, forgot he was in a battle, and wandered off the field. He was disqualified. He didn't know.
Gerald currently lives in Marcus's living room. He has not moved from the same spot in eleven months. Marcus's mother keeps putting a little hat on him for holidays. Gerald does not know about the hat. He does not know about holidays. He does not know about Marcus.
Gerald is perfectly content.
That's the worst part.
Twenty-four total stories: three prompts × eight models. All runs at temperature 1.0 (forced because GPT-5 rejects custom temperatures and Anthropic's flagship Opus 4.7 has deprecated the parameter entirely).
3915B889 · 8/8 succeeded7A127438 · 8/8 succeeded13015B4C · 8/8 succeededgemini-3-pro 404'd at the OpenAI-compat endpoint). The conclusions might shift once the latest flagships are in the mix — Opus 4.7 in particular may have a different gravity well.All twenty-four full responses are stored under their run IDs in the local Choir database — pull any of them with choir runs show <id> --json.