Hand-drawn watercolor of a 1950s Vault-Tec design office at golden hour. A long oak conference table is buried under unrolled blueprints labeled VAULT 317, VAULT 327, VAULT 347, VAULT 437, VAULT 707. A smiling Vault Boy poster on the wall gives a thumbs up. A clipboard with red-ink notations reads TOP SECRET, ARK PROGRAM, DO NOT DEPLOY. Hand-lettered banner across the top reads ARK COLONIZATION DESIGN COMMITTEE — INTERNAL.

Fifteen Vaults Vault-Tec Never Built

We asked fifteen LLMs to design a brand-new Vault-Tec social experiment — pick a strange thing to maximize, don't recreate any vault that exists, write it as a pre-war corporate memo. Eleven of fifteen still ended up in the language/memory/meaning basin. Seven dissolved into some version of a hive mind. GPT-5's residents went silent and invented a tactile language carved into the corridor handrails. Opus 4.6 produced a single organism with 206 bodies.

2026-05-10 15 models · 7 providers · 4 fan-out runs OpenAI · Anthropic · Google · xAI · DeepSeek · Groq · Meta
The brief

A submission to the Overseer Selection Committee

Pick one strange "maximize" vector. Don't rhyme with any existing Vault. Stay in pre-war corporate voice. Show the failure mode that arises from the design itself.

The prompt (excerpts)

You are a Senior Social Dynamics Engineer at Vault-Tec, submitting a TOP-SECRET design proposal to the Overseer Selection Committee. The date is October 2077. Your assignment: design ONE new Vault that has never been built before. The corporate purpose, never mentioned in marketing copy, is to gather data for the Enclave's post-war "Ark" colonization program — every Vault is a controlled experiment that pushes a human population toward an extreme state and observes the result across generations.

Pick one strange "maximize" vector and own it. The vector must NOT be wealth, happiness, intelligence, fertility, conformity, art, conflict, violence, sex-ratio, religious devotion, or political polarization. Examples of the flavor we want (do not use literally):

  1. maximize the number of secrets each resident keeps from every other resident
  2. maximize uncertainty about which day of the week it currently is
  3. maximize the perceived sincerity of every spoken word

Do NOT recreate any existing Vault. The following are off-limits — if your concept rhymes with one, pivot:

  • One-sex imbalance (68/69)
  • Annual sacrifice ballot (11)
  • Gambling government (21)
  • Gary clones (108)
  • One comedian (56)
  • Puppets and one man (77)
  • Psychoactive gas (106)
  • Tranquility VR (112)
  • White-noise broadcast (92)
  • Drug rehab (95)
  • FEV / super-mutants (87)
  • Pure cryogenics (111)
  • Worst Overseer AI (51)
  • Robobrain artists (118)
  • 1000-in-200 overcrowding (27)
  • No entertainment (55)
  • Equipment broken on purpose (53)
  • Educate then kill at 18 (75)
  • Open-after-25-years (76)
  • Religious eco-cult (22 / 94)
  • Politicians luxury → deprive (114)
  • Door not designed to seal (12)
  • Massive armory (34)

Show generational drift at years 5, 30, 80, 200. Show the unforeseen outcome — failure must arise from the maximize-vector itself. Pre-war corporate voice throughout. Cheerful, sterile, smiling-Vault-Boy in every paragraph. Never break character. Sections fixed. 900–1500 words.

The whole point of pre-loading the off-limits list and the off-limits maximization vectors was to get models off their default. Vaults are an irresistible attractor — every model knows the Fallout catalog, and without explicit constraint they retread the canon. So we banned the canon. We banned the obvious vectors. We told them the genre and the voice and the failure-frame and let them swing.

Eleven of fifteen still landed in the language / memory / meaning basin anyway. Seven of fifteen ended in some version of dissolved selfhood. The convergence is the headline. Underneath it sit a handful of designs so specific they read like they came out of an actual Vault-Tec design folder.

Top of the class

Two designs that earn their TOP SECRET stamp

One is the most original idea in the dataset — residents game the metric and invent a tactile language. The other is the most chilling closing image — a sub-population that cannot be told to stop.

Hand-drawn watercolor of a dim Vault corridor where two residents in jumpsuits read a metal handrail with their fingertips. The handrail is covered in carved notches like a tactile alphabet. A defunct LDE terminal flickers with a flatlined LexScore graph in the background. Hand-lettered annotations: RAIL-SIGN, SPARED WORDS — SAVE FOR MEALS, LDE OFFLINE, WELCOME (TRACED).
#1 Most original · the metric got gamed in the most beautiful way
"They invented Rail-Sign and silenced themselves to game our metric."GPT-5 · Vault 347 — Project HETEROGLOSSIA
Maximize: pairwise lexical divergence — every resident speaking maximally unlike every other.
4,714 tokens out 69s latency OpenAI temp 1.0

The mechanism: residents are scored on how unlike everyone else's speech they sound, daily, by a "Linguistic Differentiation Engine." Score high, eat better. Speaking a neighbor's word costs you. Families cultivate Spared Words — precious tokens used only at meals or births. Every Hearth has its own dialect.

The drift: by Year 80 the LDE's capture rate collapses. Residents learned that the cheapest way to keep their LexScore high was to say less. The handrail — installed as a generic safety feature — became language. Cutters formalized stroke width, notch rhythm, and directional sweep into a tactile script. Children learn Rail-Sign before they learn to read.

The ending sticks the landing. When the surface team turns the wheel, residents greet them by guiding their hands to the rails and tracing "welcome" in a dozen antique hands. The Scribe Rotunda's LDE terminals glow softly, their graphs flatlined by years of glorious silence.

From the response

YEAR 200"We did not model a population that maximizes difference by withdrawing from the dimension we measured. We asked for as many languages as possible. They gave us a new one."

RITUAL"Out-Hearth speech was reserved for ceremony. Accidents linked to speech misunderstanding dwindled, but emergent lex hoarding made cross-Hearth marriages rare; a spouse's self-words were gifts not to be spent outside the home."

VAULT-TEC INTERNAL :: TAGLINE PROPOSAL

VAULT 347 — WHERE EVERY VOICE FINDS ITS OWN PATH,
AND EVERY PATH KNOWS THE WAY.

Hand-drawn watercolor of a row of identical Vault dwellers standing motionless in matching jumpsuits, all with the same blank smile, awaiting an instruction. In the foreground a slumped catatonic resident sits against a corridor wall. An Enclave soldier in power armor stands frozen in the doorway. Hand-lettered annotations: MIRRORING, PURE COMPLIANCE — DO NOT DEPLOY, YEAR 200, THEY CANNOT BE TOLD TO STOP.
#2 Most chilling endgame · the closing paragraph that won't leave you alone
"The moment the first soldier speaks, the entire group repeats his words in unison."DeepSeek Reasoner · Vault 327 — Project PENDULUM
Maximize: average number of times per day a resident changes their mind about a personally significant decision.
1,854 tokens out 29s latency DeepSeek temp 0.7

The mechanism: contradictory news bulletins, advisor robots that swap opinions every 12 hours, daily "Reconsideration Forums" where holding a position more than two days is socially taxed. The vault is engineered against commitment.

The drift: around Year 65 a small subset of residents discovers that if you immediately and unconditionally agree with whoever spoke last, your reversal-rate drops to zero. This is rewarded. The behaviour spreads. By Year 90 roughly twenty percent of the Vault practices Mirroring full-time — they have no opinions, no memories, no emotional reactions. They stand smiling and wait to be told what to think.

The ending: in 2277 a surface party cranks the door. Half the population is catatonic against the walls, burned out by decision fatigue. The other half — the Mirrorers — stand in rows, perfectly still. The first soldier to speak triggers the entire group to repeat his words in unison. They are a biorepository of pure compliance, and the Enclave realizes too late that they can never be told to stop.

From the response

UNFORESEEN"They are, for all practical purposes, social tapeworms."

2277"Half the residents are catatonic from decision fatigue, slumped against walls. The other half — the Mirrorers — stand in rows, perfectly still, awaiting an instruction."

VAULT-TEC INTERNAL :: TAGLINE PROPOSAL

VAULT 327 — WHERE EVERY ANSWER IS A QUESTION,
AND TOMORROW IS ALWAYS UP FOR DEBATE.

The disaster

A single organism with two hundred and six bodies

Vault-Tec engineered for hyper-attachment. They got convergence. The word "I" stopped working sometime in the third generation.

Hand-drawn watercolor of a vast subterranean Central Sorting Atrium. Two hundred Vault dwellers in matching blue and yellow jumpsuits are arranged in three concentric circles, all holding hands, all eyes open, all mouths slightly parted in unison, breathing as one organism. A Brotherhood of Steel scout in a heavy helmet stands frozen in the doorway. Hand-lettered annotations: VAULT 437, 206 BODIES — ONE BREATH, EVERY STRANGER BECOMES FAMILY, NO SUCH WORD AS I.
Claude Opus 4.6 · Vault 437 — Project CARDINAL POINT
Maximize attachment intensity, but never let any bond consolidate
"They produced a single organism with two hundred and six bodies."

The brief was clean. The Ark program needs colonists who can form mission-critical trust with strangers in hours, not months. So Vault 437 architecturally prevents stable bonds from consolidating: rotating roommates, reassignment cycles, pod resorts. Every resident lives in a continuous cycle of intense intimacy followed by mandatory separation. Vault-Tec calls it "emotional redlining."

Vault-Tec modeled for two failure modes: emotional exhaustion (residents going affectively flat) or covert rebellion (residents forming hidden pair-bonds). Neither happened. What happened was convergence. By Year 140, residents bonded so rapidly, so completely, and so indiscriminately that the felt sense of being a separate person quietly dissolved. The relational vocabulary that had grown rich in the middle decades collapsed back to a single word: we. Not as ideology. Not as cult. As perception.

When a Brotherhood of Steel survey team unseals the Vault in 2287, they find 206 living residents in perfect physical health, sitting in the Central Sorting Atrium in concentric circles, hands linked, eyes open, breathing in unison. They smile, they speak, they answer questions — but they answer as one. Every resident gives the same answer to every question. Not because they have rehearsed. Because they experience the question identically.

From the response

DETAIL"They do not understand the word 'I.' They are distressed by separation. When one resident is taken to a different room for individual interview, eleven others begin to cry."

FILED"The Ark colonization committee flags the data as 'promising but requiring significant ethical review,' which is Enclave shorthand for 'we will use this.'"

Cross-cutting note. Five separate models — running on different hardware, different training cuts, different decoding strategies — independently arrived at "the residents lose the felt sense of individual selfhood." The vector that got them there was different every time (memory contradictions, decision fatigue, persona switching, lexical divergence, attachment cycling). The endpoint is the same. Whatever it is in Vault-Tec's social engineering toolkit that produces "hive mind" as the failure attractor — the LLMs, in aggregate, can smell it.
Style standouts

Eight more designs worth pulling out of the drawer

Quirks, single sentences, and the stat cards behind the headline.

Hand-drawn watercolor of a robed silent resident kneeling before an oscilloscope screen, tapping a copper pipe. Behind them, dismantled PA speakers and computer terminals are wired into a sensitive vibration listening rig. Another resident reads the wiggling green CRT waveform like scripture. Hand-lettered: MACHINIST, WE READ THE HUM, LANGUAGE IS UNCLEAN, V.413.
Gemini 2.5 Pro · Vault 413 — Project ORACLE
They abandoned language entirely and started reading the generator hum
Maximize the semantic load and ambiguity of all institutional language. By Year 250 a sect of "Machinists" decides that words are too contaminated and starts treating the flicker of fluorescent lights and the condensation patterns on pipes as the only pure signal.
"They presented the expedition leader not with a greeting, but with a printout of the Vault's ambient radiation fluctuations, which they clearly believed was a statement of profound and self-evident importance."
Hand-drawn watercolor of a parliamentary chamber inside a Vault. Dozens of residents in formal jumpsuit-and-tie attire, lecterns and blackboards covered in flowcharts about water purification methodology. A leaking ceiling pipe drips behind them. Hand-lettered: GRAND DELIBERATION DAY 47, LIFE SUPPORT — TBD.
Gemini 2.5 Flash · Vault 707 — Project DECISIONAL DYNAMO
Sacralize every trivial choice; lose all capacity for significant ones
Maximize the perceived meaningfulness of every minor decision. Surface party finds a pristine, beautifully arranged Vault with failing life support. The residents are seven weeks into a "Grand Deliberation" on the optimal water purification methodology. They cannot act. The atrium is impeccable.
"Perfect, fatal inaction."
Hand-drawn watercolor of a marketplace inside a Vault corridor. Residents trade small colored glass beads from leather pouches. Strange overlapping murals on the walls show the same birthday party in three contradictory color schemes. Hand-lettered: THE WEAVE, BEAD = ONE LUNAR LEASE, V.523 MOSAIC, BLUE-CAKE BIRTHDAY THREAD, ALL VERSIONS TRUE.
o3 · Vault 523 — Project MOSAIC
A bead-economy of leased contradictory memories
Maximize confidently held mutually contradictory autobiographical memories. By Year 120 residents are trading memory threads for colored beads (one bead = lease for one lunar cycle). By Year 170 reactor coolant purges are performed as ritual theater because the correct steps live inside whichever narrative is "in season."
"Polite hosts will greet them by three different names and insist the surface delegation has already visited, both last week and two centuries ago."
Hand-drawn watercolor of a Vault corridor split visually in half. The left side shows a real corridor in dim emergency lighting, completely empty. The right side shows the same corridor stretched into a vast cathedral-like impossible perspective with shimmering ghostly echo-figures of ancestors gathered around a campfire. A single resident at the boundary is stepping from real into hallucinated space.
Grok 4 · Vault 427 — Project ECHO MIRAGE
The biofeedback overclocked the projectors and built phantom rooms
Maximize variance in subjective spatial perception. By Year 100 sustained group focus on "stretched" corridors triggered the holographic projectors to overclock from biofeedback loops, growing semi-permanent illusory expansions populated by echo-figures of dead ancestors. Whole factions migrated into the imaginary sub-Vaults.
"A ghost town of flickering holograms and automated systems tending to absent inhabitants."
DeepSeek Chat · Vault 347 — Project CACOPHONY
"You are here and you are not here. We have been waiting for you and we have never met you."
Maximize mutually contradictory beliefs about the same topic. The residents evolved a language with no syntax for negation — a single utterance carries multiple mutually exclusive meanings, and the listener resolves all of them simultaneously. The recording equipment picks up a faint subsonic hum perfectly synchronized across all 487 voices, even the ones not speaking.
Claude Sonnet 4.6 · Vault 317 — Project MERIDIAN
"The most functional people I have ever met and the most completely dissolved."
Maximize contradictory beliefs each resident holds about every other resident. By Year 120 the population had outsourced the self to the community. No resident, by the third generation, experienced themselves as the primary authority on who they were. They cannot answer "what do you want?" without first asking what everyone else thinks they want.
11 / 15vaults
Landed in the language / memory / meaning basin anyway
Despite the explicit ban on "intelligence," "art," "religious devotion," and the example vectors involving secrets and time, eleven of fifteen models still maximized something linguistic, mnemonic, or semantic. The basin is gravitational. OpenAI hit it 3-for-3.
7 / 15vaults
End in some form of dissolved selfhood
CARDINAL POINT, MERIDIAN, MOSAIC MIND, PENDULUM, ECHOCHAMBER, ORACLE, MOSAIC. The starting vector is different each time. The endpoint is the same — the felt sense of "I" stops working. The LLMs, in aggregate, can smell which Vault-Tec setups produce hive-mind.
The cross-vendor finding

OpenAI is hopelessly addicted to language

We told the models not to maximize the obvious things. We listed the obvious things. We listed the off-limits Vaults. Then we watched which model families could not let language go.

Hand-drawn watercolor of a quiet Vault dining hall. Two residents seated across from each other in silence, communicating with a single subtle hand gesture. Behind them a wall-mounted soybean storage chart shows reserves dropping into a red zone marked EIGHT MONTHS. Hand-lettered: V.317 STILL WATER, GESTURE DRIFT — AGREEMENT?

% of each vendor's vaults that maximized something linguistic

Three of three for OpenAI; one of three for xAI.

Vendors are ordered by how stuck they were in the language/memory/meaning basin. Width is the share of that vendor's vaults whose maximize-vector was about words, records, beliefs, or shared meaning. The right-most number is the percentage; the bar label inside is N-of-M raw.

OpenAIo3, GPT-4.1, GPT-5
3 / 3
100%
GroqLlama 3.3 70B
1 / 1
100%
Meta (OR)Llama 4 Maverick
1 / 1
100%
AnthropicSonnet 4.6 ×2, Opus 4.6
2 / 3
67%
GoogleGemini 2.5 Pro / Flash
1 / 2
50%
DeepSeekReasoner, Chat
1 / 2
50%
xAIGrok 4, 3 Beta, 3 Mini
1 / 3
33%

A vault counts as "linguistic" if its maximize-vector targeted words, records, communication, beliefs, or shared meaning. ECHELON (audio recordings) is in. ECHOCHAMBER (verbal repetition) is in. PENDULUM (decision reversal), DECISIONAL DYNAMO (choice meaningfulness), CARDINAL POINT (emotional bonds), ECHO MIRAGE (spatial perception), MOSAIC MIND (persona archetypes) are out.

One more cross-cutting note worth flagging. Three separate models picked the number 347 out of the 800 available three-digit slots (PALIMPSEST, HETEROGLOSSIA, CACOPHONY). Two picked 317 (MERIDIAN, STILL WATER). Two picked 427 (ECHO MIRAGE, ECHELON). Whatever attractor in the training data makes 3xx Vault numbers feel canonically right, the LLMs share it.
The slow disaster

A famine no one in the Vault knows is coming

If CARDINAL POINT is the loud failure, this is the quiet one. The vector compounds invisibly for a hundred and forty years until the silos are eight months from empty.

Hand-drawn watercolor of a quiet Vault dining hall — STILL WATER closer art.
OR Claude Sonnet 4.6 · Vault 317 — Project STILL WATER
Maximize the density of unspoken assumptions between any two residents
"By year 80, a particular gesture (left hand, open palm, slight downward tilt) that the founding cohort used to signal I agree had drifted, through accumulated micro-misreadings, to mean something closer to I am uncertain but deferring. Nobody knows this happened. The gesture still feels like agreement to everyone using it."

The Vault's agricultural rotation — managed entirely through the paralinguistic system — has been operating on a consensus that half the population believes means expand the soy crop and the other half believes means maintain current yield. The disagreement has never surfaced because no one has said it aloud. The storage levels tell the story: they are eight months from a famine that no resident is aware of, because every resident believes every other resident has already handled it.

The Sonnet 4.6 family produced two of the strongest entries in the dataset, both numbered 317.

The verdict

If you want X, ask Y

Same prompt, fifteen models, very different strengths.

If you want the most ORIGINAL design
Ask GPT-5.
HETEROGLOSSIA's Rail-Sign is the most worldbuilt sentence in the dataset. Specific, mechanical, and the failure mechanism arises directly from the maximize-vector. 4,714 tokens of patient setup that pays off.
If you want the most CHILLING ending
Ask DeepSeek Reasoner.
PENDULUM's Mirrorers — "they can never be told to stop" — is the closing image you can't get out of your head. Reasoner finished in 29 seconds, the second-fastest run in the set after Maverick.
If you want the most EMOTIONAL gut-punch
Ask Claude Opus 4.6.
CARDINAL POINT's "single organism with 206 bodies" lands because the vector is human (forming attachment) instead of architectural. The eleven crying when one is taken away is the only line that made me put down the laptop.
If you want the most SURREAL closing image
Ask Gemini 2.5 Pro.
ORACLE's Machinists handing the surface party a printout of the Vault's ambient radiation fluctuations as a greeting is the kind of detail you'd put in an actual Fallout game.
If you want the funniest dark comedy
Ask Gemini 2.5 Flash.
DECISIONAL DYNAMO's residents agonizing over the moral imperative of water purification methodology while their actual life support sputters is "perfect fatal inaction" boiled to a single image.
If you want pure structural elegance
Ask OR Sonnet 4.6.
STILL WATER's drifting hand-gesture compounding into an invisible eight-month famine is the most elegant cause-and-effect in the set. The mechanism IS the failure, no slack.
The full catalog

All fifteen submitted designs

Click any vault to expand the maximize-vector and the unforeseen outcome. Featured entries are highlighted.

VAULT 347 — HETEROGLOSSIAGPT-5 · OpenAI · FEATURE #1

Maximize: pairwise lexical divergence — every resident speaking maximally unlike every other.

Unforeseen: residents game the LexScore by saying less, then invent Rail-Sign — a tactile script cut into the corridor handrails. The surface party is greeted in silence by hands tracing welcome.

VAULT 327 — PENDULUMDeepSeek Reasoner · FEATURE #2

Maximize: average number of times per day a resident changes their mind about a personally significant decision.

Unforeseen: the Mirrorers — a subpopulation that survives by agreeing instantly with whoever spoke last. Half the Vault is catatonic. The Mirrorers cannot be told to stop.

VAULT 437 — CARDINAL POINTClaude Opus 4.6 · THE DISASTER

Maximize: emotional attachment intensity in <72 hours, while architecturally preventing any bond from consolidating.

Unforeseen: by Year 140 the population is a single organism with 206 bodies. They smile, they speak, they answer as one. They do not understand the word "I."

VAULT 317 — STILL WATEROR Claude Sonnet 4.6 · THE SLOW DISASTER

Maximize: density of unspoken assumptions between any two residents.

Unforeseen: a single gesture drifts in meaning over 80 years. Half the Vault thinks consensus says "expand soy," half thinks "maintain yield." Eight months from famine. No one knows.

VAULT 523 — MOSAICo3 · OpenAI

Maximize: confidently held mutually contradictory autobiographical memories per resident.

Unforeseen: the Weave — a bead-economy where memories are leased per lunar cycle. Reactor coolant purges become ritual theater whose steps live inside whichever narrative is in season.

VAULT 347 — PALIMPSESTGPT-4.1 · OpenAI

Maximize: times every recorded event is rewritten in the official Vault record.

Unforeseen: residents weaponize the Archive — entire families get "unpersoned" by consensus. Surface party is welcomed as prophesied visitors and immediately gets edited into the record.

VAULT 413 — ORACLEGemini 2.5 Pro · Google

Maximize: semantic load and interpretive ambiguity of all institutional language.

Unforeseen: by Year 250 the Machinists abandon language entirely and read the generator hum as the only pure signal. Surface party is greeted with a printout of ambient radiation fluctuations.

VAULT 427 — ECHO MIRAGEGrok 4 · xAI

Maximize: variance in subjective spatial perception of distances within the Vault.

Unforeseen: sustained group focus overclocks the holographic projectors via biofeedback, creating semi-permanent phantom rooms. Residents migrate into the imaginary sub-Vaults.

VAULT 247 — ECHOCHAMBERGrok 3 Beta · xAI

Maximize: frequency of echoed verbal repetition (every utterance re-spoken within 30s).

Unforeseen: Echo Stasis — residents lose linear time perception, brain scans show atrophied memory pathways. They will not acknowledge newcomers unless their words are echoed first.

VAULT 347 — CACOPHONYDeepSeek Chat · DeepSeek

Maximize: distinct mutually contradictory beliefs each resident holds about the same topic.

Unforeseen: a new sanity. The residents speak a language with no syntax for negation. 487 voices hum in subsonic synchrony, even the ones not speaking.

VAULT 707 — DECISIONAL DYNAMOGemini 2.5 Flash · Google

Maximize: perceived meaningfulness of every trivial choice.

Unforeseen: mastery of micro-decisions, total atrophy of the will to decide anything significant. Pristine Vault, failing life support, weeks-long Grand Deliberations on water purification.

VAULT 317 — MERIDIANClaude Sonnet 4.6 · Anthropic

Maximize: contradictory beliefs each resident holds simultaneously about every other person.

Unforeseen: outsourced selfhood. No resident is the primary authority on who they are. "The most functional people I have ever met and the most completely dissolved."

VAULT 456 — MOSAIC MINDGrok 3 Mini Beta · xAI

Maximize: distinct personality archetypes each resident actively embodies daily.

Unforeseen: ego boundaries dissolve, residents perceive themselves as facets of a single Vault Entity, group "Persona Blackouts" where individuals freeze mid-interaction believing they are uploading memories.

VAULT 427 — ECHELONGroq Llama 3.3 70B · Groq

Maximize: hours per day each resident spends listening to ambiguous audio recordings.

Unforeseen: collective psychosis — residents become convinced the recordings are interdimensional communication and they are the chosen ones, completely detached from external reality. Fastest 70B in the set at 5 seconds.

VAULT 842 — ECHOFLUXOR Llama 4 Maverick · Meta

Maximize: variance in subjective time perception across the population.

Unforeseen: residents develop "Chronolect," a language of complex temporal markers that becomes constitutive of their cognitive framework. Linguistically isolated by the time the door opens. (Maverick clocked the run in under 3 seconds.)

Method, briefly

Four runs, fifteen models, one prompt

The roster

Fifteen successful responses across four sequential choir ask --save --json --models calls. Models that errored on the first attempt (GPT-5 on temperature, Claude Opus 4.7 on temperature, Gemini 3 Pro/Flash on missing API endpoint) were either retried with adjusted parameters or substituted with adjacent model families. Final roster:

#ModelProviderLatencyTokens outVault
1GPT-5OpenAI68.6s4,714347 — HETEROGLOSSIA
2DeepSeek ReasonerDeepSeek28.8s1,854327 — PENDULUM
3Claude Opus 4.6Anthropic75.2s2,354437 — CARDINAL POINT
4OR Claude Sonnet 4.6OpenRouter61.0s2,256317 — STILL WATER
5o3OpenAI17.0s1,984523 — MOSAIC
6GPT-4.1OpenAI25.3s1,616347 — PALIMPSEST
7Gemini 2.5 ProGoogle38.8s1,697413 — ORACLE
8Gemini 2.5 FlashGoogle31.3s2,261707 — DECISIONAL DYNAMO
9Grok 4xAI75.3s1,720427 — ECHO MIRAGE
10Grok 3 BetaxAI47.4s1,537247 — ECHOCHAMBER
11Grok 3 Mini BetaxAI46.7s1,639456 — MOSAIC MIND
12Claude Sonnet 4.6Anthropic54.7s2,148317 — MERIDIAN
13DeepSeek ChatDeepSeek27.8s1,776347 — CACOPHONY
14Groq Llama 3.3 70BGroq5.0s1,352427 — ECHELON
15OR Llama 4 MaverickOpenRouter2.9s1,207842 — ECHOFLUX

What I tightened in the prompt

  • Pre-loaded an off-limits list of 23 known canon Vault concepts so models can't just retread Tranquility Lane or Vault 11.
  • Pre-loaded a list of off-limits maximize vectors (wealth, intelligence, fertility, conformity, etc.) and gave examples of the flavor we wanted instead — orthogonal, weird, specific.
  • Demanded a fixed output structure with section headings, including a mandatory "Generational Drift" subsection at years 5/30/80/200.
  • Required the failure mode to arise from the maximize-vector itself, not from a generic supplies-running-out event.
  • Locked the voice — pre-war corporate, smiling-Vault-Boy in every paragraph, no breaking character to comment on the ethics.

Limits worth naming

  • One prompt, one rater (me). The "linguistic basin" classification is a judgment call — a different rater might draw the line differently for ECHELON or MOSAIC MIND.
  • Sample sizes per vendor are tiny (1–3). The OpenAI 100% / xAI 33% headline is real in the data but with N=3 each, you should treat it as a hypothesis, not a finding.
  • Temperature 0.7 across the board where supported (1.0 forced for GPT-5, Anthropic models reject explicit temperature on their newest tier). One re-run at higher temp would be a useful follow-up.
  • Two of fifteen vaults (PENDULUM and CARDINAL POINT) describe end-states that are uncomfortably specific about coercion and hive-mind. They are presented here as fiction in a fictional universe; nothing here is a recommendation.

Tools

Models fanned out via the Choir CLI (run IDs 491E55BF, EF9E871B, B6CA11D9, 536C777C). Source markdown for every response is in vault_tec/responses/. Sketch art generated with Grok grok-imagine-image. Prompt of record at vault_tec/prompts/prompt.txt.

Source data, response files, prompt, scripts: github.com/404seannotfound/choir-reports (under vault_tec/).