Scriptorium

A pipeline for building illustrated novels end-to-end. Brief in — story, art, layout, PDF, and audiobook out. The same multi-model fan-out and judging pattern as Choir, applied to making books.

10-stage pipeline Multi-model fan-out Internal judging panel Folk-art · gouache · cut-paper · etc. TTS audiobooks

Scriptorium takes a four-sentence brief and produces a finished illustrated novel: a readable HTML edition with art interleaved through the prose, a printable PDF, and a narrated audiobook with distinct voices for each character. The pipeline is ten short stages, each one a slash command. Most stages fan out across multiple premium models, run an internal judge to pick winners, and surface named elements (two-word evocative names like "Porch Rituals", "Threadbare Cloth") so the inspirer can mix and match across candidates instead of choosing between numbered options.

Everything stays on your Mac. Pipeline reads brief from disk, calls models via Choir + image-gen APIs + ElevenLabs, writes outputs to a per-book folder. No Scriptorium server, no upload, no cloud step you didn't approve.

How It Works — 10 Stages

Each stage is a slash command that consumes the previous stage's outputs and produces named, structured artifacts on disk. Most stages fan out across multiple models and run an internal judge to pick winners. The first six produce a menu; the last four bake the cake.

Stage 0

Interview

A short conversation captures the inspirer's intent. Title, concept, voice references, POV, tone, themes, boundaries, visual aesthetic. Captured in brief.md.

Stage 1+2

Arc fan-out + judging

Six structural shapes (tragic, comic, mystery, picaresque, ironic, wildcard) × three premium models = up to 18 candidate arcs. Two judges score each on inventiveness, coherence, stakes, illustrability, voice match. Top 5 named survivors.

Stage 3

Spread

Each survivor gets a chapter-by-chapter timeline, named characters, named subplots, named pivotal events, themes, and a cover subject. All five spread in parallel.

Stage 4

Style gallery

24 named visual styles in 6 families. Filtered by the brief (e.g. "no watercolor, no sketches" prunes 8 styles). Two sample illustrations per surviving style. The inspirer picks one in Stage 6.

Stage 5

Preview site

A browsable static site assembles the 5 arcs with gantt charts, the style gallery, and a six-step lock-in wizard. The inspirer reviews and decides.

Stage 6

Lock-in

Commit to one spine arc, optional named elements mixed in from other arcs, one visual style, the cover, and the character roster. Cover and portraits get re-rendered in the chosen style.

Stage 7

Chapter weave

Each chapter is fanned out across two models, judged for momentum / voice / continuity / prose / scene shape. Winner promoted to chosen.md.

Stage 8

Illustrate the beats

For each chapter, identify 6+ distinct narrative beats at six page-positions (opening / early / midchapter / turning / climax / closing). Each beat is rendered once in the locked style, interleaved with the prose.

Stage 9

Compile

Cover + table of contents + chapters with beats inline + back matter, in one parchment-toned HTML. Chrome-headless print pass produces the PDF.

Stage 10

Audiobook

Each chapter is segmented by Claude into voice-tagged dialogue / narration pairs. ElevenLabs renders each segment with the per-character voice. ffmpeg stitches segments with pauses.

The Lock-in Site

Stage 5 builds a browsable static site the inspirer reviews before committing to a spine arc, mix-in elements, visual style, cover, and cast. The pages are real artifacts from This Is Why We Can't Have Nice Things.

Project landing page with five arc candidates as cards in a grid, each showing the cover thumbnail, structural shape, judge score, and pitch.
Landing — the five arcs. Each surviving arc shows its cover thumbnail, structural shape, judge mean score, and a two-paragraph pitch. The inspirer clicks into any card for the full detail page.
Detail page for the 'Porch Rituals' arc showing logline, hook / climax / resolution, cover thumbnail, and a horizontal gantt chart of named subplots with pivotal-event markers across nine chapters.
Arc detail — gantt chart. Every arc detail page includes the full logline, hook / climax / resolution, and a gantt chart showing the named subplots' chapter spans plus pivotal-event markers. Click into one to read the whole spec.
Visual styles gallery showing eleven named styles across five families: Comics & Sequential (Sunday Strip, Tintin Line, Riso Print), Kid-Book Painting (Gouache Storybook, Crayon Wax, Cut Paper), Print & Engraving, Decorative & Story-Specific, and Project-Specific.
Visual style gallery. Two sample illustrations per style, all rendered from the same constant SUBJECT so the difference is purely the envelope. Filtered by the brief (this book said "no watercolor / no sketches" so eight styles got pruned).
The lock-in wizard in step 1 of 6 — Pick the spine arc — showing five radio-button cards each with an arc cover and pitch. Back and Next buttons at the bottom.
Six-step lock-in wizard. Pick a spine arc, then mix in named elements from the other four arcs, then pick a style, approve a cover, choose portraits, leave notes. Picks persist in localStorage; the final summary appears only after step 6.

Why It Works

Fan-out beats single-shot

A single Opus call gives you a good arc. Six structural shapes × three models × two judges gives you the best of 18, named and ranked. Same logic applies at every stage of the pipeline.

Names carry across stages

Every subplot, event, factor, and character gets a two-word evocative name ("Iron Letters", "Porch Rituals"). Named candidates are easy to discuss; numbered ones aren't. The chapter-weave prompt mentions mix-ins by name. The lineage doc reads like a recipe.

Beats, not banner art

Six or more distinct narrative moments per chapter, each rendered once and interleaved with the prose at proportional positions. The book reads as a graphic-novel-meets-prose hybrid, not a novel with a chapter heading image.

Inspirer-led picks

The pipeline produces a menu of named options at every decision point. The inspirer picks via a wizard with localStorage state. No model "decides" the arc, the cover, or the cast.

Style is reflexive

The visual style envelope wraps every chapter SUBJECT. For a book about a wagon, the style was folk-art painting on a wagon panel — the visual language echoes the narrative form.

Parallel where possible

Within each stage, work that can parallelize does. 73 illustrations don't render serially. 296 TTS segments don't render serially. End-to-end wall clock for a 9-chapter book: ~25–35 minutes.

What's Under the Hood

Claude Opus 4.7

Primary writer. Arcs, chapter weave, beat identification, audiobook segmentation.

xAI Grok 4 · Google Gemini 3 Pro

Diversity in fan-outs and second-judge panel. Gemini judging matches Opus quality at a fraction of the cost.

OpenAI gpt-image-1

Primary illustrator. Renders 1024×1024 PNG. Cleanest output for folk-art / gouache / kid-book traditions.

xAI grok-imagine · Imagen 4

Style variety. Grok for newspaper-comic and crayon-wax; Imagen for stained glass, linocut, lantern-ink.

ElevenLabs multilingual_v2

TTS. 11 premade voices per book, mapped per character. Renders in seconds per segment.

Choir

The CLI that routes prompts to any model. Scriptorium shells out to choir ask at every LLM step.