April and the Dragon Staff v2
A natural-prose retelling of the same story and arc as v1, with the actual flow-arts safety ritual the inspirer practices (perimeter, eye contact with your safety, close the fuel, spin off, call the body part if you catch fire, smother) and Cypher — an older tiger who spins poi — as April's first teacher. Same 98 Pixar-storybook illustrations; the form follows the moment now rather than the meter.
The numbers
A natural-prose retelling of v1, same arc and same illustrations. 37% shorter in words, 20 minutes shorter in audio. The verse stayed only where it earned its keep — about a third of v1's line count.
shorter than v1
The form follows the moment
v1 was strict AABB throughout — every couplet rhymed. v2 drops the rule and lets the prose breathe. Verse stays only where it earns its keep: April's teaching chants, the friends' echoed responses, the final reversed chorus. The climax in v1 was a blank line where Mango's rhyme should have been. The climax in v2 is Mango's confession the morning after, stripped of every exclamation mark she usually carries.
The friends sat close on the cobblestones — closer than they had sat all week. No one spoke. A lantern across the square guttered and steadied. Mango's wings were folded so tight they almost disappeared. She spoke first. "I didn't close the fuel." Her voice was low, stripped of every exclamation mark. "I knew the step. I've sung it back a hundred times. But my hands were already reaching for the spin, and my mouth said the words and my hands did not do them."
The cast — and their voices
Plus George (Narrator). Cypher replaces Laurel from v1 — he is based on the inspirer's real teacher; he spins poi on chains (not staff), and his voice is Bill — a wise mature register.
Cost breakdown
v2 reused all 98 v1 illustrations + the cover, so image-gen cost was almost nothing — just one new Cypher portrait (~$0.07) and zero re-renders. TTS dropped from 490 to 400 segments. The big metered cost line stays ElevenLabs.
| Bucket | Volume | Out-of-pocket | At-full-rates |
|---|---|---|---|
xAI grok-imagine — grok-imagine-imagejust 1 new Cypher portrait (older tiger replacing v1's Laurel); all 98 beats + cover + 5 friend portraits reused from v1 |
1 image | ~$0 xAI credit grant |
~$0.07 |
Claude Code orchestration — claude-opus-4-6v2 chapter weave + segmentation; arc/judges/spread reused from v1 |
~3M tokens | $0 Claude Pro/Code subscription |
~$10 |
ElevenLabs TTS — eleven_multilingual_v2400 voice-tagged segments across 7 voices; zero quota issues this time (took 104 s end to end) |
400 requests · 33k chars | ~$6.00 $0.015/request, Creator subscription |
~$6 |
Choir LLM calls — claude-opus-4-6v2 chapter weave (10 in parallel) + v2 audio segmentation (10 in parallel) |
~20 calls · ~1.2M tokens | ~$0 | ~$2 |
| Total this version | ~$6 | ~$28 | |
The v2 savings. v2 didn't run arc fan-out, judging, spread, character generation, factor generation, cover render, or 5 of the 6 portrait renders — all of those were inherited from v1. The only fresh model calls were the 10 chapter weaves (with the new style guide + new safety steps), 10 audio segmentations, and 1 new Cypher portrait. ElevenLabs was the only line that scaled with the new prose, and it dropped from 490 to 400 segments — call-and-response chapters used to require a tag-switch per couplet; natural prose lets the narrator carry longer stretches before a voice swaps in.
Timing
v2 was fast because most stages were reused from v1. Active runtime for the new work: about 10 minutes.
| Stage | What runs | Wall clock |
|---|---|---|
| 0 · v2 brief capture | Inspirer notes: drop AABB, real safety steps, Cypher (older tiger) replaces Laurel, compress | ~3 min |
| 1–5 · Arcs / judges / spread / style / preview | Reused from v1 as-is. Same spine arc (Silent Echo), same mix-ins, same factors, same timeline. | — |
| 3' · Cast tweak | Replace Laurel with Cypher in characters.json; update style envelope's per-character anchor block | < 5 s |
| 6 · Cypher portrait | 1 new render via Grok-Imagine (other 5 portraits + cover reused) | ~30 s |
| 7 · v2 chapter weave | 10 parallel chapters with the new natural-prose style guide + new safety steps | ~90 s |
| 8 · Beats | Reused all 98 illustrations from v1. The visual moments map cleanly to the prose retelling. | — |
| 9 · v2 compile | novel.html + Chrome-headless PDF | ~10 s |
| 10a · v2 audio segmentation | 10 parallel prose-aware segmentations (Cypher tag swapped in for Laurel) | ~60 s |
| 10b · v2 TTS render | 400 ElevenLabs segments, 5 concurrent, zero quota issues | 104 s |
| 10c · v2 stitch + tag + inject | Same chime + stitch + ID3 polish + player injection pipeline as v1 | ~30 s |
| 11 · v2 publish | scripts/publish-book.sh april-and-the-dragon-staff-v2 | ~60 s |
| Total v2-only work (excluding inherited stages) | ~10 min | |
What changed in v2
Same story, same arc, same illustrations. Different prose, different safety steps, different teacher.
Form: AABB → natural prose with selective verse
Safety ritual: invented → real flow-arts steps
Teacher: Laurel → Cypher (older tiger, poi on chains)
Length: 59 min → 38 min audiobook (37% shorter)
What stayed the same
The tooling
Same stack as v1. The only voice swap: Matilda → Bill for Cypher (mature mentor register for the older tiger).
choir
Sean's CLI for routing prompts to any model. v2 used it for the 10-chapter weave and 10-chapter audio segmentation only — arc fan-out and judging panel were inherited from v1.
Anthropic Opus 4.6
Claude Opus 4.6
Wrote every chapter, segmented every chapter for TTS, identified beats. v2 used the same model as v1 but with a different per-project style guide (drop AABB; verse only where it serves).
~$3 / 1M input · $15 / 1M output
xAI grok-imagine
Just 1 new render in v2: Cypher's portrait (older tiger spinning poi). All 98 beats + cover + 5 friend portraits reused from v1's renders.
~$0.07 per image
ElevenLabs multilingual_v2
7 voices: George (narrator), Sarah (April + chorus), Jessica (Mango), Lily (Fern), Laura (Riddle), Roger (Dusty), Bill (Cypher) — swapped in for Matilda from v1 to give Cypher a mature-male-mentor register fitting an older tiger.
$0.015 per request on Creator subscription
ffmpeg
Same three-note bell chime synthesized in v1 (C4 → E4 → G4 with reverb tail). Same per-chapter concat + ID3 metadata sidecar with chapter markers.
Free
mutagen
Same ID3v2.3 polish pass as v1: APIC cover, TIT2/TALB/TPE1/TCOM/TYER, named CTOC with TOP_LEVEL+ORDERED flags. Per-chapter mp3s tagged TRCK n/10.
Free
Chrome --headless --print-to-pdf
Same setup as v1. v2 PDF is 38 MB at full-resolution beat JPEGs and lives only in the source repo.
Free
scripts/publish-book.sh
The publish pipeline introduced for v1's publish — copies assets jpeg→jpg, re-encodes audio to 64 kbps mono 22050 Hz, re-tags ID3v2.3 + imports CHAP frames, rebuilds web novel.html, inserts the book card. v2 ran it with the slug april-and-the-dragon-staff-v2 in ~60 s.
Free
Read it / hear it
Per-chapter mp3s live inside the novel page. Audio files are also in /books/april-and-the-dragon-staff-v2/audio/. The master audiobook.mp3 has ID3v2.3 chapter markers + embedded cover artwork — Apple Podcasts, Overcast, and Pocket Casts will show a clickable chapter list.
Want to see the original? Click v1 in the tabs above.
April and the Dragon Staff
A 10-chapter children's picture book in strict AABB rhymed verse — call-and-response between April the rainbow koala and her four friends, with 98 Pixar-storybook illustrations and a 59-minute audiobook narrated with 7 voices. The form does the moral: at the climax, the skipped friend's response couplet becomes a literal silence on the page and in the ear.
The numbers
A four-sentence brief became a 59-minute illustrated audiobook in strict rhymed verse. Here's the count of everything that got generated along the way.
The form does the moral
The book is in strict AABB rhymed verse with a call-and-response structure: April issues a call couplet, and the friends issue a response couplet. The whole book sings this back and forth — until chapter 8, when one friend skips a step in the safety ritual and her response couplet simply doesn't come. The page holds two blank lines where the rhyme should land. The audiobook holds 2.8 seconds of silence. Then the troupe catches the mistake together, and from chapter 10 the ritual returns — but inverted: the friends call, and April responds. The form earns the closing.
"The dragon staff has three bright wicks,
and fire is not for careless tricks."
"We hear you, April, loud and true —
now teach us what the dragon can do!"
…
"Now shake off the extra before the fire can grow.
Three flicks from the wrist and the excess falls free —
the burn-off comes FIRST. Mango — answer me."
(silence — two blank lines)
The night holds its breath. Not a bell. Not a sound.
Mango's wings are already mid-sweep off the ground.
The cast — and their voices
Plus George (Narrator) — warm British storyteller, the constant voice across all 10 chapters.
Cost breakdown
Two ways to read this. Out-of-pocket is what actually hit a credit card. At-full-rates is what a fresh account with no credits or subscriptions would have paid for the same work.
| Bucket | Volume | Out-of-pocket | At-full-rates |
|---|---|---|---|
xAI grok-imagine — grok-imagine-imagecover + 6 character portraits + 98 chapter beats + 1 style verification render |
106 images | ~$0 xAI credit grant |
~$7 |
Claude Code orchestration — claude-opus-4-6the agent loop running the pipeline; heavy prompt caching brings effective rate to ~$3/M. |
~5M tokens | $0 Claude Pro/Code subscription |
~$15 |
ElevenLabs TTS — eleven_multilingual_v2490 voice-tagged segments across 7 voices; 463 rendered before the monthly Creator quota hit, 27 finished after top-up |
490 requests · 52k chars | ~$7.35 $0.015/request, Creator subscription |
~$10 |
Choir LLM calls — claude-opus-4-6 + gemini-2.5-pro + gpt-4.1arc fan-out (6 shapes × 3 models), 2-judge panel, spread, chapter weave, beats ID, audiobook segmentation |
~55 calls · ~2M tokens | ~$0 | ~$3 |
| Total this book | ~$8 | ~$35 | |
What absorbed the cost. xAI's image-gen credit grant covers all 106 renders here (this book is xAI-only on images — Grok-Imagine nails the Pixar Storybook 3D aesthetic on the first try, so OpenAI didn't need to run). ElevenLabs lands at $0.015 per request on Sean's Creator subscription — the real out-of-pocket line on this book. 490 requests came from the call-and-response form: every voice switch is its own segment, so the segmentation count is denser than a prose audiobook. Claude Code is a flat monthly subscription that covers all the agent's planning and tool calls.
What a fresh account would pay. ~$35 at-full-rates is what you'd spend if you opened brand-new accounts at every provider and rebuilt this exact book. About a third less than Can't Have Nice Things (~$42) — chiefly because this book renders illustrations only with Grok-Imagine ($0.07/image) instead of OpenAI's gpt-image-1 ($0.04–$0.17/image), and because the audiobook is half the length.
Timing
Wall-clock time for each stage. Everything that could parallelize did. Total active runtime from "go" to a finished audiobook: roughly 20 minutes (plus a brief pause mid-way through chapter 10 to top up ElevenLabs).
| Stage | What runs | Wall clock |
|---|---|---|
| 0 · Interview | Capture the four-sentence seed brief + 4 follow-up choices (rhyme, cast, stakes, setting) | ~3 min |
| 1 + 2 · Arcs + judge | 18 parallel arc generations (6 shapes × 3 models), then a 2-judge panel scoring all 18 against the brief | ~2 min |
| 3 · Spread | Timeline + factors + cover subject in parallel, then character profiles (depends on timeline) | ~2 min |
| 4 · Style gallery | Skipped. Style pre-locked to Pixar Storybook 3D by a reference image during the interview, then verified with a single test render. | — |
| 5 · Preview site | Skipped. Auto-selected the highest-judged arc (Silent Echo, both judges agreed by a 6-point margin) and grafted the best mix-ins from the other top arcs. | — |
| 6 · Lock-in (cover + portraits) | 1 cover + 6 character portraits in parallel | ~30 s |
| 7 · Chapter weave | 10 parallel chapters; each prompt embeds the full arc + timeline + characters + 4 mix-ins + AABB style guide | ~90 s |
| 8a · Beat identification | 10 parallel beat-ID calls; output 98 distinct beats (8–11 per chapter) | ~90 s |
| 8b · Beat illustration | 98 Grok-Imagine renders, 8 concurrent via Python ThreadPoolExecutor — zero failures | 92 s |
| 9 · Compile | novel.html assembly + Chrome-headless print to PDF | ~10 s |
| 10a · Audio segmentation | 10 parallel verse-aware segmentations (never split mid-couplet; chapter-8 blank-echo preserved as a 2.8 s silence segment) | ~60 s |
| 10b · TTS render | 490 ElevenLabs segments via eleven_multilingual_v2, 5 concurrent (Creator tier). 463 in the first pass; quota cap; ~30 s pause to top up; 27 more in 10 s. | ~2:30 + pause |
| 10c · Chime + stitch + tag + inject | Synth 3-note bell chime (ffmpeg sine + reverb), stitch 10 chapter mp3s + master, embed ID3v2.3 chapter markers + cover art, inject audio players into novel.html | ~45 s |
| Total (active runtime, end to end) | ~15 – 20 min | |
The process
Ten stages, each a separate slash command or script. The early ones generate a menu of options; the late ones commit to picks and bake the cake. Two stages were skippable for this book — the style was pre-locked by a reference image, and the preview wizard was auto-resolved because the judges agreed by a wide margin.
0 · Capture the seed (/scriptorium-interview)
brief.md and every downstream prompt reads it.1 + 2 · Wide arc fan-out + judging (/scriptorium-arcs)
3 · Spread the survivor + grafts (/scriptorium-spread)
4 · Visual style — skipped
styles/chosen.md.5 · Preview wizard — auto-resolved
6 · Lock-in: cover + portraits (/scriptorium-lock)
arc.md, mix-ins → arc_mixins.md, characters → characters.json, style envelope → styles/chosen.md. Then the cover (April calling, three friends responding, festival lanterns at blue-hour dusk — locked on the first render) and 6 character portraits (April, Mango, Fern, Riddle, Dusty, Laurel) rendered in parallel via Grok-Imagine.7 · Weave the chapters (/scriptorium-chapter N)
8 · Illustrate the beats (/scriptorium-illustrate N)
9 · Compile the novel (/scriptorium-compile)
10 · Audiobook (/scriptorium-audiobook)
multilingual_v2 renders 490 segments using 7 voices (Sarah for April, Jessica for Mango, Lily for Fern, Laura for Riddle, Roger for Dusty, Matilda for Laurel in chapter 9, and George as the constant narrator). Each chapter opens with a synthesized three-note bell chime (ffmpeg sine + aecho, C4 → E4 → G4 with a soft reverb tail) — a small audio sting that gives chapter transitions proper audiobook weight. ffmpeg concat stitches per-chapter mp3s (0.4 s between segments, 1.2 s + chime + 1.0 s between chapters); ID3v2.3 chapter markers + embedded APIC cover artwork are baked in so podcast apps surface a clickable chapter list. Audio players are then injected into novel.html as a hero player after the cover plus a per-chapter player above each chapter.11 · Publish (scripts/publish-book.sh <slug>) — new pipeline step
scripts/publish-book.sh codifies the 7 publish steps documented in PROCESS.md §6: copy assets (jpeg → jpg), re-encode the audiobook to 64 kbps mono 22050 Hz for web delivery (54 MB → 28 MB, under the GitHub Pages 100 MB cap), re-tag with ID3v2.3 + import CHAP frames from the source master, rebuild novel.html with retargeted paths and a sean-makes-stuff book-nav, insert a .book-card in books/index.html, and bump the "N books shipped" count on the root page. --minimal (default) is what this book shipped under; --full additionally generates this very detail page.The tooling
Every step is a shell-out, not a library binding — the pipeline is a stack of bash scripts and Python helpers. Sibling of choir and scriptorium.
choir
Sean's CLI for routing prompts to any model across providers. Single-model: plain text out. Multi-model: JSON. --save persists a comparison run; choir runs compare appends a judge summary later.
Anthropic Opus 4.6 · OpenAI GPT-4.1 · Google Gemini 2.5 Pro
Claude Opus 4.6
Primary writer. Wrote every chapter, identified every beat, and segmented every chapter for TTS. All 5 top-judged arcs and all 5 chosen mix-ins came from Opus runs; GPT-4.1 and Gemini arcs all collapsed against the Opus equivalents in the judging panel.
~$3 / 1M input · $15 / 1M output (with heavy prompt caching across chapter weaves)
Gemini 2.5 Pro
The second judge in the arc panel — different model family for diversity. Both judges ranked Silent Echo #1 by a 6-point margin, and the GPT-4.1 / Gemini arc proposals all collapsed against the Opus ones.
~$1.25 / 1M tokens
xAI grok-imagine
All 105 images: the cover, 6 character portraits, 98 chapter beats, plus 1 style verification render. Grok-Imagine nailed Pixar Storybook 3D on the first try — rainbow gradient fur, gentle subsurface scattering, warm rim lighting, soft painterly post-processing.
~$0.07 per image
ElevenLabs multilingual_v2
7 voices from the premade catalog: George (warm British storyteller, narrative_story) as the narrator; Sarah, Jessica, Lily, Laura, Roger, Matilda for April + 4 friends + Laurel. Voice settings: stability 0.55, similarity_boost 0.78, style 0.15 (a slight theatrical lift for verse meter).
~$0.18 – $0.30 per 1,000 characters
ffmpeg
Chapter intro chime synthesized from three sine waves (C4 / E4 / G4 with staggered adelay, exponential afade envelopes, soft aecho reverb tail, 4.5 kHz lowpass for warmth). Per-chapter concat with anullsrc silence files; ID3 chapter markers from a metadata sidecar.
Free
mutagen
Re-encode strips ID3, so a polish pass writes ID3v2.3 (some podcast apps drop CHAP markers in v2.4): APIC cover artwork, TIT2/TALB/TPE1/TCOM/TYER, named CTOC with TOP_LEVEL | ORDERED flags. Per-chapter mp3s get TRCK n/10 so apps show them as a series.
Free (Python lib)
Chrome --headless --print-to-pdf
Print CSS hides the PDF nav bar and audio players; chapter sections page-break-before, beat figures page-break-inside avoid. The PDF is 38 MB at full-resolution beat JPEGs and lives only in the source repo — too big for GitHub Pages.
Free
Read it / hear it
Per-chapter mp3s live inside the novel page — each chapter has its own <audio> player above the verse. The chapter audio files are also in /books/april-and-the-dragon-staff/audio/ if you want to drop them into a podcast app or pull them down individually. The master audiobook.mp3 has ID3v2.3 chapter markers + embedded cover artwork, so apps like Apple Podcasts, Overcast, and Pocket Casts will show a clickable chapter list.