Version
You're reading v2 — natural prose retelling with the inspirer's actual flow-arts safety ritual and Cypher (an older tiger who spins poi) as April's first teacher. Same arc and same 98 illustrations as v1; the prose breathes more, the verse only lands where it earns its keep, and the audiobook is 20 minutes shorter.
April and the Dragon Staff (v2) cover
Books · v2 retelling · Generated end-to-end with LLMs and image/audio models

April and the Dragon Staff v2

A natural-prose retelling of the same story and arc as v1, with the actual flow-arts safety ritual the inspirer practices (perimeter, eye contact with your safety, close the fuel, spin off, call the body part if you catch fire, smother) and Cypher — an older tiger who spins poi — as April's first teacher. Same 98 Pixar-storybook illustrations; the form follows the moment now rather than the meter.

The numbers

A natural-prose retelling of v1, same arc and same illustrations. 37% shorter in words, 20 minutes shorter in audio. The verse stayed only where it earned its keep — about a third of v1's line count.

6,209
words
356
verse lines
10
chapters
98
chapter beats
38:46
audiobook duration
7
distinct voices
400
TTS segments
~55
LLM calls
33k
TTS characters
37%
shorter than v1
vs v1

The form follows the moment

v1 was strict AABB throughout — every couplet rhymed. v2 drops the rule and lets the prose breathe. Verse stays only where it earns its keep: April's teaching chants, the friends' echoed responses, the final reversed chorus. The climax in v1 was a blank line where Mango's rhyme should have been. The climax in v2 is Mango's confession the morning after, stripped of every exclamation mark she usually carries.

The friends sat close on the cobblestones — closer than they had sat all week. No one spoke. A lantern across the square guttered and steadied.

Mango's wings were folded so tight they almost disappeared.

She spoke first.

"I didn't close the fuel." Her voice was low, stripped of every exclamation mark. "I knew the step. I've sung it back a hundred times. But my hands were already reaching for the spin, and my mouth said the words and my hands did not do them."

The cast — and their voices

April
April
Sarah
Rainbow koala teacher
Mango
Mango
Jessica
Fastest fruit bat
Fern
Fern
Lily
Careful tree frog
Riddle
Riddle
Laura
Questioning parrot
Dusty
Dusty
Roger
Steady wombat
Cypher
Cypher
Bill
April's old teacher (Ch. 9) — older tiger, spins poi

Plus George (Narrator). Cypher replaces Laurel from v1 — he is based on the inspirer's real teacher; he spins poi on chains (not staff), and his voice is Bill — a wise mature register.

Cost breakdown

v2 reused all 98 v1 illustrations + the cover, so image-gen cost was almost nothing — just one new Cypher portrait (~$0.07) and zero re-renders. TTS dropped from 490 to 400 segments. The big metered cost line stays ElevenLabs.

~$6
Out-of-pocket
~$28
At full provider rates
BucketVolumeOut-of-pocketAt-full-rates
xAI grok-imaginegrok-imagine-image
just 1 new Cypher portrait (older tiger replacing v1's Laurel); all 98 beats + cover + 5 friend portraits reused from v1
1 image ~$0
xAI credit grant
~$0.07
Claude Code orchestrationclaude-opus-4-6
v2 chapter weave + segmentation; arc/judges/spread reused from v1
~3M tokens $0
Claude Pro/Code subscription
~$10
ElevenLabs TTSeleven_multilingual_v2
400 voice-tagged segments across 7 voices; zero quota issues this time (took 104 s end to end)
400 requests · 33k chars ~$6.00
$0.015/request, Creator subscription
~$6
Choir LLM callsclaude-opus-4-6
v2 chapter weave (10 in parallel) + v2 audio segmentation (10 in parallel)
~20 calls · ~1.2M tokens ~$0 ~$2
Total this version ~$6 ~$28

The v2 savings. v2 didn't run arc fan-out, judging, spread, character generation, factor generation, cover render, or 5 of the 6 portrait renders — all of those were inherited from v1. The only fresh model calls were the 10 chapter weaves (with the new style guide + new safety steps), 10 audio segmentations, and 1 new Cypher portrait. ElevenLabs was the only line that scaled with the new prose, and it dropped from 490 to 400 segments — call-and-response chapters used to require a tag-switch per couplet; natural prose lets the narrator carry longer stretches before a voice swaps in.

Timing

v2 was fast because most stages were reused from v1. Active runtime for the new work: about 10 minutes.

StageWhat runsWall clock
0 · v2 brief captureInspirer notes: drop AABB, real safety steps, Cypher (older tiger) replaces Laurel, compress~3 min
1–5 · Arcs / judges / spread / style / previewReused from v1 as-is. Same spine arc (Silent Echo), same mix-ins, same factors, same timeline.
3' · Cast tweakReplace Laurel with Cypher in characters.json; update style envelope's per-character anchor block< 5 s
6 · Cypher portrait1 new render via Grok-Imagine (other 5 portraits + cover reused)~30 s
7 · v2 chapter weave10 parallel chapters with the new natural-prose style guide + new safety steps~90 s
8 · BeatsReused all 98 illustrations from v1. The visual moments map cleanly to the prose retelling.
9 · v2 compilenovel.html + Chrome-headless PDF~10 s
10a · v2 audio segmentation10 parallel prose-aware segmentations (Cypher tag swapped in for Laurel)~60 s
10b · v2 TTS render400 ElevenLabs segments, 5 concurrent, zero quota issues104 s
10c · v2 stitch + tag + injectSame chime + stitch + ID3 polish + player injection pipeline as v1~30 s
11 · v2 publishscripts/publish-book.sh april-and-the-dragon-staff-v2~60 s
Total v2-only work (excluding inherited stages)~10 min

What changed in v2

Same story, same arc, same illustrations. Different prose, different safety steps, different teacher.

Form: AABB → natural prose with selective verse
v1 was anapestic tetrameter with AABB couplets, every page. v2 drops the rule and writes in natural English, letting verse appear only where the moment calls for it — April's teaching chants, the friends' echoed responses, the final reversed chorus in chapter 10. The chapter writer was given an explicit instruction: read your output aloud — if a couplet doesn't sing, write it as prose. Result: 356 verse lines down from 1,083; 6,209 words down from 9,895 (37% shorter); audiobook 38:46 down from 58:41 (20 min shorter).
Safety ritual: invented → real flow-arts steps
v1's ritual was AI-imagined (wet towel, pat-down, breath count, burn-off). v2 uses the inspirer's actual flow-arts safety ritual: perimeter, eye contact with your safety, close the fuel, spin off (or spin slow), call the body part if you catch fire, smother, rotate. The climax skip is now Mango forgetting to close the fuel can — an open fuel can near fire is the most common preventable disaster in real flow arts. Her safety (Dusty) yells "ARM!" — never "left arm", because every spinner is dyslexic in a panic. Three seconds. Yell again. Mango can't put it out. Mango lies down. Dusty smothers. The fire goes out because we take away the air. SMOTHER is treated with reverence — not metaphor but physics.
Teacher: Laurel → Cypher (older tiger, poi on chains)
v1's Laurel was a silver-grey koala who spun dragon staff like April. v2's Cypher is an older tiger who spins poi — twin wicks on chains. Cypher is grounded in the inspirer's real first teacher. The ritual translates across props — Cypher taught April on poi, she carried it to the staff she chose. In the chapter 9 reveal, April retells the memory of Cypher in mostly prose with a few verse moments; his voice (when she retells what he said) is low, unhurried, with strong nouns: "Perimeter. Eye contact. Close the fuel." Cypher's voice in the audiobook is Bill (wise mature mentor register), replacing Matilda from v1.
Length: 59 min → 38 min audiobook (37% shorter)
The inspirer asked for ~30 minutes; the model came in at 38:46 because the prose was too tight to cut further without losing language worth keeping. Compression came from two places: (1) prose is denser than verse for the same image — a stanza of four couplets describing Mango landing can be one sentence in prose, (2) call-and-response chapters in v1 required a voice tag-switch per couplet, multiplying the segment count; natural prose lets the narrator carry longer stretches between voice swaps.
What stayed the same
The Silent Echo arc (call-and-response form, missing-response climax, friends catch the mistake together, reversed chorus closing). The 98 chapter beat illustrations — every one of them. The cover, the rainbow-gradient title, the Pixar storybook aesthetic. The chapter-intro chime (warm three-note bell). ID3v2.3 chapter markers, APIC cover, podcast-app-friendly TOC. The cast — April, Mango, Fern, Riddle, Dusty — and their voice signatures.

The tooling

Same stack as v1. The only voice swap: Matilda → Bill for Cypher (mature mentor register for the older tiger).

Fan-out + judging

choir

Sean's CLI for routing prompts to any model. v2 used it for the 10-chapter weave and 10-chapter audio segmentation only — arc fan-out and judging panel were inherited from v1.

Anthropic Opus 4.6

Text generation

Claude Opus 4.6

Wrote every chapter, segmented every chapter for TTS, identified beats. v2 used the same model as v1 but with a different per-project style guide (drop AABB; verse only where it serves).

~$3 / 1M input · $15 / 1M output

Image generation

xAI grok-imagine

Just 1 new render in v2: Cypher's portrait (older tiger spinning poi). All 98 beats + cover + 5 friend portraits reused from v1's renders.

~$0.07 per image

Text-to-speech

ElevenLabs multilingual_v2

7 voices: George (narrator), Sarah (April + chorus), Jessica (Mango), Lily (Fern), Laura (Riddle), Roger (Dusty), Bill (Cypher) — swapped in for Matilda from v1 to give Cypher a mature-male-mentor register fitting an older tiger.

$0.015 per request on Creator subscription

Audio synth + stitching

ffmpeg

Same three-note bell chime synthesized in v1 (C4 → E4 → G4 with reverb tail). Same per-chapter concat + ID3 metadata sidecar with chapter markers.

Free

ID3 tagging

mutagen

Same ID3v2.3 polish pass as v1: APIC cover, TIT2/TALB/TPE1/TCOM/TYER, named CTOC with TOP_LEVEL+ORDERED flags. Per-chapter mp3s tagged TRCK n/10.

Free

PDF

Chrome --headless --print-to-pdf

Same setup as v1. v2 PDF is 38 MB at full-resolution beat JPEGs and lives only in the source repo.

Free

Publish

scripts/publish-book.sh

The publish pipeline introduced for v1's publish — copies assets jpeg→jpg, re-encodes audio to 64 kbps mono 22050 Hz, re-tags ID3v2.3 + imports CHAP frames, rebuilds web novel.html, inserts the book card. v2 ran it with the slug april-and-the-dragon-staff-v2 in ~60 s.

Free

Read it / hear it

Per-chapter mp3s live inside the novel page. Audio files are also in /books/april-and-the-dragon-staff-v2/audio/. The master audiobook.mp3 has ID3v2.3 chapter markers + embedded cover artwork — Apple Podcasts, Overcast, and Pocket Casts will show a clickable chapter list.

Want to see the original? Click v1 in the tabs above.

You're reading v1 — the original strict-AABB version. Switch to v2 (current) for the natural-prose retelling.
April and the Dragon Staff cover
Books · Generated end-to-end with LLMs and image/audio models

April and the Dragon Staff

A 10-chapter children's picture book in strict AABB rhymed verse — call-and-response between April the rainbow koala and her four friends, with 98 Pixar-storybook illustrations and a 59-minute audiobook narrated with 7 voices. The form does the moral: at the climax, the skipped friend's response couplet becomes a literal silence on the page and in the ear.

The numbers

A four-sentence brief became a 59-minute illustrated audiobook in strict rhymed verse. Here's the count of everything that got generated along the way.

9,895
words
1,083
verse lines
10
chapters
105
images rendered
98
chapter beats
58:41
audiobook duration
7
distinct voices
490
TTS segments
~55
LLM calls
52k
TTS characters

The form does the moral

The book is in strict AABB rhymed verse with a call-and-response structure: April issues a call couplet, and the friends issue a response couplet. The whole book sings this back and forth — until chapter 8, when one friend skips a step in the safety ritual and her response couplet simply doesn't come. The page holds two blank lines where the rhyme should land. The audiobook holds 2.8 seconds of silence. Then the troupe catches the mistake together, and from chapter 10 the ritual returns — but inverted: the friends call, and April responds. The form earns the closing.

"The dragon staff has three bright wicks,
and fire is not for careless tricks."
"We hear you, April, loud and true —
now teach us what the dragon can do!"

…

"Now shake off the extra before the fire can grow.
Three flicks from the wrist and the excess falls free —
the burn-off comes FIRST. Mango — answer me."

                              (silence — two blank lines)

The night holds its breath. Not a bell. Not a sound.
Mango's wings are already mid-sweep off the ground.

The cast — and their voices

April
April
Sarah
Rainbow koala teacher
Mango
Mango
Jessica
Fastest fruit bat
Fern
Fern
Lily
Careful tree frog
Riddle
Riddle
Laura
Questioning parrot
Dusty
Dusty
Roger
Steady wombat
Laurel
Laurel
Matilda
April's old teacher (Ch. 9)

Plus George (Narrator) — warm British storyteller, the constant voice across all 10 chapters.

Cost breakdown

Two ways to read this. Out-of-pocket is what actually hit a credit card. At-full-rates is what a fresh account with no credits or subscriptions would have paid for the same work.

~$8
Out-of-pocket
~$35
At full provider rates
BucketVolumeOut-of-pocketAt-full-rates
xAI grok-imaginegrok-imagine-image
cover + 6 character portraits + 98 chapter beats + 1 style verification render
106 images ~$0
xAI credit grant
~$7
Claude Code orchestrationclaude-opus-4-6
the agent loop running the pipeline; heavy prompt caching brings effective rate to ~$3/M.
~5M tokens $0
Claude Pro/Code subscription
~$15
ElevenLabs TTSeleven_multilingual_v2
490 voice-tagged segments across 7 voices; 463 rendered before the monthly Creator quota hit, 27 finished after top-up
490 requests · 52k chars ~$7.35
$0.015/request, Creator subscription
~$10
Choir LLM callsclaude-opus-4-6 + gemini-2.5-pro + gpt-4.1
arc fan-out (6 shapes × 3 models), 2-judge panel, spread, chapter weave, beats ID, audiobook segmentation
~55 calls · ~2M tokens ~$0 ~$3
Total this book ~$8 ~$35

What absorbed the cost. xAI's image-gen credit grant covers all 106 renders here (this book is xAI-only on images — Grok-Imagine nails the Pixar Storybook 3D aesthetic on the first try, so OpenAI didn't need to run). ElevenLabs lands at $0.015 per request on Sean's Creator subscription — the real out-of-pocket line on this book. 490 requests came from the call-and-response form: every voice switch is its own segment, so the segmentation count is denser than a prose audiobook. Claude Code is a flat monthly subscription that covers all the agent's planning and tool calls.

What a fresh account would pay. ~$35 at-full-rates is what you'd spend if you opened brand-new accounts at every provider and rebuilt this exact book. About a third less than Can't Have Nice Things (~$42) — chiefly because this book renders illustrations only with Grok-Imagine ($0.07/image) instead of OpenAI's gpt-image-1 ($0.04–$0.17/image), and because the audiobook is half the length.

Timing

Wall-clock time for each stage. Everything that could parallelize did. Total active runtime from "go" to a finished audiobook: roughly 20 minutes (plus a brief pause mid-way through chapter 10 to top up ElevenLabs).

StageWhat runsWall clock
0 · InterviewCapture the four-sentence seed brief + 4 follow-up choices (rhyme, cast, stakes, setting)~3 min
1 + 2 · Arcs + judge18 parallel arc generations (6 shapes × 3 models), then a 2-judge panel scoring all 18 against the brief~2 min
3 · SpreadTimeline + factors + cover subject in parallel, then character profiles (depends on timeline)~2 min
4 · Style gallerySkipped. Style pre-locked to Pixar Storybook 3D by a reference image during the interview, then verified with a single test render.
5 · Preview siteSkipped. Auto-selected the highest-judged arc (Silent Echo, both judges agreed by a 6-point margin) and grafted the best mix-ins from the other top arcs.
6 · Lock-in (cover + portraits)1 cover + 6 character portraits in parallel~30 s
7 · Chapter weave10 parallel chapters; each prompt embeds the full arc + timeline + characters + 4 mix-ins + AABB style guide~90 s
8a · Beat identification10 parallel beat-ID calls; output 98 distinct beats (8–11 per chapter)~90 s
8b · Beat illustration98 Grok-Imagine renders, 8 concurrent via Python ThreadPoolExecutor — zero failures92 s
9 · Compilenovel.html assembly + Chrome-headless print to PDF~10 s
10a · Audio segmentation10 parallel verse-aware segmentations (never split mid-couplet; chapter-8 blank-echo preserved as a 2.8 s silence segment)~60 s
10b · TTS render490 ElevenLabs segments via eleven_multilingual_v2, 5 concurrent (Creator tier). 463 in the first pass; quota cap; ~30 s pause to top up; 27 more in 10 s.~2:30 + pause
10c · Chime + stitch + tag + injectSynth 3-note bell chime (ffmpeg sine + reverb), stitch 10 chapter mp3s + master, embed ID3v2.3 chapter markers + cover art, inject audio players into novel.html~45 s
Total (active runtime, end to end)~15 – 20 min

The process

Ten stages, each a separate slash command or script. The early ones generate a menu of options; the late ones commit to picks and bake the cake. Two stages were skippable for this book — the style was pre-locked by a reference image, and the preview wizard was auto-resolved because the judges agreed by a wide margin.

0 · Capture the seed (/scriptorium-interview)
A short conversation captures the inspirer's intent. The seed for this book was a single message: "April, the rainbow-colored koala, teaches her friends how to spin the dragon staff. She teaches them about fire safety. She teaches them the difference between knowing a thing and remembering a thing." Then four follow-up choices: strict Seuss meter + AABB rhyme, let the writer invent the friends, scary near-miss with no injury, and a flow-arts festival town setting. The brief lands in brief.md and every downstream prompt reads it.
1 + 2 · Wide arc fan-out + judging (/scriptorium-arcs)
Six structural shapes (tragic, comic, mystery, picaresque, ironic, wildcard) × three premium models (Opus 4.6, GPT-4.1, Gemini 2.5 Pro) = 18 candidate arcs. Two independent judges (Opus 4.6 + Gemini 2.5 Pro) score each on inventiveness, coherence, stakes, illustrability, and voice match — and name the survivors with two-word evocative names. The judges agreed by a 6-point margin: the wildcard winner was Silent Echo, a call-and-response arc where every chapter alternates April's call couplet and the friends' response couplet — and where the climax replaces the skipped friend's response couplet with a literal blank line on the page. Both judges flagged this as the single most inventive idea on the slate. Top 5: Silent Echo, Woven Circle, Flame Patch, Five Doors, Dragon's Sentence.
3 · Spread the survivor + grafts (/scriptorium-spread)
For the chosen spine arc: chapter-by-chapter timeline, named subplots, named pivotal events, recurring motifs ("factors"), full character roster with portrait briefs, and a cover subject — all in parallel. Four mix-ins were grafted from the other top-5 arcs (non-negotiable, must show up in target chapters): from Flame Patch — April's hidden backstory of her own teacher Laurel, threaded as clues across chapters 1–8 and revealed in chapter 9; from Woven Circle — the collective troupe catch at the climax, where all four friends respond in distinct postures; from Five Doors — one chapter per friend in Act 2 (Fern→spotter, Riddle→breath count, Dusty→wet towel, Mango→burn-off); from Dragon's Sentence — the "fire AND frame" refrain and the "wick kisses the grass" climax image.
4 · Visual style — skipped
The style was pre-locked at Stage 0 by a reference image: Pixar Storybook 3D — soft painterly 3D render, big expressive eyes with crisp catchlights, gentle subsurface scattering, warm rim lighting. Verified with a single test render before any chapter art committed (a DeLorean from Back to the Future piloted by a pig next to the Space Needle, just to confirm the renderer hits the target style). The style envelope, per-character anchor descriptions, palette, lighting standards, and guardrails live in styles/chosen.md.
5 · Preview wizard — auto-resolved
The preview/lock-in wizard exists for cases where the judges produce a close call and the inspirer needs to choose between arcs, mix-ins, and cover candidates. Here both judges ranked Silent Echo first by a 6-point margin, and the mix-ins from the other top-5 arcs were unambiguous — so the wizard step was auto-resolved without inspirer review. (Full autonomy was a deliberate choice; the inspirer was offered checkpoints and declined them.)
6 · Lock-in: cover + portraits (/scriptorium-lock)
The chosen arc and mix-ins were committed: spine → arc.md, mix-ins → arc_mixins.md, characters → characters.json, style envelope → styles/chosen.md. Then the cover (April calling, three friends responding, festival lanterns at blue-hour dusk — locked on the first render) and 6 character portraits (April, Mango, Fern, Riddle, Dusty, Laurel) rendered in parallel via Grok-Imagine.
7 · Weave the chapters (/scriptorium-chapter N)
All 10 chapters in parallel. Each prompt embedded the full arc, the locked timeline, the locked character roster (with voice signatures for AABB couplets), the 4 mix-ins, the 6 recurring factors, and the per-project narrative style guide (strict anapestic tetrameter, AABB rhyme, call-and-response structure, no adult lectures, named flow-arts vocabulary). Output: 1,083 verse lines / 9,895 words. Every named friend speaks in their own voice signature — Mango's lines lean forward fast and bright; Fern's clip short and careful; Riddle leads with why; Dusty is monosyllabic and grounded. Chapter 8's blank-echo trick lands as four blank lines on the page, exactly where Mango's response couplet should have been.
8 · Illustrate the beats (/scriptorium-illustrate N)
For each chapter, an illustration director identifies at least 6 distinct narrative beats — opening, early, midchapter, turning, climax, closing — each with a specific scene subject, mood, palette hint, and hand-lettered annotation list. This book averaged 9.8 beats per chapter (denser than the default 6, because the inspirer wanted a heavily illustrated picture book). Total: 98 chapter beats. Each beat is rendered once via Grok-Imagine in the locked Pixar Storybook 3D envelope, with the per-character anchor descriptions inserted into the prompt so the cast stays on-model across all 98 frames. 8 concurrent via a Python ThreadPoolExecutor: 92 seconds, zero failures. Beats are interleaved with the verse in the chapter HTML at proportional positions (opening at 4% of stanzas, climax at 78%, closing at 95%).
9 · Compile the novel (/scriptorium-compile)
Cover + table of contents + cast portrait grid + 10 chapters with beats interleaved + a parent-facing back-matter safety-ritual cheat sheet (the 7 steps in plain prose for kids and parents to talk through together), all in one HTML page with a Pixar-storybook palette (warm cream paper, rainbow-gradient title, Marker Felt display font). A Chrome-headless print pass produces the PDF (~38 MB at full-resolution beat JPEGs; the PDF is archive-only and not published to the web).
10 · Audiobook (/scriptorium-audiobook)
Each chapter is segmented by Opus into voice-tagged JSON, respecting the AABB couplet boundary — segments never split mid-rhyme. Chapter 8's blank-echo lands in the audio too, as a 2.8-second silence segment exactly where Mango's response couplet should have been. ElevenLabs multilingual_v2 renders 490 segments using 7 voices (Sarah for April, Jessica for Mango, Lily for Fern, Laura for Riddle, Roger for Dusty, Matilda for Laurel in chapter 9, and George as the constant narrator). Each chapter opens with a synthesized three-note bell chime (ffmpeg sine + aecho, C4 → E4 → G4 with a soft reverb tail) — a small audio sting that gives chapter transitions proper audiobook weight. ffmpeg concat stitches per-chapter mp3s (0.4 s between segments, 1.2 s + chime + 1.0 s between chapters); ID3v2.3 chapter markers + embedded APIC cover artwork are baked in so podcast apps surface a clickable chapter list. Audio players are then injected into novel.html as a hero player after the cover plus a per-chapter player above each chapter.
11 · Publish (scripts/publish-book.sh <slug>) — new pipeline step
A new scripts/publish-book.sh codifies the 7 publish steps documented in PROCESS.md §6: copy assets (jpeg → jpg), re-encode the audiobook to 64 kbps mono 22050 Hz for web delivery (54 MB → 28 MB, under the GitHub Pages 100 MB cap), re-tag with ID3v2.3 + import CHAP frames from the source master, rebuild novel.html with retargeted paths and a sean-makes-stuff book-nav, insert a .book-card in books/index.html, and bump the "N books shipped" count on the root page. --minimal (default) is what this book shipped under; --full additionally generates this very detail page.

The tooling

Every step is a shell-out, not a library binding — the pipeline is a stack of bash scripts and Python helpers. Sibling of choir and scriptorium.

Fan-out + judging

choir

Sean's CLI for routing prompts to any model across providers. Single-model: plain text out. Multi-model: JSON. --save persists a comparison run; choir runs compare appends a judge summary later.

Anthropic Opus 4.6 · OpenAI GPT-4.1 · Google Gemini 2.5 Pro

Text generation

Claude Opus 4.6

Primary writer. Wrote every chapter, identified every beat, and segmented every chapter for TTS. All 5 top-judged arcs and all 5 chosen mix-ins came from Opus runs; GPT-4.1 and Gemini arcs all collapsed against the Opus equivalents in the judging panel.

~$3 / 1M input · $15 / 1M output (with heavy prompt caching across chapter weaves)

Text generation (judge)

Gemini 2.5 Pro

The second judge in the arc panel — different model family for diversity. Both judges ranked Silent Echo #1 by a 6-point margin, and the GPT-4.1 / Gemini arc proposals all collapsed against the Opus ones.

~$1.25 / 1M tokens

Image generation

xAI grok-imagine

All 105 images: the cover, 6 character portraits, 98 chapter beats, plus 1 style verification render. Grok-Imagine nailed Pixar Storybook 3D on the first try — rainbow gradient fur, gentle subsurface scattering, warm rim lighting, soft painterly post-processing.

~$0.07 per image

Text-to-speech

ElevenLabs multilingual_v2

7 voices from the premade catalog: George (warm British storyteller, narrative_story) as the narrator; Sarah, Jessica, Lily, Laura, Roger, Matilda for April + 4 friends + Laurel. Voice settings: stability 0.55, similarity_boost 0.78, style 0.15 (a slight theatrical lift for verse meter).

~$0.18 – $0.30 per 1,000 characters

Audio synth + stitching

ffmpeg

Chapter intro chime synthesized from three sine waves (C4 / E4 / G4 with staggered adelay, exponential afade envelopes, soft aecho reverb tail, 4.5 kHz lowpass for warmth). Per-chapter concat with anullsrc silence files; ID3 chapter markers from a metadata sidecar.

Free

ID3 tagging

mutagen

Re-encode strips ID3, so a polish pass writes ID3v2.3 (some podcast apps drop CHAP markers in v2.4): APIC cover artwork, TIT2/TALB/TPE1/TCOM/TYER, named CTOC with TOP_LEVEL | ORDERED flags. Per-chapter mp3s get TRCK n/10 so apps show them as a series.

Free (Python lib)

PDF

Chrome --headless --print-to-pdf

Print CSS hides the PDF nav bar and audio players; chapter sections page-break-before, beat figures page-break-inside avoid. The PDF is 38 MB at full-resolution beat JPEGs and lives only in the source repo — too big for GitHub Pages.

Free

Read it / hear it

Per-chapter mp3s live inside the novel page — each chapter has its own <audio> player above the verse. The chapter audio files are also in /books/april-and-the-dragon-staff/audio/ if you want to drop them into a podcast app or pull them down individually. The master audiobook.mp3 has ID3v2.3 chapter markers + embedded cover artwork, so apps like Apple Podcasts, Overcast, and Pocket Casts will show a clickable chapter list.