Squiggle Story - AI Content Pipeline Architecture
Building production-grade educational content with AI - not as a shortcut, but as a constrained system where every layer enforces quality through research-grounded rules.
Overview
Squiggle Story is an English language acquisition app for Japanese children aged 2 to 4, featuring a daily 2.5-minute pseudo-video-call from an alien character named Zara. I'm building it solo, with AI-assist for creating entire curriculum and content pipeline.
The interesting part isn't the product concept - it's that the content pipeline is designed so that AI operates within strict pedagogical constraints at every layer, with deterministic validation gates that catch errors before they propagate. The system distinguishes between what AI can judge (probabilistic review) and what must be enforced by hard rules (deterministic validation).
Content Pipeline
The system is a layered architecture where each layer constrains the next:
Planning Layers - Three layers generate curriculum content with increasing specificity. Layer 1 produces the full multi-year scope and sequence. Layer 2 breaks it into weekly blueprints. Layer 3 produces detailed episode specifications that become strict contracts for downstream agents. Each layer's output passes through a deterministic validation gate before flowing to the next.
Downstream Agents - Below a contract boundary, the Script Writer and Bedtime Story Generator operate strictly within the constraints defined above. They cannot improvise beyond what the contract allows.
Probabilistic Reviewers (amber) - LLM-based evaluation that catches subtle issues like tone, pacing, pedagogical quality, and narrative coherence. Valuable but fallible.
Deterministic Validators (coral) - Rule-based, binary pass/fail gates at every stage. A grammar contract check doesn't use judgment - it checks whether a forbidden utterance shape appears in the output. These are grounded in the Research Codex and Decision Codex.
Human Approval Gates (coral) - Human review at strategic checkpoints. The system doesn't self-approve.
Four Knowledge Codices
The pipeline draws on four authoritative knowledge stores, each governing a different dimension:
Research Codex
What learning science says. L1 acquisition research, natural morpheme ordering, developmental grammar stages, formulaic chunk theory. This isn't prompt engineering intuition - it's peer-reviewed developmental psychology encoded into system rules.
Decision Codex
What the system has learned from its own mistakes. 20+ logged, versioned design decisions with full rationale. Schema definitions at v8, treated like API contracts. When the AI makes a mistake, the investigation produces a decision that prevents that class of error from recurring.
Lore Codex
What's happened in Zara's world. Character state, narrative threads, story continuity. Ensures every piece of content is internally consistent across hundreds of episodes.
Prop Codex
What can be rendered. An asset registry of available WordCards, background scenes, character states, and animation constraints. The content pipeline never references assets that don't exist in the animation system.
Grammar Contract System
The most concrete example of deterministic discipline in the pipeline. Rather than letting the AI decide what grammar complexity to use, each episode spec includes an explicit grammar contract:
Allowed utterance shapes (e.g., holophrastic, fixed phrases)
Forbidden structures (e.g., SUBJECT_VERB_OBJECT at early stages)
Maximum words per line
Allowed fixed phrases
Allowed and forbidden sentence frames
A 2-year-old's episode literally cannot drift into full sentences because the contract forbids those utterance shapes. This came from research into developmental grammar stages - L1 acquisition literature on how children naturally progress from single words to multi-word combinations.
The Through-Line
The hard problem with AI isn't generation - it's reliable generation within constraints. That's a production discipline problem, not an engineering problem. The pipeline architecture, the four codices, the deterministic validation floor, the grammar contracts - these are all expressions of the same conviction: AI-supported scalability in education requires the same rigor as any other production system. Without that rigor, you get slop.