Meadow
Deliverable Review
Incorrect. Try again.
M2 Build Spec  /  M2-005

Text-to-Speech

Every word the child taps becomes speech — instantly, reliably, on every iPad.
M2-005
v1.0 · May 2026
Milestone: M2 Engine Foundation
Design sources: Interaction Design Onboarding My Guide
01 What It Does
The foundation of everything

When your child taps a word, the iPad speaks it out loud. Instantly.

This is the single most important feature in Meadow. Every word, every phrase, every follow-up in the SCS conversation loop — it all depends on this. If speech doesn’t work, the entire app is useless.

Meadow uses the speech engine built into every iPad. That means it works without the internet — on airplane mode, in the car, at grandma’s house with no WiFi. It sounds natural, and it responds fast.

Nothing about this feature is flashy. It just has to work, every time, without fail.

🍌 banana Tap → Speech. That’s it.
400ms
Max Time to Speech
0
Internet Required
100%
Works on Silent
From M1-003: triple modality on every tap

Speech is just one of three simultaneous outputs. M1-003 established that every word tap produces picture + speech + ASL sign together. The signing bubble (fixed bottom-right corner, overlapping the companion character) shows a 1–2 second sign animation once per tap. Research from Binger & Light (2007) shows multimodal input accelerates vocabulary acquisition in AAC users.

02 Speed Matters

The 400-millisecond budget

When a child taps a word, speech must start within 400 milliseconds — less than half a second. That’s the total time budget from the moment the child’s finger touches the screen to the moment sound comes out of the speaker.

Why does this matter so much? Because the entire purpose of an AAC device is to teach cause and effect: “I did something, and something happened.” If there’s a noticeable delay between tapping and hearing, the child loses that connection. They stop understanding that they made it happen. And if they stop understanding that, they stop trying.

How the 400ms budget is spent

Touch
Visual
Speech starts
0–50ms — iPad detects the tap
50–100ms — Word highlights on screen (visual confirmation)
100–400ms — Speech engine fires and sound begins
The timeline a child experiences
🖐 Touch 0ms Visual feedback ≤100ms 🔊 Speech starts ≤400ms TOO SLOW Child loses connection

Why 400ms and not 1 second

Research on cause-and-effect learning in young children shows that the connection between action and result weakens rapidly beyond about half a second. For pre-verbal children who are learning that tapping creates speech, this window is even more critical. AAC professionals consistently cite response latency as one of the top reasons children abandon devices. Meadow’s 400ms budget ensures the child always experiences the tap and the speech as a single, connected event.

03 Smart Interruption

When children tap fast, Meadow keeps up

Children don’t politely wait for one word to finish before tapping the next. They explore. They get excited. They get frustrated and tap rapidly. Meadow handles all of this gracefully.

The rule is simple: the newest tap always wins. If the iPad is speaking “banana” and the child taps “milk,” the speech stops mid-word and immediately says “milk.” No queue. No delay. No confusion about which word the child actually wanted.

What 10 rapid taps look like
🍌
tap 1
🥣
tap 2
🤲
tap 3
😄
tap 4
👋
tap 5
🍩
tap 6
🙌
tap 7
💕
tap 8
👨
tap 9
💦
tap 10

Each tap interrupts the previous one. Only the last tap is fully spoken. No crashes, no queue backup, no frozen screen. The child is always heard.

Instant interruption

New tap while speaking? Current word stops, new word plays immediately.

💪

Stress-tested for 10+ rapid taps

10 taps in 2 seconds: no crashes, no queue backup, no lag.

💜

Built for real kids

Rapid tapping happens during excitement, frustration, or exploration. All normal. All handled.

04 Voice Configuration

Parent controls for how the voice sounds

All voice settings live behind the parent gate — only adults can change them. Children interact with the voice as configured, but can never accidentally change the settings.

🔊

Speech Rate

How fast or slow the voice speaks. A slower pace helps younger children connect the spoken word to its meaning — similar to how parents naturally slow their speech when talking to toddlers.

Default: 0.85x (slightly slower than normal)
🎤

Voice Selection

Choose from the voices built into the iPad. Different voices suit different children — some respond better to a higher pitch, others to a deeper tone. All voices work offline.

Default: iPad system voice
🔇

Works on Silent Mode

Even when the iPad’s side switch is set to silent, Meadow still speaks. This is essential for classroom use where other notifications should be muted but the child’s voice must always be heard.

Always on — not configurable

Why 0.85x default speed

Speech-language pathologists use a technique called “parentese” — naturally slower, slightly exaggerated speech that helps young children process language. Meadow’s default speech rate mirrors this by speaking approximately 15% slower than normal conversational pace. Parents and therapists can adjust this up or down based on the individual child’s needs.

05 Acceptance Criteria

How we know it’s working

Each criterion below is a specific, testable checkpoint. If Meadow passes all of these, the text-to-speech engine is production-ready.

ID What We’re Testing Must Pass
M2-005 Speech fires within 400ms Any word tap produces audible speech within 400 milliseconds, tested on the oldest supported iPad (iPad 9). No exceptions — every word, every time.
M2-005 Interruption works correctly Tapping a new word while the previous word is still being spoken interrupts the current speech and immediately speaks the new word. No overlap, no queue.
M2-005 Speech rate matches settings Speech rate matches the value set in the parent profile. Default is 0.85x. Changing the setting changes the voice speed immediately.
M2-005 Works on silent mode When the iPad’s side switch is set to silent, Meadow still produces speech audio. The child’s voice is never muted by accident.
M2-005 Survives rapid tapping 10 taps in 2 seconds: no crash, no freeze, no audio artifacts. Each tap produces its word (even if interrupted by the next). The app remains responsive throughout.

Testing on real hardware

All performance criteria — especially the 400ms speed target and the rapid-tap stress test — must be verified on a physical iPad 9, the oldest and slowest iPad that Meadow supports. Computer simulators run faster than real hardware and can give false confidence. If it works on iPad 9, it works on everything.