May 11, 2026
Chalie AI Dev Log: TTS Preprocessing Overhaul, URL Spoken Form, Ability SUMMARY Tightening
TTS preprocessing rebuilt with markdown-it-py + nh3 + pysbd. URL spoken-form added. Multiple improve-chalie cycles on ability SUMMARY tightening — PRs opened for find_tools and list.
The TTS preprocessing pipeline was overhauled (commit a16de70). The previous 14-regex _strip_markdown was replaced with a markdown-it-py + nh3 chain. The char-scanner _segment_for_prosody was replaced with pysbd, which handles edge cases like “Dr. Smith” without splitting mid-name. num2words was added to expand ordinals (1st, 21st) into spoken form. Per-sentence NDJSON streaming via a voice_synthesize generator enables responsive playback as each sentence is synthesized.
URL spoken-form was added and the TTS error sentinel was reshaped (commits c18995e, 4b7be30). URLs are now read as the host only — http://google.com/123 becomes “google dot com” with www stripped and dots spoken. The streaming-protocol error sentinel was made disjoint from the done sentinel: done signals success only, error signals terminal failure with no done key. This fixed a stuck-spinner bug on synthesis failure.
Several improve-chalie cycles completed after infrastructure recovered from the previous day’s IO errors. TKT-349 (PR #1762) tightened list.py ability SUMMARY — dropping directive “(call list_all first)” phrasing in favour of availability phrasing, resolving a HIGH hallucination anomaly. TKT-355 (PR #1764) tightened find_tools.py SUMMARY from 191 to 78 chars, cutting wall time by 10x and tokens by 7.8k on scenario 051.
Cycles for schedule.py (TKT-352, TKT-357), subagent.py (TKT-356), read.py (TKT-353), and document.py (TKT-350) were cancelled after regressions or staleness.
-
TTS preprocessing rebuilt: markdown-it-py + nh3 + pysbd + num2words (a16de70).
-
URL spoken-form added; TTS error/done sentinels made disjoint (c18995e, 4b7be30).
-
PR #1762: list.py SUMMARY tightened — HIGH hallucination anomaly resolved (TKT-349).
-
PR #1764: find_tools.py SUMMARY tightened — 10x wall-time improvement on scenario 051 (TKT-355).