March 15, 2026

Triage Consolidation and Autonomous Gating

Replaced the multi-stage LLM triage with a deterministic ONNX gate, purged the CLARIFY routing mode, and introduced tiered autonomous execution guardrails.

Triage and Routing Overhaul

This work concludes the migration from a heavyweight LLM-based triage pipeline to a thin, deterministic gate. I’ve replaced the CognitiveTriageService—which previously ran an LLM call followed by six self-evaluation rules—with the MessageGateService. This new service uses an ONNX model for routing (~5ms latency) and effectively collapses the system’s routing logic into three branches: CANCEL (fast exit), RESPOND (high-confidence direct answers), and ACT (universal intent handler).

As part of this consolidation, the CLARIFY mode has been purged from the codebase. Any need for clarification is now handled naturally within the ACT loop rather than being a distinct routing state. I also removed the IGNORE mode for user messages; Chalie now defaults to responding to all valid user inputs. To support this, I cleaned up dozens of orphaned prompts and configurations that were tied to the old triage logic.

Autonomous Execution Guardrails

To support safer autonomous actions, I introduced a new tiered gating system. The ConsequenceClassifierService now categorizes actions into SAFE, CHECK, or COMMIT tiers. This is paired with a DomainConfidenceService that tracks per-domain scoring for autonomous decisions. These services feed into a unified AutonomousExecutionGate, ensuring that high-stakes actions require stronger internal confidence or explicit user approval before execution.

I also moved thread exchanges to a write-through SQLite model. This ensures that chat history survives service restarts and provides a more durable foundation for the reasoning loop to reference previous turns.

Durable Image Pipeline

Images uploaded via the chat interface are now first-class citizens in the document system. Instead of being held ephemerally in memory, they are persisted to disk and recorded in the documents table. This allows for long-term recall and enables the ‘Uploads’ tab in the Brain interface.

On the processing side, I optimized the image pipeline to be more memory-efficient. The EXIF-stripping logic was refactored to use a BytesIO round-trip rather than materializing full pixel lists, which reduced peak memory usage by nearly 90% for large uploads. Triage is now also image-aware, receiving textual descriptions of attachments to make better routing decisions.

Cognitive Jobs and UI Refinement

The ‘Brain’ interface’s Jobs UI was overhauled to be more dynamic. Hardcoded job arrays were replaced with a centralized cognitive_jobs.json configuration served via the backend. Each job now displays 4-level capability metrics (reasoning, structured output, creativity, and classification) using color-coded pills. This makes it easier to see which models are assigned to specific cognitive tasks and what their expected performance profile is.

Finally, I’ve improved the quality of user trait extraction with better prompt guidance and post-processing to strip articles and pronouns, leading to much cleaner ‘Identity’ cards in the user model.