ARCHITECTURE
System Architecture
Overview
Chalie is a persistent cognitive agent — a continuously running cognitive runtime, not a request-response service. Intelligence emerges from experience: every interaction flows through a memory pipeline that compresses, abstracts, and decays information over time. The system is not a chatbot or assistant wrapper; it is a cognitive runtime that forms memories, runs background processes, exercises judgment, and diverges into a unique identity shaped by its interaction history.
Core Architecture
System Type
- Synthetic cognitive brain using LLMs to replicate human brain functions
- Tech Stack: Python backend, SQLite (WAL mode + sqlite-vec + FTS5), MemoryStore (in-memory, thread-safe), Ollama (configurable LLMs), Vanilla JavaScript frontend (Radiant design system)
- Core Pattern: Single-process architecture with daemon threads and PromptQueue, service-oriented design
Communication Pattern
- Client connects to
/ws(WebSocket via flask-sock) - Client sends
{"type": "message", "text": "..."}JSON frame - Backend enqueues message → Digest Worker processes → response streamed back as WebSocket frames (
status→message→done) - Drift thoughts, cards, and proactive notifications also arrive over the same
/wsconnection - Authentication: Session cookie-based (
@require_sessiondecorator)
Code Organization
backend/
├── services/ # Business logic (memory, orchestration, routing, embeddings)
├── workers/ # Async workers (digest, memory chunking, consolidation)
├── listeners/ # Input handlers (direct REST API)
├── api/ # REST API blueprints (conversation, memory, proactive, privacy, system)
├── configs/ # Configuration files (connections.json, agent configs, generated/)
├── migrations/ # Database migrations
├── prompts/ # LLM prompt templates (mode-specific)
├── tools/ # First-party tool modules
├── tests/ # Test suite
└── run.py # Single-process entry point
Frontend applications located separately:
frontend/
├── interface/ # Main chat UI (ES6 modules, Radiant design system)
│ ├── app.js # Thin orchestrator — boot, wiring, small glue (~680 lines)
│ ├── utils.js # Shared utilities (escHtml, lsGet/lsSet, toast, relativeTime)
│ ├── auth.js # Authentication & login dialog
│ ├── chat.js # Message sending orchestration & conversation history
│ ├── image_attach.js # Image upload, preview strip, analysis tracking
│ ├── document_upload.js # Document upload dialog, processing, synthesis
│ ├── task_strip.js # Persistent task strip display & polling
│ ├── apps_panel.js # Interface daemon panel, scope approval, app overlay
│ ├── event_router.js # WebSocket drift/push event dispatcher
│ ├── notifications.js # Audio chime, system notifications, push subscription
│ ├── update_system.js # Update banner & dialog
│ ├── ambient_canvas.js # Animated background gradient orbs
│ ├── api.js # REST client
│ ├── ws.js # WebSocket client
│ ├── renderer.js # Conversation spine DOM renderer
│ ├── presence.js # Presence dot state machine
│ ├── voice.js # Voice I/O (STT + TTS)
│ ├── heartbeat.js # Client context telemetry
│ ├── ambient.js # Behavioral sensor (passive)
│ ├── moment_search.js # Recall search overlay
│ ├── markdown.js # Markdown parser (XSS-safe)
│ └── sw.js # Service worker (caching, push, share target)
├── brain/ # Admin/cognitive dashboard
└── on-boarding/ # Account setup wizard
Module communication: Constructor injection for shared services, callback registration for cross-module events, custom DOM events (chalie:action, chalie:speak, chalie:pin-moment) for loose coupling. Modules never reference each other directly — app.js wires all connections.
Asset versioning: Flask injects a <script type="importmap"> into index.html at serve time, mapping all module imports to ?v=VERSION URLs. The VERSION file is the single source of truth. Service worker uses network-first for JS/CSS (localhost = 0ms latency) with cache fallback for offline/PWA.
IMPORTANT: UI code must exist under /interface/, /brain/, or /on-boarding/ only.
Key Services
Core Services (backend/services/)
Routing & Decision Making
mode_router_service.py— Deterministic mode routing (~5ms) with signal collection + tie-breakerrouting_stability_regulator_service.py— Single authority for router weight mutation (24h cycle, ±0.02/day max)
Response Generation
frontal_cortex_service.py— LLM response generation using mode-specific promptsvoice_mapper_service.py— Translates identity vectors to tone instructions
Memory System
context_assembly_service.py— Unified retrieval from multiple memory layers (transcript + compaction, moments, episodes, knowledge) with weighted budget allocation; working memory now reads from compaction summary + budget-constrained recent transcript entries instead of fixed-size MemoryStore FIFOtranscript_service.py— Persistent, topic-scoped, append-only conversation record (SQLite + sqlite-vec); semantic search, keyword fallback, selective embedding (>50 tokens), 90-day TTL pruningcompaction_service.py— Incremental LLM-powered summarization; fires when total context exceeds 85% of provider budget; stores compacted text with transcript watermark intopic_compactionsepisodic_retrieval_service.py— Hybrid vector + FTS search for episodesknowledge_service.py— Unified knowledge store (traits, concepts, procedures, relationships) with RRF hybrid search (exact + FTS5 + vector KNN), decay management, and prompt injectiontemporal_pattern_service.py— Mines hour-of-day and day-of-week distributions frominteraction_logfor behavioral pattern detection; stores discoveries as behavioral traits in knowledge table with generalized labels; 24h background worker cycleepisodic_storage_service.py— SQLite CRUD for episodic memorieslist_service.py— Deterministic list management (shopping, to-do, chores); perfect recall with full history vialists,list_items,list_eventstablesmoment_service.py— Pinned message bookmarks with LLM-enriched context, sqlite-vec semantic search, and salience boosting; moments stored as documents withsource_type='moment'moment_enrichment_service.py— Background worker (5min poll): collects gists from ±4hr interaction window, generates LLM summaries, seals moments after 4hrs
Autonomous Behavior
reasoning_loop_service.py— Signal-driven continuous reasoning; dispatches signal types through handlers; falls back to salient/insight discovery on idle; attention-gatedautonomous_actions/— Decision routing by priority: CommunicateAction (10), PlanAction (7), AmbientToolAction (6), ReflectAction (5), ReconcileAction (4), NothingAction (-1)decay_engine_service.py— Periodic decay (episodic 0.05/hr, semantic 0.03/hr)
Ambient Awareness
-
ambient_inference_service.py— Deterministic inference engine (<1ms, zero LLM): place, attention, energy, mobility, tempo, device_context from browser telemetry + behavioral signals; thresholds loaded fromconfigs/agents/ambient-inference.json; emits transition events (place, attention, energy) to event bridge whenemit_events=True -
place_learning_service.py— Accumulates place fingerprints (geohash ~1km, never raw coords) inplace_fingerprintstable; learned patterns override heuristics after 20+ observations -
client_context_service.py— Rich client context with location history ring buffer (12 entries), place transition detection, session re-entry detection (>30min absence), demographic trait seeding from locale, and circadian hourly interaction counts; emits session_start/session_resume events to event bridge -
event_bridge_service.py— Connects ambient context changes (place, attention, energy, session) to autonomous actions; enforces stabilization windows (90s), per-event cooldowns, confidence gating, aggregation (60s bundle window), and focus gates; config inconfigs/agents/event-bridge.json
ACT Loop & Critic
act_orchestrator_service.py— Unified, parameterized ACT loop runner. Configurable:critic_enabled,smart_repetition(embedding-based),escalation_hints,persistent_task_exit,deferred_card_context. Fatigue-free termination model: hard cap (30 iterations), cumulative timeout, semantic repetition, type repetition, no-actions signal.act_loop_service.py— Cognitive iteration manager with action execution, history tracking, and telemetry. Constructor-injected critic and dispatcher. Generic scalar output chaining between sequential actions.act_dispatcher_service.py— Routes actions to skill handlers with timeout enforcement; returns structured results with confidence and contextual notescritic_service.py— Post-action verification via lightweight LLM; safe actions get silent correction, consequential actions pause; EMA-based confidence calibrationact_reflection_service.py— Enqueues tool outputs for background experience assimilationpersistent_task_service.py— Multi-session background task management with state machine (ACCEPTED → IN_PROGRESS → COMPLETED/PAUSED/CANCELLED/EXPIRED); duplicate detection via Jaccard similarity; rate limiting (3 cycles/hr, 5 active tasks max)plan_decomposition_service.py— LLM-powered goal → step DAG decomposition; validates DAG (Kahn’s cycle detection), step quality (4–30 word descriptions, Jaccard dedup), and cost classification (cheap/expensive); plans stored inpersistent_tasks.progressJSON (stored as TEXT in SQLite); ready-step ordering (shallowest depth, cheapest first)
Constants & Registries
services/innate_skills/registry.py— Authoritative frozenset definitions for all skill membership sets (ALL_SKILL_NAMES,PLANNING_SKILLS,COGNITIVE_PRIMITIVES,CONTEXTUAL_SKILLS,TRIAGE_VALID_SKILLS, etc.). Single source of truth — all consumers import from here.services/act_action_categories.py— Authoritative frozenset definitions for action behavior categories (READ_ACTIONS,DETERMINISTIC_ACTIONS,SAFE_ACTIONS,CRITIC_SKIP_READS,ACTION_FATIGUE_COSTS).services/act_memory_keys.py— Centralized MemoryStore key patterns for the ACT system (deferred cards, tool caches, heartbeat, reflection queue).
Tool Integration
tool_registry_service.py— Tool discovery, metadata management; loads first-party tools from ToolLibraryService, registers interface tools via HTTP; invokes first-party tools directly in-processtool_config_service.py— Tool configuration persistence; webhook key generation (HMAC-SHA256 + replay protection via X-Chalie-Signature/X-Chalie-Timestamp)tool_performance_service.py— Performance metrics tracking; correctness-biased ranking (50% success_rate, 15% speed, 15% reliability, 10% cost, 10% preference); post-triage tool reranking; user correction propagation; 30-day preference decaytool_profile_service.py— LLM-generated tool capability profiles withshort_summary,full_profile,triage_triggers, andusage_scenarios; profiles power thefind_toolsinnate skill (semantic search against capability embeddings intool_capability_profiles_vec)- Webhook endpoint (
/api/tools/webhook/<name>) — External tool triggers with HMAC-SHA256 or simple token auth, 30 req/min rate limit, 512KB payload cap
Identity & Learning
identity_service.py— 6-dimensional identity vector system with coherence constraintsidentity_state_service.py— Tracks identity state changes and evolution
Infrastructure
database_service.py— SQLite connection management (WAL mode) and migrationsmemory_store.py— MemoryStore: thread-safe, in-memory key-value store with Redis-compatible APIconfig_service.py— JSON file config loader (agent configs, connection names); runtime config (port, host) managed byruntime_config.pyvia CLI argsoutput_service.py— Output queue management for responsesevent_bus_service.py— Pub/sub event routing
Topic Classification
topic_classifier_service.py— Embedding-based deterministic topic classification with two-signal boundary detectiontwo_signal_boundary_service.py— Self-calibrating topic boundary detector: consecutive + window similarity must both drop below adaptive thresholds (K=1.6), with discourse marker fast path; per-thread state in MemoryStore with 24h TTL
Session & Conversation
thread_conversation_service.py— MemoryStore-backed conversation thread persistencethread_service.py— Manages conversation threads with expirysession_service.py— Tracks user sessions and topic changes
Documents & File Management
document_service.py— Document CRUD, chunk storage, hybrid search (semantic via sqlite-vec + FTS5 + keyword boost via Reciprocal Rank Fusion), soft delete with 30-day purge window, dual-layer duplicate detection (SHA-256 hash + cosine similarity on summary embeddings)document_processing_service.py— Full extraction pipeline: text extraction (pdfplumber, python-docx, python-pptx, trafilatura), regex-based metadata extraction (dates, companies, monetary values, reference numbers, document type heuristic), adaptive chunk sizing by document type, SimHash fingerprinting, language detection (langdetect)document_card_service.py— Inline HTML card emission for document search results (source attribution with type badges, confidence indicators), upload confirmations, document previews, and lifecycle events; cyan#00F0FFaccent
Innate Skills (backend/services/innate_skills/ and backend/skills/)
Built-in cognitive skills for the ACT loop:
recall_skill.py— Unified retrieval across ALL memory layers including user traits (<500ms); supports “what do you know about me?” viauser_traitslayer with broad/specific query modes and confidence labelsmemorize_skill.py— Store gists and facts (<50ms)introspect_skill.py— Comprehensive internal state report: 4 natural-language scopes (memory health, skill/tool usage, reasoning state, identity); supports “why did you do that?” via routing audit trail and autonomous action historyassociate_skill.py— Spreading activation through semantic graph (<500ms)scheduler_skill.py— Create/list/cancel reminders and scheduled tasks (<100ms)autobiography_skill.py— Retrieve synthesized user narrative with optional section extraction (<500ms)list_skill.py— Deterministic list management: add/remove/check items, view, history (<50ms)persistent_task_skill.py— Multi-session background task management: create (with plan decomposition), pause, resume, cancel, check status, show plan, set priority (<100ms; create ~2-5s with LLM decomposition)document_skill.py— Document search and management via ACT loop: search (hybrid semantic via sqlite-vec + FTS5 + keyword retrieval), list, view, delete, restore; documents are reference material retrieved via skill, not context assembly; search results include[Source: document_id=...]markers for frontal cortex citationread_skill.py— Fetch and read web page content for information gathering and researchreflect_skill.py— On-demand experiential synthesis via lightweight LLM call; retrieves ACT loop outcomes, episodes, concepts, and strategy patterns, then synthesizes into actionable insight (what worked, what didn’t, patterns noticed, connections formed); optionally stores as gistfind_tools_skill.py— Discover registered tools via semantic search against tool capability profilesnotes_skill.py— Search past conversation transcript for on-demand retrieval of older context (renamed totranscriptskill;notesalias preserved for backward compat)
Worker Processes (backend/workers/)
Queue Workers (Daemon Threads)
- Digest Worker — Core pipeline: classify → unified generate → enqueue memory job
- Episodic Memory Worker — Builds episodes from sequences of exchanges
- Semantic Consolidation Worker — Extracts concepts + relationships from episodes
Services/Daemons (Daemon Threads)
- REST API + WebSocket — Flask app with flask-sock on port 8081
- Reasoning Loop — Signal-driven continuous reasoning (see service listing above); attention-gated
- Ambient Inference Service — Deterministic inference of place, attention, energy, mobility, tempo from browser telemetry (<1ms, zero LLM)
- Place Learning Service — Accumulates place fingerprints in SQLite; learned patterns override heuristics after 20+ observations
- Decay Engine — Periodic memory decay cycle; flat rate per decay class; contradicted traits resolve via inline contradiction check at creation time
- Routing Stability Regulator — Single authority for router weight mutation
- Experience Assimilation — Tool results → episodic memory (60s poll)
- Thread Expiry Service — Expires stale threads (5min cycle)
- Scheduler Service — Fires due reminders/tasks (60s poll)
- Autobiography Synthesis — Synthesizes user narrative (6h cycle)
- Triage Calibration — Triage correctness scoring (24h cycle); wires user corrections to tool preferences; learns usage scenarios from clarification→tool resolution chains
- Profile Enrichment — Tool profile enrichment (6h cycle, 3 tools/cycle); preference decay; usage-triggered full profile rebuilds (15 successes or reliability < 50%)
- Curiosity Pursuit — Explores curiosity threads via ACT loop (6h cycle)
- Moment Enrichment — Enriches pinned moments with gists + LLM summary, seals after 4hrs (5min poll)
- Temporal Pattern Service — Mines behavioral patterns from interaction timestamps (24h cycle, 5min warmup); detects hour-of-day peaks, day-of-week peaks, topic-time clusters; stores as
behavioral_patternuser traits - Persistent Task Worker — Runs eligible multi-session background tasks via bounded ACT loop (30min cycle with ±30% jitter); plan-aware execution follows step DAG when present (up to 3 steps/cycle with per-step fatigue budgets), falls back to flat loop otherwise; adaptive user surfacing at coverage milestones
- Document Worker — PromptQueue worker for document processing: text extraction → metadata extraction → adaptive chunking → batch embedding → storage; 10min timeout per document
- Document Purge Service — Hard-deletes documents past their 30-day soft-delete window (6h cycle)
- VaultService — AES-256-GCM envelope encryption; PBKDF2-derived KEK wraps a random DEK stored in
vault_config; unlocked post-login; migrates legacy Fernet data on first unlock
Data Flow Pipeline
User Input → Response Pipeline
[User Input]
→ [run.py] → [PromptQueue] → [Digest Worker]
├─ Classification (embedding-based, two-signal boundary detection)
├─ Context Assembly (transcript + compaction + semantic memories)
├─ Unified LLM Generation (skills + tools discoverable inline)
│ └─ ACT loop runs inline when LLM invokes skills/tools
└─ Enqueue Memory Job
→ [Episodic Memory Queue] → [Episodic Memory Worker]
→ SQLite Episodes Table
→ [Semantic Consolidation Queue] → [Semantic Consolidation Worker]
→ SQLite Concepts Table
Background Processes
[Routing Stability Regulator] ← 24h cycle
→ adjusts configs/generated/mode_router_config.json
[Decay Engine] → runs every 1800s (30min)
├─ Episodic decay (salience-weighted)
├─ Semantic decay (strength-weighted)
└─ User trait decay (category-specific)
[Cognitive Drift Engine] → during worker idle
├─ Seed selection (weighted random)
├─ Spreading activation (depth 2, decay 0.7/level)
└─ LLM synthesis → stores as drift gist
Key Architectural Decisions
Deterministic Mode Router
- Decoupled: Mode selection (mathematical, ~5ms) separate from response generation (LLM, ~2-15s)
- Signals: ~17 observable signals from context + NLP (context warmth, question marks, greeting patterns, etc.)
- Scores: Each mode gets weighted composite score; highest wins
- Tie-breaker: ONNX classifier for ambiguous cases
- Self-leveling: Router naturally shifts toward UNIFIED as memory accumulates
Single Authority for Weight Mutation
- Routing Stability Regulator is the only service that modifies router weights
- Other services log “pressure signals” but don’t mutate state
- Updates bounded: max ±0.02/day, 48h cooldown per parameter
- Closed-loop control: Verifies adjustments work before persisting
Mode-Specific Prompts
- Each mode (UNIFIED, ACT, IGNORE) has its own focused prompt template
- Replaces old approach: single combined prompt with mode selection embedded
- Focused scope prevents elaboration and improves consistency
Memory Hierarchy
- Topic Transcript (SQLite + sqlite-vec) — Persistent, append-only conversation record per topic; budget-aware filling replaces fixed turn limits
- Compaction (SQLite) — Incremental LLM summarization of older transcript entries; preserves facts/decisions/preferences, discards conversation flow
- Working Memory (MemoryStore, legacy fallback) — FIFO buffer used only when no transcript data exists yet
- Gists (MemoryStore, 30min TTL) — Compressed exchange summaries
- Facts (MemoryStore, 24h TTL) — Atomic key-value assertions
- Episodes (SQLite + sqlite-vec) — Narrative units with decay
- Concepts (SQLite + sqlite-vec) — Knowledge nodes and relationships
- Procedural Memory (SQLite) — Learned action reliability; surfaced in context assembly as reliability hints (≥8 attempts, top 3 skills)
- User Traits (SQLite) — Personal facts with category-specific decay (includes behavioral patterns from temporal mining)
- Lists (SQLite) — Deterministic ground-truth state (shopping, to-do, chores); perfect recall, no decay, full event history
Each layer optimized for its timescale; all integrated via context assembly. Context assembly reads compaction summary + budget-constrained recent transcript entries for working memory context. Lists are injected into all prompts as `` for passive awareness; the ACT loop uses the list skill for mutations.
Configuration Precedence
Environment variables > .env file > JSON config files > hardcoded defaults
See docs/02-PROVIDERS-SETUP.md for provider configuration.
Thread-Safe Worker State
- All workers run as daemon threads within a single Python process
- Shared state managed via thread-safe data structures (locks, queues)
- No multiprocessing overhead — lightweight, in-process coordination
Adaptive Topic Boundary Detection
- Replaces static 0.65 cosine similarity threshold with a 3-layer self-calibrating detector
- Two-Signal Detection: consecutive similarity (cos to previous message) AND window similarity (cos to centroid of last 5 messages) must both drop below self-calibrating thresholds (mean - K*std, K=1.6) for a boundary to fire
- Discourse Markers: 16 regex patterns for explicit topic switch phrases (“by the way”, “speaking of”, etc.) provide a high-precision fast path bypassing the two-signal gate
- All thresholds derived from running conversation statistics; stats only updated on non-boundary messages to keep baseline clean
- State persisted in MemoryStore (
two_signal_boundary:{thread_id}, 24h TTL); cold-start mode (markers only) during first 6 messages
Topic Confidence Reinforcement
- Topic confidence updated via bounded reinforcement formula
new = current + (new_confidence - current) * 0.5- Ensures gradual adaptation without oscillation
Error Resilience
- All workers catch JSON decode errors from LLM responses
- Log meaningful messages instead of crashing
- Return status strings for graceful degradation
Safety & Constraints
Hard Boundaries
- Prompt hierarchy immutable (marked as “authoritative and final”)
- Skill registry fixed at startup (no runtime skill registration)
- Data scope parameterized by topic (no cross-topic leakage)
- Speaker confidence gates trait storage (unknown speakers = 0.3 penalty)
Operational Limits
- ACT loop: 60s cumulative timeout, 30 max iterations; post-action critic verification
- Persistent tasks: 5 active max, 3 cycles/hr rate limit, 14-day auto-expiry; plan-decomposed tasks: 3–8 steps per plan, up to 3 steps executed per cycle, 3 ACT iterations per step
- Fatigue budget: 2.5 activation units per 30min
- Per-concept cooldown: 60min (prevents circular rumination)
- Delegation rate: 1 per topic per 30min
Anti-Manipulation
- Identity isolation: 6 vectors with coherence constraints
- No vulnerability simulation: Explicitly forbidden
- Exponential backoff: System retreats on silence (opposite of dependency)
- No flattery optimization: Soul axiom: “Never optimize by misleading”
Configuration Files
Primary Configuration
configs/connections.json— SQLite path and MemoryStore settingsconfigs/agents/*.json— LLM settings (model, temperature, timeout)configs/generated/mode_router_config.json— Learned router weights (generated)
Provider Configuration
- Stored in SQLite
providerstable (not JSON files) - Runtime configurable via REST API (
/api/providers) - Supports: Ollama, Anthropic, OpenAI, Google Gemini
See docs/02-PROVIDERS-SETUP.md for detailed setup instructions.
REST API
Available Blueprints
user_auth— Account creation, login, API key managementconversation— Chat endpoint (WebSocket streaming), conversation list/retrievalmemory— Memory search, fact managementproactive— Outreach/notifications, upcoming tasksprivacy— Data deletion, exportsystem— Health, version, settings, observability (routing, memory, tools, identity, tasks, autobiography, traits)tools— Tool execution, configurationproviders— LLM provider configurationpush— Push notification subscriptionscheduler— Reminders and scheduled taskslists— List managementstubs— Placeholder endpoints (calendar, notifications, integrations, permissions) returning 501
Observability Endpoints (/system/observability/*)
routing— Mode router decision distribution and recent activitymemory— Memory layer counts and health indicatorstools— Tool performance statsidentity— Identity vector statestasks— Active persistent tasks, curiosity threads, triage calibrationautobiography— Current autobiography narrative with delta (changed/unchanged sections)traits(GET) — User traits grouped by category with confidence scorestraits/<key>(DELETE) — Remove a specific learned trait (user correction)
See API blueprints in backend/api/ for full reference.
Testing Strategy
Test Markers
@pytest.mark.unit— No external dependencies (fast)@pytest.mark.integration— Requires SQLite/MemoryStore (slower)
Test Organization
backend/tests/
├── test_services/ # Service unit tests
├── test_workers/ # Worker integration tests
└── fixtures/ # Shared test fixtures
Run all tests: pytest
Run only unit: pytest -m unit
Run with verbose: pytest -v
Development Workflow
Setup
cd backend
pip install -r requirements.txt
source .venv/bin/activate
cp .env.example .env
Local Development
# Single command — starts Flask + WebSocket + all daemon threads
python backend/run.py
No external services required. SQLite and MemoryStore are embedded — everything runs in one process.
Deployment Notes
- No Telemetry: Zero external calls except to configured LLM/voice providers
- Local First: All data stored locally unless external providers configured
- Encryption: AES-256-GCM envelope encryption via VaultService (password-derived KEK wraps a random DEK)
- CORS: Defaults to localhost, restrict before production
Interface Layer
External applications can extend Chalie’s capabilities by pairing as interfaces. Interfaces expose tool capabilities that Chalie registers in its normal tool pipeline.
Protocol
Chalie → Interface:
GET /health— periodic liveness check (every 30s)GET /capabilities— fetch tool manifestsPOST /execute— invoke a capability
Interface → Chalie:
POST /api/signals— push events (authenticated via signal_token)
Pairing
Bluetooth-style: Chalie generates a one-time pairing key (brain dashboard). User enters it into the interface along with Chalie’s host:port. Interface calls POST /api/interfaces/pair. Both sides exchange connection details.
Health Monitoring
A daemon thread pings all paired interfaces every 30 seconds. After 3 consecutive failures, an interface is marked offline and its tools become invisible to the LLM. Recovery is automatic on the next successful health check.
Key Files
services/interface_registry_service.py— Core lifecycle managementapi/interfaces.py— REST API for pairing, listing, removalworkers/interface_health_worker.py— Health monitor daemonmigrations/012_interfaces.sql— Database schema
Glossary
- Mode Router: Deterministic mathematical function selecting engagement mode from observable signals
- Tie-Breaker: ONNX classifier consulted when top 2 modes are within effective margin
- Routing Signals: Observable features collected from MemoryStore and NLP analysis (~5ms)
- Router Confidence: Normalized gap between top 2 scores — measures routing certainty
- Pressure Signal: Metric logged by monitors, consumed by the single regulator
- Context Warmth: Signal (0.0-1.0) measuring how much context is available for current topic
- Drift Gist: Spontaneous thought stored during idle periods (DMN)
- Episode: Narrative memory unit with intent, context, action, emotion, outcome, salience
- Concept: Knowledge node with strength decay and spreading activation
- Salience: Computed importance metric (0.1-1.0) based on novelty, emotion, commitment