Chalie is a human-in-the-loop cognitive assistant that combines memory consolidation, semantic reasoning, and proactive assistance. The system processes user prompts through a chain of workers and services, enriching conversations with memory chunks and generating episodic memories for future use.
/chat with text@require_session decorator)backend/
├── services/ # Business logic (memory, orchestration, routing, embeddings)
├── workers/ # Async workers (digest, memory chunking, consolidation)
├── listeners/ # Input handlers (direct REST API)
├── api/ # REST API blueprints (conversation, memory, proactive, privacy, system)
├── configs/ # Configuration files (connections.json, agent configs, generated/)
├── migrations/ # Database migrations
├── prompts/ # LLM prompt templates (mode-specific)
├── tools/ # Skill implementations
├── tests/ # Test suite
└── consumer.py # Main supervisor process
Frontend applications located separately:
frontend/
├── interface/ # Main chat UI (HTML/CSS/JS, Radiant design system)
├── brain/ # Admin/cognitive dashboard
└── on-boarding/ # Account setup wizard
IMPORTANT: UI code must exist under /interface/, /brain/, or /on-boarding/ only.
backend/services/)mode_router_service.py — Deterministic mode routing (~5ms) with signal collection + tie-breakerrouting_decision_service.py — Routing decision audit trail (PostgreSQL)routing_stability_regulator_service.py — Single authority for router weight mutation (24h cycle, ±0.02/day max)routing_reflection_service.py — Idle-time peer review of routing decisions via strong LLMcognitive_triage_service.py — LLM-based 4-step triage (social filter → LLM → self-eval → dispatch); routes to RESPOND/ACT/CLARIFY/ACKNOWLEDGE; defers tool selection to ACT loop when tools exist but none namedcognitive_reflex_service.py — Learned fast path via semantic abstraction; heuristic pre-screen (~1ms) + pgvector cluster lookup (~5-20ms) bypasses full pipeline for self-contained queries; rolling-average centroids generalize from observed examples; self-correcting per cluster via user corrections and shadow validationfrontal_cortex_service.py — LLM response generation using mode-specific promptsvoice_mapper_service.py — Translates identity vectors to tone instructionscontext_assembly_service.py — Unified retrieval from 6 memory layers (working memory, moments, facts, gists, episodes, procedural, concepts) with weighted budget allocation; procedural hints surface learned action reliability (≥8 attempts, top 3, confidence labels)episodic_retrieval_service.py — Hybrid vector + FTS search for episodessemantic_retrieval_service.py — Vector similarity + spreading activation for conceptsuser_trait_service.py — Per-user trait management with category-specific decay (core, relationship, physical, preference, communication_style, micro_preference, behavioral_pattern)temporal_pattern_service.py — Mines hour-of-day and day-of-week distributions from interaction_log for behavioral pattern detection; stores discoveries as behavioral_pattern user traits with generalized labels; 24h background worker cycleepisodic_storage_service.py — PostgreSQL CRUD for episodic memoriessemantic_storage_service.py — PostgreSQL CRUD for semantic conceptsgist_storage_service.py — Redis-backed short-term memory with deduplicationlist_service.py — Deterministic list management (shopping, to-do, chores); perfect recall with full history via lists, list_items, list_events tablesmoment_service.py — Pinned message bookmarks with LLM-enriched context, pgvector semantic search, and salience boosting; stores user-pinned Chalie responses as permanent, searchable moments via moments tablemoment_enrichment_service.py — Background worker (5min poll): collects gists from ±4hr interaction window, generates LLM summaries, seals moments after 4hrs; boosts related episode salience on sealmoment_card_service.py — Inline HTML card emission for moment display in the conversation spinecognitive_drift_engine.py — Default Mode Network (DMN) for spontaneous thoughts during idle; attention-gated (skips when user in deep focus)autonomous_actions/ — Decision routing (priority 10→6): CommunicateAction, SuggestAction (skill-matched proactive suggestions), NurtureAction (gentle phase-appropriate presence), PlanAction (proactive plan proposals from recurring topics, 7-gate eligibility with signal persistence), ReflectAction, SeedThreadActionspark_state_service.py — Tracks relationship phase progression (first_contact → surface → exploratory → connected → graduated)spark_welcome_service.py — First-contact welcome message triggered on first SSE connection; runs once per lifecyclecuriosity_thread_service.py — Self-directed exploration threads (learning and behavioral) seeded from cognitive driftcuriosity_pursuit_service.py — Background worker exploring active threads via ACT loopdecay_engine_service.py — Periodic decay (episodic 0.05/hr, semantic 0.03/hr)ambient_inference_service.py — Deterministic inference engine (<1ms, zero LLM): place, attention, energy, mobility, tempo, device_context from browser telemetry + behavioral signals; thresholds loaded from configs/agents/ambient-inference.json; emits transition events (place, attention, energy) to event bridge when emit_events=Trueplace_learning_service.py — Accumulates place fingerprints (geohash ~1km, never raw coords) in place_fingerprints table; learned patterns override heuristics after 20+ observationsclient_context_service.py — Rich client context with location history ring buffer (12 entries), place transition detection, session re-entry detection (>30min absence), demographic trait seeding from locale, and circadian hourly interaction counts; emits session_start/session_resume events to event bridgeevent_bridge_service.py — Connects ambient context changes (place, attention, energy, session) to autonomous actions; enforces stabilization windows (90s), per-event cooldowns, confidence gating, aggregation (60s bundle window), and focus gates; config in configs/agents/event-bridge.jsonact_loop_service.py — Iterative action execution with safety limits (60s timeout)act_dispatcher_service.py — Routes actions to skill handlers with timeout enforcement; returns structured results with confidence and contextual notescritic_service.py — Post-action verification: evaluates each action result for correctness via lightweight LLM (reuses cognitive-triage agent config); safe actions get silent correction, consequential actions pause; EMA-based confidence calibrationpersistent_task_service.py — Multi-session background task management with state machine (PROPOSED → ACCEPTED → IN_PROGRESS → COMPLETED/PAUSED/CANCELLED/EXPIRED); duplicate detection via Jaccard similarity; rate limiting (3 cycles/hr, 5 active tasks max)plan_decomposition_service.py — LLM-powered goal → step DAG decomposition; validates DAG (Kahn’s cycle detection), step quality (4–30 word descriptions, Jaccard dedup), and cost classification (cheap/expensive); plans stored in persistent_tasks.progress JSONB; ready-step ordering (shallowest depth, cheapest first)tool_registry_service.py — Tool discovery, metadata management, and cron execution via run_interactive (bidirectional stdin/stdout dialog protocol)tool_container_service.py — Container lifecycle; run() for single-shot, run_interactive() for bidirectional tool↔Chalie dialog (JSON-lines stdout, Chalie responses via stdin)tool_config_service.py — Tool configuration persistence; webhook key generation (HMAC-SHA256 + replay protection via X-Chalie-Signature/X-Chalie-Timestamp)tool_performance_service.py — Performance metrics tracking; correctness-biased ranking (50% success_rate, 15% speed, 15% reliability, 10% cost, 10% preference); post-triage tool reranking; user correction propagation; 30-day preference decaytool_profile_service.py — LLM-generated tool capability profiles with triage_triggers (short action verbs injected into triage prompt for vocabulary bridging), short_summary, full_profile, and usage_scenarios; Redis-cached triage summaries (5min TTL)/api/tools/webhook/<name>) — External tool triggers with HMAC-SHA256 or simple token auth, 30 req/min rate limit, 512KB payload capidentity_service.py — 6-dimensional identity vector system with coherence constraintsidentity_state_service.py — Tracks identity state changes and evolutionuser_trait_service.py — User trait management with category-specific decaydatabase_service.py — PostgreSQL connection pool and migrationsredis_client.py — Redis connection handlingconfig_service.py — Environment and JSON file config (precedence: env > .env > json)output_service.py — Output queue management for responsesevent_bus_service.py — Pub/sub event routingcard_renderer_service.py — Card system rendering enginetopic_classifier_service.py — Embedding-based deterministic topic classification with adaptive boundary detectionadaptive_boundary_detector.py — 3-layer self-calibrating topic boundary detector (NEWMA + Transient Surprise + Leaky Accumulator); persists per-thread state in Redis; degrades gracefully to static threshold when Redis is unavailabletopic_stability_regulator_service.py — 24h adaptive tuning of topic classification and boundary detector parametersthread_conversation_service.py — Redis-backed conversation thread persistencethread_service.py — Manages conversation threads with expirysession_service.py — Tracks user sessions and topic changesbackend/services/innate_skills/ and backend/skills/)10 built-in cognitive skills for the ACT loop:
recall_skill.py — Unified retrieval across ALL memory layers including user traits (<500ms); supports “what do you know about me?” via user_traits layer with broad/specific query modes and confidence labelsmemorize_skill.py — Store gists and facts (<50ms)introspect_skill.py — Self-examination (context warmth, FOK signal, stats, decision explanations, recent autonomous actions) (<100ms); supports “why did you do that?” via routing audit trail and autonomous action historyassociate_skill.py — Spreading activation through semantic graph (<500ms)scheduler_skill.py — Create/list/cancel reminders and scheduled tasks (<100ms)autobiography_skill.py — Retrieve synthesized user narrative with optional section extraction (<500ms)list_skill.py — Deterministic list management: add/remove/check items, view, history (<50ms)focus_skill.py — Focus session management: set, check, clear with distraction detection (<50ms)moment_skill.py — Natural language moment recall (“Do you remember…”) and listing via pgvector searchpersistent_task_skill.py — Multi-session background task management: create (with plan decomposition), pause, resume, cancel, check status, show plan, set priority (<100ms; create ~2-5s with LLM decomposition)backend/workers/)behavioral_pattern user traits[User Input]
→ [Consumer] → [Prompt Queue] → [Digest Worker]
├─ Classification (embedding-based, adaptive boundary detection)
├─ Context Assembly (retrieve from all 5 memory layers)
├─ Mode Routing (deterministic ~5ms mathematical router)
├─ Mode-Specific LLM Generation
│ └─ If ACT: action loop → re-route → terminal response
└─ Enqueue Memory Chunking Job
→ [Memory Chunker Queue] → [Memory Chunker Worker]
→ [Conversation JSON] (enriched)
→ [Episodic Memory Queue] → [Episodic Memory Worker]
→ PostgreSQL Episodes Table
→ [Semantic Consolidation Queue] → [Semantic Consolidation Worker]
→ PostgreSQL Concepts Table
[Routing Stability Regulator] ← reads routing_decisions (24h cycle)
→ adjusts configs/generated/mode_router_config.json
[Routing Reflection Service] ← reads reflection-queue (idle-time)
→ writes routing_decisions.reflection → feeds pressure to regulator
[Decay Engine] → runs every 1800s (30min)
├─ Episodic decay (salience-weighted)
├─ Semantic decay (strength-weighted)
└─ User trait decay (category-specific)
[Cognitive Drift Engine] → during worker idle
├─ Seed selection (weighted random)
├─ Spreading activation (depth 2, decay 0.7/level)
└─ LLM synthesis → stores as drift gist
Each layer optimized for its timescale; all integrated via context assembly. Lists are injected into all prompts as `` for passive awareness; the ACT loop uses the list skill for mutations.
Environment variables > .env file > JSON config files > hardcoded defaults
See docs/02-PROVIDERS-SETUP.md for provider configuration.
WorkerManager maintains shared dictionary via multiprocessing.Manager()WorkerBase._update_shared_state to merge per-worker metricsadaptive_boundary:{thread_id}, 24h TTL); cold-start fallback (0.55 threshold) when Redis unavailable or < 5 messagesaccumulator_boundary_base, accumulator_leak_rate, NEWMA windows) are the slow outer loop controlled by Topic Stability Regulatornew = current + (new_confidence - current) * 0.5configs/connections.json — Redis & PostgreSQL endpointsconfigs/agents/*.json — LLM settings (model, temperature, timeout)configs/generated/mode_router_config.json — Learned router weights (generated)providers table (not JSON files)/api/providers)See docs/02-PROVIDERS-SETUP.md for detailed setup instructions.
user_auth — Account creation, login, API key managementconversation — Chat endpoint (SSE streaming), conversation list/retrievalmemory — Memory search, fact managementproactive — Outreach/notifications, upcoming tasksprivacy — Data deletion, exportsystem — Health, version, settings, observability (routing, memory, tools, identity, tasks, autobiography, traits)tools — Tool execution, configurationproviders — LLM provider configurationpush — Push notification subscriptionscheduler — Reminders and scheduled taskslists — List managementstubs — Placeholder endpoints (calendar, notifications, integrations, voice, permissions) returning 501/system/observability/*)routing — Mode router decision distribution and recent activitymemory — Memory layer counts and health indicatorstools — Tool performance statsidentity — Identity vector statestasks — Active persistent tasks, curiosity threads, triage calibrationautobiography — Current autobiography narrative with delta (changed/unchanged sections)traits (GET) — User traits grouped by category with confidence scorestraits/<key> (DELETE) — Remove a specific learned trait (user correction)See API blueprints in backend/api/ for full reference.
@pytest.mark.unit — No external dependencies (fast)@pytest.mark.integration — Requires PostgreSQL/Redis (slower)backend/tests/
├── test_services/ # Service unit tests
├── test_workers/ # Worker integration tests
└── fixtures/ # Shared test fixtures
Run all tests: pytest
Run only unit: pytest -m unit
Run with verbose: pytest -v
cd backend
pip install -r requirements.txt
source .venv/bin/activate
cp .env.example .env
# Terminal 1: PostgreSQL + Redis
# (ensure postgres + redis running locally)
# Terminal 2: Consumer (all workers)
python consumer.py
# Terminal 3: Test/debug
python -c "from api import create_app; app = create_app(); app.run()"
docker-compose build
docker-compose up -d
docker-compose logs -f backend
chalie — change in production