System Architecture

Overview

Chalie is a human-in-the-loop cognitive assistant that combines memory consolidation, semantic reasoning, and proactive assistance. The system processes user prompts through a chain of workers and services, enriching conversations with memory chunks and generating episodic memories for future use.

Core Architecture

System Type

Communication Pattern

  1. User sends message → POST to /chat with text
  2. Backend processes: Mode router selects mode → mode-specific LLM generates response
  3. Response delivered: Via SSE stream (status → message → done events)
  4. Authentication: Session cookie-based authentication (@require_session decorator)

Code Organization

backend/
├── services/          # Business logic (memory, orchestration, routing, embeddings)
├── workers/           # Async workers (digest, memory chunking, consolidation)
├── listeners/         # Input handlers (direct REST API)
├── api/               # REST API blueprints (conversation, memory, proactive, privacy, system)
├── configs/           # Configuration files (connections.json, agent configs, generated/)
├── migrations/        # Database migrations
├── prompts/           # LLM prompt templates (mode-specific)
├── tools/             # Skill implementations
├── tests/             # Test suite
└── consumer.py        # Main supervisor process

Frontend applications located separately:

frontend/
├── interface/         # Main chat UI (HTML/CSS/JS, Radiant design system)
├── brain/             # Admin/cognitive dashboard
└── on-boarding/       # Account setup wizard

IMPORTANT: UI code must exist under /interface/, /brain/, or /on-boarding/ only.

Key Services

Core Services (backend/services/)

Routing & Decision Making

Response Generation

Memory System

Autonomous Behavior

Ambient Awareness

ACT Loop & Critic

Tool Integration

Identity & Learning

Infrastructure

Topic Classification

Session & Conversation

Innate Skills (backend/services/innate_skills/ and backend/skills/)

10 built-in cognitive skills for the ACT loop:

Worker Processes (backend/workers/)

Queue Workers

Services/Daemons

Data Flow Pipeline

User Input → Response Pipeline

[User Input]
  → [Consumer] → [Prompt Queue] → [Digest Worker]
    ├─ Classification (embedding-based, adaptive boundary detection)
    ├─ Context Assembly (retrieve from all 5 memory layers)
    ├─ Mode Routing (deterministic ~5ms mathematical router)
    ├─ Mode-Specific LLM Generation
    │  └─ If ACT: action loop → re-route → terminal response
    └─ Enqueue Memory Chunking Job
      → [Memory Chunker Queue] → [Memory Chunker Worker]
        → [Conversation JSON] (enriched)
      → [Episodic Memory Queue] → [Episodic Memory Worker]
        → PostgreSQL Episodes Table
        → [Semantic Consolidation Queue] → [Semantic Consolidation Worker]
          → PostgreSQL Concepts Table

Background Processes

[Routing Stability Regulator] ← reads routing_decisions (24h cycle)
    → adjusts configs/generated/mode_router_config.json

[Routing Reflection Service] ← reads reflection-queue (idle-time)
    → writes routing_decisions.reflection → feeds pressure to regulator

[Decay Engine] → runs every 1800s (30min)
    ├─ Episodic decay (salience-weighted)
    ├─ Semantic decay (strength-weighted)
    └─ User trait decay (category-specific)

[Cognitive Drift Engine] → during worker idle
    ├─ Seed selection (weighted random)
    ├─ Spreading activation (depth 2, decay 0.7/level)
    └─ LLM synthesis → stores as drift gist

Key Architectural Decisions

Deterministic Mode Router

Single Authority for Weight Mutation

Mode-Specific Prompts

Memory Hierarchy

Each layer optimized for its timescale; all integrated via context assembly. Lists are injected into all prompts as `` for passive awareness; the ACT loop uses the list skill for mutations.

Configuration Precedence

Environment variables > .env file > JSON config files > hardcoded defaults

See docs/02-PROVIDERS-SETUP.md for provider configuration.

Thread-Safe Worker State

Adaptive Topic Boundary Detection

Topic Confidence Reinforcement

Error Resilience

Safety & Constraints

Hard Boundaries

Operational Limits

Anti-Manipulation

Configuration Files

Primary Configuration

Provider Configuration

See docs/02-PROVIDERS-SETUP.md for detailed setup instructions.

REST API

Available Blueprints

Observability Endpoints (/system/observability/*)

See API blueprints in backend/api/ for full reference.

Testing Strategy

Test Markers

Test Organization

backend/tests/
├── test_services/         # Service unit tests
├── test_workers/          # Worker integration tests
└── fixtures/              # Shared test fixtures

Run all tests: pytest Run only unit: pytest -m unit Run with verbose: pytest -v

Development Workflow

Setup

cd backend
pip install -r requirements.txt
source .venv/bin/activate
cp .env.example .env

Local Development (without Docker)

# Terminal 1: PostgreSQL + Redis
# (ensure postgres + redis running locally)

# Terminal 2: Consumer (all workers)
python consumer.py

# Terminal 3: Test/debug
python -c "from api import create_app; app = create_app(); app.run()"

Docker Development

docker-compose build
docker-compose up -d
docker-compose logs -f backend

Deployment Notes

Future Roadmap

Completed

Planned (Priority 1)

Planned (Priority 2)

Planned (Priority 3)

Glossary