SIGNAL CONTRACT
Signal Contract — Continuous Reasoning Spine
This document defines the contract that governs Chalie’s transition from independent timer-based services to a unified signal-driven architecture. It is the governing spec for all migration work.
Status: Active — governs all service migration decisions.
1. Governing Principles
1.1 Simplicity Over Cleverness
Every service must be as minimal as possible. Complexity compounds across 40+ services — a 10% increase in complexity per service is a 40x increase in system debugging difficulty. When in doubt, do less.
1.2 Graceful Isolation
“Forgetting my name for a split-second doesn’t put me in a vegetative state.”
No service failure may cascade into other services. Every signal consumer must operate under the assumption that any signal source may be dead, delayed, or producing garbage. The system degrades gracefully — individual capabilities may temporarily weaken, but the core reasoning loop never stops.
Concrete rules:
- Every signal consumption is wrapped in try/except at the boundary
- Every service has a fail-open default: if it can’t do its job, it returns a neutral result (empty string, no-op, skip), never raises into the caller
- No service holds locks that other services need
- No service writes state that another service must read to function (MemoryStore state is advisory, never mandatory)
- A service being dead means its signals stop arriving — consumers treat “no signal” as “nothing interesting happened”, not as an error
1.3 Independent Testability
Every service must be testable in complete isolation:
- Unit tests use in-memory MemoryStore and
:memory:SQLite — no shared state - No test may depend on another service being initialized
- Every service is covered by at least one
chalie-nightly-testblackbox scenario - Integration between services is tested by the nightly suite, not by unit tests
1.4 Service Layers (Fault Domains)
Every service belongs to exactly one of three layers. Failures are contained within a layer — they never cascade across layer boundaries.
| Layer | Analogy | What it does | If it fails… |
|---|---|---|---|
| Cognitive | Brain | Reasoning, memory formation, consolidation, decay, planning, reflection | …you stop reasoning well, but you still perceive and can still use tools |
| Embodiment | Body/Senses | Perception, ambient awareness, place learning, context tracking, voice I/O | …you lose awareness of surroundings, but you can still think and act on what you know |
| Capability | Tools/Hands | External tools, document processing, scheduling, list management | …you lose specific abilities, but you find alternatives or report inability |
Cognitive services: DecayEngine, SemanticConsolidation, EpisodicMemoryWorker, MemoryChunker, ReasoningLoopService, ContextAssembly, ModeRouter, PlanDecomposition, CriticService, UncertaintyService, ContradictionClassifier, IdleConsolidation, GrowthPattern, AutobiographySynthesis, GoalInference, SelfModel
Embodiment services: AmbientInference, PlaceLearning, ClientContext, EventBridge, VoiceService, FolderWatcher, TemporalPattern, EpisodicMemoryObserver, ThreadExpiry
Capability services: ToolRegistry, ToolWorker, ToolSubprocess, ToolConfig, ToolProfile, ToolPerformance, ACTLoop, ACTDispatcher, DocumentService, DocumentProcessing, DocumentPurge, SchedulerService, ListService, PersistentTaskWorker, MomentEnrichment, ProfileEnrichment
Cross-layer rules:
- Cognitive services never import embodiment or capability services at module level (lazy imports only)
- Embodiment services write to MemoryStore; cognitive services read from MemoryStore. Never direct calls.
- Capability failures surface as “tool unavailable” — the cognitive layer plans around them, never crashes
- A full embodiment outage means ambient signals stop arriving. The cognitive layer treats this as “nothing interesting is happening” (idle), not as an error
1.5 Minimal Surface Area
Each service exposes the minimum interface needed:
- One public method for its primary job (e.g.,
process(),consolidate(),decay()) - Signal emission is a side-effect, not the primary interface
- No service exposes internal state to other services except through MemoryStore advisory keys
2. Signal Envelope
All signals flowing through the spine use this format:
@dataclasses.dataclass
class ReasoningSignal:
signal_type: str # What happened (see §3)
source: str # Who emitted it (service name)
concept_id: int | None # Direct concept reference (fast path)
concept_name: str | None # Human-readable label
topic: str | None # Domain/topic context
content: str | None # Freeform payload (< 200 chars)
activation_energy: float # 0.0–1.0, how important/urgent
timestamp: float # When emitted (epoch)
2.1 Signal Types (Registered)
| Signal Type | Meaning | Emitter(s) | Energy Range |
|---|---|---|---|
memory_pressure |
Knowledge is fading or contradicted | decay_engine, semantic_consolidation | 0.5–0.7 |
new_knowledge |
New concept formed from experience | semantic_consolidation | 0.6 |
novel_observation |
Surprising tool output stored as episode | experience_assimilation | 0.6 |
ambient_context |
Environment changed (place, attention, energy) | event_bridge | From confidence |
idle_discovery |
Nothing happened, engine self-seeds | reasoning_loop (internal) | 0.4–0.5 |
episode_created |
New narrative episode consolidated | episodic_memory_worker | 0.5 |
trait_changed |
User trait created, updated, or corrected | knowledge_service | 0.3–0.7 |
task_state_changed |
Persistent task state transition | persistent_task_service | 0.5–0.6 |
schedule_fired |
Scheduled reminder/task fired | scheduler_service | 0.5 |
thread_expired |
Conversation thread expired | thread_expiry_service | 0.3 |
user_message |
User sent a chat message | websocket | 1.0 |
goal_inferred |
Recurring topic pattern detected as potential goal | goal_inference_service | 0.6 |
Note: Signal handlers also update the world model cache in MemoryStore (world_model:items).
task_state_changed and schedule_fired trigger incremental cache updates via
WorldStateService.notify_task_changed() / notify_schedule_changed(). The cache
is fully refreshed from DB during idle periods.
New signal types require:
- Addition to this table
- A nightly test scenario
- Documentation of what the consumer should do with it
2.2 Signal Transport
- Priority queue:
reasoning:priority(user messages — processed first) - Background queue:
reasoning:signals(all other signal types) - Pop:
blpop([priority, signals], timeout=idle_timeout)— tries priority first - Push:
rpush(key, signal.to_json()) - Max depth: 50 signals (oldest dropped on overflow, background queue only)
- Debounce: 30s minimum between processed background signals (user messages bypass)
- Serialization: JSON via
dataclasses.asdict() - Yield points: Background signal processing checks priority queue before expensive operations (LLM calls); if a user message is waiting, background reasoning aborts and the loop picks up the priority signal
2.3 Emission Rules
- Emission is always fire-and-forget — the emitter never waits for a response
- Emission is always wrapped in try/except — a failed emit is logged at DEBUG, never raised
- Emission uses lazy imports (
from services.reasoning_loop_service import emit_reasoning_signal, ReasoningSignal) to avoid import cycles - Emitters never instantiate the consumer — they push to the queue and forget
3. Service Lifecycle Contract
3.1 Registration
Every spine-connected service declares in its module docstring:
Emits: signal_type_1, signal_type_2
Consumes: signal_type_3 (via reasoning:signals queue)
Trigger: <timer Ns | signal-driven | request-driven | one-shot>
Fail mode: <fail-open description>
3.2 Health
Every long-running service writes a heartbeat:
store.set(f"health:{service_name}", str(time.time()), ex=ttl)
Where ttl is 2x the expected cycle time. The SelfModelService (30s cycle) reads these heartbeats and includes dead services in its noteworthy[] list. No automated restart — health is observational, not coercive.
3.3 Startup Order
Services start in dependency order (managed by run.py), but no service assumes another service is running. If a dependency isn’t ready:
- Queue-based: messages accumulate, processed when consumer starts
- Signal-based: signals accumulate (up to queue cap), processed when consumer starts
- Direct call: try/except, return neutral default
4. Migration Pattern
4.1 Converting a Timer Service to Signal-Responsive
For a service that currently runs on time.sleep(N):
Before:
def run(self):
while True:
time.sleep(self.interval)
self._do_work()
After (Phase 1 — emit signals, keep timer):
def run(self):
while True:
time.sleep(self.interval)
self._do_work()
# NEW: emit signal if something interesting happened
if result.is_interesting:
emit_reasoning_signal(ReasoningSignal(...))
After (Phase 2 — consume signals, remove timer):
def run_signal_loop(self):
while True:
signal = self.store.blpop("service:signals", timeout=self.max_idle)
if signal:
self._process_signal(signal)
else:
self._idle_maintenance()
Phase 1 is always safe to ship independently. Phase 2 requires the spine to route signals to the service.
4.2 Migration Checklist (Per Service)
- [ ] Service docstring updated with Emits/Consumes/Trigger/Fail-mode
- [ ] Signal emission added (Phase 1)
- [ ] Unit tests pass in isolation
- [ ] Nightly scenario created/updated
- [ ] Timer removed, signal consumption added (Phase 2)
- [ ] Fail-open verified (service killed → system continues)
- [ ] Documented in this file’s migration tracker (§5)
5. Migration Tracker
Phase 1 Complete (Emits Signals, Keeps Timer)
| Service | Signals Emitted | Timer | Nightly Scenario |
|---|---|---|---|
| DecayEngineService | memory_pressure |
30min | 966 |
| SemanticConsolidationService | new_knowledge, memory_pressure |
Queue-driven | 967 |
| ExperienceAssimilationService | novel_observation |
60s poll | — |
| EventBridgeService | ambient_context |
Event-driven | 968 |
| EpisodicMemoryWorker | episode_created |
Queue-driven | 971 |
| KnowledgeService | trait_changed |
Request-driven | 972 |
| PersistentTaskService | task_state_changed |
Request/timer | 973 |
| SchedulerService | schedule_fired |
60s timer | 974 |
| ThreadExpiryService | thread_expired |
5min timer | 975 |
| EpisodicMemoryWorker | goal_emerged |
Post-episode clustering + LLM | — |
Phase 2 Complete (Signal-Driven, No Timer)
| Service | Signals Consumed | Idle Fallback | Nightly Scenario |
|---|---|---|---|
| ReasoningLoopService | All signal types | 10min → salient/insight | 965, 968, 969 |
Not Yet Started
| Service | Current Trigger | Priority | Notes |
|---|---|---|---|
| EpisodicMemoryObserver | 60s timer | — | Could react to gist-stored signals |
| IdleConsolidationService | 5min timer | — | Could react to queue-drain signals |
| GrowthPatternService | 30min timer | — | Could react to trait-change signals |
| AutobiographySynthesis | 6h timer | Low | Long cycle, timer is fine for now |
| PersistentTaskWorker | 30min timer | — | Could react to plan-ready signals |
| ProfileEnrichmentService | 6h timer | Low | Long cycle, timer is fine |
| TemporalPatternService | 6h timer | Low | Long cycle, timer is fine |
| SelfModelService | 30s timer | — | Heartbeat aggregator, timer is natural |
| DocumentPurgeService | 6h timer | Low | Maintenance, timer is fine |
| MomentEnrichmentService | 5min timer | Low | Polling for status, timer is fine |
| FolderWatcherService | 30s timer | Low | OS-level polling, timer is natural |
6. Anti-Patterns
6.1 Signal Cascades
Bad: Service A emits signal → Service B processes it and emits signal → Service C processes it and emits signal → Service A processes it. Rule: No circular signal paths. If A emits to B, B must never emit back to A through any chain. Draw the signal graph before adding a new emission point.
6.2 Signal as RPC
Bad: Service A emits a signal and waits for a response.
Rule: Signals are fire-and-forget. If you need a response, use a direct function call or a dedicated result queue (like bg_llm:result:{job_id}).
6.3 Mandatory Signals
Bad: Service B crashes if it doesn’t receive a signal from Service A within N seconds. Rule: No signal is mandatory. “No signal” means “nothing interesting happened”, never “something is broken”. Timeouts trigger idle/maintenance behavior, not error states.
6.4 Fat Signals
Bad: Signal payload contains the full episode text, embeddings, or large data structures. Rule: Signals carry references (concept_id, topic) and summaries (content < 200 chars). The consumer looks up full data from SQLite/MemoryStore if needed.
6.5 Signal-Driven Configuration
Bad: Using signals to propagate config changes across services. Rule: Config is read from files/DB at service init or on a slow reload cycle. Signals carry cognitive events, not infrastructure state.
7. The Spine (Future)
The current architecture has a single consumer (ReasoningLoopService) reading from a single queue (reasoning:signals). The future spine will:
- Route signals to multiple consumers — each service registers interest in specific signal types
- Priority scheduling — user-facing signals preempt background maintenance
- Backpressure — slow consumers don’t cause queue overflow for fast consumers
- Observability — signal flow is logged and queryable for debugging
This is explicitly not built yet. The current single-queue model is sufficient for Phase 1 (emit signals) and the initial Phase 2 conversions. The spine emerges when enough services are signal-driven that routing becomes necessary.
Build the spine when you need it, not before.