Big Changes: Goal Pursuit & Memory Optimization

The persistent task system is replaced by a simpler goal_pursuit system. GoalPursuitProcessor runs as a daemon thread with a fixed iteration limit and timeout, communicating via an isolated channel. This involved removing the previous state machine, DAG planner, and related task tables and prompts.

Memory usage is reduced by removing the optimum and PyTorch dependencies from doc2query_service. This service now uses raw ONNX Runtime for T5 generation with KV-cache and top-p sampling.

TTL expiration is implemented across all MemoryStore keys, preventing unbounded data growth in various queues and states.

Thread stack size is reduced to 2MB across existing threads, resulting in a memory saving of approximately 140MB.

The ONNX model warm-up inference is no longer performed during system boot, only the download remains.