June 9, 2026

Delegates converge, browser rebuilt, recall guardrail, and scrub

Web search and browse delegates were running blind to their own actions, re-issuing the same searches until they hit max_iterations

Web search and browse delegates were running blind to their own actions, re-issuing the same searches until they hit max_iterations. The root cause was a flag conflation: skip_transcript and skip_input_row were both set True, which prevented the act-trail uid from being assigned. Fixing them to False lets each delegate write a per-turn transcript row and render its own act-trail into the next iteration’s prompt, so the delegate sees its work and converges. Documentation and the shared config shape were updated to reflect the new skip flags.

The browser tool was rebuilt from a 4-action, 19-parameter interface into 9 flat verbs (open, read, find, click, fill, select, scroll, back, screenshot) that a small local model can drive. A single persistent Playwright page per delegate run, visible-text-only targeting, and a mechanical post-action diff replace the old interaction DSL. Screenshots now route through ingest_file into a screenshots subdirectory, and the web_browse delegate’s tool summary explicitly advertises that it takes screenshots and inspects them with its own vision, closing a gap where the orchestrator refused to delegate screenshot tasks.

Memory recall no longer silently fans out to document and schedule searches behind the model’s back. An explicit recall now appends a one-line guardrail steering the model to use the document and schedule search tools on its own judgement, while the turn-0 auto-seed recall stays silent. The turn-0 seed radius was narrowed by 30% (to 0.35) to reduce over-retrieval. Tests drive the real ToolDispatcher chokepoint to confirm the guardrail text, the absence of hidden fan-out tool calls, and the correct radii.

Both compaction system prompts were rewritten as fixed-structure contracts. The history prompt slimmed from 430 to 244 words with a Person/Now/Holding/Open/Voice/Last structure, and the trail prompt became a four-part Goal/Done/Failed/Next handover with a hallucination guard. Feature tests pin the new shape and confirm the legacy tags never return.

Two sweeps scrubbed internal infrastructure references from comments, docstrings, and documentation: ticket IDs, internal scenario names, absolute paths, and private harness references were replaced with neutral prose. No logic was touched, and all outbound MCP-server examples were preserved.

Voice model download hints were repointed from the installer to Settings, matching the runtime-download path introduced earlier. Three docs were updated to describe the on-demand download via RuntimeDepsService, and a feature test asserts no hint references the installer.

Test cleanup removed dead pre-load guard code and de-mocked the last two MagicMock-saturated test files, driving real stack interactions for pattern matching, document upload, and graph budget capping.

  • Web delegates now render their own act-trail, converging instead of re-searching blindly to max_iterations.

  • Browser tool rebuilt as 9 flat verbs on a persistent page, with screenshot+vision affordance in the delegate summary.

  • Memory recall no longer silently fans out; explicit recall gets a guardrail, and turn-0 seed radius narrowed 30%.

  • Compaction system prompts rewritten as fixed-structure contracts, slimming history prompt to 244 words.

  • Internal infrastructure references scrubbed from comments, docstrings, and docs across two sweeps.

  • Voice model download hints now point to Settings, not the installer, matching the runtime-download path.