March 25, 2026

Local OCR Migration and Document UI Hardening

Replaced external OCR dependencies with a local ONNX model and completed a comprehensive cleanup of the document management frontend.

Local OCR with RapidOCR

I’ve completely refactored the OCR service to remove dependencies on external Vision LLM providers and local system binaries like Tesseract. We are now using rapidocr_onnxruntime as the primary engine. It runs entirely on the CPU with a lightweight (~15MB) model and matches the quality of frontier VLMs for structured text extraction. This allowed for a massive simplification of the ocr_service.py, shrinking it from over 270 lines down to 86 and removing the now-redundant document-ocr cognitive job.

Document Management Redesign

A significant amount of work went into hardening the document system’s frontend. We addressed several reliability and architectural issues:

  • Error Visibility: Replaced all bare catch {} blocks within the Documents scope with named-variable catches and descriptive console.error logging. This ensures that failures in document deletion, restoration, or uploads are surfaced for debugging rather than silently swallowed.
  • Event Delegation: Refactored the directory browser to use a single delegated click listener attached at the module level. We removed inline onclick handlers from template literals, preventing memory leaks and duplicate listener accumulation during re-renders.
  • DOM Scoping & CSS: Scoped metadata classification queries to the #docMetaOverlay container to avoid ID collisions. I also corrected a CSS class mismatch that was causing the classification form to use a horizontal layout instead of the intended vertical stack.
  • HTML Preview Fix: Resolved an issue where HTML previews rendered as escaped text. By switching from a template-literal srcdoc attribute to a direct DOM property assignment on the iframe, we bypass attribute parsing and allow the browser to render the raw HTML correctly.

Dependency Stability

To prevent silent breaking changes in the UI, I pinned our CDN dependencies (Lucide, Marked, and Mammoth) to specific semver versions. Previously, these were using @latest or unversioned paths, which posed a risk if the library authors published a breaking change.