Commit Graph

4 Commits

Author SHA1 Message Date
aaron 465f2f725b Code review fixes: CV pinning, F1 (excluded_sources), F14 (50KB truncation), F37
- api.py: strip CV pinning workaround (parity violation, see architecture doc)
- dream.py: F1 — retrieve_graphiti() now accepts excluded_sources, over-fetches
  3x and filters in-process. Was silently dropping the parameter; would have
  confounded E3 with broken cross-stage exclusion in Graphiti arm.
- watcher.py + ingest.py: F14 — drop full_text[:50000] truncation. Was
  propagating through entire cascade. Postgres TEXT can hold up to 1GB.
- corpus_integrity.py: F37 — same truncation, third path now clean.

Backups: api.py.bak.*, dream.py.bak.*, watcher.py.bak.*, ingest.py.bak.*,
corpus_integrity.py.bak.* timestamped pre-fix.

Re-cascaded Shop Class as Soulcraft (only already-cascaded source affected
by F14, 414KB).
2026-05-01 02:26:37 +00:00
aaron 25e42c0231 corpus_integrity.py: write unreadables with retry_count=0 so OCR can retry when it ships 2026-04-30 22:03:48 +00:00
aaron 7822fb1cc1 corpus_integrity.py: write unreadable files to ingest_failures for UI visibility 2026-04-30 21:59:06 +00:00
aaron 74e2c34f43 corpus integrity: ingest_failures tracking in watcher, reconciliation script, corpus status/retry/reconcile endpoints 2026-04-30 21:54:39 +00:00