aaron e7de7fb64b stage3_worker: v2.3 — bulk-vs-single-episode routing on Stage 2 state-type
Reads new routing columns from stage_3_queue (state_type, state_type_confidence,
supersedes_prior_state, state_type_rationale) and dispatches each row to one of
two ingest pathways:

  - BULK pathway (existing, renamed from ingest_to_graphiti to ingest_bulk):
    safer-cheaper default. Used when supersedes=false OR confidence=low OR
    routing fields are NULL (legacy rows). Skips edge invalidation per
    graphiti-core's bulk semantics.

  - SINGLE-EPISODE pathway (new, ingest_single_episode): used only when
    supersedes_prior_state=true AND confidence in {medium, high}. Per-chunk
    POST to /episodes (singular endpoint) with shared saga tag. Each call
    independent — own timeout, own retry envelope.

Routing decision isolated in should_route_single_episode() with unit-tested
truth table covering all eight (supersedes × confidence) combinations.

Per-chunk heartbeat (heartbeat_row): single-episode pathway updates
stage_3_queue.started_at after each successful chunk POST so a long-running
document doesn't cross the 10-minute stale threshold mid-process and get
re-dequeued. started_at semantics now: 'last activity timestamp' rather
than 'began at'. Best-effort; failures logged not raised.

Partial-success on chunk failure: previously-committed chunks stay in the
graph; the function raises with detail (single_episode_partial: chunk N/M
failed, succeeded K). The row is marked failed_at with that detail. Re-
ingestion would re-POST chunks 1..N-1 against the graph; graphiti's dedup
handles them as no-ops.

DB connection scoping: process_one no longer holds one Postgres connection
across the whole ingest call (which can run an hour for long single-episode
documents). Each DB write gets a short-lived connection.

Phase A item 3 of three. Closes the mechanical-patches block. Item 4
(custom_extraction_instructions text design) is the remaining intellectual
work; sidecar and worker plumbing is now ready for it.
2026-05-01 19:07:41 +00:00
2026-04-25 02:05:42 +00:00
S
Description
No description provided
12 MiB
Languages
Python 95.9%
HTML 3.7%
Shell 0.4%