aaronAI

Author	SHA1	Message	Date
aaron	e7de7fb64b	stage3_worker: v2.3 — bulk-vs-single-episode routing on Stage 2 state-type Reads new routing columns from stage_3_queue (state_type, state_type_confidence, supersedes_prior_state, state_type_rationale) and dispatches each row to one of two ingest pathways: - BULK pathway (existing, renamed from ingest_to_graphiti to ingest_bulk): safer-cheaper default. Used when supersedes=false OR confidence=low OR routing fields are NULL (legacy rows). Skips edge invalidation per graphiti-core's bulk semantics. - SINGLE-EPISODE pathway (new, ingest_single_episode): used only when supersedes_prior_state=true AND confidence in {medium, high}. Per-chunk POST to /episodes (singular endpoint) with shared saga tag. Each call independent — own timeout, own retry envelope. Routing decision isolated in should_route_single_episode() with unit-tested truth table covering all eight (supersedes × confidence) combinations. Per-chunk heartbeat (heartbeat_row): single-episode pathway updates stage_3_queue.started_at after each successful chunk POST so a long-running document doesn't cross the 10-minute stale threshold mid-process and get re-dequeued. started_at semantics now: 'last activity timestamp' rather than 'began at'. Best-effort; failures logged not raised. Partial-success on chunk failure: previously-committed chunks stay in the graph; the function raises with detail (single_episode_partial: chunk N/M failed, succeeded K). The row is marked failed_at with that detail. Re- ingestion would re-POST chunks 1..N-1 against the graph; graphiti's dedup handles them as no-ops. DB connection scoping: process_one no longer holds one Postgres connection across the whole ingest call (which can run an hour for long single-episode documents). Each DB write gets a short-lived connection. Phase A item 3 of three. Closes the mechanical-patches block. Item 4 (custom_extraction_instructions text design) is the remaining intellectual work; sidecar and worker plumbing is now ready for it.	2026-05-01 19:07:41 +00:00
aaron	70e87e3ab5	stage2_worker: v2.2 — add state-type classification for Stage 3 routing Mistral pass now produces two concerns in a single flat JSON output: (a) orientation context (existing four fields, unchanged semantics) (b) state-type classification: state_type (current/reference/historical), state_type_confidence (low/medium/high), supersedes_prior_state (bool), state_type_rationale (text) Routing fields written as explicit columns on stage_3_queue (separate ALTER TABLE migration adds them: state_type, state_type_confidence, supersedes_prior_state, state_type_rationale + index on supersedes). Safe-cheap defaults on malformed Mistral output: state_type='reference', confidence='low', supersedes=false. All defaults route to bulk pathway (no temporal invalidation cost) so Mistral parse drift can't accidentally trigger expensive single-episode ingest. Phase A item 2 of three. Sidecar (item 1, commit `8b0a163`) already plumbs custom_extraction_instructions through to /episodes/bulk. Stage 3 routing logic (item 3) follows.	2026-05-01 19:02:11 +00:00
aaron	8b0a163670	graphiti_service: expose custom_extraction_instructions on /episodes/bulk; add saga on /episodes - BulkEpisodeRequest: new optional custom_extraction_instructions field with comment noting graphiti-core inserts it into extract_nodes/extract_edges prompts only, NOT dedupe prompts (verified by reading prompts directory) - EpisodeRequest: new optional saga field, plumbed through to add_episode for upcoming Stage 3 single-episode pathway - Both handlers use conditional kwargs construction so existing callers see no behavioral change Phase A item 1 of three. Items 2 (stage2_worker) and 3 (stage3_worker) follow.	2026-05-01 18:57:31 +00:00
aaron	1a8e0353f5	stage3_worker: v2.2 — absolute sudo/systemctl paths, error logging, reset failure counter on recovery failure Mirrors stage2_worker v2.1 (`da98019`) resilience fixes: - Absolute paths for /usr/bin/sudo and /bin/systemctl - Log stdout/stderr when sidecar restart fails - Reset consecutive_failures even when wedge recovery fails (prevents permanent stuck state if restart itself is broken)	2026-05-01 18:40:25 +00:00
aaron	da980193dd	stage2_worker: v2.1 — terminal failure states + sudo path fix Three classes of silent failure converted to clean terminal states: - Mistral timeout: previously left rows in zombie state (started_at set, failed_at null, attempts incremented past retry threshold, row invisible to selection query). Now sets failed_at with reason 'mistral_timeout_after_300s'. Surfaced 2026-05-01 when 17 documents accumulated in this state during the Stage 3 saga deadlock incident. - Mistral parse failure: run_mistral returns {'error': 'parse_failed'} on JSON decode failure but process_one wasn't checking, so empty orientation ('Active frames: . Frame relationships: ...') was shipped to Stage 3. This is F22 from the 2026-04-30 code review. Now sets failed_at with reason 'mistral_parse_failure'. - Wedge recovery hammering: consecutive_failures was only reset on successful Ollama restart. With the sudo path bug (also fixed here), recovery always failed, so every subsequent failure re-attempted restart. Now resets the counter regardless and logs the failure visibly. Also: subprocess.run now uses absolute paths (/usr/bin/sudo, /bin/systemctl) instead of relying on PATH, fixing the 'No such file or directory: sudo' error that broke Stage 2's recover_wedge() since deployment. F45-adjacent — sudoers entries were added 2026-05-01 but the PATH issue was masking that fix. Worker version bumped to 2.1 to match Stage 3's resilience patch level.	2026-05-01 17:28:53 +00:00
aaron	b936931668	Stage 3 worker v2.1 — saga-size limit + wedge detection + sudoers fixes Production incident 2026-05-01: F14 re-cascade attempt surfaced three compounding issues in cascade resilience. stage3_worker.py changes: - MAX_CHUNKS_PER_SAGA=10 — large documents split into multiple bulk commits, all sharing the same saga tag for Graphiti document linking. Original implementation sent all chunks as one saga; 17-19 chunk sagas deadlocked sidecar's Python-side coordination. - recover_wedge() function — restarts aaronai-graphiti.service when consecutive_failures hits threshold. Mirrors Stage 2 pattern. - run() loop adds consecutive_failures counter with threshold-2 escalation. Resolves F28 + F29 from code review. - Worker version bumped 2.0 -> 2.1. - post_bulk() helper extracts shared HTTP POST + error handling. Outside-repo changes (system config, separately documented): - WatchdogSec=600 commented in stage2 + stage3 systemd unit files. Workers have no sd_notify support; per-request timeouts in code handle the actual failure modes. - /etc/sudoers.d/aaron-aaronai created with NOPASSWD entries for systemctl restart ollama and restart aaronai-graphiti.service. Stage 2's existing recover_wedge() was silently broken since deployment due to this gap. .gitignore — added rules for *.bak files, runtime artifacts (watcher_heartbeat, dreamer_state.json, corpus_integrity_report.json, watcher_state.json, watcher_status.json), Python cruft, virtual env, .env, editor/OS files, and Aaron AI runtime data (conversations.db, sessions.db, memory.md, settings.json). Untracked 11 files that shouldn't have been committed in `465f2f7` (this morning): backup files and runtime artifacts. Re-cascading Shop Class (414KB) and BirdAI-Experiments-Log.md (192KB) through the patched worker after re-extracting full text from disk. Cascade in progress at commit time.	2026-05-01 05:18:09 +00:00
aaron	465f2f725b	Code review fixes: CV pinning, F1 (excluded_sources), F14 (50KB truncation), F37 - api.py: strip CV pinning workaround (parity violation, see architecture doc) - dream.py: F1 — retrieve_graphiti() now accepts excluded_sources, over-fetches 3x and filters in-process. Was silently dropping the parameter; would have confounded E3 with broken cross-stage exclusion in Graphiti arm. - watcher.py + ingest.py: F14 — drop full_text[:50000] truncation. Was propagating through entire cascade. Postgres TEXT can hold up to 1GB. - corpus_integrity.py: F37 — same truncation, third path now clean. Backups: api.py.bak., dream.py.bak., watcher.py.bak., ingest.py.bak., corpus_integrity.py.bak.* timestamped pre-fix. Re-cascaded Shop Class as Soulcraft (only already-cascaded source affected by F14, 414KB).	2026-05-01 02:26:37 +00:00
aaron	25e42c0231	corpus_integrity.py: write unreadables with retry_count=0 so OCR can retry when it ships	2026-04-30 22:03:48 +00:00
aaron	7822fb1cc1	corpus_integrity.py: write unreadable files to ingest_failures for UI visibility	2026-04-30 21:59:06 +00:00
aaron	74e2c34f43	corpus integrity: ingest_failures tracking in watcher, reconciliation script, corpus status/retry/reconcile endpoints	2026-04-30 21:54:39 +00:00
aaron	f11cacd9c9	add experiment scripts and results; watcher.py latest changes	2026-04-30 18:06:03 +00:00
aaron	1cf26df450	api.py: return error_type=transcription_failed on Whisper crash, frontend retry logic can now distinguish from network failures	2026-04-30 17:45:47 +00:00
aaron	7cd765146a	stage3_worker.py: log sidecar response body on non-200	2026-04-30 17:37:28 +00:00
aaron	58515ebec0	graphiti_service.py: add traceback logging, log file handler for all endpoints	2026-04-30 17:36:19 +00:00
aaron	91166367fa	E3: add Graphiti retrieval branch to dream.py, E3 experiment script with blinding	2026-04-30 17:17:28 +00:00
aaron	2b3c2380a0	watcher.py: in-process ingest, embedder loaded once at startup, startup recovery, heartbeat, no duplicate logging	2026-04-30 16:42:44 +00:00
aaron	2fb50cce71	ingest.py: guard Stage 2 enqueue behind SKIP_STAGE2_ENQUEUE env var for migration runs	2026-04-30 16:20:11 +00:00
aaron	c08f57a6f2	stage2/3 workers: remove duplicate StreamHandler, stdout captured by systemd	2026-04-30 16:12:51 +00:00
aaron	cae7fb8775	dream.py v1.1: score-band exclusion for Early REM, DREAMER_VERSION constant, manifest versioning	2026-04-30 15:51:11 +00:00
aaron	b53717af5b	dream.py: enrich manifest with retrieval breadth metrics	2026-04-30 06:14:55 +00:00
aaron	2b9a1782c1	feat: stage2/3 pipeline, taxonomy-free cascade, E1.8/E4 experiments, corpus migration state	2026-04-30 04:04:31 +00:00
aaron	62b5b5453a	fix: max_coroutines=2, saga support in sidecar; stage3 chunking; TIMEOUT_MAX 0 persistent in falkordb compose	2026-04-30 04:01:02 +00:00
aaron	95d022ec64	fix: FalkorDriver database=aaron, build indices on correct graph	2026-04-29 21:34:20 +00:00
aaron	d91a5675ff	capture: public SSE endpoint for transcription completion events	2026-04-29 18:00:54 +00:00
aaron	c42d898504	emit capture_saved SSE event when async transcription completes	2026-04-29 17:58:01 +00:00
aaron	a05fcec882	async voice transcription — return immediately, whisper runs in background	2026-04-29 17:48:22 +00:00
aaron	eb7cf3be10	upgrade whisper small -> large-v3, bump cpu_threads to 8	2026-04-29 17:35:03 +00:00
aaron	3f6c435be4	add client_time to chat context — user-supplied, not logged	2026-04-29 17:26:03 +00:00
aaron	21557790d9	capture: return error_type on transcription failure instead of HTTP 500	2026-04-29 17:04:56 +00:00
aaron	794e0aeddd	update whisper prompt: add BirdAI stack terms, remove stale ChromaDB	2026-04-29 16:47:30 +00:00
aaron	d271e17929	add sourcing constraint to system prompt, close hallucination gap	2026-04-29 16:37:39 +00:00
aaron	5d83fb7601	fix: load_dotenv override=True, option b source exclusion	2026-04-29 16:32:09 +00:00
aaron	83d4f60d0d	option b: cross-night source exclusion in dream pipeline	2026-04-29 16:19:52 +00:00
aaron	b6fe350ab2	experiments: add consistency test and briefing generator results + scripts	2026-04-28 02:47:41 +00:00
aaron	037d747573	chore: archive deprecated chromadb and migration scripts	2026-04-28 00:15:46 +00:00
aaron	d5b5c2ec14	Graphiti sidecar service + SentenceTransformer embedder — self-hosted, no OpenAI dependency	2026-04-27 18:21:22 +00:00
aaron	4ee2567400	Add SentenceTransformer embedder for Graphiti — self-hosted, no OpenAI dependency	2026-04-27 18:18:37 +00:00
aaron	a1f732fc9e	Dreamer: manifest writer, Late REM v1.2 (remove coherence pull)	2026-04-27 16:54:18 +00:00
aaron	03b3f012c3	Dreamer: prompt versioning, Early REM v1.1, prompt signature in headers	2026-04-27 16:50:21 +00:00
aaron	6776637178	Remove hardcoded PG password fallbacks — require PG_DSN env var in all scripts	2026-04-27 05:16:37 +00:00
aaron	a1f5c1049a	Fix dreamer status display, watcher excludes Media/, remove NVM debt item	2026-04-27 05:08:01 +00:00
aaron	d3239aba17	Image capture — extend /api/capture for image+voice, Claude vision description, Media/ WebDAV, watcher excludes Media/	2026-04-27 04:28:31 +00:00
aaron	ef2fddc47f	Redesign dreamer — interdependent pipeline, NREM→Early REM→Late REM→Synthesis	2026-04-26 23:41:24 -04:00
aaron	7af246ac01	APScheduler — replace systemd timers, in-process dream and ingest scheduling	2026-04-27 03:04:33 +00:00
aaron	9b312d936f	Add SSE endpoint and dream notify — /api/events and /api/events/notify	2026-04-27 02:20:50 +00:00
aaron	9088b5643d	Add /api/dreamer/status and /api/dreamer/run endpoints	2026-04-27 01:27:09 +00:00
aaron	a07de922df	Add /api/capture and /api/captures endpoints — auth-free, WebDAV delivery to Journal/Captures/	2026-04-26 22:39:55 +00:00
aaron	8c8fba11b8	Add nightly conversation indexing — Aaron AI conversations into pgvector at 2:30AM	2026-04-26 21:28:40 +00:00
aaron	f78b83042b	Migrate to pgvector — remove ChromaDB from api.py, ingest scripts, dream.py	2026-04-26 21:16:04 +00:00
aaron	d2eed98906	Pre-pgvector migration checkpoint — upsert, allow_replace_deleted, maintenance timer	2026-04-26 20:19:49 +00:00

1 2

62 Commits