aaronAI

Author	SHA1	Message	Date
aaron	c0e6159b5e	graphiti_patches: vendored FalkorDB vector index support for graphiti-core 0.29.0 Adds native FalkorDB vector index support to graphiti-core's FalkorDB driver. Three patched files (graph_queries.py, falkordb_driver.py, falkordb/operations/search_ops.py) plus apply.sh that backs up venv files and copies patches over. Why this exists: graphiti-core 0.29.0 builds similarity queries using interpreted Cypher cosine math (vec.cosineDistance) which produces a full-table scan over Entity/RELATES_TO/Community nodes for every search. At ~4,000+ entities, single-episode add_episode took 8+ minutes for the resolve-against-existing-graph step and bulk ingest hung indefinitely. FalkorDB itself supports db.idx.vector.queryNodes and queryRelationships procedures backed by HNSW indexes; the driver just doesn't use them. Patches: 1. graph_queries.py — adds get_vector_indices() returning CREATE VECTOR INDEX statements for FalkorDB (Entity.name_embedding, RELATES_TO.fact_embedding, Community.name_embedding). HNSW with cosine similarity. Adds VECTOR_INDEX_CANDIDATE_MULTIPLIER for over-fetch when WHERE filters reject some top-k results. Original get_vector_cosine_func_query preserved for fallback. 2. falkordb_driver.py — extends build_indices_and_constraints() to call get_vector_indices() alongside range and fulltext. Adds cache invalidation hook so the search_ops dispatcher re-probes for indexes after they're built. 3. falkordb/operations/search_ops.py — adds vector-index dispatcher helpers (_falkordb_vector_index_exists with module-level cache, _falkordb_vector_node_search_cypher, _falkordb_vector_edge_search_cypher). Rewrites the three vector-similarity call sites (Entity.name_embedding, RELATES_TO.fact_embedding, Community.name_embedding) to use db.idx.vector.queryNodes / queryRelationships when available, fall back to interpreted-Cypher cosine math when not. Index existence probed once per (label, attribute, entity_type) and cached. Empirical result: single-episode add_episode against a 4,277-entity graph went from indefinite hang to 8.2 seconds. Bulk re-ingest of already-known content (worst case for entity dedup) committed in 60ms. Activation requires bridging driver._search_ops to driver.search_interface in the sidecar (see graphiti_service.py). graphiti-core declares search_interface as the dispatcher attribute but never assigns the per-driver implementation to it — naming mismatch in their internal refactor. The bridge is one line in our sidecar's lifespan. Upstream candidate: this is a known gap (referenced indirectly in upstream issue #1263 RFC for external vector store overlay). Maintainers' attention is on Milvus/Qdrant/Pinecone overlay; this is the FalkorDB- native alternative for users who don't want to run a separate vector DB. PR after empirical validation in production. Apache-2.0 graphiti-core source is NOT vendored — backups/ is gitignored to keep the upstream source out of this repo.	2026-05-02 05:19:01 +00:00
aaron	d7b2a850c4	stage3_worker: v2.4 — encoder extraction instructions v1.0 Adds EXTRACTION_INSTRUCTIONS_V1 constant passed to the sidecar via custom_extraction_instructions on both bulk and single-episode pathways. graphiti-core inserts the text into entity and edge extraction prompts only; it does NOT enter dedup prompts (that's the encoder-stays-naive commitment). Architectural posture: the encoder is content-naive. It does not draw on prior knowledge of the user, the substrate, or the cycle's accumulated work. Schema and personality live in the cycle's consolidated substrate where the dream phase shapes them. The encoder produces source-grounded ground truth for the cycle to work from. Empirical validation in tonight's smoke test: 30+ verb-shaped predicates from 3 chunks of real content, including IS_AUTOBIOGRAPHICAL_TO, INFORMED_DESIGN_OF, EVALUATED_DOMAIN_PURITY, DISCONFIRMED_HYPOTHESIS_ABOUT. Compare to default extraction's 4 predicate types across 22,289 edges. RELATES_TO appears once as appropriate fallback rather than collapsing everything generic. Bumps WORKER_VERSION to 2.4.	2026-05-02 05:15:17 +00:00
aaron	a0bf280075	Add Pattern 1 async job model migration Adds graphiti_jobs table for sidecar's async ingest queue and external_job_id column on stage_3_queue for worker's polling reference. Tonight's smoke test diagnosed that bulk ingest against the 4,222-entity graph commits successfully but the worker's 600s HTTP read-timeout fires before the sidecar's response returns. Three days of 'saga deadlock' failures were false negatives — the work succeeded; the worker just stopped listening. Pattern 1 separates submission from completion observation so the worker can't false-negative this way. Migration only — sidecar and worker code changes follow in subsequent commits.	2026-05-02 02:22:30 +00:00
aaron	30beeb3a26	migrations: retroactively track stage_3_queue routing columns Adds migrations/ directory with README documenting the convention (timestamped filenames, idempotent SQL, forward-only, single change per file). First migration is the Stage 3 queue routing columns added live during Phase A patches today: state_type, state_type_confidence, supersedes_prior_state, state_type_rationale, plus index on supersedes. Required by stage2_worker.py >= 2.2 and stage3_worker.py >= 2.3. Idempotent (IF NOT EXISTS), safe to re-apply. Verified by re-applying against the live DB — no changes, no errors. Closes a reproducibility gap: a fresh DB provisioned from git would crash on first Stage 2 enqueue without these columns. Now the SQL travels with the code.	2026-05-01 19:11:09 +00:00
aaron	e7de7fb64b	stage3_worker: v2.3 — bulk-vs-single-episode routing on Stage 2 state-type Reads new routing columns from stage_3_queue (state_type, state_type_confidence, supersedes_prior_state, state_type_rationale) and dispatches each row to one of two ingest pathways: - BULK pathway (existing, renamed from ingest_to_graphiti to ingest_bulk): safer-cheaper default. Used when supersedes=false OR confidence=low OR routing fields are NULL (legacy rows). Skips edge invalidation per graphiti-core's bulk semantics. - SINGLE-EPISODE pathway (new, ingest_single_episode): used only when supersedes_prior_state=true AND confidence in {medium, high}. Per-chunk POST to /episodes (singular endpoint) with shared saga tag. Each call independent — own timeout, own retry envelope. Routing decision isolated in should_route_single_episode() with unit-tested truth table covering all eight (supersedes × confidence) combinations. Per-chunk heartbeat (heartbeat_row): single-episode pathway updates stage_3_queue.started_at after each successful chunk POST so a long-running document doesn't cross the 10-minute stale threshold mid-process and get re-dequeued. started_at semantics now: 'last activity timestamp' rather than 'began at'. Best-effort; failures logged not raised. Partial-success on chunk failure: previously-committed chunks stay in the graph; the function raises with detail (single_episode_partial: chunk N/M failed, succeeded K). The row is marked failed_at with that detail. Re- ingestion would re-POST chunks 1..N-1 against the graph; graphiti's dedup handles them as no-ops. DB connection scoping: process_one no longer holds one Postgres connection across the whole ingest call (which can run an hour for long single-episode documents). Each DB write gets a short-lived connection. Phase A item 3 of three. Closes the mechanical-patches block. Item 4 (custom_extraction_instructions text design) is the remaining intellectual work; sidecar and worker plumbing is now ready for it.	2026-05-01 19:07:41 +00:00
aaron	70e87e3ab5	stage2_worker: v2.2 — add state-type classification for Stage 3 routing Mistral pass now produces two concerns in a single flat JSON output: (a) orientation context (existing four fields, unchanged semantics) (b) state-type classification: state_type (current/reference/historical), state_type_confidence (low/medium/high), supersedes_prior_state (bool), state_type_rationale (text) Routing fields written as explicit columns on stage_3_queue (separate ALTER TABLE migration adds them: state_type, state_type_confidence, supersedes_prior_state, state_type_rationale + index on supersedes). Safe-cheap defaults on malformed Mistral output: state_type='reference', confidence='low', supersedes=false. All defaults route to bulk pathway (no temporal invalidation cost) so Mistral parse drift can't accidentally trigger expensive single-episode ingest. Phase A item 2 of three. Sidecar (item 1, commit `8b0a163`) already plumbs custom_extraction_instructions through to /episodes/bulk. Stage 3 routing logic (item 3) follows.	2026-05-01 19:02:11 +00:00
aaron	8b0a163670	graphiti_service: expose custom_extraction_instructions on /episodes/bulk; add saga on /episodes - BulkEpisodeRequest: new optional custom_extraction_instructions field with comment noting graphiti-core inserts it into extract_nodes/extract_edges prompts only, NOT dedupe prompts (verified by reading prompts directory) - EpisodeRequest: new optional saga field, plumbed through to add_episode for upcoming Stage 3 single-episode pathway - Both handlers use conditional kwargs construction so existing callers see no behavioral change Phase A item 1 of three. Items 2 (stage2_worker) and 3 (stage3_worker) follow.	2026-05-01 18:57:31 +00:00
aaron	1a8e0353f5	stage3_worker: v2.2 — absolute sudo/systemctl paths, error logging, reset failure counter on recovery failure Mirrors stage2_worker v2.1 (`da98019`) resilience fixes: - Absolute paths for /usr/bin/sudo and /bin/systemctl - Log stdout/stderr when sidecar restart fails - Reset consecutive_failures even when wedge recovery fails (prevents permanent stuck state if restart itself is broken)	2026-05-01 18:40:25 +00:00
aaron	da980193dd	stage2_worker: v2.1 — terminal failure states + sudo path fix Three classes of silent failure converted to clean terminal states: - Mistral timeout: previously left rows in zombie state (started_at set, failed_at null, attempts incremented past retry threshold, row invisible to selection query). Now sets failed_at with reason 'mistral_timeout_after_300s'. Surfaced 2026-05-01 when 17 documents accumulated in this state during the Stage 3 saga deadlock incident. - Mistral parse failure: run_mistral returns {'error': 'parse_failed'} on JSON decode failure but process_one wasn't checking, so empty orientation ('Active frames: . Frame relationships: ...') was shipped to Stage 3. This is F22 from the 2026-04-30 code review. Now sets failed_at with reason 'mistral_parse_failure'. - Wedge recovery hammering: consecutive_failures was only reset on successful Ollama restart. With the sudo path bug (also fixed here), recovery always failed, so every subsequent failure re-attempted restart. Now resets the counter regardless and logs the failure visibly. Also: subprocess.run now uses absolute paths (/usr/bin/sudo, /bin/systemctl) instead of relying on PATH, fixing the 'No such file or directory: sudo' error that broke Stage 2's recover_wedge() since deployment. F45-adjacent — sudoers entries were added 2026-05-01 but the PATH issue was masking that fix. Worker version bumped to 2.1 to match Stage 3's resilience patch level.	2026-05-01 17:28:53 +00:00
aaron	b936931668	Stage 3 worker v2.1 — saga-size limit + wedge detection + sudoers fixes Production incident 2026-05-01: F14 re-cascade attempt surfaced three compounding issues in cascade resilience. stage3_worker.py changes: - MAX_CHUNKS_PER_SAGA=10 — large documents split into multiple bulk commits, all sharing the same saga tag for Graphiti document linking. Original implementation sent all chunks as one saga; 17-19 chunk sagas deadlocked sidecar's Python-side coordination. - recover_wedge() function — restarts aaronai-graphiti.service when consecutive_failures hits threshold. Mirrors Stage 2 pattern. - run() loop adds consecutive_failures counter with threshold-2 escalation. Resolves F28 + F29 from code review. - Worker version bumped 2.0 -> 2.1. - post_bulk() helper extracts shared HTTP POST + error handling. Outside-repo changes (system config, separately documented): - WatchdogSec=600 commented in stage2 + stage3 systemd unit files. Workers have no sd_notify support; per-request timeouts in code handle the actual failure modes. - /etc/sudoers.d/aaron-aaronai created with NOPASSWD entries for systemctl restart ollama and restart aaronai-graphiti.service. Stage 2's existing recover_wedge() was silently broken since deployment due to this gap. .gitignore — added rules for *.bak files, runtime artifacts (watcher_heartbeat, dreamer_state.json, corpus_integrity_report.json, watcher_state.json, watcher_status.json), Python cruft, virtual env, .env, editor/OS files, and Aaron AI runtime data (conversations.db, sessions.db, memory.md, settings.json). Untracked 11 files that shouldn't have been committed in `465f2f7` (this morning): backup files and runtime artifacts. Re-cascading Shop Class (414KB) and BirdAI-Experiments-Log.md (192KB) through the patched worker after re-extracting full text from disk. Cascade in progress at commit time.	2026-05-01 05:18:09 +00:00
aaron	465f2f725b	Code review fixes: CV pinning, F1 (excluded_sources), F14 (50KB truncation), F37 - api.py: strip CV pinning workaround (parity violation, see architecture doc) - dream.py: F1 — retrieve_graphiti() now accepts excluded_sources, over-fetches 3x and filters in-process. Was silently dropping the parameter; would have confounded E3 with broken cross-stage exclusion in Graphiti arm. - watcher.py + ingest.py: F14 — drop full_text[:50000] truncation. Was propagating through entire cascade. Postgres TEXT can hold up to 1GB. - corpus_integrity.py: F37 — same truncation, third path now clean. Backups: api.py.bak., dream.py.bak., watcher.py.bak., ingest.py.bak., corpus_integrity.py.bak.* timestamped pre-fix. Re-cascaded Shop Class as Soulcraft (only already-cascaded source affected by F14, 414KB).	2026-05-01 02:26:37 +00:00
aaron	25e42c0231	corpus_integrity.py: write unreadables with retry_count=0 so OCR can retry when it ships	2026-04-30 22:03:48 +00:00
aaron	7822fb1cc1	corpus_integrity.py: write unreadable files to ingest_failures for UI visibility	2026-04-30 21:59:06 +00:00
aaron	74e2c34f43	corpus integrity: ingest_failures tracking in watcher, reconciliation script, corpus status/retry/reconcile endpoints	2026-04-30 21:54:39 +00:00
aaron	655dea6ae5	add remaining experiment result files	2026-04-30 18:06:52 +00:00
aaron	f11cacd9c9	add experiment scripts and results; watcher.py latest changes	2026-04-30 18:06:03 +00:00
aaron	1cf26df450	api.py: return error_type=transcription_failed on Whisper crash, frontend retry logic can now distinguish from network failures	2026-04-30 17:45:47 +00:00
aaron	7cd765146a	stage3_worker.py: log sidecar response body on non-200	2026-04-30 17:37:28 +00:00
aaron	58515ebec0	graphiti_service.py: add traceback logging, log file handler for all endpoints	2026-04-30 17:36:19 +00:00
aaron	91166367fa	E3: add Graphiti retrieval branch to dream.py, E3 experiment script with blinding	2026-04-30 17:17:28 +00:00
aaron	2b3c2380a0	watcher.py: in-process ingest, embedder loaded once at startup, startup recovery, heartbeat, no duplicate logging	2026-04-30 16:42:44 +00:00
aaron	2fb50cce71	ingest.py: guard Stage 2 enqueue behind SKIP_STAGE2_ENQUEUE env var for migration runs	2026-04-30 16:20:11 +00:00
aaron	c08f57a6f2	stage2/3 workers: remove duplicate StreamHandler, stdout captured by systemd	2026-04-30 16:12:51 +00:00
aaron	cae7fb8775	dream.py v1.1: score-band exclusion for Early REM, DREAMER_VERSION constant, manifest versioning	2026-04-30 15:51:11 +00:00
aaron	b53717af5b	dream.py: enrich manifest with retrieval breadth metrics	2026-04-30 06:14:55 +00:00
aaron	2b9a1782c1	feat: stage2/3 pipeline, taxonomy-free cascade, E1.8/E4 experiments, corpus migration state	2026-04-30 04:04:31 +00:00
aaron	62b5b5453a	fix: max_coroutines=2, saga support in sidecar; stage3 chunking; TIMEOUT_MAX 0 persistent in falkordb compose	2026-04-30 04:01:02 +00:00
aaron	95d022ec64	fix: FalkorDriver database=aaron, build indices on correct graph	2026-04-29 21:34:20 +00:00
aaron	d91a5675ff	capture: public SSE endpoint for transcription completion events	2026-04-29 18:00:54 +00:00
aaron	c42d898504	emit capture_saved SSE event when async transcription completes	2026-04-29 17:58:01 +00:00
aaron	a05fcec882	async voice transcription — return immediately, whisper runs in background	2026-04-29 17:48:22 +00:00
aaron	eb7cf3be10	upgrade whisper small -> large-v3, bump cpu_threads to 8	2026-04-29 17:35:03 +00:00
aaron	3f6c435be4	add client_time to chat context — user-supplied, not logged	2026-04-29 17:26:03 +00:00
aaron	21557790d9	capture: return error_type on transcription failure instead of HTTP 500	2026-04-29 17:04:56 +00:00
aaron	794e0aeddd	update whisper prompt: add BirdAI stack terms, remove stale ChromaDB	2026-04-29 16:47:30 +00:00
aaron	d271e17929	add sourcing constraint to system prompt, close hallucination gap	2026-04-29 16:37:39 +00:00
aaron	5d83fb7601	fix: load_dotenv override=True, option b source exclusion	2026-04-29 16:32:09 +00:00
aaron	83d4f60d0d	option b: cross-night source exclusion in dream pipeline	2026-04-29 16:19:52 +00:00
aaron	b6fe350ab2	experiments: add consistency test and briefing generator results + scripts	2026-04-28 02:47:41 +00:00
aaron	9937abbe27	chore: ignore state files	2026-04-28 00:25:19 +00:00
aaron	3121f85c87	chore: ignore state files	2026-04-28 00:23:06 +00:00
aaron	037d747573	chore: archive deprecated chromadb and migration scripts	2026-04-28 00:15:46 +00:00
aaron	d5b5c2ec14	Graphiti sidecar service + SentenceTransformer embedder — self-hosted, no OpenAI dependency	2026-04-27 18:21:22 +00:00
aaron	4ee2567400	Add SentenceTransformer embedder for Graphiti — self-hosted, no OpenAI dependency	2026-04-27 18:18:37 +00:00
aaron	a1f732fc9e	Dreamer: manifest writer, Late REM v1.2 (remove coherence pull)	2026-04-27 16:54:18 +00:00
aaron	03b3f012c3	Dreamer: prompt versioning, Early REM v1.1, prompt signature in headers	2026-04-27 16:50:21 +00:00
aaron	6776637178	Remove hardcoded PG password fallbacks — require PG_DSN env var in all scripts	2026-04-27 05:16:37 +00:00
aaron	a1f5c1049a	Fix dreamer status display, watcher excludes Media/, remove NVM debt item	2026-04-27 05:08:01 +00:00
aaron	d3239aba17	Image capture — extend /api/capture for image+voice, Claude vision description, Media/ WebDAV, watcher excludes Media/	2026-04-27 04:28:31 +00:00
aaron	ef2fddc47f	Redesign dreamer — interdependent pipeline, NREM→Early REM→Late REM→Synthesis	2026-04-26 23:41:24 -04:00

1 2

70 Commits