aaronAI/scripts at 84994f92827e411e7be9d10ce6a1b85e65046f56 - aaronAI - Aaron's Code

aaron/aaronAI

Files

T

History

aaron 84994f9282 api.py: prompt-cache system prompt and memory across tool_use round-trip

Move persistent memory from the user message into system blocks with
cache_control: ephemeral on the last block. The static prefix (system prompt +
memory, ~3-5K tokens typically) is identical between the two LLM calls of a
tool_use round-trip and stable across turns within the 5-minute cache TTL.

Without this, the tool-call retrieval architecture roughly doubled input
token cost on retrieval-needed turns (full context billed twice). With cache
reads at ~10% of standard input, the duplication cost drops by ~90% — the
"twice as expensive" hit becomes "slightly more expensive plus tool overhead."

client_time stays in the user message (per-turn dynamic, should not be in the
cached prefix).

2026-05-19 23:13:43 +00:00

..

scripts/: separate production from experimental and deprecated

2026-05-02 23:28:24 +00:00

embeddings: backfill type and created_at (Improvement #2 part A)

2026-05-03 23:58:53 +00:00

api.py

api.py: prompt-cache system prompt and memory across tool_use round-trip

2026-05-19 23:13:43 +00:00

backup.sh

conversations.db, sessions.db: enable WAL, add message index; update backup.sh

2026-05-04 03:24:51 +00:00

corpus_integrity.py

scripts/encoding.py: Stage 1 dual-implementation consolidation (Track 1 Finding 11)

2026-05-03 01:40:47 +00:00

dream.py

dream.py: raise_for_status on manifest writes; total_chunks as actual corpus count

2026-05-04 16:29:04 +00:00

encoding.py

encoding: per-slide pptx chunking + extract_blocks API; api: recency tiebreak

2026-05-19 21:58:25 +00:00

failures.py

scripts/encoding.py: Stage 1 dual-implementation consolidation (Track 1 Finding 11)

2026-05-03 01:40:47 +00:00

graphiti_service.py

graphiti_service.py: add traceback logging, log file handler for all endpoints

2026-04-30 17:36:19 +00:00

ingest_conversations.py

ingest_conversations.py: lazy-load embedder to match ingest.py pattern

2026-05-04 03:13:45 +00:00

ingest.py

encoding: per-slide pptx chunking + extract_blocks API; api: recency tiebreak

2026-05-19 21:58:25 +00:00

reindex_docx_pptx.py

encoding: per-slide pptx chunking + extract_blocks API; api: recency tiebreak

2026-05-19 21:58:25 +00:00

st_embedder.py

Add SentenceTransformer embedder for Graphiti — self-hosted, no OpenAI dependency

2026-04-27 18:18:37 +00:00

stage2_worker.py

stage2_worker: v2.1 — terminal failure states + sudo path fix

2026-05-01 17:28:53 +00:00

stage3_worker.py

stage3_worker: v2.2 — absolute sudo/systemctl paths, error logging, reset failure counter on recovery failure

2026-05-01 18:40:25 +00:00

test_retrieval.py

api.py: tool-call retrieval, drop the keyword intent classifier

2026-05-19 23:05:25 +00:00

watcher.py

encoding: per-slide pptx chunking + extract_blocks API; api: recency tiebreak

2026-05-19 21:58:25 +00:00