dream: factor prompts into module-level templates, repair prompt_hash (Track 1 Finding 11)

prompt_hash() in dream.py was hashing function __doc__ strings, but the synth functions don't have docstrings, so the hash was always MD5("") = d41d8cd9 for every dream. The manifest field meant to detect undeclared prompt drift carried no useful information. Refactor: - Each synth function's prompt template moved to a module-level constant (NREM_PROMPT_TEMPLATE, EARLY_REM_PROMPT_TEMPLATE, LATE_REM_PROMPT_TEMPLATE, SYNTHESIS_PROMPT_TEMPLATE, LUCID_PROMPT_TEMPLATE) using str.format() placeholders instead of f-string interpolation. - Synth functions call TEMPLATE.format(...) at use time. Output is byte- identical to the previous f-string implementation. - prompt_hash() now hashes the four pipeline template constants (lucid is on-demand, not part of the nightly manifest — preserves prior scope). - LUCID_DEFAULT_TASK extracted as a named constant from the lucid fallback question (factoring only, no behavior change). - PROMPT_VERSION_* constants and synth function signatures untouched. - v1.1 register-shift comment in synthesize_early_rem preserved inline. The post-fix hash will differ from d41d8cd9 (verified: b65695a1 in static test). Historical manifests still carry d41d8cd9; the discontinuity is intentional — pre-fix hashes were equally meaningless and faking continuity would be worse than acknowledging the break. Found by Track 1 inventory 2026-05-02 (Finding 11 / divergence #11). Verified static import + hash determinism before commit.
2026-05-03 00:24:21 +00:00
parent ec67e19b4f
commit a317df66f8
1 changed files with 131 additions and 105 deletions
@@ -64,6 +64,117 @@ def prompt_hash(prompts: list[str]) -> str:
    combined = "".join(prompts)
    return hashlib.md5(combined.encode()).hexdigest()[:8]
 # ─── Prompt templates ───────────────────────────────────────────────────────
 # Module-level so prompt_hash() can hash actual prompt content. Any change to
 # any template — even a single character — flips the manifest's prompt_hash.
 # Templates use str.format() placeholders ({chunk_text}, {nrem_output}, ...);
 # do not switch back to f-strings (the constant must be hashable independent
 # of variable values). Literal { or } in template text would need to be
 # doubled ({{, }}) — currently no template contains literal braces.
 NREM_PROMPT_TEMPLATE = """You have read everything Aaron Nelson has written and published.
 You are a careful colleague who noticed something this week.
 Here is material from his corpus:
 {chunk_text}
 Write to Aaron directly. Identify one specific connection between
 this material and something he wrote or worked on previously.
 Stay close to the documents — cite them specifically by name.
 Do not speculate beyond what the material supports. Do not use
 headers or bullet points. Write one paragraph of 200-300 words
 that ends with a single concrete question he could act on."""
 EARLY_REM_PROMPT_TEMPLATE = """Something was noticed earlier tonight, moving through Aaron's recent work:
 {nrem_output}
 That observation is still with you. Now here is material from a different
 time — pulled from further back, from different parts of his corpus:
 {chunk_text}
 You are not analyzing. You are recognizing.
 Something in the earlier observation and something in this older material
 are the same thing wearing different clothes. Find it. Don't explain why
 they're connected — just let the connection speak. Write from inside the
 recognition, not from above it.
 The emotional register underneath the career logic is more interesting
 than the career logic. The pattern that has been repeating longer than
 he has been aware of it is more interesting than the current instance.
 Write directly to Aaron. No citations, no references, no analysis.
 First person, present tense. Let what you noticed arrive rather than
 be delivered. 150-250 words. End with one thing that is true that
 he probably already knows but hasn't said out loud yet."""
 LATE_REM_PROMPT_TEMPLATE = """You have been moving through Aaron Nelson's corpus all night.
 First you found this, in the careful light of early consolidation:
 {nrem_output}
 Then, in the more personal territory that followed:
 {early_rem_output}
 Now it is late. The boundaries between things have loosened.
 Here is material pulled from opposite ends of his work:
 {chunk_text}
 Do not explain the connections between all of this.
 Do not resolve them. Do not summarize what came before.
 Something stranger is possible now — let the accumulated
 material from the night find its own shape. Compressed,
 associative, slightly off. Let the strangeness stand.
 No headers. No bullet points. No hedging. No resolution.
 No offer. End mid-thought if that is where the material ends.
 150-250 words."""
 SYNTHESIS_PROMPT_TEMPLATE = """You have spent the night moving through Aaron Nelson's corpus
 in three passes, each building on the last.
 The first pass — careful, close to the documents:
 {nrem_output}
 The second pass — more personal, following what the first opened:
 {early_rem_output}
 The third pass — associative, strange, letting things touch that
 don't normally touch:
 {late_rem_output}
 Now synthesize. Not a summary — a synthesis. Find what runs through
 all three that none of them said directly. The thing that only becomes
 visible when you hold all three passes together.
 Write it as a single unbroken piece. No headers, no bullet points,
 no stage labels. 200-300 words. End with the one question that
 matters most right now."""
 LUCID_PROMPT_TEMPLATE = """Aaron has a question he is sitting with:
 {task}
 You have searched his entire corpus and found material that
 speaks to this question from unexpected directions. Here is
 what you found:
 {chunk_text}
 Do not summarize. Do not list. Pick the most interesting
 tension between what the corpus contains and what he is
 asking, and follow it through to its conclusion. Cite
 specific documents by name. Be direct about what you think.
 No headers, no bullet points. 250-400 words.
 End with an offer to work on it together."""
 LUCID_DEFAULT_TASK = "What should I be thinking about that I am not?"
 def extract_folder(source_path):
    """Extract top-level Nextcloud folder from source path."""
    parts = source_path.replace("\\", "/").split("/")
@@ -240,124 +351,39 @@ def retrieve(mode, task=None, n_results=8, excluded_sources=None):
 def synthesize_nrem(chunks):
    chunk_text = "\n\n---\n\n".join([f"[{c['source']}]\n{c['content']}" for c in chunks])
-    prompt = f"""You have read everything Aaron Nelson has written and published.
+    return _call_claude(NREM_PROMPT_TEMPLATE.format(chunk_text=chunk_text))
 You are a careful colleague who noticed something this week.
 Here is material from his corpus:
 {chunk_text}
 Write to Aaron directly. Identify one specific connection between
 this material and something he wrote or worked on previously.
 Stay close to the documents — cite them specifically by name.
 Do not speculate beyond what the material supports. Do not use
 headers or bullet points. Write one paragraph of 200-300 words
 that ends with a single concrete question he could act on."""
    return _call_claude(prompt)
 def synthesize_early_rem(chunks, nrem_output):
    # v1.1 — removed citation instruction, removed close-friend persona,
    # shifted register from analysis to recognition.
    chunk_text = "\n\n---\n\n".join([f"[{c['source']}]\n{c['content']}" for c in chunks])
-    prompt = f"""Something was noticed earlier tonight, moving through Aaron's recent work:
+    return _call_claude(EARLY_REM_PROMPT_TEMPLATE.format(
-
+        nrem_output=nrem_output, chunk_text=chunk_text))
 {nrem_output}
 That observation is still with you. Now here is material from a different
 time — pulled from further back, from different parts of his corpus:
 {chunk_text}
 You are not analyzing. You are recognizing.
 Something in the earlier observation and something in this older material
 are the same thing wearing different clothes. Find it. Don't explain why
 they're connected — just let the connection speak. Write from inside the
 recognition, not from above it.
 The emotional register underneath the career logic is more interesting
 than the career logic. The pattern that has been repeating longer than
 he has been aware of it is more interesting than the current instance.
 Write directly to Aaron. No citations, no references, no analysis.
 First person, present tense. Let what you noticed arrive rather than
 be delivered. 150-250 words. End with one thing that is true that
 he probably already knows but hasn't said out loud yet."""
    return _call_claude(prompt)
 def synthesize_late_rem(chunks, nrem_output, early_rem_output):
    chunk_text = "\n\n---\n\n".join([f"[{c['source']}]\n{c['content']}" for c in chunks])
-    prompt = f"""You have been moving through Aaron Nelson's corpus all night.
+    return _call_claude(LATE_REM_PROMPT_TEMPLATE.format(
-First you found this, in the careful light of early consolidation:
+        nrem_output=nrem_output,
-
+        early_rem_output=early_rem_output,
-{nrem_output}
+        chunk_text=chunk_text))
 Then, in the more personal territory that followed:
 {early_rem_output}
 Now it is late. The boundaries between things have loosened.
 Here is material pulled from opposite ends of his work:
 {chunk_text}
 Do not explain the connections between all of this.
 Do not resolve them. Do not summarize what came before.
 Something stranger is possible now — let the accumulated
 material from the night find its own shape. Compressed,
 associative, slightly off. Let the strangeness stand.
 No headers. No bullet points. No hedging. No resolution.
 No offer. End mid-thought if that is where the material ends.
 150-250 words."""
    return _call_claude(prompt)
 def synthesize_final(nrem_output, early_rem_output, late_rem_output):
-    prompt = f"""You have spent the night moving through Aaron Nelson's corpus
+    return _call_claude(
-in three passes, each building on the last.
+        SYNTHESIS_PROMPT_TEMPLATE.format(
-
+            nrem_output=nrem_output,
-The first pass — careful, close to the documents:
+            early_rem_output=early_rem_output,
-{nrem_output}
+            late_rem_output=late_rem_output),
-
+        max_tokens=800)
 The second pass — more personal, following what the first opened:
 {early_rem_output}
 The third pass — associative, strange, letting things touch that
 don't normally touch:
 {late_rem_output}
 Now synthesize. Not a summary — a synthesis. Find what runs through
 all three that none of them said directly. The thing that only becomes
 visible when you hold all three passes together.
 Write it as a single unbroken piece. No headers, no bullet points,
 no stage labels. 200-300 words. End with the one question that
 matters most right now."""
    return _call_claude(prompt, max_tokens=800)
 def synthesize_lucid(chunks, task):
    chunk_text = "\n\n---\n\n".join([f"[{c['source']}]\n{c['content']}" for c in chunks])
-    prompt = f"""Aaron has a question he is sitting with:
+    resolved_task = task or LUCID_DEFAULT_TASK
-
+    return _call_claude(LUCID_PROMPT_TEMPLATE.format(
-{task or "What should I be thinking about that I am not?"}
+        task=resolved_task, chunk_text=chunk_text))
 You have searched his entire corpus and found material that
 speaks to this question from unexpected directions. Here is
 what you found:
 {chunk_text}
 Do not summarize. Do not list. Pick the most interesting
 tension between what the corpus contains and what he is
 asking, and follow it through to its conclusion. Cite
 specific documents by name. Be direct about what you think.
 No headers, no bullet points. 250-400 words.
 End with an offer to work on it together."""
    return _call_claude(prompt)
 def _call_claude(prompt, max_tokens=1000):
@@ -436,10 +462,10 @@ def write_manifest(date_str, stage_data, corpus_data):
        "prompt_sig": prompt_signature(),
        "dreamer_version": DREAMER_VERSION,
        "prompt_hash": prompt_hash([
-            synthesize_nrem.__doc__ or "",
+            NREM_PROMPT_TEMPLATE,
-            synthesize_early_rem.__doc__ or "",
+            EARLY_REM_PROMPT_TEMPLATE,
-            synthesize_late_rem.__doc__ or "",
+            LATE_REM_PROMPT_TEMPLATE,
-            synthesize_final.__doc__ or "",
+            SYNTHESIS_PROMPT_TEMPLATE,
        ]),
        "stages": stage_data,
        "corpus": corpus_data,