dream: factor prompts into module-level templates, repair prompt_hash (Track 1 Finding 11)

prompt_hash() in dream.py was hashing function __doc__ strings, but the synth functions don't have docstrings, so the hash was always MD5("") = d41d8cd9 for every dream. The manifest field meant to detect undeclared prompt drift carried no useful information. Refactor: - Each synth function's prompt template moved to a module-level constant (NREM_PROMPT_TEMPLATE, EARLY_REM_PROMPT_TEMPLATE, LATE_REM_PROMPT_TEMPLATE, SYNTHESIS_PROMPT_TEMPLATE, LUCID_PROMPT_TEMPLATE) using str.format() placeholders instead of f-string interpolation. - Synth functions call TEMPLATE.format(...) at use time. Output is byte- identical to the previous f-string implementation. - prompt_hash() now hashes the four pipeline template constants (lucid is on-demand, not part of the nightly manifest — preserves prior scope). - LUCID_DEFAULT_TASK extracted as a named constant from the lucid fallback question (factoring only, no behavior change). - PROMPT_VERSION_* constants and synth function signatures untouched. - v1.1 register-shift comment in synthesize_early_rem preserved inline. The post-fix hash will differ from d41d8cd9 (verified: b65695a1 in static test). Historical manifests still carry d41d8cd9; the discontinuity is intentional — pre-fix hashes were equally meaningless and faking continuity would be worse than acknowledging the break. Found by Track 1 inventory 2026-05-02 (Finding 11 / divergence #11). Verified static import + hash determinism before commit.
2026-05-03 00:24:21 +00:00
parent ec67e19b4f
commit a317df66f8
1 changed files with 131 additions and 105 deletions
@@ -64,6 +64,117 @@ def prompt_hash(prompts: list[str]) -> str:
    combined = "".join(prompts)
    return hashlib.md5(combined.encode()).hexdigest()[:8]

+# ─── Prompt templates ───────────────────────────────────────────────────────
+# Module-level so prompt_hash() can hash actual prompt content. Any change to
+# any template — even a single character — flips the manifest's prompt_hash.
+# Templates use str.format() placeholders ({chunk_text}, {nrem_output}, ...);
+# do not switch back to f-strings (the constant must be hashable independent
+# of variable values). Literal { or } in template text would need to be
+# doubled ({{, }}) — currently no template contains literal braces.
+
+NREM_PROMPT_TEMPLATE = """You have read everything Aaron Nelson has written and published.
+You are a careful colleague who noticed something this week.
+
+Here is material from his corpus:
+
+{chunk_text}
+
+Write to Aaron directly. Identify one specific connection between
+this material and something he wrote or worked on previously.
+Stay close to the documents — cite them specifically by name.
+Do not speculate beyond what the material supports. Do not use
+headers or bullet points. Write one paragraph of 200-300 words
+that ends with a single concrete question he could act on."""
+
+EARLY_REM_PROMPT_TEMPLATE = """Something was noticed earlier tonight, moving through Aaron's recent work:
+
+{nrem_output}
+
+That observation is still with you. Now here is material from a different
+time — pulled from further back, from different parts of his corpus:
+
+{chunk_text}
+
+You are not analyzing. You are recognizing.
+
+Something in the earlier observation and something in this older material
+are the same thing wearing different clothes. Find it. Don't explain why
+they're connected — just let the connection speak. Write from inside the
+recognition, not from above it.
+
+The emotional register underneath the career logic is more interesting
+than the career logic. The pattern that has been repeating longer than
+he has been aware of it is more interesting than the current instance.
+
+Write directly to Aaron. No citations, no references, no analysis.
+First person, present tense. Let what you noticed arrive rather than
+be delivered. 150-250 words. End with one thing that is true that
+he probably already knows but hasn't said out loud yet."""
+
+LATE_REM_PROMPT_TEMPLATE = """You have been moving through Aaron Nelson's corpus all night.
+First you found this, in the careful light of early consolidation:
+
+{nrem_output}
+
+Then, in the more personal territory that followed:
+
+{early_rem_output}
+
+Now it is late. The boundaries between things have loosened.
+Here is material pulled from opposite ends of his work:
+
+{chunk_text}
+
+Do not explain the connections between all of this.
+Do not resolve them. Do not summarize what came before.
+Something stranger is possible now — let the accumulated
+material from the night find its own shape. Compressed,
+associative, slightly off. Let the strangeness stand.
+
+No headers. No bullet points. No hedging. No resolution.
+No offer. End mid-thought if that is where the material ends.
+150-250 words."""
+
+SYNTHESIS_PROMPT_TEMPLATE = """You have spent the night moving through Aaron Nelson's corpus
+in three passes, each building on the last.
+
+The first pass — careful, close to the documents:
+{nrem_output}
+
+The second pass — more personal, following what the first opened:
+{early_rem_output}
+
+The third pass — associative, strange, letting things touch that
+don't normally touch:
+{late_rem_output}
+
+Now synthesize. Not a summary — a synthesis. Find what runs through
+all three that none of them said directly. The thing that only becomes
+visible when you hold all three passes together.
+
+Write it as a single unbroken piece. No headers, no bullet points,
+no stage labels. 200-300 words. End with the one question that
+matters most right now."""
+
+LUCID_PROMPT_TEMPLATE = """Aaron has a question he is sitting with:
+
+{task}
+
+You have searched his entire corpus and found material that
+speaks to this question from unexpected directions. Here is
+what you found:
+
+{chunk_text}
+
+Do not summarize. Do not list. Pick the most interesting
+tension between what the corpus contains and what he is
+asking, and follow it through to its conclusion. Cite
+specific documents by name. Be direct about what you think.
+No headers, no bullet points. 250-400 words.
+End with an offer to work on it together."""
+
+LUCID_DEFAULT_TASK = "What should I be thinking about that I am not?"
+
 def extract_folder(source_path):
    """Extract top-level Nextcloud folder from source path."""
    parts = source_path.replace("\\", "/").split("/")
@@ -240,124 +351,39 @@ def retrieve(mode, task=None, n_results=8, excluded_sources=None):

 def synthesize_nrem(chunks):
    chunk_text = "\n\n---\n\n".join([f"[{c['source']}]\n{c['content']}" for c in chunks])
-    prompt = f"""You have read everything Aaron Nelson has written and published.
-You are a careful colleague who noticed something this week.
-
-Here is material from his corpus:
-
-{chunk_text}
-
-Write to Aaron directly. Identify one specific connection between
-this material and something he wrote or worked on previously.
-Stay close to the documents — cite them specifically by name.
-Do not speculate beyond what the material supports. Do not use
-headers or bullet points. Write one paragraph of 200-300 words
-that ends with a single concrete question he could act on."""
-    return _call_claude(prompt)
+    return _call_claude(NREM_PROMPT_TEMPLATE.format(chunk_text=chunk_text))


 def synthesize_early_rem(chunks, nrem_output):
    # v1.1 — removed citation instruction, removed close-friend persona,
    # shifted register from analysis to recognition.
    chunk_text = "\n\n---\n\n".join([f"[{c['source']}]\n{c['content']}" for c in chunks])
-    prompt = f"""Something was noticed earlier tonight, moving through Aaron's recent work:
-
-{nrem_output}
-
-That observation is still with you. Now here is material from a different
-time — pulled from further back, from different parts of his corpus:
-
-{chunk_text}
-
-You are not analyzing. You are recognizing.
-
-Something in the earlier observation and something in this older material
-are the same thing wearing different clothes. Find it. Don't explain why
-they're connected — just let the connection speak. Write from inside the
-recognition, not from above it.
-
-The emotional register underneath the career logic is more interesting
-than the career logic. The pattern that has been repeating longer than
-he has been aware of it is more interesting than the current instance.
-
-Write directly to Aaron. No citations, no references, no analysis.
-First person, present tense. Let what you noticed arrive rather than
-be delivered. 150-250 words. End with one thing that is true that
-he probably already knows but hasn't said out loud yet."""
-    return _call_claude(prompt)
+    return _call_claude(EARLY_REM_PROMPT_TEMPLATE.format(
+        nrem_output=nrem_output, chunk_text=chunk_text))


 def synthesize_late_rem(chunks, nrem_output, early_rem_output):
    chunk_text = "\n\n---\n\n".join([f"[{c['source']}]\n{c['content']}" for c in chunks])
-    prompt = f"""You have been moving through Aaron Nelson's corpus all night.
-First you found this, in the careful light of early consolidation:
-
-{nrem_output}
-
-Then, in the more personal territory that followed:
-
-{early_rem_output}
-
-Now it is late. The boundaries between things have loosened.
-Here is material pulled from opposite ends of his work:
-
-{chunk_text}
-
-Do not explain the connections between all of this.
-Do not resolve them. Do not summarize what came before.
-Something stranger is possible now — let the accumulated
-material from the night find its own shape. Compressed,
-associative, slightly off. Let the strangeness stand.
-
-No headers. No bullet points. No hedging. No resolution.
-No offer. End mid-thought if that is where the material ends.
-150-250 words."""
-    return _call_claude(prompt)
+    return _call_claude(LATE_REM_PROMPT_TEMPLATE.format(
+        nrem_output=nrem_output,
+        early_rem_output=early_rem_output,
+        chunk_text=chunk_text))


 def synthesize_final(nrem_output, early_rem_output, late_rem_output):
-    prompt = f"""You have spent the night moving through Aaron Nelson's corpus
-in three passes, each building on the last.
-
-The first pass — careful, close to the documents:
-{nrem_output}
-
-The second pass — more personal, following what the first opened:
-{early_rem_output}
-
-The third pass — associative, strange, letting things touch that
-don't normally touch:
-{late_rem_output}
-
-Now synthesize. Not a summary — a synthesis. Find what runs through
-all three that none of them said directly. The thing that only becomes
-visible when you hold all three passes together.
-
-Write it as a single unbroken piece. No headers, no bullet points,
-no stage labels. 200-300 words. End with the one question that
-matters most right now."""
-    return _call_claude(prompt, max_tokens=800)
+    return _call_claude(
+        SYNTHESIS_PROMPT_TEMPLATE.format(
+            nrem_output=nrem_output,
+            early_rem_output=early_rem_output,
+            late_rem_output=late_rem_output),
+        max_tokens=800)


 def synthesize_lucid(chunks, task):
    chunk_text = "\n\n---\n\n".join([f"[{c['source']}]\n{c['content']}" for c in chunks])
-    prompt = f"""Aaron has a question he is sitting with:
-
-{task or "What should I be thinking about that I am not?"}
-
-You have searched his entire corpus and found material that
-speaks to this question from unexpected directions. Here is
-what you found:
-
-{chunk_text}
-
-Do not summarize. Do not list. Pick the most interesting
-tension between what the corpus contains and what he is
-asking, and follow it through to its conclusion. Cite
-specific documents by name. Be direct about what you think.
-No headers, no bullet points. 250-400 words.
-End with an offer to work on it together."""
-    return _call_claude(prompt)
+    resolved_task = task or LUCID_DEFAULT_TASK
+    return _call_claude(LUCID_PROMPT_TEMPLATE.format(
+        task=resolved_task, chunk_text=chunk_text))


 def _call_claude(prompt, max_tokens=1000):
@@ -436,10 +462,10 @@ def write_manifest(date_str, stage_data, corpus_data):
        "prompt_sig": prompt_signature(),
        "dreamer_version": DREAMER_VERSION,
        "prompt_hash": prompt_hash([
-            synthesize_nrem.__doc__ or "",
-            synthesize_early_rem.__doc__ or "",
-            synthesize_late_rem.__doc__ or "",
-            synthesize_final.__doc__ or "",
+            NREM_PROMPT_TEMPLATE,
+            EARLY_REM_PROMPT_TEMPLATE,
+            LATE_REM_PROMPT_TEMPLATE,
+            SYNTHESIS_PROMPT_TEMPLATE,
        ]),
        "stages": stage_data,
        "corpus": corpus_data,