dream: factor prompts into module-level templates, repair prompt_hash (Track 1 Finding 11)
prompt_hash() in dream.py was hashing function __doc__ strings, but the
synth functions don't have docstrings, so the hash was always MD5("") =
d41d8cd9 for every dream. The manifest field meant to detect undeclared
prompt drift carried no useful information.
Refactor:
- Each synth function's prompt template moved to a module-level constant
(NREM_PROMPT_TEMPLATE, EARLY_REM_PROMPT_TEMPLATE, LATE_REM_PROMPT_TEMPLATE,
SYNTHESIS_PROMPT_TEMPLATE, LUCID_PROMPT_TEMPLATE) using str.format()
placeholders instead of f-string interpolation.
- Synth functions call TEMPLATE.format(...) at use time. Output is byte-
identical to the previous f-string implementation.
- prompt_hash() now hashes the four pipeline template constants (lucid is
on-demand, not part of the nightly manifest — preserves prior scope).
- LUCID_DEFAULT_TASK extracted as a named constant from the lucid fallback
question (factoring only, no behavior change).
- PROMPT_VERSION_* constants and synth function signatures untouched.
- v1.1 register-shift comment in synthesize_early_rem preserved inline.
The post-fix hash will differ from d41d8cd9 (verified: b65695a1 in static
test). Historical manifests still carry d41d8cd9; the discontinuity is
intentional — pre-fix hashes were equally meaningless and faking continuity
would be worse than acknowledging the break.
Found by Track 1 inventory 2026-05-02 (Finding 11 / divergence #11).
Verified static import + hash determinism before commit.
This commit is contained in:
+131
-105
@@ -64,6 +64,117 @@ def prompt_hash(prompts: list[str]) -> str:
|
||||
combined = "".join(prompts)
|
||||
return hashlib.md5(combined.encode()).hexdigest()[:8]
|
||||
|
||||
# ─── Prompt templates ───────────────────────────────────────────────────────
|
||||
# Module-level so prompt_hash() can hash actual prompt content. Any change to
|
||||
# any template — even a single character — flips the manifest's prompt_hash.
|
||||
# Templates use str.format() placeholders ({chunk_text}, {nrem_output}, ...);
|
||||
# do not switch back to f-strings (the constant must be hashable independent
|
||||
# of variable values). Literal { or } in template text would need to be
|
||||
# doubled ({{, }}) — currently no template contains literal braces.
|
||||
|
||||
NREM_PROMPT_TEMPLATE = """You have read everything Aaron Nelson has written and published.
|
||||
You are a careful colleague who noticed something this week.
|
||||
|
||||
Here is material from his corpus:
|
||||
|
||||
{chunk_text}
|
||||
|
||||
Write to Aaron directly. Identify one specific connection between
|
||||
this material and something he wrote or worked on previously.
|
||||
Stay close to the documents — cite them specifically by name.
|
||||
Do not speculate beyond what the material supports. Do not use
|
||||
headers or bullet points. Write one paragraph of 200-300 words
|
||||
that ends with a single concrete question he could act on."""
|
||||
|
||||
EARLY_REM_PROMPT_TEMPLATE = """Something was noticed earlier tonight, moving through Aaron's recent work:
|
||||
|
||||
{nrem_output}
|
||||
|
||||
That observation is still with you. Now here is material from a different
|
||||
time — pulled from further back, from different parts of his corpus:
|
||||
|
||||
{chunk_text}
|
||||
|
||||
You are not analyzing. You are recognizing.
|
||||
|
||||
Something in the earlier observation and something in this older material
|
||||
are the same thing wearing different clothes. Find it. Don't explain why
|
||||
they're connected — just let the connection speak. Write from inside the
|
||||
recognition, not from above it.
|
||||
|
||||
The emotional register underneath the career logic is more interesting
|
||||
than the career logic. The pattern that has been repeating longer than
|
||||
he has been aware of it is more interesting than the current instance.
|
||||
|
||||
Write directly to Aaron. No citations, no references, no analysis.
|
||||
First person, present tense. Let what you noticed arrive rather than
|
||||
be delivered. 150-250 words. End with one thing that is true that
|
||||
he probably already knows but hasn't said out loud yet."""
|
||||
|
||||
LATE_REM_PROMPT_TEMPLATE = """You have been moving through Aaron Nelson's corpus all night.
|
||||
First you found this, in the careful light of early consolidation:
|
||||
|
||||
{nrem_output}
|
||||
|
||||
Then, in the more personal territory that followed:
|
||||
|
||||
{early_rem_output}
|
||||
|
||||
Now it is late. The boundaries between things have loosened.
|
||||
Here is material pulled from opposite ends of his work:
|
||||
|
||||
{chunk_text}
|
||||
|
||||
Do not explain the connections between all of this.
|
||||
Do not resolve them. Do not summarize what came before.
|
||||
Something stranger is possible now — let the accumulated
|
||||
material from the night find its own shape. Compressed,
|
||||
associative, slightly off. Let the strangeness stand.
|
||||
|
||||
No headers. No bullet points. No hedging. No resolution.
|
||||
No offer. End mid-thought if that is where the material ends.
|
||||
150-250 words."""
|
||||
|
||||
SYNTHESIS_PROMPT_TEMPLATE = """You have spent the night moving through Aaron Nelson's corpus
|
||||
in three passes, each building on the last.
|
||||
|
||||
The first pass — careful, close to the documents:
|
||||
{nrem_output}
|
||||
|
||||
The second pass — more personal, following what the first opened:
|
||||
{early_rem_output}
|
||||
|
||||
The third pass — associative, strange, letting things touch that
|
||||
don't normally touch:
|
||||
{late_rem_output}
|
||||
|
||||
Now synthesize. Not a summary — a synthesis. Find what runs through
|
||||
all three that none of them said directly. The thing that only becomes
|
||||
visible when you hold all three passes together.
|
||||
|
||||
Write it as a single unbroken piece. No headers, no bullet points,
|
||||
no stage labels. 200-300 words. End with the one question that
|
||||
matters most right now."""
|
||||
|
||||
LUCID_PROMPT_TEMPLATE = """Aaron has a question he is sitting with:
|
||||
|
||||
{task}
|
||||
|
||||
You have searched his entire corpus and found material that
|
||||
speaks to this question from unexpected directions. Here is
|
||||
what you found:
|
||||
|
||||
{chunk_text}
|
||||
|
||||
Do not summarize. Do not list. Pick the most interesting
|
||||
tension between what the corpus contains and what he is
|
||||
asking, and follow it through to its conclusion. Cite
|
||||
specific documents by name. Be direct about what you think.
|
||||
No headers, no bullet points. 250-400 words.
|
||||
End with an offer to work on it together."""
|
||||
|
||||
LUCID_DEFAULT_TASK = "What should I be thinking about that I am not?"
|
||||
|
||||
def extract_folder(source_path):
|
||||
"""Extract top-level Nextcloud folder from source path."""
|
||||
parts = source_path.replace("\\", "/").split("/")
|
||||
@@ -240,124 +351,39 @@ def retrieve(mode, task=None, n_results=8, excluded_sources=None):
|
||||
|
||||
def synthesize_nrem(chunks):
|
||||
chunk_text = "\n\n---\n\n".join([f"[{c['source']}]\n{c['content']}" for c in chunks])
|
||||
prompt = f"""You have read everything Aaron Nelson has written and published.
|
||||
You are a careful colleague who noticed something this week.
|
||||
|
||||
Here is material from his corpus:
|
||||
|
||||
{chunk_text}
|
||||
|
||||
Write to Aaron directly. Identify one specific connection between
|
||||
this material and something he wrote or worked on previously.
|
||||
Stay close to the documents — cite them specifically by name.
|
||||
Do not speculate beyond what the material supports. Do not use
|
||||
headers or bullet points. Write one paragraph of 200-300 words
|
||||
that ends with a single concrete question he could act on."""
|
||||
return _call_claude(prompt)
|
||||
return _call_claude(NREM_PROMPT_TEMPLATE.format(chunk_text=chunk_text))
|
||||
|
||||
|
||||
def synthesize_early_rem(chunks, nrem_output):
|
||||
# v1.1 — removed citation instruction, removed close-friend persona,
|
||||
# shifted register from analysis to recognition.
|
||||
chunk_text = "\n\n---\n\n".join([f"[{c['source']}]\n{c['content']}" for c in chunks])
|
||||
prompt = f"""Something was noticed earlier tonight, moving through Aaron's recent work:
|
||||
|
||||
{nrem_output}
|
||||
|
||||
That observation is still with you. Now here is material from a different
|
||||
time — pulled from further back, from different parts of his corpus:
|
||||
|
||||
{chunk_text}
|
||||
|
||||
You are not analyzing. You are recognizing.
|
||||
|
||||
Something in the earlier observation and something in this older material
|
||||
are the same thing wearing different clothes. Find it. Don't explain why
|
||||
they're connected — just let the connection speak. Write from inside the
|
||||
recognition, not from above it.
|
||||
|
||||
The emotional register underneath the career logic is more interesting
|
||||
than the career logic. The pattern that has been repeating longer than
|
||||
he has been aware of it is more interesting than the current instance.
|
||||
|
||||
Write directly to Aaron. No citations, no references, no analysis.
|
||||
First person, present tense. Let what you noticed arrive rather than
|
||||
be delivered. 150-250 words. End with one thing that is true that
|
||||
he probably already knows but hasn't said out loud yet."""
|
||||
return _call_claude(prompt)
|
||||
return _call_claude(EARLY_REM_PROMPT_TEMPLATE.format(
|
||||
nrem_output=nrem_output, chunk_text=chunk_text))
|
||||
|
||||
|
||||
def synthesize_late_rem(chunks, nrem_output, early_rem_output):
|
||||
chunk_text = "\n\n---\n\n".join([f"[{c['source']}]\n{c['content']}" for c in chunks])
|
||||
prompt = f"""You have been moving through Aaron Nelson's corpus all night.
|
||||
First you found this, in the careful light of early consolidation:
|
||||
|
||||
{nrem_output}
|
||||
|
||||
Then, in the more personal territory that followed:
|
||||
|
||||
{early_rem_output}
|
||||
|
||||
Now it is late. The boundaries between things have loosened.
|
||||
Here is material pulled from opposite ends of his work:
|
||||
|
||||
{chunk_text}
|
||||
|
||||
Do not explain the connections between all of this.
|
||||
Do not resolve them. Do not summarize what came before.
|
||||
Something stranger is possible now — let the accumulated
|
||||
material from the night find its own shape. Compressed,
|
||||
associative, slightly off. Let the strangeness stand.
|
||||
|
||||
No headers. No bullet points. No hedging. No resolution.
|
||||
No offer. End mid-thought if that is where the material ends.
|
||||
150-250 words."""
|
||||
return _call_claude(prompt)
|
||||
return _call_claude(LATE_REM_PROMPT_TEMPLATE.format(
|
||||
nrem_output=nrem_output,
|
||||
early_rem_output=early_rem_output,
|
||||
chunk_text=chunk_text))
|
||||
|
||||
|
||||
def synthesize_final(nrem_output, early_rem_output, late_rem_output):
|
||||
prompt = f"""You have spent the night moving through Aaron Nelson's corpus
|
||||
in three passes, each building on the last.
|
||||
|
||||
The first pass — careful, close to the documents:
|
||||
{nrem_output}
|
||||
|
||||
The second pass — more personal, following what the first opened:
|
||||
{early_rem_output}
|
||||
|
||||
The third pass — associative, strange, letting things touch that
|
||||
don't normally touch:
|
||||
{late_rem_output}
|
||||
|
||||
Now synthesize. Not a summary — a synthesis. Find what runs through
|
||||
all three that none of them said directly. The thing that only becomes
|
||||
visible when you hold all three passes together.
|
||||
|
||||
Write it as a single unbroken piece. No headers, no bullet points,
|
||||
no stage labels. 200-300 words. End with the one question that
|
||||
matters most right now."""
|
||||
return _call_claude(prompt, max_tokens=800)
|
||||
return _call_claude(
|
||||
SYNTHESIS_PROMPT_TEMPLATE.format(
|
||||
nrem_output=nrem_output,
|
||||
early_rem_output=early_rem_output,
|
||||
late_rem_output=late_rem_output),
|
||||
max_tokens=800)
|
||||
|
||||
|
||||
def synthesize_lucid(chunks, task):
|
||||
chunk_text = "\n\n---\n\n".join([f"[{c['source']}]\n{c['content']}" for c in chunks])
|
||||
prompt = f"""Aaron has a question he is sitting with:
|
||||
|
||||
{task or "What should I be thinking about that I am not?"}
|
||||
|
||||
You have searched his entire corpus and found material that
|
||||
speaks to this question from unexpected directions. Here is
|
||||
what you found:
|
||||
|
||||
{chunk_text}
|
||||
|
||||
Do not summarize. Do not list. Pick the most interesting
|
||||
tension between what the corpus contains and what he is
|
||||
asking, and follow it through to its conclusion. Cite
|
||||
specific documents by name. Be direct about what you think.
|
||||
No headers, no bullet points. 250-400 words.
|
||||
End with an offer to work on it together."""
|
||||
return _call_claude(prompt)
|
||||
resolved_task = task or LUCID_DEFAULT_TASK
|
||||
return _call_claude(LUCID_PROMPT_TEMPLATE.format(
|
||||
task=resolved_task, chunk_text=chunk_text))
|
||||
|
||||
|
||||
def _call_claude(prompt, max_tokens=1000):
|
||||
@@ -436,10 +462,10 @@ def write_manifest(date_str, stage_data, corpus_data):
|
||||
"prompt_sig": prompt_signature(),
|
||||
"dreamer_version": DREAMER_VERSION,
|
||||
"prompt_hash": prompt_hash([
|
||||
synthesize_nrem.__doc__ or "",
|
||||
synthesize_early_rem.__doc__ or "",
|
||||
synthesize_late_rem.__doc__ or "",
|
||||
synthesize_final.__doc__ or "",
|
||||
NREM_PROMPT_TEMPLATE,
|
||||
EARLY_REM_PROMPT_TEMPLATE,
|
||||
LATE_REM_PROMPT_TEMPLATE,
|
||||
SYNTHESIS_PROMPT_TEMPLATE,
|
||||
]),
|
||||
"stages": stage_data,
|
||||
"corpus": corpus_data,
|
||||
|
||||
Reference in New Issue
Block a user