dream: factor prompts into module-level templates, repair prompt_hash (Track 1 Finding 11)
prompt_hash() in dream.py was hashing function __doc__ strings, but the
synth functions don't have docstrings, so the hash was always MD5("") =
d41d8cd9 for every dream. The manifest field meant to detect undeclared
prompt drift carried no useful information.
Refactor:
- Each synth function's prompt template moved to a module-level constant
(NREM_PROMPT_TEMPLATE, EARLY_REM_PROMPT_TEMPLATE, LATE_REM_PROMPT_TEMPLATE,
SYNTHESIS_PROMPT_TEMPLATE, LUCID_PROMPT_TEMPLATE) using str.format()
placeholders instead of f-string interpolation.
- Synth functions call TEMPLATE.format(...) at use time. Output is byte-
identical to the previous f-string implementation.
- prompt_hash() now hashes the four pipeline template constants (lucid is
on-demand, not part of the nightly manifest — preserves prior scope).
- LUCID_DEFAULT_TASK extracted as a named constant from the lucid fallback
question (factoring only, no behavior change).
- PROMPT_VERSION_* constants and synth function signatures untouched.
- v1.1 register-shift comment in synthesize_early_rem preserved inline.
The post-fix hash will differ from d41d8cd9 (verified: b65695a1 in static
test). Historical manifests still carry d41d8cd9; the discontinuity is
intentional — pre-fix hashes were equally meaningless and faking continuity
would be worse than acknowledging the break.
Found by Track 1 inventory 2026-05-02 (Finding 11 / divergence #11).
Verified static import + hash determinism before commit.
This commit is contained in:
+131
-105
@@ -64,6 +64,117 @@ def prompt_hash(prompts: list[str]) -> str:
|
|||||||
combined = "".join(prompts)
|
combined = "".join(prompts)
|
||||||
return hashlib.md5(combined.encode()).hexdigest()[:8]
|
return hashlib.md5(combined.encode()).hexdigest()[:8]
|
||||||
|
|
||||||
|
# ─── Prompt templates ───────────────────────────────────────────────────────
|
||||||
|
# Module-level so prompt_hash() can hash actual prompt content. Any change to
|
||||||
|
# any template — even a single character — flips the manifest's prompt_hash.
|
||||||
|
# Templates use str.format() placeholders ({chunk_text}, {nrem_output}, ...);
|
||||||
|
# do not switch back to f-strings (the constant must be hashable independent
|
||||||
|
# of variable values). Literal { or } in template text would need to be
|
||||||
|
# doubled ({{, }}) — currently no template contains literal braces.
|
||||||
|
|
||||||
|
NREM_PROMPT_TEMPLATE = """You have read everything Aaron Nelson has written and published.
|
||||||
|
You are a careful colleague who noticed something this week.
|
||||||
|
|
||||||
|
Here is material from his corpus:
|
||||||
|
|
||||||
|
{chunk_text}
|
||||||
|
|
||||||
|
Write to Aaron directly. Identify one specific connection between
|
||||||
|
this material and something he wrote or worked on previously.
|
||||||
|
Stay close to the documents — cite them specifically by name.
|
||||||
|
Do not speculate beyond what the material supports. Do not use
|
||||||
|
headers or bullet points. Write one paragraph of 200-300 words
|
||||||
|
that ends with a single concrete question he could act on."""
|
||||||
|
|
||||||
|
EARLY_REM_PROMPT_TEMPLATE = """Something was noticed earlier tonight, moving through Aaron's recent work:
|
||||||
|
|
||||||
|
{nrem_output}
|
||||||
|
|
||||||
|
That observation is still with you. Now here is material from a different
|
||||||
|
time — pulled from further back, from different parts of his corpus:
|
||||||
|
|
||||||
|
{chunk_text}
|
||||||
|
|
||||||
|
You are not analyzing. You are recognizing.
|
||||||
|
|
||||||
|
Something in the earlier observation and something in this older material
|
||||||
|
are the same thing wearing different clothes. Find it. Don't explain why
|
||||||
|
they're connected — just let the connection speak. Write from inside the
|
||||||
|
recognition, not from above it.
|
||||||
|
|
||||||
|
The emotional register underneath the career logic is more interesting
|
||||||
|
than the career logic. The pattern that has been repeating longer than
|
||||||
|
he has been aware of it is more interesting than the current instance.
|
||||||
|
|
||||||
|
Write directly to Aaron. No citations, no references, no analysis.
|
||||||
|
First person, present tense. Let what you noticed arrive rather than
|
||||||
|
be delivered. 150-250 words. End with one thing that is true that
|
||||||
|
he probably already knows but hasn't said out loud yet."""
|
||||||
|
|
||||||
|
LATE_REM_PROMPT_TEMPLATE = """You have been moving through Aaron Nelson's corpus all night.
|
||||||
|
First you found this, in the careful light of early consolidation:
|
||||||
|
|
||||||
|
{nrem_output}
|
||||||
|
|
||||||
|
Then, in the more personal territory that followed:
|
||||||
|
|
||||||
|
{early_rem_output}
|
||||||
|
|
||||||
|
Now it is late. The boundaries between things have loosened.
|
||||||
|
Here is material pulled from opposite ends of his work:
|
||||||
|
|
||||||
|
{chunk_text}
|
||||||
|
|
||||||
|
Do not explain the connections between all of this.
|
||||||
|
Do not resolve them. Do not summarize what came before.
|
||||||
|
Something stranger is possible now — let the accumulated
|
||||||
|
material from the night find its own shape. Compressed,
|
||||||
|
associative, slightly off. Let the strangeness stand.
|
||||||
|
|
||||||
|
No headers. No bullet points. No hedging. No resolution.
|
||||||
|
No offer. End mid-thought if that is where the material ends.
|
||||||
|
150-250 words."""
|
||||||
|
|
||||||
|
SYNTHESIS_PROMPT_TEMPLATE = """You have spent the night moving through Aaron Nelson's corpus
|
||||||
|
in three passes, each building on the last.
|
||||||
|
|
||||||
|
The first pass — careful, close to the documents:
|
||||||
|
{nrem_output}
|
||||||
|
|
||||||
|
The second pass — more personal, following what the first opened:
|
||||||
|
{early_rem_output}
|
||||||
|
|
||||||
|
The third pass — associative, strange, letting things touch that
|
||||||
|
don't normally touch:
|
||||||
|
{late_rem_output}
|
||||||
|
|
||||||
|
Now synthesize. Not a summary — a synthesis. Find what runs through
|
||||||
|
all three that none of them said directly. The thing that only becomes
|
||||||
|
visible when you hold all three passes together.
|
||||||
|
|
||||||
|
Write it as a single unbroken piece. No headers, no bullet points,
|
||||||
|
no stage labels. 200-300 words. End with the one question that
|
||||||
|
matters most right now."""
|
||||||
|
|
||||||
|
LUCID_PROMPT_TEMPLATE = """Aaron has a question he is sitting with:
|
||||||
|
|
||||||
|
{task}
|
||||||
|
|
||||||
|
You have searched his entire corpus and found material that
|
||||||
|
speaks to this question from unexpected directions. Here is
|
||||||
|
what you found:
|
||||||
|
|
||||||
|
{chunk_text}
|
||||||
|
|
||||||
|
Do not summarize. Do not list. Pick the most interesting
|
||||||
|
tension between what the corpus contains and what he is
|
||||||
|
asking, and follow it through to its conclusion. Cite
|
||||||
|
specific documents by name. Be direct about what you think.
|
||||||
|
No headers, no bullet points. 250-400 words.
|
||||||
|
End with an offer to work on it together."""
|
||||||
|
|
||||||
|
LUCID_DEFAULT_TASK = "What should I be thinking about that I am not?"
|
||||||
|
|
||||||
def extract_folder(source_path):
|
def extract_folder(source_path):
|
||||||
"""Extract top-level Nextcloud folder from source path."""
|
"""Extract top-level Nextcloud folder from source path."""
|
||||||
parts = source_path.replace("\\", "/").split("/")
|
parts = source_path.replace("\\", "/").split("/")
|
||||||
@@ -240,124 +351,39 @@ def retrieve(mode, task=None, n_results=8, excluded_sources=None):
|
|||||||
|
|
||||||
def synthesize_nrem(chunks):
|
def synthesize_nrem(chunks):
|
||||||
chunk_text = "\n\n---\n\n".join([f"[{c['source']}]\n{c['content']}" for c in chunks])
|
chunk_text = "\n\n---\n\n".join([f"[{c['source']}]\n{c['content']}" for c in chunks])
|
||||||
prompt = f"""You have read everything Aaron Nelson has written and published.
|
return _call_claude(NREM_PROMPT_TEMPLATE.format(chunk_text=chunk_text))
|
||||||
You are a careful colleague who noticed something this week.
|
|
||||||
|
|
||||||
Here is material from his corpus:
|
|
||||||
|
|
||||||
{chunk_text}
|
|
||||||
|
|
||||||
Write to Aaron directly. Identify one specific connection between
|
|
||||||
this material and something he wrote or worked on previously.
|
|
||||||
Stay close to the documents — cite them specifically by name.
|
|
||||||
Do not speculate beyond what the material supports. Do not use
|
|
||||||
headers or bullet points. Write one paragraph of 200-300 words
|
|
||||||
that ends with a single concrete question he could act on."""
|
|
||||||
return _call_claude(prompt)
|
|
||||||
|
|
||||||
|
|
||||||
def synthesize_early_rem(chunks, nrem_output):
|
def synthesize_early_rem(chunks, nrem_output):
|
||||||
# v1.1 — removed citation instruction, removed close-friend persona,
|
# v1.1 — removed citation instruction, removed close-friend persona,
|
||||||
# shifted register from analysis to recognition.
|
# shifted register from analysis to recognition.
|
||||||
chunk_text = "\n\n---\n\n".join([f"[{c['source']}]\n{c['content']}" for c in chunks])
|
chunk_text = "\n\n---\n\n".join([f"[{c['source']}]\n{c['content']}" for c in chunks])
|
||||||
prompt = f"""Something was noticed earlier tonight, moving through Aaron's recent work:
|
return _call_claude(EARLY_REM_PROMPT_TEMPLATE.format(
|
||||||
|
nrem_output=nrem_output, chunk_text=chunk_text))
|
||||||
{nrem_output}
|
|
||||||
|
|
||||||
That observation is still with you. Now here is material from a different
|
|
||||||
time — pulled from further back, from different parts of his corpus:
|
|
||||||
|
|
||||||
{chunk_text}
|
|
||||||
|
|
||||||
You are not analyzing. You are recognizing.
|
|
||||||
|
|
||||||
Something in the earlier observation and something in this older material
|
|
||||||
are the same thing wearing different clothes. Find it. Don't explain why
|
|
||||||
they're connected — just let the connection speak. Write from inside the
|
|
||||||
recognition, not from above it.
|
|
||||||
|
|
||||||
The emotional register underneath the career logic is more interesting
|
|
||||||
than the career logic. The pattern that has been repeating longer than
|
|
||||||
he has been aware of it is more interesting than the current instance.
|
|
||||||
|
|
||||||
Write directly to Aaron. No citations, no references, no analysis.
|
|
||||||
First person, present tense. Let what you noticed arrive rather than
|
|
||||||
be delivered. 150-250 words. End with one thing that is true that
|
|
||||||
he probably already knows but hasn't said out loud yet."""
|
|
||||||
return _call_claude(prompt)
|
|
||||||
|
|
||||||
|
|
||||||
def synthesize_late_rem(chunks, nrem_output, early_rem_output):
|
def synthesize_late_rem(chunks, nrem_output, early_rem_output):
|
||||||
chunk_text = "\n\n---\n\n".join([f"[{c['source']}]\n{c['content']}" for c in chunks])
|
chunk_text = "\n\n---\n\n".join([f"[{c['source']}]\n{c['content']}" for c in chunks])
|
||||||
prompt = f"""You have been moving through Aaron Nelson's corpus all night.
|
return _call_claude(LATE_REM_PROMPT_TEMPLATE.format(
|
||||||
First you found this, in the careful light of early consolidation:
|
nrem_output=nrem_output,
|
||||||
|
early_rem_output=early_rem_output,
|
||||||
{nrem_output}
|
chunk_text=chunk_text))
|
||||||
|
|
||||||
Then, in the more personal territory that followed:
|
|
||||||
|
|
||||||
{early_rem_output}
|
|
||||||
|
|
||||||
Now it is late. The boundaries between things have loosened.
|
|
||||||
Here is material pulled from opposite ends of his work:
|
|
||||||
|
|
||||||
{chunk_text}
|
|
||||||
|
|
||||||
Do not explain the connections between all of this.
|
|
||||||
Do not resolve them. Do not summarize what came before.
|
|
||||||
Something stranger is possible now — let the accumulated
|
|
||||||
material from the night find its own shape. Compressed,
|
|
||||||
associative, slightly off. Let the strangeness stand.
|
|
||||||
|
|
||||||
No headers. No bullet points. No hedging. No resolution.
|
|
||||||
No offer. End mid-thought if that is where the material ends.
|
|
||||||
150-250 words."""
|
|
||||||
return _call_claude(prompt)
|
|
||||||
|
|
||||||
|
|
||||||
def synthesize_final(nrem_output, early_rem_output, late_rem_output):
|
def synthesize_final(nrem_output, early_rem_output, late_rem_output):
|
||||||
prompt = f"""You have spent the night moving through Aaron Nelson's corpus
|
return _call_claude(
|
||||||
in three passes, each building on the last.
|
SYNTHESIS_PROMPT_TEMPLATE.format(
|
||||||
|
nrem_output=nrem_output,
|
||||||
The first pass — careful, close to the documents:
|
early_rem_output=early_rem_output,
|
||||||
{nrem_output}
|
late_rem_output=late_rem_output),
|
||||||
|
max_tokens=800)
|
||||||
The second pass — more personal, following what the first opened:
|
|
||||||
{early_rem_output}
|
|
||||||
|
|
||||||
The third pass — associative, strange, letting things touch that
|
|
||||||
don't normally touch:
|
|
||||||
{late_rem_output}
|
|
||||||
|
|
||||||
Now synthesize. Not a summary — a synthesis. Find what runs through
|
|
||||||
all three that none of them said directly. The thing that only becomes
|
|
||||||
visible when you hold all three passes together.
|
|
||||||
|
|
||||||
Write it as a single unbroken piece. No headers, no bullet points,
|
|
||||||
no stage labels. 200-300 words. End with the one question that
|
|
||||||
matters most right now."""
|
|
||||||
return _call_claude(prompt, max_tokens=800)
|
|
||||||
|
|
||||||
|
|
||||||
def synthesize_lucid(chunks, task):
|
def synthesize_lucid(chunks, task):
|
||||||
chunk_text = "\n\n---\n\n".join([f"[{c['source']}]\n{c['content']}" for c in chunks])
|
chunk_text = "\n\n---\n\n".join([f"[{c['source']}]\n{c['content']}" for c in chunks])
|
||||||
prompt = f"""Aaron has a question he is sitting with:
|
resolved_task = task or LUCID_DEFAULT_TASK
|
||||||
|
return _call_claude(LUCID_PROMPT_TEMPLATE.format(
|
||||||
{task or "What should I be thinking about that I am not?"}
|
task=resolved_task, chunk_text=chunk_text))
|
||||||
|
|
||||||
You have searched his entire corpus and found material that
|
|
||||||
speaks to this question from unexpected directions. Here is
|
|
||||||
what you found:
|
|
||||||
|
|
||||||
{chunk_text}
|
|
||||||
|
|
||||||
Do not summarize. Do not list. Pick the most interesting
|
|
||||||
tension between what the corpus contains and what he is
|
|
||||||
asking, and follow it through to its conclusion. Cite
|
|
||||||
specific documents by name. Be direct about what you think.
|
|
||||||
No headers, no bullet points. 250-400 words.
|
|
||||||
End with an offer to work on it together."""
|
|
||||||
return _call_claude(prompt)
|
|
||||||
|
|
||||||
|
|
||||||
def _call_claude(prompt, max_tokens=1000):
|
def _call_claude(prompt, max_tokens=1000):
|
||||||
@@ -436,10 +462,10 @@ def write_manifest(date_str, stage_data, corpus_data):
|
|||||||
"prompt_sig": prompt_signature(),
|
"prompt_sig": prompt_signature(),
|
||||||
"dreamer_version": DREAMER_VERSION,
|
"dreamer_version": DREAMER_VERSION,
|
||||||
"prompt_hash": prompt_hash([
|
"prompt_hash": prompt_hash([
|
||||||
synthesize_nrem.__doc__ or "",
|
NREM_PROMPT_TEMPLATE,
|
||||||
synthesize_early_rem.__doc__ or "",
|
EARLY_REM_PROMPT_TEMPLATE,
|
||||||
synthesize_late_rem.__doc__ or "",
|
LATE_REM_PROMPT_TEMPLATE,
|
||||||
synthesize_final.__doc__ or "",
|
SYNTHESIS_PROMPT_TEMPLATE,
|
||||||
]),
|
]),
|
||||||
"stages": stage_data,
|
"stages": stage_data,
|
||||||
"corpus": corpus_data,
|
"corpus": corpus_data,
|
||||||
|
|||||||
Reference in New Issue
Block a user