chat: cap retrieve_documents per turn, truncate displayed citations, broaden lock-file skip
- MAX_RETRIEVALS_PER_TURN (5): after five retrieve_documents calls in a single
turn, further calls return a budget-exhausted message instead of executing.
Caps cost on runaway multi-query loops without forbidding compound questions.
- MAX_CITED_SOURCES (5): accumulated_sources was growing to 14+ entries across
multiple tool calls and showing chunks Claude never actually used. Cap the
list returned to the UI at 5, preserving insertion order so the
highest-relevance early-call results survive. Proper fix (Claude-driven
inline citations) is bigger work, noted for later.
- ingest.py lock-file skip: changed prefix tuple from ("~$", ".") to ("~", ".")
so it catches Office lock files even when Nextcloud's filesystem encoding has
mangled the "$" into a unicode replacement char. Matches what watcher.py
already does.
This commit is contained in:
+3
-1
@@ -82,7 +82,9 @@ IGNORED_TOP_FOLDERS = {"Drafts"}
|
||||
|
||||
def _ingest_one(filepath: Path, embedder, root: Path = None) -> int:
|
||||
"""Ingest a single file. Returns chunk count, 0 on skip/failure."""
|
||||
if filepath.name.startswith(("~$", ".")):
|
||||
# "~" catches Office lock files (~$) including the case where Nextcloud
|
||||
# filesystem encoding has mangled the "$" to a unicode replacement char.
|
||||
if filepath.name.startswith(("~", ".")):
|
||||
return 0
|
||||
if filepath.suffix.lower() not in SUPPORTED:
|
||||
return 0
|
||||
|
||||
Reference in New Issue
Block a user