Files

T

aaron d2ec20e373 graphiti_patches: vendored FalkorDB vector index support for graphiti-core 0.29.0

Adds native FalkorDB vector index support to graphiti-core's FalkorDB
driver. Three patched files (graph_queries.py, falkordb_driver.py,
falkordb/operations/search_ops.py) plus apply.sh that backs up venv
files and copies patches over.

Why this exists: graphiti-core 0.29.0 builds similarity queries using
interpreted Cypher cosine math (vec.cosineDistance) which produces a
full-table scan over Entity/RELATES_TO/Community nodes for every search.
At ~4,000+ entities, single-episode add_episode took 8+ minutes for the
resolve-against-existing-graph step and bulk ingest hung indefinitely.
FalkorDB itself supports db.idx.vector.queryNodes and queryRelationships
procedures backed by HNSW indexes; the driver just doesn't use them.

Patches:

1. graph_queries.py — adds get_vector_indices() returning CREATE VECTOR
   INDEX statements for FalkorDB (Entity.name_embedding,
   RELATES_TO.fact_embedding, Community.name_embedding). HNSW with
   cosine similarity. Adds VECTOR_INDEX_CANDIDATE_MULTIPLIER for
   over-fetch when WHERE filters reject some top-k results. Original
   get_vector_cosine_func_query preserved for fallback.

2. falkordb_driver.py — extends build_indices_and_constraints() to call
   get_vector_indices() alongside range and fulltext. Adds cache
   invalidation hook so the search_ops dispatcher re-probes for indexes
   after they're built.

3. falkordb/operations/search_ops.py — adds vector-index dispatcher
   helpers (_falkordb_vector_index_exists with module-level cache,
   _falkordb_vector_node_search_cypher, _falkordb_vector_edge_search_cypher).
   Rewrites the three vector-similarity call sites (Entity.name_embedding,
   RELATES_TO.fact_embedding, Community.name_embedding) to use
   db.idx.vector.queryNodes / queryRelationships when available, fall
   back to interpreted-Cypher cosine math when not. Index existence
   probed once per (label, attribute, entity_type) and cached.

Empirical result: single-episode add_episode against a 4,277-entity
graph went from indefinite hang to 8.2 seconds. Bulk re-ingest of
already-known content (worst case for entity dedup) committed in 60ms.

Activation requires bridging driver._search_ops to driver.search_interface
in the sidecar (see graphiti_service.py). graphiti-core declares
search_interface as the dispatcher attribute but never assigns the
per-driver implementation to it — naming mismatch in their internal
refactor. The bridge is one line in our sidecar's lifespan.

Upstream candidate: this is a known gap (referenced indirectly in
upstream issue #1263 RFC for external vector store overlay). Maintainers'
attention is on Milvus/Qdrant/Pinecone overlay; this is the FalkorDB-
native alternative for users who don't want to run a separate vector DB.
PR after empirical validation in production. Apache-2.0 graphiti-core
source is NOT vendored — backups/ is gitignored to keep the upstream
source out of this repo.

2026-05-20 04:04:24 +00:00

2.3 KiB

Raw Blame History

graphiti-core Patches — FalkorDB Vector Index Support

Vendored patches against graphiti-core 0.29.0 adding native FalkorDB vector index support. Three files modified, all under graphiti_core/driver/falkordb/ and graphiti_core/graph_queries.py. No changes to Neo4j or Kuzu code paths.

Why this exists

graphiti-core's FalkorDB driver uses interpreted Cypher cosine math (vec.cosineDistance(...)) for similarity search. Each query becomes a full table scan over Entity/RELATES_TO/Community nodes. At ~4,000+ entities, single-episode ingest's resolve-against-existing-graph step takes 8+ minutes and bulk ingest hangs FalkorDB. FalkorDB itself supports db.idx.vector.queryNodes and db.idx.vector.queryRelationships procedures backed by HNSW indexes; graphiti-core's driver doesn't use them.

These patches:

Add get_vector_indices() to graph_queries.py returning CREATE VECTOR INDEX statements for FalkorDB on Entity.name_embedding, RELATES_TO.fact_embedding, and Community.name_embedding.
Extend falkordb_driver.py:build_indices_and_constraints() to create the vector indexes alongside range and fulltext indexes.
Rewrite the three vector-similarity call sites in falkordb/operations/search_ops.py to use db.idx.vector.queryNodes and db.idx.vector.queryRelationships instead of full-scan cosine math. Over-fetches by a configurable multiplier to handle filter rejections.

Files

Patched file	Source
`graphiti_core/graph_queries.py`	Adds `get_vector_indices()`
`graphiti_core/driver/falkordb/falkordb_driver.py`	Extends `build_indices_and_constraints`
`graphiti_core/driver/falkordb/operations/search_ops.py`	Three query rewrites

How to apply

./apply.sh — backs up the originals into ./backups/<timestamp>/ and copies the patched files over.

How to revert

Move the timestamped backup back over the venv:

cp backups/<ts>/graph_queries.py /home/aaron/aaronai/venv/lib/python3.12/site-packages/graphiti_core/graph_queries.py
# ...etc

Upstream candidate

Documented gap (issue #1263 references it indirectly via vector store overlay RFC). Maintainers' attention is on Milvus/external vector DB overlay; this patch is the FalkorDB-native alternative for users who don't want a separate vector DB. Consider PR after empirical validation in production.

2.3 KiB Raw Blame History