Changelog¶
Reverse-chronological. Versions follow SemVer and correspond to git tags. The "Unreleased" section accumulates between tags.
v0.13.0 — Vector search & models split¶
Added¶
models.pysplit into package. The 3641-linemodels.pyis now amodels/package with focused modules:orm.py,query.py,schema.py,session.py,agents.py. All public imports are unchanged via PEP 562 lazy__getattr__re-exports.- Full-stack vector search support.
VectorPropertydescriptor onNodeModel.__vector_indexes__for declaring vector indexes.SchemaDDL.vector_indexes()generatesCREATE VECTOR INDEXDDL (Neo4j 5.11+).Query.vector_search()/vector_search_model()for the query builder.GraphSession.vector_search()/semantic_search()and async equivalents onAsyncGraphSession. - Embedding adapters.
cypher_validator.embeddingsmodule withOpenAIEmbeddings,SentenceTransformerEmbeddings,CohereEmbeddingsadapters.EmbeddingFn/BatchEmbeddingFnruntime-checkable protocols. - CLI vector search.
cypher vector-searchsubcommand with--vector,--text,--provider,--top-koptions.
Fixed¶
-
Vector index name injection.
Query.vector_search()validates index names against^[A-Za-z_][A-Za-z0-9_]*$to prevent Cypher injection. -
NLToCypher returns parameterized Cypher. The GLiNER2 pipeline no longer inlines entity literals via
_inline_paramsbefore returning the query string. Callers receive(cypher, params)with$paramplaceholders intact — safe to log and route through any Cypher-validating layer. py_parser.collect_exprthreads labels and rel_types correctly. Subqueries, pattern comprehensions,shortestPath, andreducenow populatePyQueryInfo.labels_usedandPyQueryInfo.rel_types_used. Before this fix,parse_querycould under-report what a query referenced — which in turn broke_build_provenance_cypherfor any query that used a subquery to discover its domain labels.
Performance¶
- ORM metaclass — single-pass
__init__replaces two-pass__init__+__init_subclass__pattern. Regex compiled at module level. - Lazy imports —
models/__init__.pyuses PEP 562__getattr__soimport cypher_validator.modelsonly loads submodules on first access. - Agent tool lookups —
AgentTools/ExtendedAgentToolsbuildlabel→modelandrel_type→modeldicts at init for O(1) dispatch instead of linear scans. - Batch DB introspection —
GraphSchema.from_neo4j_db()uses 2 batch queries (nodes + rels) instead of N+1 individual queries. closest_match— shrinking cap on each hit + length-delta pre-filter + early return on exact match.compute_fixed_query—HashSetdedup replaces an O(n²)Vec::containsinner loop.collect_node_bindings/collect_rel_bindings— splitentryAPI intoget_mut+insertso the labelsVecisn't cloned when a variable is already bound.levenshtein_capped— 1-D rolling array (O(n) space), length-delta early exit, row-min early exit.- Regex hoisting in
llm_utils.pyandllm_pipeline.py:_RE_FENCED_TAGGED,_RE_FENCED_ANY,_RE_BACKTICK,_RE_CYPHER_LINE,_RE_JSON_BLOCK,_RE_CYPHER_BLOCK,_RE_SENTENCE_BOUNDARY. Schema—HashSet<String>for properties;has_propertyis now O(1).CypherGenerator::new— precomputes labels / rel_types / props_by_labelVecs once at construction.validate_batch— Rayon parallel iteration with GIL release viaPython::allow_threads.
See Performance for context and numbers.
Documentation¶
- New MkDocs Material site: schema / validator / generator / parser / error-codes pages, the full Pydantic ORM reference (models, query builder, repository, sessions, bulk ops, traversal, DDL, agent tools, caveats), the LLM section (pipeline, RAG, tools, async), GLiNER2 integration, and the developer-facing architecture / testing / performance / contributing pages.
Tests¶
tests/test_orm_api_contracts.py— 21 driver-free contract tests pinning the gotchas in API caveats:Condliteral inlining,Query.wheresingle-arg,Traversal.path_existscolumn name,GraphSession/Repositoryconstructor signatures,BulkOps@staticmethodshape,CypherFnreturns plain strings.tests/test_orm_neo4j.py— 23 live integration tests round-tripping every ORM CRUD path against Neo4j 5.26-community.tests/conftest.py— mirrorsNEO4J_USERNAME/NEO4J_PASSWORDandNEO4J_USER/NEO4J_PASSso either env-var convention works.
v0.12.0 — Pydantic Cypher ORM¶
The biggest single release since v0.9.0.
Added¶
- Pydantic ORM layer.
NodeModel/RelationshipModelwith registry-aware metaclasses,GraphSchema.from_models/from_registry/from_neo4j_db, theQueryfluent builder (withCond,CondGroup,RawExpr,PropExpr,NodeRef,RelRef,PathBuilder). Repository(model, db, var="n")— typed CRUD wrapper.BulkOps.bulk_create_nodes/bulk_merge_nodes/bulk_create_relationships/bulk_merge_relationships/bulk_delete_nodes— all@staticmethodreturning(cypher, params).Traversal—neighbors,shortest_path,subgraph,degree,common_neighbors,path_exists.SchemaDDL+SchemaDiff— constraints, indexes, migration DDL.GraphSession/AsyncGraphSession— execute Cypher with hydration into Pydantic instances.- AI agent tools.
AgentTools/ExtendedAgentToolsproduce OpenAI and Anthropic function-call specs and dispatch them back to(cypher, params)viahandle_tool_call. QueryHistory,QueryPlan,QueryResult,CypherFn(type-safe wrappers for common Cypher functions),schema_to_pipeline_kwargs.
See docs/orm/ for the full reference.
v0.11.0 — Subqueries & pattern comprehensions¶
Added¶
CALL { ... }subqueries — both read and write forms.EXISTS { ... },COUNT { ... },COLLECT { ... }subquery expressions.- Pattern comprehensions —
[(n)-[:REL]-(m) WHERE ... | expr].
Tests¶
tests/test_subqueries.pycovers every new construct against the validator and the parser.
v0.10.0 — initial Cypher ORM scaffold¶
First cut at the Pydantic ORM. Established the NodeModel /
RelationshipModel registration mechanism and the GraphSchema bridge into
the Rust validator. The full surface area landed in v0.12.0.
v0.9.1 — strict NER mode¶
Fixed¶
_collect_entity_statusnow operates in strict NER mode when anEntityNERExtractoris supplied: relation triples with at least one unconfirmed endpoint are silently dropped, which prevents schema endpoint labels from being stamped onto non-entity spans (e.g. "doctor" →Drug).
v0.9.0 — warnings, REDUCE, shortestPath, FOREACH¶
Added¶
- Warning diagnostics. Codes
W101–W2xxfor stylistic issues that don't block execution. REDUCE,shortestPath/allShortestPaths,FOREACH— full validator coverage.- Aggregate scope checks (no aggregate inside a
WHERE, no nested aggregates). WITHscope fix — variables not carried forward throughWITHare correctly reported as out-of-scope.
v0.6.1 — inline entity values¶
Fixed¶
- The GLiNER2 pipeline used to inline entity values into returned Cypher
strings via
_inline_params. Reverted to keep$paramplaceholders, but the v0.6.1 fix re-enabled inlining as an option for callers that needed it. The "Unreleased" entry above reverses this — parameterised by default.
v0.6.0 — EntityNERExtractor + db_aware¶
Added¶
EntityNERExtractor— spaCy and HuggingFace backends.NLToCypher(..., db_aware=True)— MATCH existing entities, CREATE new ones in a single round-trip.- 11 worked examples in
examples/using real models, no mocks.
v0.5.0 — Graph RAG + did-you-mean¶
Added¶
GraphRAGPipeline— full NL question → Cypher → execute → format → answer chain.cypher_validator.llm_utils:extract_cypher_from_text,repair_cypher,cypher_tool_spec,format_records,few_shot_examples.- Levenshtein "did you mean?" suggestions in validator diagnostics +
result.fixed_queryauto-fix. - Parameterised query support throughout the validator and generator.
Fixed¶
IS NULL/IS NOT NULLvalidator handling.
v0.4.0 — Schema APIs + generation batch¶
Added¶
Schema.merge,Schema.to_json,Schema.from_json.CypherGenerator.generate_batch(n, query_type=None)— bulk query generation.NLToCypher.from_env()— pick up Neo4j credentials from env vars.Neo4jDatabase.execute_many— sequential multi-query execution.
v0.3.0 — Neo4jDatabase + db-aware¶
Added¶
Neo4jDatabase— context-managed wrapper around the official driver.NLToCypher(..., db=db)— generate Cypher and execute it.
v0.2.0 — GLiNER2 hard dep¶
Changed¶
gliner2is now a required dependency. Previously it was optional but every published example relied on it.
v0.1.0 — initial release¶
- Rust parser + validator + generator via PyO3.
- Pest grammar at
src/grammar/cypher.pest. Schema,CypherValidator,CypherGenerator,parse_queryPython API.- Basic GLiNER2 integration.
- 13 query-type templates in the generator.