Use Text2Cypher with RAG

In this blog post Use Text2Cypher with RAG for dependable graph-based answers today we will show how to turn natural-language questions into precise Cypher queries and reliable answers over your graph data.

Before diving into code, let’s clarify the idea. Text2Cypher uses a language model to translate a user question into a Cypher query for Neo4j (or any Cypher-compatible graph). Retrieval-augmented generation (RAG) feeds the model the most relevant context—like your graph schema, allowed operations, and example queries—so it generates safer, more accurate Cypher and better final answers. Put simply: RAG gives the model the right map; Text2Cypher drives the car.

What is Text2Cypher and why pair it with RAG

Cypher is a declarative query language for graphs. It’s expressive and readable, but users still need to know labels, relationship types, properties, and how to pattern-match paths. Text2Cypher lets them ask “Which customers bought product X after July?” and have a model produce the correct MATCH/WHERE/RETURN.

RAG improves Text2Cypher by grounding the model with:

Schema snippets (node labels, relationships, indexed properties)
Business vocabulary mappings (e.g., “client” means :Customer)
Query patterns and constraints (read-only, no deletes)
Representative examples with input-output pairs

The result: fewer hallucinations, fewer dangerous queries, more precise and explainable answers.

Solution architecture at a glance

Intent understanding: classify if the question is graph-queryable.
Context retrieval (RAG): fetch the minimal relevant schema, synonyms, and examples.
Text2Cypher: prompt the LLM to produce a Cypher statement with parameters.
Safety checks: enforce read-only, validate syntax with EXPLAIN, limit result size.
Execution: run parameterized Cypher via Neo4j driver.
Answer synthesis: summarize results and include lightweight citations.
Observability: track query success, correctness, and token usage.

Prerequisites

A Neo4j instance (local or managed) and the Python neo4j driver
Access to an LLM API (e.g., OpenAI or Azure OpenAI)
A small set of schema docs and example queries

Sample domain

We’ll use a movie graph with nodes and relationships:

Labels: Movie(title, released, tagline), Person(name, born)
Relationships: (:Person)-[:ACTED_IN]->(:Movie), (:Person)-[:DIRECTED]->(:Movie)

Example business synonyms: “actor” → :Person who ACTED_IN; “film” → :Movie.

Preparing your RAG context

1) Curate schema and examples

Create short, copy-pastable documents:

Schema.md – labels, relationships, properties, indexes
Synonyms.md – term-to-schema mapping
Examples.md – a handful of natural-language to Cypher pairs

2) Embed and index the context

Use any vector store. Here’s a concise example with LangChain + FAISS. Swap providers as needed.

pip install langchain langchain-openai langchain-community faiss-cpu neo4j

from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_community.vectorstores import FAISS
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.docstore.document import Document

schema = """
Labels:
- Movie(title STRING, released INT, tagline STRING)
- Person(name STRING, born INT)
Relationships:
- (Person)-[:ACTED_IN]->(Movie)
- (Person)-[:DIRECTED]->(Movie)
Indexes: INDEX ON :Movie(title), :Person(name)
"""

synonyms = """
Synonyms:
- actor/actress => Person who ACTED_IN
- film/movie => Movie
- directed by => (Person)-[:DIRECTED]->(Movie)
"""

examples = """
Q: movies starring Tom Hanks after 1990
Cypher:
MATCH (p:Person {name: $name})-[:ACTED_IN]->(m:Movie)
WHERE m.released > $year
RETURN m.title AS title, m.released AS year
ORDER BY year DESC LIMIT 20;
Params: {"name": "Tom Hanks", "year": 1990}

Q: who directed Apollo 13
Cypher:
MATCH (p:Person)-[:DIRECTED]->(m:Movie {title: $title})
RETURN p.name AS director LIMIT 5;
Params: {"title": "Apollo 13"}
"""

splitter = RecursiveCharacterTextSplitter(chunk_size=800, chunk_overlap=50)
docs = [Document(page_content=t) for t in splitter.split_text("\n\n".join([schema, synonyms, examples]))]

vec = FAISS.from_documents(docs, OpenAIEmbeddings())

Prompting the model to produce safe Cypher

The LLM should only produce read-only Cypher with parameters and no data writes. We’ll instruct it strictly and then validate.

READ_ONLY_KEYWORDS = ["CREATE", "MERGE", "DELETE", "DETACH", "SET", "DROP", "LOAD CSV", "CALL dbms."]

SYSTEM_PROMPT = """
You convert natural language questions into Cypher for a Neo4j database.
Rules:
- Only use MATCH/WHERE/RETURN/ORDER/LIMIT/OPTIONAL MATCH
- Use parameters, never literal user input
- Prefer indexed lookups on :Person(name) or :Movie(title) when relevant
- Limit results to 50 unless the user asks otherwise
- Output only a JSON object with fields: cypher, params
- Do not explain yourself
Context:
{context}
"""

End-to-end function

import json
from langchain.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from neo4j import GraphDatabase

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

prompt = ChatPromptTemplate.from_messages([
    ("system", SYSTEM_PROMPT),
    ("user", "Question: {question}\nReturn JSON with cypher and params.")
])

parser = StrOutputParser()

# Neo4j driver
NEO4J_URI = "bolt://localhost:7687"
NEO4J_USER = "neo4j"
NEO4J_PASS = "password"
driver = GraphDatabase.driver(NEO4J_URI, auth=(NEO4J_USER, NEO4J_PASS))


def retrieve_context(question: str) -> str:
    """RAG step: pull a few relevant chunks."""
    hits = vec.similarity_search(question, k=4)
    return "\n---\n".join(h.page_content for h in hits)


def looks_read_only(cypher: str) -> bool:
    up = cypher.upper()
    return not any(kw in up for kw in READ_ONLY_KEYWORDS)


def explain_ok(tx, cypher: str, params: dict) -> bool:
    try:
        list(tx.run("EXPLAIN " + cypher, **params))
        return True
    except Exception as e:
        print("Explain failed:", e)
        return False


def text2cypher(question: str):
    context = retrieve_context(question)
    chain = prompt | llm | parser
    raw = chain.invoke({"context": context, "question": question})
    data = json.loads(raw)
    cypher = data.get("cypher", "").strip()
    params = data.get("params", {})

    assert looks_read_only(cypher), "Refused: non read-only Cypher"

    with driver.session() as session:
        ok = session.execute_read(lambda tx: explain_ok(tx, cypher, params))
        assert ok, "Cypher failed EXPLAIN"

        # Execute for real after EXPLAIN passes
        records = session.run(cypher, **params)
        rows = [r.data() for r in records]
    return cypher, params, rows, context

Turn rows into a friendly answer

After execution, use a lightweight prompt to summarize results and keep citations. We provide the rows and a pointer to what each column means.

ANSWER_PROMPT = """
You are a helpful assistant. Summarize the query results clearly.
- Use plain language
- If there is a 'title' or 'name' column, list top items
- Mention constraints (e.g., after 1990) if evident
Return 2-5 sentences.
"""

def synthesize_answer(question: str, rows: list[dict]) -> str:
    sample = rows[:10]
    content = {
        "question": question,
        "rows": sample,
        "note": "Only summarize the data; do not invent new facts."
    }
    msg = [
        {"role": "system", "content": ANSWER_PROMPT},
        {"role": "user", "content": json.dumps(content)}
    ]
    resp = llm.invoke(msg)
    return resp.content

# Example usage
q = "movies starring Tom Hanks after 1990"
cypher, params, rows, ctx = text2cypher(q)
answer = synthesize_answer(q, rows)
print(cypher, "\n", params, "\n", answer)

Key design choices that raise reliability

Schema-scoped RAG: retrieve only the relevant labels/relationships for the question, not the entire schema. Smaller context, better precision.
Strict output format: force JSON with cypher and params. Easier to validate.
Read-only guardrails: both in prompt and in code using keyword checks.
Dry-run with EXPLAIN: catches syntax and semantic errors early.
Parameterization: never interpolate raw user strings.
Result limits: protect performance and cost with LIMIT and pagination.

Handling ambiguity and follow-ups

Not every question maps cleanly to your graph. Add a classification step:

If classification says “needs more info,” ask a clarifying question (e.g., “Which Tom Hanks? Actor or director role?”).
If unanswerable from the graph, gracefully hand off to a different data source or return a helpful message.

Advanced patterns

Few-shot query plans

Give the model patterns like “find people by name, then expand via ACTED_IN.” This nudges reuse of efficient structures and indexed properties.

Dynamic schema retrieval

Periodically introspect the database (e.g., CALL db.schema.visualization in Neo4j) to regenerate schema chunks for the vector store. Automate nightly updates so RAG stays current.

Constraint-aware generation

Pass in business constraints: maximum depth, allowed relationship types, or tenant filters. You can pre-append WHERE clauses (e.g., tenantId = $tenant) to every generated query.

Result-grounded answers

When summarizing, expose which node properties were used. For example: “Based on Movie.title and Movie.released.” This gives users confidence and aids troubleshooting.

Testing and evaluation

Build a test set of question → expected Cypher and top-3 result rows.
Measure: generation success rate, EXPLAIN pass rate, execution success, and exact match/precision@k on results.
Track latency per stage (RAG retrieval, generation, EXPLAIN, execution).
Log rejected queries (e.g., contained CREATE) to improve prompts and filters.

Production hardening

Caching: memoize RAG retrieval for common questions and cache successful Cypher for fast repeats.
Rate limits and backoff: protect your LLM and database under load.
Observability: store the question, retrieved context IDs, generated Cypher, params, EXPLAIN plan summary, and outcome.
Permissions: add a policy layer that injects row-level filters or denies certain labels per user role.
Payload hygiene: trim long user inputs, strip markdown, normalize whitespace.

Common pitfalls and fixes

Hallucinated labels/relationships: extend synonyms and examples; add a reject rule if unknown labels appear.
Slow queries: prefer name/title lookups; encourage bounded traversals and LIMIT.
Over-long prompts: retrieve fewer, denser chunks; merge schema into concise tables.
Brittle outputs: enforce JSON schema and retry on parse failure.

When to consider GraphRAG

If your goal is long-form answers from the graph (not just structured results), consider GraphRAG patterns that retrieve subgraphs as context and let the model reason over them. You can combine both: Text2Cypher to fetch the precise subgraph, then feed that subgraph as RAG context for the final narrative answer.

Wrap-up

By pairing Text2Cypher with RAG, you turn natural questions into precise, safe Cypher and explainable answers. Start small: index your schema, add a handful of examples, enforce read-only rules, and iterate with metrics. As your catalog of examples and constraints grows, the system becomes both more capable and more predictable—exactly what technical teams and technical managers need for production-grade graph intelligence.

Discover more from CPI Consulting -Specialist Azure Consultancy

Subscribe to get the latest posts sent to your email.