CodeGraphContext PR #882 rejects write Cypher on /api/graph
CodeGraphContext's visualisation endpoint accepted arbitrary Cypher through /api/graph and passed it directly to Neo4j. PR #882 adds the missing read-only guard.

CodeGraphContext's visualisation server exposed arbitrary write Cypher through GET /api/graph. The route accepted a cypher_query parameter, passed it straight to Neo4j with session.run(cypher_query) and relied on callers to behave. PR #882 closes that gap by enforcing the same read-only contract that already existed in the MCP query tool.
I identified the vulnerability, submitted the fix and validated it against both destructive and benign queries. The change was merged on 7 May 2026.
The vulnerable path
CodeGraphContext is an MCP server and CLI tool that indexes local source code into a graph database so AI assistants can query project structure. That graph is exposed through two main surfaces: the MCP tool path and the visualisation HTTP server.
The MCP path already had a guard. Its execute_cypher_query handler blocked write-oriented Cypher keywords before passing a query to the database. The visualisation server did not.
In src/codegraphcontext/viz/server.py, the get_graph() route accepts cypher_query from GET /api/graph. Before the patch, the relevant control flow was direct:
@app.get("/api/graph")
async def get_graph(repo_path: Optional[str] = None, cypher_query: Optional[str] = None):
...
with db_manager.get_driver().session() as session:
if cypher_query:
print(f"DEBUG: Executing custom query: {cypher_query}", flush=True)
result = session.run(cypher_query)The data flow is the vulnerability:
- A client supplies
cypher_queryin the query string. - FastAPI binds it to the route argument.
- The value reaches
session.run()unchanged. - Neo4j executes it with the application's database privileges.
That is CWE-943: special elements in query logic were not neutralised before execution. In practice, it meant the visualiser could be asked to run CREATE, MERGE, DELETE, SET, REMOVE, DROP or CALL apoc.* operations even though the endpoint was supposed to be a graph viewer.
Why localhost still mattered
The visualiser is not normally deployed as a public internet service. That does not make the bug academic.
The normal usage model is a developer running CodeGraphContext locally with the visualiser listening on 127.0.0.1:8000. The server also installs permissive CORS with allow_origins=["*"]. Under that combination, any web page the developer visits can send a browser request to the local visualiser. The browser is allowed to make the cross-origin request and the attacker's JavaScript can observe the response.
The preconditions are ordinary:
| Condition | Why it is realistic |
|---|---|
| Visualiser running locally | This is the tool's normal development workflow |
| Browser on the same machine | The developer is using the same workstation |
| Wildcard CORS | The server explicitly permits cross-origin access |
| Writable Neo4j user | The default graph database user is not read-only |
Before the patch, a request equivalent to this could create a node in the local graph:
GET /api/graph?cypher_query=CREATE%20(n:Pwn)%20RETURN%20n HTTP/1.1
Host: 127.0.0.1:8000The same path could delete or mutate the indexed code graph. That is not remote code execution, but it is still a meaningful compromise of the tool's data model. For an assistant that uses graph context to reason about code, corrupting the graph is also a way to corrupt the assistant's view of the repository.
What PR #882 changes
The fix adds a small read-only validator near the top of viz/server.py, at lines 32 to 48 in the PR diff:
_FORBIDDEN_CYPHER_KEYWORDS = (
'CREATE', 'MERGE', 'DELETE', 'SET', 'REMOVE', 'DROP', 'CALL apoc'
)
_STRING_LITERAL_RE = re.compile(r'''"(?:\\.|[^"\\])*"|\'(?:\\.|[^\'\\])*\'''')
def _is_read_only_cypher(query: str) -> bool:
"""Return True if *query* contains no write keywords (outside string literals)."""
stripped = _STRING_LITERAL_RE.sub('', query)
for keyword in _FORBIDDEN_CYPHER_KEYWORDS:
if re.search(r'\b' + keyword + r'\b', stripped, re.IGNORECASE):
return False
return TrueThe helper strips string literals before looking for forbidden keywords. That matters because a query such as this should remain valid:
MATCH (n) WHERE n.name = "CREATE" RETURN nThe word CREATE appears in the query text, but it is data rather than syntax. Removing quoted strings before keyword matching avoids that false positive and matches the behaviour already used by the MCP handler.
The enforcement point is added immediately before session.run() at lines 94 to 102 of the diff:
if cypher_query:
if not _is_read_only_cypher(cypher_query):
raise HTTPException(
status_code=400,
detail=(
"This endpoint only supports read-only Cypher queries. "
"Prohibited keywords like CREATE, MERGE, DELETE, SET, REMOVE, "
"DROP, or CALL apoc are not allowed."
),
)
result = session.run(cypher_query)A final small change at line 265 preserves intentional HTTPException responses. Without that, the route's broad except Exception block would catch the deliberate 400 response and convert it into a generic 500. The fix therefore changes both the security decision and the HTTP behaviour around that decision.
The patch is deliberately narrow: one file, about 30 added lines. It does not redesign the visualiser, change the database driver or remove custom queries. It makes the HTTP endpoint enforce the contract it already claimed to have.
What testing showed
The regression test was behavioural rather than theoretical.
Before the patch, a request containing CREATE (n:Pwn) RETURN n reached the driver and wrote a node to Neo4j. After the patch, the same request returned HTTP 400 with the read-only error message and did not reach session.run().
Read queries still worked. MATCH (n) RETURN n LIMIT 5 continued to return graph results. The string-literal case was also checked with MATCH (n) WHERE n.name = "CREATE" RETURN n, which was accepted because the forbidden keyword was inside a literal rather than in Cypher syntax.
That last check is small but important. Security patches that break ordinary read queries tend to get reverted. The validator needed to block writes without making the visualiser unpleasant to use.
The broader pattern in AI graph tooling
Cypher injection is not new. NVD records Cisco SD-WAN vManage issues such as CVE-2021-1349 where crafted HTTP requests to a management interface could trigger Cypher query language injection. The pattern has now moved into AI infrastructure because graph databases are a convenient way to store code relationships, entities, memories and retrieval context.
The newer examples look different at the edge but similar at the sink. NVD describes CVE-2026-32247 in Graphiti as a Cypher injection in an AI agent temporal context graph, where attacker-controlled labels reached Cypher construction for Neo4j, FalkorDB and Neptune backends. The interesting part is the route into the bug: in MCP deployments, prompt injection could induce an LLM client to call a tool with attacker-controlled entity_types, which then became graph query structure.
I have seen the same shape in other AI projects. In LightRAG's Memgraph backend, an entity_type value was interpolated into a Cypher label. In full-stack-ai-agent-template, webhook URLs became server-side requests to attacker-chosen destinations. Different CWE numbers, same structural failure: an AI-facing framework takes flexible input, converts it into a privileged operation and forgets that the caller may be hostile or may be an LLM acting on hostile instructions.
CodeGraphContext's bug is simpler than Graphiti's label injection and less exposed than a public API server. Its importance is the trust boundary. The visualiser looked like a local developer convenience, but wildcard CORS turned it into a browser-reachable API. The database looked like an internal implementation detail, but custom Cypher made it a user-controlled query engine.
What remains to harden
PR #882 removes the most damaging class of impact by rejecting write Cypher. It does not make the visualisation server a hardened remote interface.
Three follow-ups are worth treating separately:
| Follow-up | Reason |
|---|---|
| Tighten wildcard CORS | The root drive-by condition is any site being able to call the local visualiser |
| Use driver-level read transactions | session.execute_read(...) is a stronger boundary than keyword matching |
| Share the validator with the MCP handler | Copying the blocklist by convention creates drift risk |
The second point is the most important long term. Keyword blocklists are useful guardrails, but they are not the same as a database-enforced read-only execution path. If the driver or database can enforce read semantics, the application should use that as the final boundary.
The uncomfortable lesson is that localhost is no longer a quiet place. Developer tools increasingly expose local HTTP servers to browsers, editors and AI agents. Once those tools add permissive CORS and privileged backends, the difference between "local convenience endpoint" and "attack surface" is one visited web page.
Newsletter
One email a week. Security research, engineering deep-dives and AI security insights - written for practitioners. No noise.