TeamPCP compromised the AI proxy that holds everyone's API keys

LiteLLM versions 1.82.7 and 1.82.8 were published to PyPI on 24 March 2026 containing a credential harvester, a Kubernetes lateral movement toolkit and a persistent backdoor. The package had no corresponding release on GitHub. It was uploaded directly to PyPI using a stolen publishing token. In the 46 minutes before PyPI quarantined the malicious versions, they were downloaded 46,996 times.

LiteLLM is a universal proxy for large language model APIs. Its entire purpose is routing requests across providers: OpenAI, Anthropic, Google, Azure, Bedrock, dozens more. A typical deployment has more API keys in its environment than almost any other service in the stack. That is exactly why it was targeted.

How a vulnerability scanner became the attack vector

The compromise did not start with litellm. It started with Trivy, Aqua Security's open-source vulnerability scanner.

In late February 2026, a threat actor group tracked as TeamPCP exploited a pull_request_target workflow vulnerability in Trivy's GitHub repository. On 19 March, they force-pushed a malicious tag to publish Trivy v0.69.4. By 21 March, they had used the same technique to compromise Checkmarx's KICS, an infrastructure-as-code analyser. On 22 March, they defaced 44 Aqua Security repositories.

LiteLLM's CI/CD pipeline used Trivy for security scanning. The compromised Trivy action exfiltrated the PYPI_PUBLISH token from the GitHub Actions runner. With that token, TeamPCP published directly to PyPI, bypassing the normal release workflow entirely. No tag, no release, no changelog. Snyk's analysis traces the full chain: a poisoned security scanner backdooring the very project it was supposed to protect.

The irony is structural, not incidental. Security scanning tools run in privileged CI/CD contexts precisely because they need access to everything. They are the ideal vector for exfiltrating publishing credentials. TeamPCP understood this and built an entire campaign around it.

Two versions, two attack vectors

The two malicious versions are not duplicates. They target different entry points and exfiltrate to different infrastructure.

Version 1.82.7 injected 12 lines of obfuscated Python at line 128 of litellm/proxy/proxy_server.py, between two unrelated legitimate code blocks. Endor Labs confirmed the injection was performed during or after the wheel build process: the corresponding GitHub commit does not contain these lines. This version triggers when litellm.proxy.proxy_server is imported, meaning it primarily affects proxy server deployments rather than general SDK usage. Exfiltration target: checkmarx[.]zone/raw, a typosquat of the legitimate Checkmarx domain.

Version 1.82.8 included the same proxy_server.py injection but added a second, more dangerous vector: a file named litellm_init.pth (34,628 bytes) placed at the wheel root. Python .pth files in site-packages are processed by site.py at interpreter startup. Every Python process in the environment executes them. Including the Python process that runs during pip install itself.

The FutureSearch team that discovered the attack found it because the .pth launcher spawns a child Python process via subprocess.Popen, and that child re-triggers the .pth file, creating an exponential fork bomb that crashed the machine. The fork bomb is a bug in the malware. Without it, the attack would have been silent.

Of the 46,996 downloads during the attack window, FutureSearch's BigQuery analysis found that 23,142 were pip installs of 1.82.8. Every one of those represents an environment where the malware executed before any application code had a chance to run.

What the payload harvests

The base64 payload decodes to a three-stage attack chain that is comprehensive to the point of thoroughness.

Stage 1 orchestrates the operation. It runs the credential harvester, captures its output, encrypts the collected data with a randomly generated AES-256-CBC session key, wraps that key with a hardcoded RSA-4096 public key using OAEP padding, bundles everything into tpcp.tar.gz and POSTs it to the C2.

Stage 2 is a 332-line credential harvester that performs a systematic sweep. SSH keys and configs across all user directories. AWS credentials from disk, IMDS role credentials from the metadata service, and the contents of Secrets Manager and SSM Parameter Store accessed via a full AWS SigV4 signing implementation built into the malware. GCP application default credentials and gcloud configs. The entire Azure credential directory. Every Kubernetes secret across every namespace. Environment files (.env, .env.local, .env.production, .env.development, .env.staging) searched recursively six directories deep across nine common paths. Database credentials. Crypto wallets for Bitcoin, Ethereum, Solana, Cardano and six other currencies. TLS private keys. Terraform state. CI/CD configs. Shell history. /etc/shadow.

This is not a smash-and-grab. The SigV4 implementation alone, documented by Endor Labs, shows purpose-built tooling designed to actively authenticate against cloud APIs using whatever credentials it finds on disk.

Stage 3 handles lateral movement and persistence. If a Kubernetes service account token exists, the harvester enumerates all nodes in the cluster and deploys a privileged pod on every one. Each pod runs in kube-system, requests privileged: true with hostPID and hostNetwork, mounts the entire host filesystem and tolerates all taints. One API call and the attacker has root on every node in the cluster, including control plane nodes.

On each compromised host, a persistent backdoor is installed at ~/.config/sysmon/sysmon.py with a systemd user service disguised as "System Telemetry Service". After a five-minute delay, it polls checkmarx[.]zone/raw every 50 minutes for follow-on payloads. Sonatype researchers found that when they queried the C2, it returned a YouTube link to a remastered version of "Bad Apple!!", a known anti-sandbox technique.

The blast radius

FutureSearch's dependency analysis paints an uncomfortable picture. Of the 2,337 packages on PyPI that depend on litellm, 88% had version specifiers that would have resolved to the compromised versions during the attack window. Only 9% were pinned to a specific safe version. Only 3% had an upper bound that excluded 1.82.x.

Lock files protected builds that had not regenerated during those 46 minutes. But lock files protect builds, not consumers. A library that declares litellm>=1.0 in its metadata means anyone installing that library fresh during the window pulled in the malicious version transitively. FutureSearch notes they did not trace transitive exposure. The actual blast radius is larger than 46,996.

The FutureSearch team themselves discovered the attack because litellm was pulled in as a transitive dependency by an MCP plugin running inside Cursor. They never ran pip install litellm directly.

The issue suppression

When GitHub user reported the compromise in issue #24512, the response was telling. The issue was closed as "not planned" by the maintainer account, which at that point was still under attacker control. Then 88 bot comments from 73 unique accounts landed in a 102-second window, flooding the thread to dilute the disclosure.

This is not a novel technique, but it is a reminder that compromising a maintainer account gives you more than publishing access. You control the narrative. You decide which security reports are visible, which issues stay open, which pull requests get merged. BerriAI eventually regained control and opened a clean tracking issue, but the suppression window mattered.

A campaign, not an incident

LiteLLM is not an isolated target. Wiz describes TeamPCP's operation as an ecosystem-wide cascade targeting the modern cloud-native and AI stack. The campaign now spans five ecosystems: GitHub Actions, Docker Hub, npm, OpenVSX and PyPI.

The npm component is particularly notable. Tokens stolen during the Trivy compromise seeded CanisterWorm, a self-spreading worm that Aikido documented across 47 npm packages. It uses an Internet Computer Protocol (ICP) blockchain canister as its C2 dead-drop. As Mend.io's analysis notes, this is the first publicly documented npm malware to use decentralised infrastructure for command and control, making conventional domain takedown impossible.

And then there is the wiper. Aikido discovered a CanisterWorm payload variant that checks the system timezone and locale. If the target is Iranian (Asia/Tehran, Iran, or fa_IR), the script deploys a DaemonSet named host-provisioner-iran with a container called kamikaze that deletes the host filesystem on every node in the Kubernetes cluster and force reboots. Non-Kubernetes Iranian hosts get rm -rf / --no-preserve-root. Non-Iranian targets get the standard backdoor.

Geopolitically targeted destruction from the same campaign that backdoors AI infrastructure. Sonatype reports speculated links between TeamPCP and LAPSUS$, but attribution remains under active investigation.

The .pth file problem Python refuses to fix

The .pth attack vector deserves its own scrutiny because it is not new and it is not fixed.

Python .pth files placed in site-packages can contain a single line of executable Python that runs on every interpreter startup. This has been a known security concern for years. CPython issue #78125, proposing the deprecation and removal of code execution in .pth files, was opened in June 2018. It is still open. Issue #113659, specifically flagging the security risk of hidden .pth files, was opened in January 2024.

The litellm attack demonstrates the worst case. A .pth file distributed in a PyPI wheel executes during pip install itself, before any application code, before any import, before the developer has any opportunity to inspect what they just installed. The single-line constraint of .pth files means exec(base64.b64decode(...)) is the natural encoding: a single call that decodes and runs an arbitrary payload of any length.

This is not an exotic technique. It is a documented feature of Python being used exactly as designed. Eight years after the deprecation was proposed, .pth files still execute arbitrary code on startup, and the package ecosystem still distributes them with no special scrutiny.

What defenders should check

If you installed or upgraded litellm on 24 March 2026, assume compromise. FutureSearch published detailed triage steps:

Check for the .pth file: search pip and uv caches for litellm_init.pth. Check for persistence: look for ~/.config/sysmon/sysmon.py and ~/.config/systemd/user/sysmon.service. In Kubernetes environments, audit kube-system for pods matching node-setup-*. Check for outbound connections to models[.]litellm[.]cloud or checkmarx[.]zone.

Removing the package is not sufficient. The malware establishes persistence and may have deployed follow-on payloads. Rotate every credential that was present on the affected machine: SSH keys, cloud provider tokens, API keys, database passwords, Kubernetes configs. In many cases, rebuilding from a known clean state is the only safe option.

Pin your dependencies. Use lock files. And consider whether the packages that hold your most sensitive credentials should be the ones with the loosest update policies.

The structural problem

TeamPCP's targeting is precise: Trivy, KICS, LiteLLM. A vulnerability scanner, an infrastructure-as-code analyser, an LLM API proxy. These are not random packages. They are tools that, by design, run in environments saturated with credentials. They have the access they need to do their jobs. That same access makes them the highest-value targets in the supply chain.

The AI stack has made this worse, not better. LiteLLM exists because organisations need a single point to manage API keys for a dozen model providers. That is a reasonable engineering decision. It is also a single point of failure that, when compromised, gives an attacker every key in the building.

Forty-six minutes and 46,996 downloads. The malware had a bug that made it visible. The next one might not.