Research
Researchsecurity12 min read

npm, PyPI and Docker Hub in 2026: developer credentials became supply-chain infrastructure

Credential harvesting against npm, PyPI and Docker Hub has turned developer identity into supply-chain infrastructure, with package registries now functioning as distribution systems for stolen trust.

npm, PyPI and Docker Hub have become attractive credential-harvesting infrastructure because they sit at the point where developer identity turns into executable trust. A stolen maintainer account is not only a way into one project. It is a way to ship code through a channel that developers, build systems and production clusters already believe. That is the important change. The attacker does not need to defeat every downstream organisation if the registry will do the distribution work for them.

This is not a new weakness dressed up in new language. Package registries have always carried risk. What has changed is the operational emphasis. The attacker is no longer satisfied with publishing one malicious dependency and waiting for installs. The better return is to steal the credentials that allow more packages, more tokens, more images and more automation secrets to be abused in sequence. Credential harvesting has become a supply-chain primitive.

The pattern is visible across npm, PyPI and Docker Hub because the registries solve the same problem in similar ways. They let developers publish reusable software at scale. They authenticate publishers. They expose artifacts to automated consumers. They integrate cleanly with CI/CD systems. They make distribution cheap. Those are the same properties that make them useful as attack infrastructure.

The registry is a trust amplifier

A package registry is usually described as a repository of code. That description is technically correct and strategically incomplete. A registry is a trust amplifier. It takes a decision made by one maintainer, or by one automation token acting for that maintainer, and projects it across every user, build pipeline and service that consumes the artifact.

That amplification is why credential theft matters more in registries than it does in many ordinary web applications. If an attacker steals access to a small forum account, the blast radius is bounded by that account. If an attacker steals access to a maintainer account for a package that appears deep inside dependency graphs, the blast radius is no longer bounded by the account. It is bounded by the dependency graph, the update behaviour of downstream consumers and the ability of defenders to notice that a legitimate identity has begun producing illegitimate artifacts.

Recent supply-chain reporting frames these compromises as attacks on weak links across third-party vendors, cloud services and shared infrastructure. That is accurate, but for developer ecosystems the weak link is often narrower than the phrase suggests. It is the gap between identity and intent. Registries can check that a valid account or token made a publication. They are much weaker at checking whether the human or automation behind that identity meant to do so.

This gap is the useful part for attackers. A package signed by a known publisher, uploaded through a normal registry path and pulled by an ordinary package manager does not look like intrusion at first glance. It looks like maintenance. The malicious act is embedded inside routine software distribution.

npm showed the value of maintainer identity

npm remains the clearest demonstration because JavaScript dependency graphs are dense, update velocity is high and install-time execution has long been an attractive abuse surface. The existing post on npm worms, credential harvesting and 2 billion weekly downloads covered the professionalisation of this model: compromised packages, developer phishing and self-propagating credential theft turned package installation into an identity collection mechanism.

The lesson from npm is not that JavaScript is uniquely careless. It is that high-throughput ecosystems reward attacks that automate trust abuse. A compromised npm token can publish a new version. A malicious package can run scripts during installation. A developer machine or CI runner may contain more credentials than the original npm token, including GitHub tokens, cloud keys, registry credentials and signing material. Once those are harvested, the attack can move from one registry account to a broader developer identity.

That is the epidemic shape. One stolen token creates a malicious release. The release lands on machines with more tokens. Those tokens create more malicious releases, access more repositories or poison more automation. The registry is not merely the place where the first payload is hosted. It becomes the mechanism by which identity theft is repeated.

This is why treating malicious packages as isolated malware samples misses the point. The package payload is often disposable. The durable value is the credential set collected from the environments that trusted the package enough to execute it.

PyPI has the same identity problem in a different accent

PyPI does not have npm's exact dependency culture, but it shares the same structural dependency on maintainer identity. Python packages are pulled into developer laptops, CI jobs, notebooks, containers, internal tools and production services. The Python ecosystem also has a long tail of projects maintained by individuals whose packages may be transitive dependencies of far larger systems.

The relevant arXiv brief, Stdlib or Third-Party? Empirical Performance and Correctness of LLM-Assisted Zero-Dependency Python Libraries, is not a threat report. Its useful contribution here is simpler: it states the dependency trade-off plainly. Third-party Python libraries introduce dependency management overhead, supply-chain risk and deployment friction. That is not an argument for rewriting the ecosystem with the standard library. It is a reminder that every convenience dependency also imports a trust relationship.

For PyPI, credential harvesting is especially dangerous because Python sits in operationally sensitive places. It is used for deployment scripts, data pipelines, machine-learning workflows, security automation and internal administration. A malicious package that lands in one of those contexts may find API keys, cloud credentials, model registry tokens, database passwords or SSH material nearby. The package registry compromise becomes a credential discovery operation.

The identity gap is familiar. A valid PyPI account publishes a release. The release name and version look plausible. Downstream dependency management systems fetch it. Software composition analysis may record that the dependency exists, but it does not prove the maintainer's session was clean, the token was not stolen or the release contents match the maintainer's intent. The tool can name the component. It cannot always name the betrayal.

PyPI has made meaningful moves on account security, including stronger authentication requirements for critical projects. Those measures matter. They also reveal the scale of the problem. If the highest-value projects require stronger controls while the long tail remains softer, attackers can choose between direct impact and propagation. A less popular package may still be valuable if it is installed by developers with privileged environments.

Docker Hub turns credentials into deployable infrastructure

Docker Hub changes the consequences because the artifact is not a library imported into a programme. It is often the programme's runtime environment. A compromised image can arrive with tools, scripts, users, entrypoints, network behaviour and embedded secrets already configured. If npm and PyPI compromise the dependency layer, Docker Hub compromises the execution substrate.

Container registries also inherit a difficult identity problem. A familiar image name and publisher identity carry enormous weight. Developers pull images into local tests. CI systems use them as build stages. Kubernetes clusters deploy them. Internal platforms mirror them. In many organisations the image is treated as operational infrastructure before anyone has inspected its provenance with the same care applied to application code.

Credential harvesting through Docker Hub can work in both directions. A compromised publisher account can distribute an image that steals environment variables, cloud metadata, registry credentials or mounted secrets. A compromised consumer environment can yield Docker credentials that allow further image publication or private registry access. The attacker can move between package ecosystems and container ecosystems because modern build pipelines already connect them.

This is where the phrase "attack infrastructure" becomes precise. Docker Hub is not only a target. It can be used to stage payloads, host credible artifacts, blend into normal deployment traffic and reach environments that block more obvious malware delivery paths. The trust relationship is doing the smuggling.

SCA describes the dependency, not the publication event

Traditional Software Composition Analysis tools still have value. They identify known vulnerable components, build inventories and help organisations understand what they depend on. The research brief notes that traditional SCA may not provide complete protection against open-source and supply-chain threats. That limitation is not a failure of implementation. It is a boundary of the model.

SCA usually answers questions after dependency selection. What packages are present? What versions are in use? What licences apply? Which known vulnerabilities map to those versions? Those are necessary questions. They are not sufficient against credential-driven publication abuse.

The harder questions are about the release event itself. Was this version published from the maintainer's normal environment? Was the authentication method stronger or weaker than previous releases? Did the package contents change in ways inconsistent with the project history? Did a new maintainer suddenly gain publication rights? Did an image add an unexpected entrypoint, binary blob or network client? Did a package begin collecting environment variables, reading credential files or reaching unfamiliar endpoints?

Those questions require identity assurance, behavioural baselines, provenance and registry-side controls. They do not fit neatly into a dependency inventory spreadsheet. A malicious package can be brand new, so it has no CVE. It can use a legitimate package name, so it is not typo-squatting. It can be published by a valid maintainer account, so the registry event looks authorised. It can be removed quickly, so incident responders may find only downstream traces.

The practical result is that defenders often learn about these attacks from side effects: unexpected outbound connections, stolen tokens, poisoned CI jobs, unusual package updates or public reports from registry malware teams. By then the registry has already amplified the attack.

The maintainer is carrying the platform's risk

The most uncomfortable part of this problem is that the burden falls on maintainers who were never resourced like infrastructure operators. Open-source maintainers are expected to protect accounts, avoid phishing, secure local machines, manage tokens, review releases, respond to reports and absorb abuse when something goes wrong. Many do this for projects maintained in spare time, without security staff or operational budget.

That mismatch is not incidental. The software industry has externalised a large amount of infrastructure risk onto people who did not agree to operate critical infrastructure. A popular npm package, PyPI module or Docker image can become part of production systems across banks, hospitals, software vendors and government bodies. The maintainer may still be one person with a laptop, a password manager and a full-time job elsewhere.

Attackers understand this asymmetry. It is cheaper to phish a maintainer than to compromise every consumer. It is cheaper to steal a token from a CI log than to exploit hardened production systems directly. It is cheaper to publish through a trusted registry than to persuade a firewall to allow an unknown binary from an unknown domain. The supply chain is attractive because the hard work of distribution has already been done.

This does not mean maintainers are careless. It means the system asks them to provide assurance far beyond what the surrounding institutions support. Registry operators, package consumers and commercial beneficiaries all depend on maintainer security. Too often, only the maintainer is treated as responsible for delivering it.

What changes the economics

The defences that matter most are the ones that reduce the value of stolen credentials or slow their conversion into trusted artifacts.

Mandatory phishing-resistant multi-factor authentication should be the floor for package and image publication, not a premium control for famous projects. Token scope should be narrow by default. A token used by one automation job should not publish every package an account controls. Long-lived tokens should be treated as toxic assets. Publication from new environments should carry friction, particularly when the release includes install scripts, native binaries, credential access patterns or container entrypoint changes.

Provenance also needs to become enforceable rather than decorative. Consumers should be able to require that a package or image came from a specific repository, workflow and signing identity. Registries should make suspicious publication events harder to complete quickly. Organisations should pin dependencies where operationally possible, mirror critical artifacts and monitor for unexpected publication changes in the packages and images they rely on.

None of this removes the need to inspect code. It changes what inspection is attached to. The release event matters as much as the artifact. A tarball or image digest can tell you what was delivered. Provenance and identity controls help answer why the system believed it.

There is also a cultural correction needed. Organisations that depend heavily on open source cannot treat maintainer security as an externality and then act surprised when attackers exploit it. Funding maintainers, supporting registry hardening, contributing verification tooling and reducing unnecessary dependencies are security measures. They are less theatrical than buying another dashboard, which may explain why they are less popular.

The attack will follow the credentials

The common thread across npm, PyPI and Docker Hub is not a specific package manager feature. It is the use of developer identity as a distribution credential. Wherever a stolen credential can cause trusted infrastructure to publish, build or deploy something, attackers will keep shaping campaigns around that credential.

This is why the problem will not stay inside one registry. Developer workstations, CI systems, source hosts, package registries, container registries and cloud platforms are now one connected identity surface. A token stolen in one layer is often useful in another. A malicious dependency can reach a CI runner. A poisoned image can reach cloud credentials. A phished maintainer can become a publisher for an ecosystem they never meant to endanger.

The industry spent years treating supply-chain compromise as a problem of bad code entering good systems. That framing is too small. The more durable problem is stolen trust entering automated systems that are designed to obey it. npm, PyPI and Docker Hub are not broken because they distribute software quickly. They are dangerous because they distribute trust quickly, and trust remains much easier to steal than to rebuild.

Newsletter

One email a week. Security research, engineering deep-dives and AI security insights - written for practitioners. No noise.