Git tags, package registries and extension marketplaces share the same broken authentication model

Every few months a supply chain attack surfaces that feels new. A GitHub Action rewrites its own tags. A package registry serves a dependency that nobody audited. An IDE extension quietly exfiltrates credentials. The disclosure lands, the incident response playbook runs and the community patches the specific hole. Then it happens again somewhere else.

The details change but the mechanism does not. Underneath every one of these incidents is the same structural failure: a mutable reference, an unverified publisher and a consumer that treats the reference as proof of integrity. Whether the reference is a Git tag, a package version string or an extension marketplace listing, the authentication model is functionally identical - and identically broken.

Mutable pointers as trust anchors

The tj-actions attack in March 2025 was the cleanest demonstration. The attacker overwrote every version tag of tj-actions/changed-files - from v1 through v45 - to point at a single malicious commit. Any CI pipeline that referenced the action by tag pulled the payload. Over 23,000 repositories were exposed.

This worked because Git tags are mutable by design. A tag is a named pointer to a commit. Anyone with push access can move it. When a workflow file says uses: tj-actions/changed-files@v44, GitHub resolves that tag to whatever commit it currently points at. There is no verification that the commit matches what was there when the developer wrote the reference. There is no signature. There is no content hash. The tag is the entire trust model and the tag can be silently changed.

Pan et al.'s study of CI/CD pipeline security across 320,000 repositories, published in IEEE Transactions on Dependable and Secure Computing in 2024, measured the scope of this exposure. They found that 94.93% of GitHub Actions references use tags rather than commit SHAs. The alternative - pinning to an immutable SHA like uses: tj-actions/changed-files@a3a2d9464ed2ef65c5e6bcfaa74a26900e4e9c6d - was used by fewer than 5% of repositories. The ecosystem's trust anchor is almost universally a mutable string.

The fix is obvious. The adoption is glacial. Pan et al. also found that action version references had an average update lag of 11.04 months. Even after the tj-actions incident, where the remediation advice was to switch to SHA pinning, the structural incentive runs against it. Tags are readable. SHAs are not. Developers choose convenience and the attack surface stays open.

The same shape in package registries

Package registries - npm, PyPI, RubyGems, NuGet - solved some of these problems years ago. Published packages are generally immutable once a version is released. npm's package-lock.json pins to both a version and an integrity hash. Python's pip supports hash checking. These mechanisms exist because the package ecosystem learned the hard way, through typosquatting, dependency confusion and the event-stream incident in 2018, that version strings alone are not sufficient.

But the newer registries that have emerged alongside AI tooling have not absorbed those lessons. MCP registries are the most visible example. When Wiz audited the major MCP server registries in early 2025, they found roughly 3,500 listed servers. Of those, approximately 100 pointed to GitHub repositories that did not exist anymore - ghost packages waiting for someone to claim the matching repository name and inject arbitrary code. This is dependency confusion transplanted from npm to a registry with even fewer safeguards.

The MCP ecosystem as of early 2026 has no package signing. No version pinning mechanism in the protocol specification. No lockfile equivalent. No reproducible builds. No publisher identity verification - Wiz demonstrated this by creating a fake "Azure MCP Server" and receiving registry labels like "verified" and "official" without any check that the publisher was Microsoft.

The trust model is a name string pointing to a mutable resource. The same pattern, the same weakness, the same exploitability. Only the substrate changed.

Extension marketplaces and the identity gap

Visual Studio Code's extension ecosystem introduces a third variation. The Visual Studio Marketplace and the Open VSX Registry host thousands of extensions that developers install with a single click. These extensions execute with the full permissions of the VS Code process - they can read files, write files, access the terminal, make network requests and interact with every API the IDE exposes.

The publisher verification model for both marketplaces is thin. Publishing a VS Code extension requires a publisher account. The account's identity is self-asserted. There is no code signing requirement. There are no reproducible builds. The marketplace runs automated malware scans, but these catch known signatures, not novel supply chain payloads.

The attack surface mirrors what happened with MCP registries and what happened with GitHub Actions before that. A developer searches for an extension, sees a name and a download count, and installs it. The name is the trust signal. The download count is the social proof. Neither attests to what the code actually does.

Research into extension marketplace security has demonstrated repeated instances of malicious extensions masquerading as popular tools. Typosquatting is common - an extension named Prettir instead of Prettier, or Python-Linter instead of Pylint. The attack does not require compromising an existing publisher. It requires creating a new name that looks close enough to a trusted one in a registry that does not enforce uniqueness constraints rigorously.

The pattern compounds when AI coding assistants enter the picture. Extensions that integrate with language models - Copilot competitors, code review tools, agentic coding assistants - request broad workspace access because they need it to function. A malicious extension posing as an AI assistant gets read access to every file in the workspace, write access to the filesystem and network access for "model inference" that could be exfiltration. The prompt injection vectors documented across Cursor, Claude Code and Amazon Q all apply equally to any extension that processes workspace content through a language model.

The authentication failure is always the same

Strip away the specifics and the structure is identical across all three surfaces:

A mutable reference - a Git tag, a package name at a version, an extension listing - serves as the trust anchor. The consumer resolves the reference at install or execution time and trusts whatever it gets back.

An unverified publisher identity - the reference points to a resource controlled by whoever currently holds the account, the repository or the registry entry. The identity behind it is either self-asserted or one compromise away from being someone else entirely.

No content-addressable verification - the consumer does not check whether the content at the reference matches a known-good hash, signature or attestation. The reference is the verification. If the reference moves, the consumer follows it.

This is not three different problems. It is one problem expressed in three different packaging formats. And each time a new distribution mechanism appears - GitHub Actions in 2019, MCP servers in 2024, agentic skill marketplaces in 2025 - the same pattern recurs because the new ecosystem's builders optimise for adoption speed over supply chain integrity.

Why the fix keeps not happening

The technical solutions are well understood. SHA pinning for Git references. Content hashes in lockfiles for packages. Code signing and provenance attestation for extensions. SLSA framework compliance for build pipelines. Sigstore for keyless signing. These exist, they work and they remain minority practices.

The adoption gap is a coordination problem. Each individual developer bears the inconvenience of SHA pinning but the security benefit is distributed across everyone who depends on their configuration. A registry that requires code signing raises the barrier to publishing and risks losing contributors to a competitor that does not. A marketplace that enforces reproducible builds needs infrastructure that nobody wants to fund until after the incident that proves it was necessary.

The hardware security community has been grappling with an analogous problem for decades. Boit et al.'s survey of hardware reverse engineering research, published in March 2025, identifies supply-chain assurance as one of the foundational motivations for the entire field. Integrated circuit verification exists because chips fabricated in untrusted facilities cannot be trusted based on their label alone - you have to inspect the physical artifact. The same logic applies to software artifacts. A package name is a label. A Git tag is a label. An extension listing is a label. Labels can be wrong.

The difference is that hardware verification infrastructure was built because the cost of a compromised chip in a defence system was immediately, catastrophically obvious. Software supply chain verification infrastructure is being built incrementally because the cost of a compromised npm package or a poisoned GitHub Action is distributed across thousands of small incidents rather than one dramatic failure. The aggregate cost may be larger, but the per-incident salience is lower. So the investment lags.

The compounding layer

What makes the current moment different from the npm supply chain concerns of 2018 or the SolarWinds conversation of 2020 is the addition of AI tools that read and act on workspace content. Traditional supply chain attacks had to deliver executable code. A compromised dependency had to contain code that a runtime would execute. A poisoned GitHub Action had to run in a CI pipeline.

AI coding assistants dissolve that constraint. A malicious payload does not need to be syntactically valid code. It needs to be text that a language model will interpret as an instruction. A comment in a configuration file works. A paragraph in a README works. A poisoned skill file works. The attack surface expanded from "code that runs" to "text that is read" and the authentication model did not update to match.

This means every attack surface discussed above - mutable Git tags, unverified package registries, self-asserted extension identities - now feeds into a secondary attack surface where the content delivered through those channels is processed by a model with system-level capabilities. A compromised MCP server does not just serve bad tool definitions. It serves tool definitions that can redirect the model's behaviour across every other tool connected to the same client. A malicious VS Code extension does not just run its own code. It can inject context that influences every AI assistant operating in the same workspace.

The compound risk is multiplicative. Each unverified trust boundary amplifies the impact of every other one.

What would move the needle

Incremental improvements within each ecosystem - better registry scanning, SHA pinning nudges in GitHub's UI, Marketplace malware detection - will help at the margins. They address symptoms. The structural problem is that every new software distribution surface reinvents the same weak trust model because it is the easiest thing to ship.

Three changes would alter the trajectory:

Default-immutable references. Git tags should be immutable by default, with mutation requiring an explicit and auditable override. GitHub could implement this tomorrow for Actions specifically - pinning workflows to the resolved SHA at commit time and warning when the upstream tag moves. npm already does something similar with lockfiles. The principle needs to be the platform default, not an opt-in best practice that 5% of developers adopt.

Cross-ecosystem provenance. Sigstore and SLSA provide a model for signed provenance that works across packaging formats. If a Git commit, an npm package and a VS Code extension all carry a signed attestation linking them to a verified build from a verified source, the mutable reference problem becomes less critical because the consumer can verify the chain regardless of which registry delivered it. This requires registries to mandate provenance rather than offer it as optional metadata.

Publisher identity as a verified claim. Self-asserted publisher names are the weakest link across every marketplace. Binding publisher identity to a verified organisational identity - through something like the Sigstore certificate authority's OIDC integration - raises the cost of impersonation from "create a new account" to "compromise a verified identity provider." This is not foolproof. But it forces the attacker up the difficulty curve.

None of these are novel proposals. All of them have existed in draft specifications, conference talks and blog posts for years. The gap is between knowing what to build and having the economic incentive to build it before the next incident rather than after.

The lock keeps breaking because we keep installing the same one

The supply chain security conversation tends to fragment by ecosystem. GitHub Actions security is discussed by CI/CD specialists. npm supply chain attacks are discussed by JavaScript developers. MCP server risks are discussed by AI security researchers. VS Code extension threats are discussed by IDE teams. Each community develops its own detection tools, its own best practices and its own incident response playbooks.

But the attacker does not respect those boundaries. The tj-actions attack chain traversed SpotBugs, reviewdog and tj-actions across four months. The ClawHavoc campaign planted nearly 1,200 malicious skills in a marketplace to target developers using AI coding tools. The pattern is not "Git is insecure" or "MCP is insecure" or "VS Code extensions are insecure." The pattern is that every distribution mechanism that relies on a mutable name as its trust anchor will eventually be exploited through that mutability.

Until the default changes - until immutable references, verified identities and content-addressable distribution become the boring infrastructure that every new platform ships with rather than the optional hardening that security teams advocate for after a breach - the lock will keep breaking. Not because the attacks are sophisticated. Because the lock was never designed to hold.