Project NOMAD PR #823: a hardcoded HMAC secret was real, but the fix was incomplete

Project NOMAD closed PR #823 without merge after I reported a hardcoded HMAC secret in its benchmark submission client. The finding was real. The maintainer said so directly. The proposed patch was also incomplete, because moving one global client secret out of benchmark_service.ts does not make that secret safe to distribute to every open-source installation.

That is the useful version of a disputed fix. This was not a maintainer saying hardcoded credentials are acceptable. It was a disagreement about where the security boundary actually sits. I reported a source-level CWE-798 issue. The maintainer answered with an architecture-level objection: a public client cannot keep a shared signing key private merely by changing which file contains it.

What the original code trusted

The vulnerable code lived in admin/app/services/benchmark_service.ts. The service builds a benchmark submission from local results, signs it and posts it to the central benchmark collector at benchmark.projectnomad.us.

The relevant flow in submitToRepository() was direct:

const timestamp = Date.now().toString()
const payload = timestamp + JSON.stringify(submission)
const signature = createHmac('sha256', BENCHMARK_HMAC_SECRET)
  .update(payload)
  .digest('hex')
 
await axios.post(
  'https://benchmark.projectnomad.us/api/v1/submit',
  submission,
  {
    timeout: 30000,
    headers: {
      'X-NOMAD-Timestamp': timestamp,
      'X-NOMAD-Signature': signature,
    },
  }
)

The HMAC key was defined as a module-level constant in the same public TypeScript file. I am not repeating the full value here. It is already present in the public PR record and should be treated as compromised.

The surrounding comment described the secret as basic protection against casual API abuse, noted that a determined attacker could extract it because NOMAD is open source and pointed to challenge-response authentication for stronger protection. That comment was unusually honest. It also described the problem. If a signing key is committed to a public repository and reused across every installation, extraction is not much of a hurdle.

The impact is bounded but concrete: forged benchmark submissions, leaderboard or aggregate-data poisoning, spam through a supposedly verified channel and cheap load against the collector. This is not remote code execution on a NOMAD host. It is still a trust failure if the collector relies on X-NOMAD-Signature to distinguish genuine NOMAD submissions from arbitrary internet traffic.

What PR #823 changed

The fix I submitted was intentionally narrow. It removed the literal value from benchmark_service.ts, imported AdonisJS environment handling and read the key from BENCHMARK_HMAC_SECRET:

import env from '#start/env'
 
const BENCHMARK_HMAC_SECRET = env.get('BENCHMARK_HMAC_SECRET')

It also added a runtime guard inside submitToRepository():

if (!BENCHMARK_HMAC_SECRET) {
  throw new Error(
    'Benchmark submission signing secret is not configured. Set the BENCHMARK_HMAC_SECRET environment variable.'
  )
}

The patch registered BENCHMARK_HMAC_SECRET in admin/start/env.ts and documented it in admin/.env.example with a generation hint. There was no fallback to the old constant, no unsigned mode and no change to the payload format or headers.

As source hygiene, that was correct. Hardcoded signing material should not live in public source control. The PR also stated the operational point that the old value had to be rotated on the benchmark server, because repository history and forks preserve committed secrets.

As a complete authentication design, it was not enough.

Why the maintainer closed it

The maintainer closed the PR with a clear explanation rather than denial. They accepted that the analysis was correct and that the patch was well constructed. The objection was that any client-side-only change merely moves the shared secret around.

Their reasoning had three practical parts:

putting the secret into install_nomad.sh or a management compose template would put it back into distributed source
requiring existing operators to set BENCHMARK_HMAC_SECRET themselves would break benchmark submissions for the install base
automatically fetching the secret at install time would still hand it to anyone who can run the installer

The maintainer described the current HMAC scheme as a speed bump against casual abuse, with the real defences in the private benchmark repository's server-side validation. They also said the proper fix would require per-install enrolment plus a server-side challenge-response protocol.

That position holds against the patch as a full remediation. A public client cannot safely share one global secret with every installation. If every legitimate user must receive the same value, every user and every installer path becomes a redistribution channel. The server also has no per-install identity, no targeted revocation and no useful way to distinguish one abusive client from everyone else using the same key.

The adversarial model

The adversarial model is simple. There is a public client repository. Anyone can read it, clone it, modify it and run it. There is a central benchmark collector. It accepts submissions from clients and decides whether to publish or aggregate them. There is an attacker who wants to submit forged results, spam the collector or make the benchmark data less trustworthy.

The public client is not a trusted computing base. It runs on machines the server does not control. Its local database can be edited. Its code can be modified. Its HTTP requests can be recreated with a script or proxy. Its secrets can be read if they are distributed with the client.

Under that model, a single global client-side HMAC key cannot prove that a benchmark was produced by an honest Project NOMAD installation. If the key is in source, the attacker reads it. If the key is in an installer, the attacker reads the installer. If the key is fetched during setup, the attacker runs the setup path. If the same key is given to every operator, one leak compromises the scheme for everyone.

This is where the maintainer is right. My patch would have removed an obviously bad source-code secret, but it would not have created durable source authentication. A secret distributed to every public client is not a secret. It is a request format.

Where the vulnerability remains real

The maintainer's architecture objection does not make the original code safe. It shows that the original design was weaker than a small client patch could fix.

The question is how the private collector treats X-NOMAD-Signature.

If the signature is an authenticity boundary, the vulnerability is straightforward. Anyone with the public key can sign arbitrary submissions that look legitimate to the verifier. The server cannot tell whether the request came from an unmodified NOMAD installation, a modified clone or a small purpose-built script.

If the signature is only a weak anti-abuse signal, severity is lower. The issue becomes poor source hygiene and a brittle speed bump rather than broken authentication. In that model, the server must rely on other controls: validation of submitted values, replay prevention, rate limits, anomaly detection, moderation and the ability to reject impossible benchmark results.

External review cannot verify the private benchmark server. The public evidence shows that the client signs a timestamp plus JSON submission and sends the digest as X-NOMAD-Signature. The maintainer says the real defence is server-side validation in the private repository. That may be true. It is also precisely why the public code should not imply that the HMAC proves a submission came from a genuine NOMAD instance unless the protocol can actually provide that property.

A speed bump is defensible when labelled as a speed bump. It becomes a vulnerability when treated as a passport.

What a complete fix requires

A complete fix starts by treating the leaked value as dead. Once a secret has been committed to a public repository, it should be assumed compromised. Rotation is not optional if the server still accepts it.

The larger fix is server-side identity and revocation. Several designs are workable, each with trade-offs:

per-install enrolment, where each installation receives a distinct credential through a registration flow
locally generated key pairs, where the client registers a public key and signs submissions with a private key generated locally
account-bound benchmark publishing, where users authenticate before submitting results to the public leaderboard
challenge-response with nonces, canonical payloads, freshness checks and replay prevention
explicitly unaudited community submissions, where the product stops presenting client-submitted data as strongly authenticated

None of these proves that a benchmark was honestly run on the stated hardware. A user-controlled machine can always lie about user-controlled measurements. The realistic goal is narrower: make abuse attributable, rate-limitable and revocable, then avoid making provenance claims the system cannot support.

The environment-variable change still belongs in a mature design for server operators and private deployment secrets. It just cannot be the main trust boundary for a public client ecosystem.

The broader AI tooling pattern

Project NOMAD fits a familiar pattern in AI and LLM tooling. A local tool grows a shared service: benchmarks, registries, telemetry, plugin catalogues, MCP directories or hosted dashboards. The feature feels small because the original product is still local. The trust boundary changes anyway.

Security reports then collide with operational reality. The report says a secret is public. The maintainer sees an install base, a support burden and a protocol that was never meant to withstand active forgery. Both sides can be describing true things. The failure is usually that the threat model was implicit until the bug report forced it into the open.

The same shape appears across other AI-adjacent findings. MCPHub's default admin password was a basic credential issue in tooling that sits close to agent workflows. gptme's Docker API key exposure showed how log redaction missed the process table. AIPex's localhost WebSocket bridge showed how a local control surface becomes remotely interesting once the browser is part of the path. The broader MCP server attack surface has the same trust-boundary problem: local convenience features become distributed security decisions.

The bug classes are not new. The deployment shapes are. AI tooling tends to connect local automation, user credentials, community services and hosted coordination points before it has written down which component is allowed to trust which fact. That is where decorative authentication grows.

PR #823 was a valid report with an incomplete remediation. The maintainer was right to reject the idea that relocating a global client secret would solve the authentication model. The remaining risk depends on what the private benchmark server believes about the HMAC it receives. If it treats the signature as one noisy anti-abuse input, the design is untidy. If it treats it as proof that a submission came from an honest Project NOMAD instance, the adversary has already been given the pen.