Gotenberg ExifTool Stdin Injection via Metadata Value Newlines - Incomplete Sanitization Bypass

Vulnerability Description

Gotenberg v8.30.1 introduced key-level sanitization (commit 405f106) via safeKeyPattern regex to block dangerous ExifTool pseudo-tags in metadata keys. However, the fix incompletely addresses the attack surface by validating keys while leaving metadata values entirely unsanitized. The go-exiftool library passes key/value pairs to ExifTool's stdin using fmt.Fprintln(e.stdin, "-"+k+"="+str), which treats each line as a separate argument. An attacker inserting a newline character (\n) within a metadata value causes the stdin buffer to split into multiple lines, effectively injecting a second ExifTool argument. This bypasses the key blocklist entirely by allowing attackers to inject any pseudo-tag (e.g., -FileName, -Directory, -SymLink, -HardLink) in the injected argument, not the key field.

Proof-of-Concept Significance

The disclosed PoC is highly reliable and requires minimal preconditions: Only HTTP POST access to the metadata write endpoint with attacker-controlled metadata values. Docker verification confirms reproducibility—the container returned HTTP 404 (indicating file relocation via -FileName injection) and created proof files in /tmp/. This demonstrates that the vulnerability is not theoretical; it reliably achieves arbitrary file system operations within the container context. The PoC proves that input validation based on key-only checks is fundamentally insufficient when underlying libraries process key/value pairs line-by-line without additional escaping.

Detection Guidance

Log Indicators:

HTTP POST requests to metadata endpoints with newline characters (%0A, \n) in request body JSON/form fields
ExifTool process logs showing unexpected pseudo-tag arguments (search for FileName=, Directory=, SymLink=, HardLink= in exiftool stderr/logs)
File system events: unexpected file moves, symlink creation, or hardlink creation triggered by document processing workflows

YARA Rule Pattern (generic):

rule Gotenberg_Metadata_Injection {
  strings:
    $newline_in_value = /"[^"]*\\n[^"]*"/
    $dangerous_tag = /-(FileName|Directory|SymLink|HardLink|MakerNote)\s*=/
  condition:
    any of them
}

Network Detection:

Monitor for POST requests to /api/*/metadata endpoints containing URL-encoded newlines in JSON payloads
Watch for ExifTool process spawns with suspicious pseudo-tag arguments in process argv

Mitigation Steps

Immediate: Upgrade Gotenberg to a patched version (> v8.30.1) that sanitizes both keys AND values
Workaround (if upgrade delayed): Implement application-level metadata value validation—reject any metadata values containing control characters (\n, \r, \x00, \t)
Input Filtering: Strip or reject newline characters from all metadata values at the HTTP handler level before passing to go-exiftool
Process Isolation: Run Gotenberg/ExifTool in restricted containers with read-only filesystems where possible, limiting blast radius of file manipulation
Audit: Review all Gotenberg instances for logs of metadata requests with unusual characters; investigate any 404 responses or unexpected file system events

Risk Assessment

Likelihood of Exploitation: Very High. The vulnerability requires only HTTP access and standard tooling (curl, Python requests). No authentication bypass needed if metadata endpoint is exposed. Attackers can achieve arbitrary file deletion, relocation, or symlink attacks—enabling lateral movement or denial of service in multi-tenant environments.

Threat Actor Interest: High. File manipulation capabilities align with container escape, supply-chain attacks, and persistence mechanisms. Organizations running Gotenberg in CI/CD pipelines, document processing services, or SaaS platforms face elevated risk. The incomplete fix in v8.30.1 suggests real-world exploitation may have driven discovery; defenders should assume active exploitation in the wild.