Wrapper Secret-Memory Hardening¶
This page documents the security controls applied to the cloudtaser-wrapper as of Epic #12 (wrapper-secret-memory-hardening, Gold tier, wrapper v0.2.11). These controls are relevant to operators and SREs deploying CloudTaser into regulated environments who need to understand the wrapper's memory isolation guarantees, crash behaviour, and rotation lifecycle.
The Memory Protection page covers the underlying kernel mechanisms (memfd_secret, mlock, MADV_DONTDUMP, PR_SET_DUMPABLE). This page covers the implementation hardening shipped on top of those mechanisms: how secret buffers are structured, how they are zeroed on every exit path, how rotation is made safe, and how error paths are sanitised.
Secret buffer allocation: memfd_secret primary, anonymous-mmap fallback¶
The wrapper allocates secret buffers using a two-tier strategy:
Primary (kernel 5.14+): memfd_secret(2) — pages are removed from the kernel direct map, making them unreadable via /proc/PID/mem, /dev/mem, or kernel modules regardless of privilege. This is the recommended production path.
Fallback (kernel < 5.14): mmap(MAP_ANONYMOUS|MAP_PRIVATE) with mlock to prevent swap exposure and madvise(MADV_DONTDUMP) to exclude the region from core dumps.
A regression fixed in this Epic addressed the /dev/shm tmpfile allocation path (issue #117): under certain container runtime configurations the wrapper was taking a third path that wrote a temporary file to the shared memory filesystem before mapping it. This path is now eliminated — allocation always goes through memfd_secret or the anonymous mmap fallback, never through a filesystem-backed tmpfile.
To enforce the strongest path and prevent silent fallback to anonymous mmap, set:
The wrapper will refuse to start on kernels older than 5.14 when this flag is set. This is the recommended setting for production workloads where data sovereignty guarantees are contractually required.
Guard pages around secret buffers¶
Each secret buffer is surrounded by PROT_NONE guard pages — one page immediately below and one page immediately above the secret region. This is the same pattern used by sodium_malloc in libsodium.
Any access that overflows or underflows the secret region by even one byte will fault against the guard page and terminate the process immediately. This converts potential buffer-overflow attacks that try to pivot into a neighbouring secret into a visible crash rather than a silent read.
The guard pages are unmapped (not merely PROT_NONE-marked) on Close(), leaving no residual virtual address range that could be probed.
Per-buffer canary¶
A 16-byte random canary is appended to each secret buffer at allocation time. The canary value is retained in the buffer metadata.
Every call to buffer.Get() validates the canary before returning the secret bytes. If the canary has been modified — by a write-overflow from adjacent memory, by a use-after-partial-free bug, or by an attacker who obtained write access to the process address space — the wrapper aborts immediately with a fail-closed crash rather than silently returning a potentially corrupted or substituted secret.
This means that an attacker who can write-corrupt the buffer but cannot read it (for example, a write-primitive-without-read in a co-located process) cannot substitute a controlled value and have it propagate to the child process.
Explicit zeroing on all exit paths¶
Secret memory is explicitly zeroed before unmapping or freeing. Zeroing happens on four distinct paths:
Normal close: secret.Buffer.Close() zeroes the entire secret region before calling munmap. The guard pages are unmapped after the zeroing is complete, leaving no window where the secret is live but the guards are gone.
Rotation: When secrets are rotated (triggered by SIGHUP), the wrapper fetches a fresh set of secrets into a new buffer before acquiring the rotation lock. On swap-over, the prior buffer's Close() is called immediately — the old secret region is zeroed and unmapped before the lock is released. There is no window where both the old and new buffers are simultaneously readable.
Signal shutdown (SIGTERM/SIGINT): Signal handlers installed at wrapper startup zero all live secret buffers before allowing the process to exit. This prevents secrets from remaining in physical RAM pages that the kernel recycles to another process.
Panic/crash paths: Go's defer-based cleanup ensures Close() is called on buffer allocation failure and on early-exit paths during secret fetching. The wrapper does not rely on process exit alone to reclaim protected pages.
prctl(PR_SET_DUMPABLE, 0)¶
The wrapper calls prctl(PR_SET_DUMPABLE, 0) early in its startup sequence, before secrets are fetched. Effects:
| Effect | What it blocks |
|---|---|
| No core dumps | Kernel will not generate core dump files; crash dumps collected by cloud provider tooling contain no secret material |
Restricted /proc access |
/proc/PID/{mem,maps,environ,syscall,stack} become unreadable by non-root processes |
| ptrace restriction | ptrace(PTRACE_ATTACH, ...) from non-parent processes is denied (defence-in-depth with eBPF enforcement) |
The wrapper verifies the setting took effect by reading PR_GET_DUMPABLE after the prctl call and aborting if it does not return 0. This regression test was added in this Epic to catch container runtimes or SELinux policies that silently re-enable dumpable after exec.
Static binary edge case
On Alpine/musl images without a dynamic loader, the musl start function runs before the wrapper's main(), and certain kernel configurations may reset the dumpable bit during execve. This edge case (issue #119) is tracked for BPF LSM enforcement in a future phase. In practice, the wrapper binary itself is dynamically linked against glibc/musl and this reset does not occur on supported configurations.
RLIMIT_CORE=0 (defense-in-depth)¶
Immediately after prctl(PR_SET_DUMPABLE, 0), the wrapper sets RLIMIT_CORE to zero (both soft and hard limits) via setrlimit(RLIMIT_CORE, {0, 0}). This provides defense-in-depth against core dump generation even if the dumpable bit is reset by the container runtime, a kernel bug, or a setuid transition.
The kernel's core dump pipeline checks RLIMIT_CORE independently of the dumpable bit. Some container runtimes and system-level core dump collectors (systemd-coredump, core_pattern piped to a collector binary) run with elevated privileges that can override PR_SET_DUMPABLE. Setting the hard rlimit to zero blocks these paths because the rlimit check occurs in the kernel's do_coredump() before the core file is written.
The wrapper verifies the rlimit took effect after the setrlimit call and logs a warning if the container runtime silently overrode the setting. This catch has been observed on certain container runtimes with elevated security profiles that preserve non-zero RLIMIT_CORE for crash diagnostics.
Environment map format injection guard¶
The CLOUDTASER_ENV_MAP parser (internal/envmap) uses explicit type handling when converting vault secret values to environment variable strings. This prevents format injection attacks where a crafted vault value could manipulate the child process's environment.
Specific controls:
| Value type | Handling | Rationale |
|---|---|---|
string |
Passed through, but rejected if it contains newline (\n), carriage return (\r), or NUL (\x00) |
Newline/CR in an env var value can inject additional environment variables in some runtimes; NUL terminates the env entry early, allowing suffix injection |
float64 |
Formatted with strconv.FormatFloat(v, 'f', -1, 64) |
Avoids scientific notation (1.5e+08) which breaks parsers expecting decimal format |
json.Number |
.String() |
Preserves exact numeric representation from vault JSON |
bool |
"true" or "false" literal |
Deterministic representation |
nil |
Empty string | Safe empty value |
map / slice |
JSON-marshalled, then scanned for newline/CR/NUL | Structured vault values become JSON strings; injection characters in the JSON representation are still rejected |
| Unknown types | Rejected with error | Fail-closed: the wrapper refuses to inject a value it cannot represent safely |
Previously, the parser used fmt.Sprintf("%s=%v", ...) which delegated formatting to Go's %v verb. This leaked Go-internal type representations (e.g., map[foo:bar] for maps) and did not validate string values for injection characters. A crafted vault value containing \nMALICIOUS_VAR=attacker_value would have been injected verbatim into the child environment.
Error-path redaction¶
Vault URL sanitisation: Vault addresses sometimes embed credentials as basic-auth usernames or query-string tokens (e.g. https://[email protected]/v1/secret). The wrapper's sanitizeURL function strips any userinfo component and any query-string parameters that contain credential-like keys before including a URL in a log line or error message. This prevents credential material in Vault addresses from appearing in application logs forwarded to a SIEM or log aggregator.
Broker error truncation: Error response bodies returned by the Vault broker are length-truncated before being propagated to the caller or logged. A Vault server configured to include diagnostic detail (token policies, secret paths, internal identifiers) in its error responses cannot cause that material to appear in wrapper logs.
PID-1 zombie reaping¶
The wrapper runs as PID 1 inside its container. On Linux, PID 1 receives the SIGCHLD for any process whose parent has exited, because all orphaned processes are re-parented to PID 1. Without explicit reaping, a workload that forks grandchildren (for example, a web server that forks request-handling processes) will accumulate zombie entries in the process table until the PID namespace is exhausted.
The wrapper installs a SIGCHLD handler that calls waitpid(-1, WNOHANG) in a loop to reap all available children. This runs on a dedicated goroutine that does not share memory with the secret buffer path.
Rotation safety¶
Secret rotation is triggered by sending SIGHUP to the wrapper process (or by the operator's automatic rotation controller).
Two invariants are maintained during rotation:
Fresh credential fetch: The wrapper reads a fresh Kubernetes service-account token from the pod's projected volume at rotation time rather than using the cached token acquired at startup. This ensures rotation works correctly when the SA token has been rolled since the wrapper started.
Rotation lock prevents torn reads: A mutex protects the buffer swap. The child process reads secrets by value at exec time (they are in the child's environment at fork). For long-running children that read secrets through the LD_PRELOAD interposer, the interposer holds a read lock on the buffer pointer for the duration of each getenv() call. The rotation lock prevents the buffer from being swapped while an in-flight getenv() call holds a pointer into the old buffer.
Deferred control: env-strip default-deny¶
The cloudtaser.io/env-allowlist annotation — which would allow operators to declare an explicit allowlist of environment variable names that may reach the child process, with all others stripped — is tracked as issue #111 in the wrapper repository. It was explicitly deferred from this Epic to a cross-component release that also touches the operator (annotation parsing) and the Helm chart (default configuration). Operators who need strict env-strip today should use the eBPF enforcement layer's /proc/PID/environ kprobe, which blocks third-party reads regardless of what the child's environ contains.
Summary table¶
| Control | What it defends against | Shipped issue |
|---|---|---|
| memfd_secret primary / anon-mmap fallback | Root read via /proc/PID/mem, kernel modules, /dev/mem |
#117 (tmpfile path fixed) |
| Guard pages (PROT_NONE perimeter) | Buffer overflow pivoting into adjacent secret region | #93 |
| Per-buffer 16-byte canary | Write-corruption of secret buffer detected on every read | #93 |
| Explicit zero on Close() | Secret material in freed pages recycled to other processes | #143, #144 |
| Explicit zero on SIGTERM/SIGINT | Secret material in RAM at process exit | #143, #144 |
| Rotation-path zeroing of prior buffer | Dual-live secret window during rotation | #135, #141 |
| prctl(PR_SET_DUMPABLE, 0) + verified | ptrace attach, /proc/PID/mem write, core dump collection | #146 |
| RLIMIT_CORE=0 (hard + soft) | Core dump generation via systemd-coredump, core_pattern, or runtime override | #115 |
| envmap format injection guard | Newline/CR/NUL injection into child environment via crafted vault values | #133 |
| Vault URL sanitisation | Credential material in log lines / SIEM forwarders | #138 |
| Broker error truncation | Vault diagnostic detail in error propagation | #145 |
| PID-1 zombie reaping | PID namespace exhaustion on fork-heavy workloads | #127 |
| Fresh SA token on SIGHUP rotation | Rotation failure when startup SA token has been rolled | #141 |
| Rotation lock on buffer swap | Torn read via LD_PRELOAD interposer during rotation | #135 |