Memory Protection¶

cloudtaser stores secrets exclusively in process memory, never on disk, never in etcd, never in environment variables that persist in /proc. This page documents each memory protection mechanism, how it works at the kernel level, and what it defends against.

Compliance Summary

cloudtaser uses 6 kernel-level memory protection mechanisms that prevent any software - including root and the cloud provider - from reading secrets in process memory. These protections satisfy the "supplementary technical measures" requirement under Schrems II (EDPB Recommendations 01/2020) and provide demonstrable controls for DORA ICT risk management and NIS2 security measures.

Overview¶

The wrapper binary applies protections in a specific order during startup:

1. Allocate memfd_secret region (or fall back to mmap + mlock)
2. Fetch secrets from EU vault over mTLS
3. Write secrets into protected memory region
4. Apply mlock, MADV_DONTDUMP, PR_SET_DUMPABLE(0)
5. Build the child's exec environment (BuildChildEnv) and verify the
   wrapper's own /proc/self/environ does not contain secret keys
6. Set up optional LD_PRELOAD interposer for glibc-dynamic apps
   (env-var delivery is the universal default for all binaries)
7. fork+exec the application process; eBPF kprobe on
   openat("/proc/<pid>/environ") protects the child's environ

Each mechanism addresses a different attack vector. Together, they form a layered defense that a root attacker must defeat simultaneously.

memfd_secret(2)¶

Kernel requirement: Linux 5.14+

memfd_secret is the strongest memory protection available in the Linux kernel. It creates an anonymous file descriptor backed by memory pages that are physically removed from the kernel's direct memory map.

How It Works¶

int fd = memfd_secret(0);
ftruncate(fd, secret_size);
void *addr = mmap(NULL, secret_size, PROT_READ | PROT_WRITE,
                   MAP_SHARED, fd, 0);
// Write secrets to addr
// Pages are now invisible to the kernel

When memfd_secret is called, the kernel:

Allocates pages from the buddy allocator
Removes those pages from the kernel's direct map (the linear mapping of all physical memory)
Sets up page table entries only in the owning process's address space

What this means in practice

No kernel code path can read memfd_secret pages. Not /dev/mem. Not /proc/PID/mem. Not kernel modules. Not eBPF programs. Not kprobes. The pages simply do not exist in any kernel-accessible mapping. The only entity that can read them is the owning user-space process -- and the hypervisor, which has direct physical memory access.

What It Defends Against¶

Attack Vector	Blocked?	Why
Root reading /proc/PID/mem	Yes	Pages not in kernel direct map
Kernel module scanning memory	Yes	Pages not in kernel direct map
/dev/mem access	Yes	Pages not in kernel direct map
eBPF program reading memory	Yes	Pages not in kernel direct map
Hypervisor memory inspection	No	Physical memory access bypasses page tables

Configuration¶

env:
  - name: CLOUDTASER_REQUIRE_MEMFD_SECRET
    value: "true"   # Fail startup if memfd_secret unavailable

When CLOUDTASER_REQUIRE_MEMFD_SECRET=true, the wrapper will refuse to start on kernels older than 5.14. This ensures that secrets are never stored without hardware-level memory hiding.

mlock(2)¶

Kernel requirement: Any Linux

mlock pins memory pages in physical RAM, preventing the kernel from swapping them to disk.

How It Works¶

mlock(secret_addr, secret_size);

The kernel marks the pages as non-evictable. They will remain in RAM until explicitly unlocked or the process exits.

What It Defends Against¶

Attack Vector	Blocked?	Why
Swap file forensics	Yes	Pages never written to swap partition
Disk imaging after shutdown	Yes (for swap)	No swap footprint to recover
OOM killer eviction to swap	Yes	Pages are pinned regardless of memory pressure

mlock does not hide memory from root

A root attacker can still read mlocked pages via /proc/PID/mem or ptrace. mlock only prevents disk exposure. It is a complement to memfd_secret, not a substitute.

Configuration¶

env:
  - name: CLOUDTASER_REQUIRE_MLOCK
    value: "true"   # Fail startup if mlock fails

CAP_IPC_LOCK

The wrapper container requires CAP_IPC_LOCK to call mlock. cloudtaser's Helm chart sets this capability by default. Without it, mlock will fail and the wrapper will fall back to unprotected memory (unless CLOUDTASER_REQUIRE_MLOCK=true).

MADV_DONTDUMP¶

Kernel requirement: Linux 3.4+

MADV_DONTDUMP tells the kernel to exclude specific memory regions from core dumps.

How It Works¶

madvise(secret_addr, secret_size, MADV_DONTDUMP);

When the process crashes and the kernel generates a core dump, pages marked with MADV_DONTDUMP are omitted. The core file contains a hole where the secret region was.

What It Defends Against¶

Attack Vector	Blocked?	Why
Core dump analysis	Yes	Secret pages excluded from dump
Crash dump collection services	Yes	Secrets not in the dump file
Cloud provider crash analysis pipelines	Yes	Provider never sees secrets in crash reports

Defense in depth with PR_SET_DUMPABLE

MADV_DONTDUMP protects the specific memory region. PR_SET_DUMPABLE(0) prevents core dumps entirely. cloudtaser applies both.

PR_SET_DUMPABLE(0)¶

Kernel requirement: Any Linux

PR_SET_DUMPABLE(0) sets the process as non-dumpable, which has several security effects beyond just preventing core dumps.

How It Works¶

prctl(PR_SET_DUMPABLE, 0);

Effects¶

Effect	Description
No core dumps	Kernel will not generate core dump files for this process
Restricted /proc access	/proc/PID/{mem,maps,environ,syscall,stack} become unreadable by non-root
ptrace restriction	Prevents ptrace ATTACH from non-parent processes (defense in depth with eBPF)
No SIGABRT dumps	Abort signals do not produce dump files

Root can override PR_SET_DUMPABLE

A root attacker can re-enable dumpable via /proc/PID/coredump_filter or by loading a kernel module. This is why cloudtaser's eBPF agent monitors /proc writes and detects kernel module loading.

Wrapper environ is structurally clean (`environ_scrubbed`)¶

Kernel requirement: Any Linux

The wrapper's own /proc/self/environ does not contain secret values. This is a structural property of the fork+exec architecture, not a runtime mutation.

How It Works¶

/proc/PID/environ on Linux is a frozen snapshot of the bytes between mm->env_start and mm->env_end set at execve() time. The kernel does not update this region when a process subsequently calls os.Setenv / setenv / putenv — those calls modify the C library's environ array in heap memory, not the kernel's record of the original envp[].

The wrapper exploits this:

The wrapper's container is started by the operator webhook with a configuration-only env (CLOUDTASER_*, VAULT_ADDR, etc.). No secrets in envp[] at exec time.
After the wrapper starts, it fetches secrets from vault into Go heap memory, the protected memfd_secret region, or both. This Go-heap data does not flow back into the kernel's frozen /proc/self/environ.
When the wrapper builds the child's exec environment via BuildChildEnv, it constructs a fresh []string containing the secret-bearing variables and passes it directly to syscall.Exec. The secrets enter the child's /proc/PID/environ snapshot at the child's exec time, never the wrapper's.

The environ_scrubbed protection score check therefore confirms a structural property: at no point does the wrapper's own /proc/self/environ byte-range contain secret keys.

This does not protect the child's environ

Linux semantics mean the wrapper cannot scrub the child's /proc/<child_pid>/environ after exec — the kernel exposes the original env_start..env_end byte range and os.Setenv/equivalent in the child does not update it either. The protection of the child's environ from third-party reads is delegated to eBPF enforcement. The kprobe on openat("/proc/PID/environ") blocks reads from any PID in a protected pod cgroup that targets a monitored PID. See cloudtaser-ebpf#168 for the cgroup-array enforcement broadening that closed the sibling-cgroup gap.

What It Defends Against¶

Attack Vector	Blocked?	Why
Reading wrapper's `/proc/1/environ` for secrets	Yes	Wrapper exec'd without secrets in `envp[]`; secrets enter only after exec, in Go heap
Reading child's `/proc/<child>/environ`	Delegated to eBPF	kprobe on `openat` denies cross-process environ reads for all monitored PIDs
Container runtime env inspection (`docker inspect`)	Yes	Container spec contains only `CLOUDTASER_*` configuration, not secrets
Kubernetes API env inspection	N/A	Secrets never set via K8s env vars

Why this matters

Many secret injection tools set secrets as environment variables in the container spec (the env: field in the pod YAML, or envFrom against a Kubernetes Secret). Those values land in the kernel's envp[] at exec time and are then readable in /proc/PID/environ for the entire lifetime of the process — by anything with ptrace_may_access-equivalent permissions on the namespace. cloudtaser fetches secrets from EU vault directly into wrapper memory after the wrapper has already exec'd, so the wrapper's frozen envp[] snapshot never contained them. The child's snapshot does contain them (that's how the child reads them), and the eBPF layer protects it from cross-process reads.

LD_PRELOAD Interposer¶

Kernel requirement: Any Linux (glibc-dynamic applications only -- optional enhancement, not required for secret delivery)

The LD_PRELOAD interposer is an optional enhancement that solves a subtle but important problem: even when secrets are stored in memfd_secret, the application runtime may copy them to the heap when the application calls getenv(). The default fork+exec env-var delivery path works for all binaries regardless of libc; the interposer adds heap-copy elimination for glibc-linked workloads.

The Problem¶

Without interposer:
  1. Secret in memfd_secret region         [PROTECTED]
  2. App calls getenv("DB_PASSWORD")
  3. C library copies value to heap         [UNPROTECTED - on regular heap]
  4. App uses heap copy
  5. Two copies exist: memfd (safe) + heap (exposed to root)

The Solution¶

With interposer:
  1. Secret in memfd_secret region         [PROTECTED]
  2. App calls getenv("DB_PASSWORD")
  3. Interposer intercepts, returns pointer into memfd region  [PROTECTED]
  4. App uses memfd pointer directly
  5. Only one copy exists: memfd (safe)

How It Works¶

The wrapper sets LD_PRELOAD to load a shared library that interposes on getenv(). When the application calls getenv() for a secret key, the interposer:

Looks up the key in its memfd_secret-backed table
Returns a pointer directly into the memfd_secret region
The application uses this pointer without knowing it points to protected memory

Coverage¶

Application Type	Interposer Works?	Alternative
Dynamically linked C/C++	Yes	--
Python, Ruby, Node.js	Yes (C runtime underneath)	--
Java (JNI)	Yes	--
PostgreSQL, MySQL, Redis	Yes	--
Go (CGO_ENABLED=1)	Yes	--
Go (CGO_ENABLED=0, static)	No	cloudtaser SDK for direct memfd access
Rust (musl static)	No	cloudtaser SDK

Approximately 90% of Kubernetes workloads are dynamically linked

The interposer covers the vast majority of production workloads without any application code changes. For the remaining statically linked binaries, the cloudtaser SDK provides equivalent protection through direct memfd file descriptor access.

Fallback Behavior on Older Kernels¶

When memfd_secret is unavailable (Linux < 5.14), the wrapper falls back to a degraded but still defended posture:

Mechanism	5.14+	Pre-5.14
memfd_secret	Active	Unavailable -- falls back to anonymous mmap
mlock	Active	Active
MADV_DONTDUMP	Active	Active
PR_SET_DUMPABLE(0)	Active	Active
Wrapper environ clean (structural)	Active	Active
LD_PRELOAD interposer	Active	Active
eBPF enforcement	Active	Active

Pre-5.14 kernels are vulnerable to root kernel module attacks

Without memfd_secret, a root attacker who loads a custom kernel module can scan process memory and find secrets. The eBPF agent detects module loading (global privilege escalation detection) but cannot block it from non-monitored PIDs without breaking system services.

Recommendation: Use Linux 5.14+ for production deployments. The protection score reflects memfd_secret availability with the highest single-mechanism point value (15 points).

Strict Mode¶

Set CLOUDTASER_REQUIRE_MEMFD_SECRET=true to prevent the wrapper from starting on older kernels:

env:
  - name: CLOUDTASER_REQUIRE_MEMFD_SECRET
    value: "true"

This is the recommended setting for production workloads where data sovereignty guarantees are contractually required.

Summary Table¶

Mechanism	Defends Against	Kernel	Can Root Bypass?	eBPF Backup?
memfd_secret	All software memory access	5.14+	No	N/A
mlock	Swap to disk	Any	No (but irrelevant -- root reads RAM directly)	N/A
MADV_DONTDUMP	Core dump exposure	3.4+	Yes (coredump_filter write)	Yes -- blocks /proc writes
PR_SET_DUMPABLE(0)	ptrace, /proc access, core dumps	Any	Yes (kernel module)	Yes -- blocks ptrace, detects modules
Wrapper environ clean (`environ_scrubbed`)	Structural read of wrapper's own `/proc/self/environ`	Any	N/A -- secrets never enter wrapper `envp[]` at exec time	eBPF kprobe on `openat("/proc/PID/environ")` blocks cross-process reads of the child's environ for all monitored PIDs
LD_PRELOAD	Heap copies of secrets	Any	N/A (prevents copies, not access)	N/A

eBPF Enforcement | Protection Score

Memory Protection¶

Overview¶

memfd_secret(2)¶

How It Works¶

What It Defends Against¶

Configuration¶

mlock(2)¶

How It Works¶

What It Defends Against¶

Configuration¶

MADV_DONTDUMP¶

How It Works¶

What It Defends Against¶

PR_SET_DUMPABLE(0)¶

How It Works¶

Effects¶

Wrapper environ is structurally clean (environ_scrubbed)¶

How It Works¶

What It Defends Against¶

LD_PRELOAD Interposer¶

The Problem¶

The Solution¶

How It Works¶

Coverage¶

Fallback Behavior on Older Kernels¶

Strict Mode¶

Summary Table¶

Wrapper environ is structurally clean (`environ_scrubbed`)¶