Root Attack Surface¶
This page analyzes the attack surface when the adversary has root access on the Kubernetes node. This is the primary threat model for cloudtaser: the cloud provider has root-equivalent access through hypervisor management, support tooling, and legal compulsion under the CLOUD Act.
Executive Summary
On Linux 5.14+ with BPF LSM enforcement active, cloudtaser blocks every known software-level attack path available to a root user on the Kubernetes node -- including the cloud provider operating under CLOUD Act compulsion. memfd_secret removes secret pages from the kernel direct map, and the BPF LSM hook lsm_ptrace_access_check synchronously denies process_vm_readv, ptrace, and /proc access against monitored processes. Enforcement requires bpf in the kernel's boot-time LSM stack; this has been verified on GKE Confidential Computing / COS (COS 6.12, cloudtaser-ebpf v0.4.54). Other managed Kubernetes platforms may support BPF LSM but have not been independently confirmed by CloudTaser -- verify per platform. The only remaining exposure is the hypervisor's physical memory view, which requires confidential computing hardware to close. On kernels without BPF LSM, these vectors are not blocked -- the read succeeds and the secret is disclosed before any reactive kill fires. See Platform Compatibility for the full matrix.
Threat Model¶
The Adversary¶
The adversary has full root access on the Kubernetes worker node. This includes:
- Cloud provider operations staff -- via hypervisor console, management plane APIs, support SSH access
- Cloud provider under legal compulsion -- CLOUD Act warrant, FISA 702 directive, National Security Letter
- An attacker who escalated to root -- through container escape, kernel exploit, or compromised node agent
The Goal¶
Extract secrets from a running cloudtaser-protected process. The secrets are:
- Database credentials, API keys, encryption keys
- Fetched from an EU-hosted OpenBao instance
- Stored exclusively in process memory (never on disk, never in etcd)
The Question¶
What can a root attacker actually do, and what does cloudtaser block?
Defense Layer 1: Wrapper (Process-Level)¶
The wrapper applies six memory protection mechanisms before launching the application:
| Defense | What It Does | Can Root Bypass? | Kernel Requirement |
|---|---|---|---|
memfd_secret |
Removes pages from kernel direct map | No -- hardware-level guarantee | Linux 5.14+ |
mlock |
Pins pages in RAM, prevents swap | No (but root can read RAM via other paths) | Any Linux |
MADV_DONTDUMP |
Excludes from core dumps | Yes -- can write to coredump_filter | Linux 3.4+ |
PR_SET_DUMPABLE(0) |
Blocks ptrace, restricts /proc | Yes -- can load kernel module to override | Any Linux |
Wrapper environ clean (environ_scrubbed) |
Wrapper exec'd without secrets in envp[]; /proc/self/environ snapshot frozen at exec time before vault fetch |
Structural -- no secret bytes ever written to wrapper environ region | Any Linux |
| LD_PRELOAD interposer | Returns memfd pointers from getenv() for glibc-dynamic apps; env-var inheritance still works for static binaries |
N/A -- prevents heap copies on the dynamic-link path | Any Linux |
memfd_secret is the critical defense
On Linux 5.14+, memfd_secret provides a hardware-backed guarantee that root cannot read secret pages through any software mechanism. Every other defense can theoretically be bypassed by root, but memfd_secret cannot -- because the pages are physically absent from the kernel's memory map.
Defense Layer 2: eBPF Agent (Enforcement)¶
Even with memfd_secret, a root attacker may attempt secondary attacks. The eBPF agent blocks these in real time.
Secret Exfiltration Attempts¶
| Attack | Syscall | Enforcement | Blocked? |
|---|---|---|---|
| Write secret to file | write, writev |
kprobe block (content matching) | Yes |
| Send secret over network | sendto, sendmsg |
kprobe block (content matching) | Yes |
| Zero-copy exfiltration | sendfile, splice, tee, vmsplice |
kprobe block | Yes |
| Async I/O bypass | io_uring_setup |
kprobe block | Yes |
Process Memory Access Attempts¶
| Attack | Syscall / Path | Enforcement | Blocked? |
|---|---|---|---|
| Read /proc/PID/mem | openat |
kprobe block | Yes |
| Read /proc/PID/environ | openat |
kprobe block (all monitored PIDs in cgroup-array, including child PIDs in sibling cgroups) | Yes |
| Cross-process read | process_vm_readv |
BPF LSM hook (lsm_ptrace_access_check) |
Blocked where BPF LSM active (verified on GKE CC/COS); not blocked where kernel lacks bpf in LSM stack -- read succeeds, reactive kill fires after disclosure |
| Cross-process write | process_vm_writev |
BPF LSM hook (lsm_ptrace_access_check) |
Blocked where BPF LSM active (verified on GKE CC/COS); not blocked otherwise -- write succeeds, reactive kill fires after disclosure |
| Debug attach | ptrace ATTACH/SEIZE |
BPF LSM hook (lsm_ptrace_access_check) |
Blocked where BPF LSM active (verified on GKE CC/COS); not blocked otherwise -- attach succeeds, reactive kill fires after disclosure |
| Performance sampling | perf_event_open |
kprobe block | Yes |
| Page fault interception | userfaultfd |
kprobe block | Yes |
| Raw memory device | /dev/mem, /dev/kmem, /proc/kcore |
kprobe block | Yes |
Information Disclosure Attempts¶
| Attack | Path | Enforcement | Blocked? |
|---|---|---|---|
| Memory layout | /proc/PID/maps, pagemap, smaps |
kprobe block | Yes |
| Register values | /proc/PID/syscall |
kprobe block | Yes |
| Kernel stack trace | /proc/PID/stack |
kprobe block | Yes |
| Process metadata | /proc/PID/cmdline, status, wchan |
kprobe block | Yes |
| Re-enable core dumps | /proc/PID/coredump_filter write |
kprobe block | Yes |
Privilege Escalation Attempts¶
| Attack | Syscall | Enforcement | Blocked? |
|---|---|---|---|
| Load kernel module | init_module, finit_module |
Global detection + monitored PID block | Detected / Blocked |
| Load eBPF program | bpf(BPF_PROG_LOAD) |
Global detection + monitored PID block | Detected / Blocked |
Defense Layer 3: Reactive Kill (Fallback)¶
On kernels where BPF LSM is not active in the boot-time LSM stack (i.e., bpf is absent from lsm=), the eBPF agent falls back to reactive enforcement for process-memory-access vectors (process_vm_readv, ptrace, /proc/PID/mem):
- The syscall executes (tracepoint fires on exit)
- The eBPF program detects the violation
SIGKILLis sent to the offending process immediately
Reactive kill does NOT prevent secret disclosure
On kernels without BPF LSM, the process_vm_readv / ptrace / /proc/PID/mem call completes and returns the secret before the reactive kill fires. The attacker already has the data. CloudTaser detects the access and terminates the offending process after the fact, but this does not prevent the extraction. For synchronous protection, use a kernel with bpf in the boot-time LSM stack (verified on GKE Confidential Computing / COS). See Platform Compatibility.
Known Limitations¶
Root Loading a Kernel Module (Pre-5.14 Kernels)¶
The fundamental limitation on pre-5.14 kernels
A root attacker can load a custom kernel module from a non-monitored PID. The module runs with full kernel privileges and can scan all process memory.
On pre-5.14 kernels:
- cloudtaser detects the module load via global privilege escalation detection
- cloudtaser logs the event and can alert operators
- cloudtaser cannot block module loads from non-monitored PIDs without breaking system services (kubelet, kube-proxy, container runtime all load modules)
- The loaded module can scan process memory because memfd_secret is unavailable
On 5.14+ kernels:
- The same module load is detected
- But the module cannot read memfd_secret pages because they are physically absent from the kernel direct map
- The attack fails at the hardware level
Recommendation: Always target Linux 5.14+ for production deployments.
Hibernate / Suspend-to-Disk¶
Full RAM contents are written to disk during hibernate. This is a kernel-level operation that:
- Bypasses mlock (pages are force-written to the hibernate image)
- Bypasses eBPF (no syscall to intercept)
- Bypasses memfd_secret (the kernel reads physical pages directly for the hibernate image)
Disable hibernate on cloudtaser nodes
io_uring for Application Performance¶
cloudtaser blocks io_uring_setup() for monitored processes because io_uring's submission queue bypasses per-syscall eBPF inspection. Applications that depend on io_uring for performance (high-throughput databases, network proxies) must use standard syscalls.
This is an intentional trade-off: security over performance for secret-handling workloads.
Confidential Computing (Not Yet Integrated)¶
The hypervisor has physical access to VM memory. On standard VMs, this means:
- The cloud provider can inspect memory through the hypervisor
- DMA attacks from host devices can read VM memory
- Live migration exposes memory contents
These attacks require confidential computing hardware (AMD SEV-SNP, Intel TDX, ARM CCA) to prevent. cloudtaser's software protections are necessary but not sufficient against a hypervisor-level adversary. Confidential computing integration is on the roadmap.
Attack Tree Summary¶
The following table shows what a root attacker can achieve at each kernel level:
| Attack Path | Pre-5.14, No eBPF | Pre-5.14, With eBPF | 5.14+, With eBPF | 5.14+, eBPF + CC |
|---|---|---|---|---|
| /proc/PID/mem | Read secrets | Blocked | Blocked + memfd | Blocked + memfd + encrypted VM |
| /proc/PID/environ (child) | Read secrets | Blocked by eBPF kprobe | Blocked + memfd | Blocked + memfd + encrypted VM |
| ptrace | Read secrets | Blocked | Blocked + memfd | Blocked + memfd + encrypted VM |
| process_vm_readv | Read secrets | Blocked (BPF LSM) or read succeeds (no LSM; reactive kill after disclosure) | Blocked (BPF LSM) + memfd | Blocked (BPF LSM) + memfd + encrypted VM |
| Kernel module scan | Read secrets | Detected (not blocked) | Detected, but memfd immune | Fully blocked |
| /dev/mem | Read secrets | Blocked | Blocked + memfd | Blocked + memfd + encrypted VM |
| Core dump | Read secrets | MADV_DONTDUMP | MADV_DONTDUMP + memfd | MADV_DONTDUMP + memfd + encrypted VM |
| Swap forensics | mlock protects | mlock protects | mlock protects | mlock + encrypted VM |
| Hypervisor inspect | Read secrets | Read secrets | Read secrets | Encrypted |
Read the table right to left
Each column adds a defense layer. The rightmost column (5.14+ with eBPF and confidential computing) has no remaining attack paths. The practical recommendation today is the third column: Linux 5.14+ with eBPF enforcement, which blocks all software-level attacks.
Protection Guarantees¶
Enforcement depends on kernel capabilities
The guarantees below assume BPF LSM is active in the kernel's boot-time LSM stack (lsm=...,bpf). This has been verified on GKE Confidential Computing / COS (COS 6.12 with cloudtaser-ebpf v0.4.54). Other managed Kubernetes platforms (EKS, AKS) may support BPF LSM but have not been independently confirmed by CloudTaser -- verify on your platform before relying on synchronous blocking. On kernels without BPF LSM, process-memory-access vectors (process_vm_readv, ptrace, /proc/PID/mem) are not blocked -- the read succeeds and the secret is disclosed. CloudTaser detects the access and terminates the offending process after the fact, but this does not prevent extraction. See Platform Compatibility for the full enforcement matrix.
On Linux 5.14+ with BPF LSM active (recommended):¶
- No software-level mechanism can read secrets from a cloudtaser-protected process --
memfd_secretremoves pages from the kernel direct map, and the BPF LSM hook (lsm_ptrace_access_check) returns-EPERMsynchronously forprocess_vm_readv,ptrace, and/proc/PID/memagainst monitored PIDs - The only remaining exposure is the hypervisor's physical memory view
- This matches the security posture of on-premises hardware (where you trust the physical host)
On Linux 5.14+ without BPF LSM (detect-only mode -- NOT protected against process-memory reads):¶
memfd_secretstill protects secret pages at the hardware level -- they are absent from the kernel direct map regardless of enforcement modeprocess_vm_readv,ptrace, and/proc/PID/memaccess attempts are not blocked -- the read completes and the secret is returned to the attacker before any reactive action fires- CloudTaser detects the access via tracepoint and terminates the offending process with
SIGKILL, but this is post-hoc -- it does not prevent the initial disclosure - For synchronous protection of process-memory-access vectors, use a kernel with BPF LSM active (e.g., GKE Confidential Computing / COS)
On pre-5.14 kernels with eBPF enforcement:¶
- All standard attack vectors are blocked
- A determined root attacker who loads a custom kernel module can scan memory
- This attack is detected and logged but not prevented
- mlock, MADV_DONTDUMP, and environment scrubbing remain effective
Without eBPF enforcement:¶
- Memory protections (memfd_secret, mlock, MADV_DONTDUMP, PR_SET_DUMPABLE) are still active
- No runtime monitoring of attack attempts
- Root can use /proc, ptrace, and process_vm_readv (blocked only by PR_SET_DUMPABLE, which root can override)
- Not recommended for production