Tier-2 Kernel Enforcement¶

CloudTaser classifies kernel environments into two enforcement tiers based on eBPF capability. This page documents the Tier-2 threat model, the data-leak window bounds for secondary vectors, active mitigations, and customer deployment guidance.

Research basis

This threat model was researched and validated against live kernels as part of cloudtaser-ebpf#606 (2026-06-04). The worst-case window figures below are derived from measured kernel ring-buffer latency on GKE COS nodes.

Kernel Tier Definitions¶

Tier-1 — CONFIG_BPF_KPROBE_OVERRIDE=y available. All attack vectors are denied synchronously before the syscall executes. Zero data-leak window.

Tier-2 — BPF LSM available, CONFIG_BPF_KPROBE_OVERRIDE absent. This is the default for most managed Kubernetes environments: GKE COS, EKS Bottlerocket, standard Debian/Ubuntu kernels.

Tier	Kernel capability	Examples	Primary enforcement
Tier-1	`CONFIG_BPF_KPROBE_OVERRIDE=y`	Custom kernels, Azure Linux 3.0+, EKS AL2023	Synchronous deny — all vectors
Tier-2	BPF LSM, no kprobe override	GKE COS, EKS Bottlerocket, Debian 12, Ubuntu 22.04	Synchronous LSM deny (primary) + ReactiveKill (secondary)

Check your kernel tier

# Tier-1: must return "CONFIG_BPF_KPROBE_OVERRIDE=y"
zcat /proc/config.gz | grep CONFIG_BPF_KPROBE_OVERRIDE

# Tier-2 LSM support
zcat /proc/config.gz | grep CONFIG_BPF_LSM

# cloudtaser CLI
cloudtaser-cli status --enforcement-tier

Tier-2 Enforcement Model¶

On Tier-2 kernels, CloudTaser operates at two enforcement levels.

Synchronous LSM Denial (primary enforcement)¶

BPF LSM hooks intercept system calls before they execute. No data is read; the attacker receives an error immediately.

Hook	Vectors blocked	Return value
`lsm_ptrace_access_check`	`process_vm_readv`, `process_vm_writev`, `ptrace(PTRACE_ATTACH)`, `ptrace(PTRACE_SEIZE)`	`-EPERM`
`lsm_file_open`	Opens of `/proc/<pid>/environ`, `/proc/<pid>/mem`, `/proc/<pid>/maps`, `/proc/<pid>/pagemap`, `/proc/<pid>/smaps`, `/proc/<pid>/stack`, `/proc/<pid>/syscall`, `/dev/mem`, `/dev/kmem`, `/proc/kcore`	`-EPERM`

Attacker calls process_vm_readv(target_pid, ...)
  → lsm_ptrace_access_check BPF hook fires BEFORE syscall executes
  → Hook checks caller against protected cgroup-array
  → Returns -EPERM
  → No memory is read
  → Event logged: VMREADV_DENIED

Primary vectors are zero-data-leak denials

process_vm_readv and /proc/<pid>/mem are the principal memory-read attack vectors. Both are denied synchronously by BPF LSM hooks on Tier-2. The attacker receives an error; no data is transferred.

Asynchronous ReactiveKill (secondary enforcement)¶

Eight syscall families have no BPF LSM hook in the supported kernel matrix. On Tier-2, the agent attaches via tracepoints, detects the violation at syscall-exit, and sends SIGKILL from userspace. The syscall completes before the signal arrives.

Syscall family	Attack scenario	Mechanism
`io_uring_*`	Bypasses per-syscall eBPF hooks via submission queue	Tracepoint + SIGKILL
`userfaultfd`	Page-fault interception for controlled memory read	Tracepoint + SIGKILL
`copy_file_range`	In-kernel zero-copy between files	Tracepoint + SIGKILL
`kcmp`	Process comparison to fingerprint targets	Tracepoint + SIGKILL
`process_madvise`	Advise changes on another process's memory	Tracepoint + SIGKILL
`init_module` / `finit_module`	Kernel module load for privilege escalation	Tracepoint + SIGKILL
`setns`	Namespace escape	Tracepoint + SIGKILL
`splice` / `tee` / `sendfile` / `vmsplice`	Zero-copy data exfiltration	Tracepoint + SIGKILL

ReactiveKill has a data-leak window

The syscall completes before SIGKILL is delivered. The window is bounded by ring-buffer flush latency + Go goroutine scheduling + signal delivery — measured at approximately 5ms on GKE COS nodes.

Worst-Case Data-Leak Window Analysis¶

Theoretical maximum¶

At sendfile(2) throughput (~10 GB/s kernel-to-kernel), a theoretical maximum of ~50 MB could be transferred in one syscall before the SIGKILL arrives.

Practical exposure¶

The theoretical maximum does not reflect realistic attack conditions:

memfd_secret makes primary vectors unreachable. Secrets in the wrapper are stored in memfd_secret-backed memory (kernel 5.14+). This memory is removed from the kernel direct map — process_vm_readv returns EIO against these pages regardless of LSM enforcement. The LSM hook is defense-in-depth on top of hardware-level isolation.
ReactiveKill vectors are secondary side-channels. The kprobe-only vector families (splice, sendfile, io_uring, etc.) are not direct memory-read paths. They require the attacker to already have data in a buffer they control — exfiltrating wrapper heap requires the primary vectors first, which LSM blocks.
Heap zeroing closes the transient plaintext window. Wrapper v0.2+ zeros Go heap copies of secrets immediately after use. A successful read of wrapper heap during the fetch window returns zeroed or partial plaintext.
Seccomp-bpf blocks vector creation at pod admission. The operator injects a RuntimeDefault seccomp profile that blocks process_vm_readv, pidfd_getfd, and userfaultfd at the kernel syscall filter level for attacker pods.

Window summary¶

Scenario	Data-leak window	Notes
Primary vectors (`process_vm_readv`, `/proc/mem`)	None — synchronous LSM denial	BPF LSM hook fires before syscall
Secondary vectors (`splice`, `sendfile`, `io_uring`)	~5ms ReactiveKill window	`memfd_secret` pages still inaccessible inside window
Theoretical worst case (no mitigations)	~50 MB per call	Not achievable: memfd + seccomp + heap-zeroing

Active Mitigations on Tier-2¶

Mitigation	Status	Effect
`memfd_secret` (kernel 5.14+)	Active by default	Hardware-level page hiding; `process_vm_readv` returns `EIO` against secret pages
Heap zeroing	Active (wrapper v0.2+)	Transient Go heap copies zeroed after use; reduces plaintext exposure during fetch window
Seccomp-bpf (RuntimeDefault)	Active via operator injection	Blocks `process_vm_readv`, `pidfd_getfd`, `userfaultfd` on attacker pods at admission
BPF LSM (primary enforcement)	Active on Tier-2	Synchronous denial of all primary memory-read vectors
ReactiveKill (secondary enforcement)	Active on Tier-2	~5ms kill window for secondary vectors
Kernel upgrade to Tier-1	Optional	Restores synchronous denial for all vectors including secondary families

Customer Guidance¶

Default (GKE, EKS, AKS on standard kernels)¶

Tier-2 enforcement provides synchronous denial for all primary secret-access vectors. The ~5ms ReactiveKill window applies only to secondary side-channel vectors that are not direct memory-read paths. Combined with memfd_secret and heap zeroing, this provides strong practical protection for regulated workloads.

Zero-tolerance posture¶

For workloads requiring synchronous denial on every vector without exception:

Deploy on Tier-1 kernels (custom kernel build with CONFIG_BPF_KPROBE_OVERRIDE=y, or Azure Linux 3.0+ / EKS AL2023)
Verify the enforcement tier: cloudtaser-cli status --enforcement-tier
Confirm the agent reports enforcement_mode: full on its /status endpoint

Verify your enforcement tier¶

# Check agent enforcement mode
kubectl exec -n cloudtaser ds/cloudtaser-ebpf -- \
  wget -qO- http://localhost:8080/status | jq .enforcement_mode

# Expected on Tier-1: "full"
# Expected on Tier-2: "lsm"

Enforcement tier is reported in audit events

Every enforcement event logged by the agent includes the enforcement_mode field. This allows compliance teams to verify that the declared tier matches the observed behaviour in audit logs.

eBPF Enforcement | Memory Protection | Root Attack Surface