Kernel Compatibility¶

cloudtaser's eBPF agent provides three layers of runtime enforcement, each with different kernel requirements. The agent automatically detects kernel capabilities at startup and uses the strongest available mechanism.

Summary

cloudtaser's eBPF agent works on all major managed Kubernetes services (GKE, EKS, AKS) with full synchronous blocking. The main gap is RHEL 8 / OpenShift on RHEL 8, which falls back to reactive kill mode. Both modes provide complete audit trails for compliance. Upgrading to RHEL 9 enables full protection.

Enforcement Layers¶

Layer	Mechanism	Kernel Requirement	Guarantee
Synchronous block	Kprobes + `bpf_override_return()`	`CONFIG_BPF_KPROBE_OVERRIDE=y`	Syscall returns `-EACCES` before any data is read
Reactive kill	Agent sends SIGKILL on detection	Any BPF-capable kernel (4.15+)	Process terminated (small race window)
Detection + audit	Tracepoints	Any BPF-capable kernel (4.15+)	Event logged with full context

When kprobe override is available, cat /proc/<pid>/environ returns "Permission denied" with zero data leakage. When it is not available, the agent falls back to reactive kill (SIGKILL after detection) and always logs the access attempt.

Kernel Support Matrix¶

Per-distro kprobe-override capability¶

Use this as your architect-review checklist. For each node OS in your cluster, look up the version and confirm the enforcement mode you will get. Where a cell is marked ?, verify on a live node using the commands below.

Distro / Node OS	Version	`CONFIG_BPF_KPROBE_OVERRIDE`	Enforcement mode	Notes
Container-Optimized OS (GCP)	all LTS milestones (m101+)	yes	synchronous	GKE default. Covers >95% of production GKE
Bottlerocket (AWS / on-prem)	1.19+	yes	synchronous	EKS + Karpenter default on recent AMIs
Ubuntu (generic kernel)	22.04+	yes	synchronous	AKS default; also common on self-managed EKS/GKE
Ubuntu (generic kernel)	20.04	no	reactive	Focal ships the option disabled in the stock generic kernel. Upgrade to 22.04 for synchronous
Ubuntu (hwe kernel)	20.04 + `linux-generic-hwe-20.04`	yes	synchronous	Enabling the HWE stack pulls a 5.15+ kernel with the option
Debian	12 (Bookworm)	yes	synchronous	Kernel 6.1+
Debian	11 (Bullseye)	yes	synchronous	Kernel 5.10 backport
Amazon Linux 2	default (kernel 4.14)	no	reactive	Upgrade to AL2023 or pin the 5.10 kernel variant for synchronous
Amazon Linux 2	5.10 kernel variant	yes	synchronous	`amazon-linux-extras install kernel-5.10`
Amazon Linux 2023	all	yes	synchronous	EKS-optimized AL2023 AMIs
RHEL 9 / Rocky 9 / Alma 9 / CentOS Stream 9	all	yes	synchronous	Kernel 5.14+
RHEL 8 / Rocky 8 / Alma 8 / CentOS Stream 8	all	no	reactive	Red Hat explicitly disabled the option in RHEL 8's kernel config. Most common enterprise gap. Fixed by upgrading to RHEL 9 / OpenShift on 9
RHEL 7 / CentOS 7	all	no	reactive	EOL June 2024. Kernel 3.10 is too old regardless
SUSE SLES 15	SP4 and earlier	? (verify)	likely reactive	SUSE historically disables the option in stock kernel builds. Confirm on your node before ruling it in
SUSE SLES 15	SP5+	? (verify)	may be synchronous	Some SP5 rebases re-enabled the option; check on a live node
Oracle Linux (UEK)	R6 / R7	? (verify)	varies	UEK rebases change kernel config; check each UEK release
Flatcar Container Linux	current stable	yes	synchronous	Tracks upstream mainline
Fedora CoreOS	current stable	yes	synchronous	Tracks Fedora kernel
Talos Linux	1.5+	yes	synchronous	Kernel 6.1+
Alpine (node OS)	all	no	reactive	Alpine's `linux-lts` kernel ships without `CONFIG_BPF_KPROBE_OVERRIDE`. Also relevant: Alpine is uncommon as a K8s node OS (vs. as a container base image)
Custom / hardened kernels (LKRG, grsecurity, strict CI builds)	--	? (verify)	varies	Security-focused distros often strip the option. Always verify the config

Two most common enterprise gaps

RHEL 8 / OpenShift on RHEL 8 -- Red Hat explicitly disables CONFIG_BPF_KPROBE_OVERRIDE in their kernel config. Fixed by upgrading to RHEL 9.
Ubuntu 20.04 generic -- shipped the option disabled. Fixed by upgrading to 22.04, or by installing the 20.04 HWE stack (apt install linux-generic-hwe-20.04).

Both fall back to reactive kill + full audit logging. The compliance posture (audit trail) is identical; the difference is whether a data read is prevented pre-return or killed post-return.

Managed Kubernetes Services¶

Service	Default Node OS	Kprobe Override
GKE	Container-Optimized OS / Ubuntu	Supported
EKS	Amazon Linux 2 / 2023	Supported
AKS	Ubuntu	Supported
OpenShift (RHEL 8)	RHEL 8	Not supported -- reactive kill fallback
OpenShift (RHEL 9)	RHEL 9	Supported
RKE2 / k3s	Depends on host OS	Check host kernel config

How to Check¶

Check the kernel config (before or after installing cloudtaser)¶

On each node OS you plan to run, confirm what the kernel actually exposes. The distro table above is a planning hint; the kernel config on the running node is ground truth.

Ubuntu / Debian (/boot/config)Any kernel exposing /proc/config.gzWhen neither of the above worksInventory across the cluster

grep CONFIG_BPF_KPROBE_OVERRIDE /boot/config-$(uname -r)

Expected for synchronous: CONFIG_BPF_KPROBE_OVERRIDE=y. Any of # CONFIG_BPF_KPROBE_OVERRIDE is not set, empty output, or missing file → reactive mode.

zcat /proc/config.gz 2>/dev/null | grep CONFIG_BPF_KPROBE_OVERRIDE

Most RHEL-family kernels expose this; some COS / Bottlerocket / stripped custom kernels do not.

# Node has neither /boot/config-* nor /proc/config.gz
# Fall back to a runtime probe from a privileged pod
bpftool feature probe kernel 2>/dev/null | grep -i kprobe_override

Requires bpftool in the probing pod (ship via a debug image) and host-pid + privileged.

# List every node and its kernel version -- useful for heterogeneous clusters
kubectl get nodes -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.nodeInfo.kernelVersion}{"\t"}{.status.nodeInfo.osImage}{"\n"}{end}'

Pair each row against the per-distro table above. Any row you cannot map → run one of the host-side checks.

Check what YOUR cluster is actually doing¶

Once cloudtaser is deployed, verify the enforcement mode is what you expect from the config check above -- don't infer from the distro alone.

Today (shipped) -- inspect the eBPF agent startup log:

kubectl logs -n cloudtaser-system -l app.kubernetes.io/name=cloudtaser-ebpf \
  | grep -E '"msg":"eBPF agent started"'

Expected output:

{"level":"INFO","msg":"eBPF agent started","enforce_mode":true,"kprobes_active":true}

kprobes_active: true → synchronous blocking active, zero data leakage pre-return.
kprobes_active: false → tracepoint detection + reactive SIGKILL; audit logged.

When kprobes fail to load (distro in reactive column, or an unexpected config gap), the agent logs a warning first and keeps running:

{"level":"WARN","msg":"kprobe programs failed to load, retrying without enforcement","error":"...bpf_override_return..."}
{"level":"INFO","msg":"eBPF agent started","enforce_mode":true,"kprobes_active":false}

In a heterogeneous cluster (e.g. RHEL 9 + RHEL 8 mixed node pools), expect the log to differ per node. Aggregate with:

kubectl get pods -n cloudtaser-system -l app.kubernetes.io/name=cloudtaser-ebpf \
  -o jsonpath='{range .items[*]}{.spec.nodeName}{"\t"}{.metadata.name}{"\n"}{end}' \
  | while read node pod; do
      mode=$(kubectl logs -n cloudtaser-system "$pod" \
        | grep -oE '"kprobes_active":(true|false)' | head -1)
      echo "$node $mode"
    done

Roadmap -- Prometheus gauge:

A cloudtaser_ebpf_enforcement_mode{mode="synchronous|reactive", node="..."} gauge is on the roadmap (see Trust Chain observability metrics) and will make this scriptable without parsing logs. Until that lands, use the log grep above.

What to do if a node is in reactive mode¶

Not every distro cell is a blocker. Pick based on your threat model:

Upgrade the node OS to a row with yes in the per-distro table. Typical paths: Ubuntu 20.04 → 22.04; RHEL 8 → RHEL 9; Amazon Linux 2 (default) → AL2023 or the 5.10 kernel variant.
Move workloads to a separate node pool that runs a synchronous-capable distro (e.g. a GKE COS pool, Bottlerocket pool, RHEL 9 pool) and taint + nodeSelector your cloudtaser-protected workloads onto it.
Accept reactive mode. For GDPR / NIS2 / DORA compliance what matters is the audit trail, and reactive mode fully delivers that. See Compliance Implications below.

Compliance Implications¶

For GDPR, NIS2, and DORA compliance, what matters is the audit trail -- proof that access attempts are detected and recorded. Both enforcement modes (synchronous block and reactive kill) provide this.

The difference is operational:

	Synchronous Block	Reactive Kill
Data leaked?	No	Possible (small race window)
Process killed?	No (syscall fails cleanly)	Yes (SIGKILL)
Audit event logged?	Yes	Yes
Compliance requirement met?	Yes	Yes

Reactive kill is still effective in practice

The reactive kill race window is very small (microseconds between detection and SIGKILL). An attacker reading /proc/pid/environ gets killed before they can exfiltrate the data over the network, because the network send is also monitored and blocked.