Skip to content

Kernel Compatibility

cloudtaser's eBPF agent provides three layers of runtime enforcement, each with different kernel requirements. The agent automatically detects kernel capabilities at startup and uses the strongest available mechanism.

Summary

cloudtaser's eBPF agent works on all major managed Kubernetes services (GKE, EKS, AKS) with full synchronous blocking. The main gap is RHEL 8 / OpenShift on RHEL 8, which falls back to reactive kill mode. Both modes provide complete audit trails for compliance. Upgrading to RHEL 9 enables full protection.


Enforcement Layers

Layer Mechanism Kernel Requirement Guarantee
Synchronous block Kprobes + bpf_override_return() CONFIG_BPF_KPROBE_OVERRIDE=y Syscall returns -EACCES before any data is read
Reactive kill Agent sends SIGKILL on detection Any BPF-capable kernel (4.15+) Process terminated (small race window)
Detection + audit Tracepoints Any BPF-capable kernel (4.15+) Event logged with full context

When kprobe override is available, cat /proc/<pid>/environ returns "Permission denied" with zero data leakage. When it is not available, the agent falls back to reactive kill (SIGKILL after detection) and always logs the access attempt.


Kernel Support Matrix

Per-distro kprobe-override capability

Use this as your architect-review checklist. For each node OS in your cluster, look up the version and confirm the enforcement mode you will get. Where a cell is marked ?, verify on a live node using the commands below.

Distro / Node OS Version CONFIG_BPF_KPROBE_OVERRIDE Enforcement mode Notes
Container-Optimized OS (GCP) all LTS milestones (m101+) yes synchronous GKE default. Covers >95% of production GKE
Bottlerocket (AWS / on-prem) 1.19+ yes synchronous EKS + Karpenter default on recent AMIs
Ubuntu (generic kernel) 22.04+ yes synchronous AKS default; also common on self-managed EKS/GKE
Ubuntu (generic kernel) 20.04 no reactive Focal ships the option disabled in the stock generic kernel. Upgrade to 22.04 for synchronous
Ubuntu (hwe kernel) 20.04 + linux-generic-hwe-20.04 yes synchronous Enabling the HWE stack pulls a 5.15+ kernel with the option
Debian 12 (Bookworm) yes synchronous Kernel 6.1+
Debian 11 (Bullseye) yes synchronous Kernel 5.10 backport
Amazon Linux 2 default (kernel 4.14) no reactive Upgrade to AL2023 or pin the 5.10 kernel variant for synchronous
Amazon Linux 2 5.10 kernel variant yes synchronous amazon-linux-extras install kernel-5.10
Amazon Linux 2023 all yes synchronous EKS-optimized AL2023 AMIs
RHEL 9 / Rocky 9 / Alma 9 / CentOS Stream 9 all yes synchronous Kernel 5.14+
RHEL 8 / Rocky 8 / Alma 8 / CentOS Stream 8 all no reactive Red Hat explicitly disabled the option in RHEL 8's kernel config. Most common enterprise gap. Fixed by upgrading to RHEL 9 / OpenShift on 9
RHEL 7 / CentOS 7 all no reactive EOL June 2024. Kernel 3.10 is too old regardless
SUSE SLES 15 SP4 and earlier ? (verify) likely reactive SUSE historically disables the option in stock kernel builds. Confirm on your node before ruling it in
SUSE SLES 15 SP5+ ? (verify) may be synchronous Some SP5 rebases re-enabled the option; check on a live node
Oracle Linux (UEK) R6 / R7 ? (verify) varies UEK rebases change kernel config; check each UEK release
Flatcar Container Linux current stable yes synchronous Tracks upstream mainline
Fedora CoreOS current stable yes synchronous Tracks Fedora kernel
Talos Linux 1.5+ yes synchronous Kernel 6.1+
Alpine (node OS) all no reactive Alpine's linux-lts kernel ships without CONFIG_BPF_KPROBE_OVERRIDE. Also relevant: Alpine is uncommon as a K8s node OS (vs. as a container base image)
Custom / hardened kernels (LKRG, grsecurity, strict CI builds) -- ? (verify) varies Security-focused distros often strip the option. Always verify the config

Two most common enterprise gaps

  1. RHEL 8 / OpenShift on RHEL 8 -- Red Hat explicitly disables CONFIG_BPF_KPROBE_OVERRIDE in their kernel config. Fixed by upgrading to RHEL 9.
  2. Ubuntu 20.04 generic -- shipped the option disabled. Fixed by upgrading to 22.04, or by installing the 20.04 HWE stack (apt install linux-generic-hwe-20.04).

Both fall back to reactive kill + full audit logging. The compliance posture (audit trail) is identical; the difference is whether a data read is prevented pre-return or killed post-return.


Managed Kubernetes Services

Service Default Node OS Kprobe Override
GKE Container-Optimized OS / Ubuntu Supported
EKS Amazon Linux 2 / 2023 Supported
AKS Ubuntu Supported
OpenShift (RHEL 8) RHEL 8 Not supported -- reactive kill fallback
OpenShift (RHEL 9) RHEL 9 Supported
RKE2 / k3s Depends on host OS Check host kernel config

How to Check

Check the kernel config (before or after installing cloudtaser)

On each node OS you plan to run, confirm what the kernel actually exposes. The distro table above is a planning hint; the kernel config on the running node is ground truth.

grep CONFIG_BPF_KPROBE_OVERRIDE /boot/config-$(uname -r)

Expected for synchronous: CONFIG_BPF_KPROBE_OVERRIDE=y. Any of # CONFIG_BPF_KPROBE_OVERRIDE is not set, empty output, or missing file → reactive mode.

zcat /proc/config.gz 2>/dev/null | grep CONFIG_BPF_KPROBE_OVERRIDE

Most RHEL-family kernels expose this; some COS / Bottlerocket / stripped custom kernels do not.

# Node has neither /boot/config-* nor /proc/config.gz
# Fall back to a runtime probe from a privileged pod
bpftool feature probe kernel 2>/dev/null | grep -i kprobe_override

Requires bpftool in the probing pod (ship via a debug image) and host-pid + privileged.

# List every node and its kernel version -- useful for heterogeneous clusters
kubectl get nodes -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.nodeInfo.kernelVersion}{"\t"}{.status.nodeInfo.osImage}{"\n"}{end}'

Pair each row against the per-distro table above. Any row you cannot map → run one of the host-side checks.

Check what YOUR cluster is actually doing

Once cloudtaser is deployed, verify the enforcement mode is what you expect from the config check above -- don't infer from the distro alone.

Today (shipped) -- inspect the eBPF agent startup log:

kubectl logs -n cloudtaser-system -l app.kubernetes.io/name=cloudtaser-ebpf \
  | grep -E '"msg":"eBPF agent started"'

Expected output:

{"level":"INFO","msg":"eBPF agent started","enforce_mode":true,"kprobes_active":true}
  • kprobes_active: true → synchronous blocking active, zero data leakage pre-return.
  • kprobes_active: false → tracepoint detection + reactive SIGKILL; audit logged.

When kprobes fail to load (distro in reactive column, or an unexpected config gap), the agent logs a warning first and keeps running:

{"level":"WARN","msg":"kprobe programs failed to load, retrying without enforcement","error":"...bpf_override_return..."}
{"level":"INFO","msg":"eBPF agent started","enforce_mode":true,"kprobes_active":false}

In a heterogeneous cluster (e.g. RHEL 9 + RHEL 8 mixed node pools), expect the log to differ per node. Aggregate with:

kubectl get pods -n cloudtaser-system -l app.kubernetes.io/name=cloudtaser-ebpf \
  -o jsonpath='{range .items[*]}{.spec.nodeName}{"\t"}{.metadata.name}{"\n"}{end}' \
  | while read node pod; do
      mode=$(kubectl logs -n cloudtaser-system "$pod" \
        | grep -oE '"kprobes_active":(true|false)' | head -1)
      echo "$node $mode"
    done

Roadmap -- Prometheus gauge:

A cloudtaser_ebpf_enforcement_mode{mode="synchronous|reactive", node="..."} gauge is on the roadmap (see Trust Chain observability metrics) and will make this scriptable without parsing logs. Until that lands, use the log grep above.

What to do if a node is in reactive mode

Not every distro cell is a blocker. Pick based on your threat model:

  1. Upgrade the node OS to a row with yes in the per-distro table. Typical paths: Ubuntu 20.04 → 22.04; RHEL 8 → RHEL 9; Amazon Linux 2 (default) → AL2023 or the 5.10 kernel variant.
  2. Move workloads to a separate node pool that runs a synchronous-capable distro (e.g. a GKE COS pool, Bottlerocket pool, RHEL 9 pool) and taint + nodeSelector your cloudtaser-protected workloads onto it.
  3. Accept reactive mode. For GDPR / NIS2 / DORA compliance what matters is the audit trail, and reactive mode fully delivers that. See Compliance Implications below.

Compliance Implications

For GDPR, NIS2, and DORA compliance, what matters is the audit trail -- proof that access attempts are detected and recorded. Both enforcement modes (synchronous block and reactive kill) provide this.

The difference is operational:

Synchronous Block Reactive Kill
Data leaked? No Possible (small race window)
Process killed? No (syscall fails cleanly) Yes (SIGKILL)
Audit event logged? Yes Yes
Compliance requirement met? Yes Yes

Reactive kill is still effective in practice

The reactive kill race window is very small (microseconds between detection and SIGKILL). An attacker reading /proc/pid/environ gets killed before they can exfiltrate the data over the network, because the network send is also monitored and blocked.