eBPF in Production: Observability, Security, and Networking Without Kernel Modules



eBPF in Production: Observability, Security, and Networking Without Kernel Modules

eBPF (extended Berkeley Packet Filter) has transformed from a networking niche technology into the most powerful observability and security primitive available to platform engineers. By 2026, if you’re running production Kubernetes workloads and not leveraging eBPF, you’re leaving significant capability on the table — and probably spending too much on traditional monitoring agents.

Server Infrastructure Photo by Alexandre Debiève on Unsplash

What is eBPF and Why Should You Care?

eBPF programs run in a sandboxed virtual machine inside the Linux kernel. They can be attached to almost any kernel event — system calls, network packets, function entries and exits, hardware events — and execute with near-zero overhead. The kernel verifies each program before loading it, ensuring safety.

The key insight: traditional observability requires out-of-band agents (processes consuming CPU, injecting bytecode, patching libraries). eBPF is in-band — it hooks into the kernel’s existing execution path with minimal overhead.

Traditional Agent:           eBPF Approach:
                            
App → Kernel                App → Kernel
         ↓                          ↓
    Agent Process                eBPF Program (in kernel)
    (overhead: ~5% CPU)          (overhead: ~0.1% CPU)
         ↓                          ↓
    Telemetry Backend          Telemetry Backend

Benchmark: Cilium (eBPF networking) vs. kube-proxy:

  • iptables (kube-proxy): O(n) rule lookup, ~100μs per connection at 10k services
  • eBPF (Cilium): O(1) hash table lookup, ~1μs per connection regardless of service count

The Three Pillars: Observability, Security, Networking

Pillar 1: Zero-Instrumentation Observability

The holy grail of observability is getting telemetry without changing your application code. eBPF makes this real:

Automatic HTTP/gRPC tracing — Attach to socket read/write syscalls and parse HTTP headers. Get request traces for any language runtime without a library.

// Simplified eBPF program that intercepts HTTP requests
SEC("kprobe/tcp_sendmsg")
int trace_http_request(struct pt_regs *ctx) {
    struct sock *sk = (struct sock *)PT_REGS_PARM1(ctx);
    struct msghdr *msg = (struct msghdr *)PT_REGS_PARM2(ctx);
    
    // Read the message buffer
    char buf[256];
    bpf_probe_read_user(buf, sizeof(buf), msg->msg_iter.iov->iov_base);
    
    // Parse HTTP method and path
    struct http_event_t event = {};
    parse_http_request(buf, &event);
    
    // Send to userspace ringbuf
    bpf_ringbuf_output(&events, &event, sizeof(event), 0);
    return 0;
}

Continuous profiling — Sample CPU stacks at kernel level, every process, no instrumentation. Tools like Polar Signals (parca), Grafana Pyroscope, and Pixie use eBPF for always-on profiling with <1% overhead.

The Pixie Platform: eBPF Observability in Practice

Pixie is the most ambitious eBPF observability platform. After a helm install, you get:

  • Automatic service maps (no service mesh required)
  • Full request/response body capture for HTTP, gRPC, Postgres, MySQL, Redis, Kafka
  • CPU, memory, and network profiling per pod
  • SQL query profiling with execution plans
  • All with zero code changes to applications
# Install Pixie in a Kubernetes cluster
helm install pixie pixie-operator-chart/pixie-operator \
  --set deployKey=<deploy-key> \
  --set clusterName=my-cluster \
  --namespace pl

# Query with PxL (Pixie Query Language)
px run px/http_data -- \
  --start_time="-5m" \
  --service="payment-service"

Pillar 2: Runtime Security with eBPF

Traditional container security relies on namespaces and cgroups to isolate processes. eBPF adds a behavioral layer — detecting anomalous activity at the kernel level, in real time.

Tetragon (by Cilium/Isovalent) is the leading eBPF-based runtime security tool:

# TracingPolicy: Alert when a process tries to read /etc/shadow
apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
  name: detect-shadow-read
spec:
  kprobes:
  - call: "security_file_open"
    syscall: false
    args:
    - index: 0
      type: "file"
    selectors:
    - matchArgs:
      - index: 0
        operator: "Prefix"
        values:
        - "/etc/shadow"
    - matchActions:
      - action: Sigkill  # Kill the process immediately

This policy kills any process that tries to open /etc/shadow — instantly, at the kernel level, before the syscall completes. No traditional IDS can react this fast.

Key security use cases:

  1. Process execution monitoring — Every exec() call logged with full args, parent PID, container context
  2. Network policy enforcement — Block connections to unexpected external IPs at the kernel level
  3. Privilege escalation detection — Detect and block capability escalations (setuid, CAP_SYS_ADMIN)
  4. File integrity monitoring — Real-time alerts on changes to critical files without polling

Pillar 3: eBPF-Based Networking

Cilium has become the default CNI for high-performance Kubernetes networking. Its eBPF dataplane replaces iptables for service routing:

# Install Cilium as CNI
helm install cilium cilium/cilium --version 1.16.0 \
  --set kubeProxyReplacement=true \
  --set k8sServiceHost=<api-server-host> \
  --set k8sServicePort=<api-server-port>

# Verify eBPF datapath is active
cilium status | grep "Proxy Status"
# BPF: operational

Bandwidth Manager — Cilium can enforce per-pod bandwidth limits using eBPF’s traffic control subsystem, without the overhead of traditional QoS:

# Kubernetes annotation for pod bandwidth limiting
metadata:
  annotations:
    kubernetes.io/egress-bandwidth: "100M"
    kubernetes.io/ingress-bandwidth: "100M"

Building Custom eBPF Tools with libbpf and Go

For custom observability, the libbpf-go + cilium/ebpf stack is the modern approach (BTF-enabled, CO-RE: Compile Once, Run Everywhere):

// main.go — attach an eBPF program that counts syscalls per process
package main

import (
    "log"
    "time"
    
    "github.com/cilium/ebpf"
    "github.com/cilium/ebpf/link"
    "github.com/cilium/ebpf/rlimit"
)

//go:generate go run github.com/cilium/ebpf/cmd/bpf2go -type event bpf syscall_counter.c

func main() {
    // Remove memory lock restriction (required for eBPF maps)
    if err := rlimit.RemoveMemlock(); err != nil {
        log.Fatal("Remove memlock:", err)
    }
    
    // Load pre-compiled eBPF programs
    objs := bpfObjects{}
    if err := loadBpfObjects(&objs, nil); err != nil {
        log.Fatal("Loading objects:", err)
    }
    defer objs.Close()
    
    // Attach to the sys_enter tracepoint
    tp, err := link.Tracepoint("raw_syscalls", "sys_enter", objs.CountSyscall, nil)
    if err != nil {
        log.Fatal("Opening tracepoint:", err)
    }
    defer tp.Close()
    
    // Read events from ring buffer
    rd, err := ebpf.NewReader(objs.Events, 4096)
    if err != nil {
        log.Fatal("Creating event reader:", err)
    }
    defer rd.Close()
    
    for {
        var event bpfEvent
        if err := rd.Read(&event); err != nil {
            break
        }
        log.Printf("PID %d: syscall %d", event.Pid, event.Syscall)
    }
}

eBPF Tool Landscape in 2026

For Observability:

| Tool | Use Case | |——|———-| | Pixie | Full-stack auto-instrumentation | | Parca | Continuous profiling | | Grafana Beyla | Zero-code RED metrics for HTTP/gRPC | | Inspektor Gadget | Interactive kernel debugging | | BCC/bpftrace | Ad-hoc investigation |

For Security:

| Tool | Use Case | |——|———-| | Tetragon | Runtime enforcement + audit | | Falco (eBPF driver) | Threat detection rules | | KubeArmor | Policy-based workload hardening | | Tracee | Container forensics |

For Networking:

| Tool | Use Case | |——|———-| | Cilium | CNI + service mesh + policy | | Calico eBPF | Network policy at scale | | Katran (Meta) | L4 load balancing |

Operational Considerations

Kernel requirements: eBPF CO-RE requires kernel 5.8+. Most production distributions (Ubuntu 22.04 LTS, RHEL 9, Debian 12) ship with 5.15+ kernels. Check BTF support: ls /sys/kernel/btf/vmlinux.

Security implications: eBPF programs require CAP_BPF (or CAP_SYS_ADMIN on older kernels). In Kubernetes, DaemonSets running eBPF tools need elevated permissions — scope carefully.

Debugging: Use bpftool prog list to see loaded programs and bpftool map show to inspect maps. The bpf_trace_printk() function writes to /sys/kernel/debug/tracing/trace_pipe for debug output.

Conclusion

eBPF has fundamentally changed the economics of production observability and security. Instead of choosing between high-overhead instrumentation or blind spots, eBPF delivers comprehensive kernel-level visibility with minimal performance impact.

For Kubernetes teams in 2026, the recommendation is clear:

  1. Replace kube-proxy with Cilium for instant networking performance gains
  2. Deploy Grafana Beyla or Pixie for zero-instrumentation observability
  3. Add Tetragon for runtime security and compliance audit trails

The kernel is your friend. eBPF is the key.


References:


이 글이 도움이 되셨다면 공감 및 광고 클릭을 부탁드립니다 :)