eBPF in Production: Programmable Kernel Networking and Observability Without Code Changes



The Problem With Traditional Observability

Adding visibility to a running production system has always meant a trade-off: either instrument everything at build time (intrusive, verbose) or accept black boxes that you can only observe from the outside. Network traffic? Need a sidecar. Syscall patterns? Add an agent that might itself become a performance bottleneck. Kernel internals? Mostly guesswork.

eBPF (extended Berkeley Packet Filter) flips this equation. It lets you run safe, sandboxed programs directly inside the Linux kernel — without recompiling the kernel, without loading kernel modules, and without modifying your application code.

Linux kernel architecture

Photo by Fotis Fotopoulos on Unsplash


What Is eBPF?

eBPF evolved from the original BPF (Berkeley Packet Filter) used for network packet filtering. The “extended” version is far more powerful: it can attach to almost any kernel event and run custom logic there.

At its core, eBPF works like this:

  1. You write a program in C (or higher-level languages like Rust or Python)
  2. It’s compiled to eBPF bytecode
  3. Before loading, a verifier checks the program for safety (no infinite loops, no invalid memory access, bounded execution)
  4. A JIT compiler translates it to native machine code
  5. The program attaches to a hook point — a system call, network event, tracepoint, kprobe, or uprobe
  6. Every time that event fires, your program runs with zero latency overhead

The programs are verifiably safe. They can’t crash the kernel. They can’t loop forever. They have bounded memory access. This is what makes eBPF production-safe.


Key Hook Points

Network:    XDP → TC → Socket → sk_buff hooks
Syscalls:   sys_enter_*, sys_exit_*
Tracepoints: sched:sched_switch, kmem:*, signal:*
kprobes:    Any kernel function (dynamic)
uprobes:    Any userspace function (dynamic)
LSM hooks:  Security enforcement points
Perf events: CPU counters, hardware events

The sheer surface area of attachment points means you can observe (or influence) virtually any behavior in the system.


Use Case 1: Zero-Instrumentation Distributed Tracing

Traditional distributed tracing requires SDK instrumentation in every service. With eBPF, you can do automatic tracing at the kernel level by intercepting:

  • TCP connect/accept events → infer service topology
  • TLS handshakes → identify which services talk to which
  • HTTP/2 and gRPC frames via protocol parsers → extract traces without decryption

Tools like Pixie, Odigos, and Beyla implement this pattern. You deploy one DaemonSet, and suddenly you have traces, metrics, and service maps — no SDK changes required.

# Install Beyla for zero-code auto-instrumentation
helm repo add grafana https://grafana.github.io/helm-charts
helm install beyla grafana/beyla \
  --set preset=network \
  --set "config.discovery.services[0].name=my-service"

Within minutes, spans start flowing into your observability backend with full HTTP method, path, status code, and latency data.


Use Case 2: High-Performance Networking with XDP

XDP (eXpress Data Path) runs eBPF programs before the Linux network stack even sees a packet. This means:

  • DDoS mitigation at line rate — drop malicious packets before they consume CPU
  • Load balancing with no kernel overhead — used by Facebook, Cloudflare, and major CDNs
  • Packet forwarding bypassing the full kernel stack

Here’s a minimal XDP program that drops all ICMP packets:

#include <linux/bpf.h>
#include <linux/if_ether.h>
#include <linux/ip.h>
#include <bpf/bpf_helpers.h>

SEC("xdp")
int drop_icmp(struct xdp_md *ctx) {
    void *data_end = (void *)(long)ctx->data_end;
    void *data     = (void *)(long)ctx->data;

    struct ethhdr *eth = data;
    if (eth + 1 > data_end) return XDP_PASS;
    if (eth->h_proto != __constant_htons(ETH_P_IP)) return XDP_PASS;

    struct iphdr *ip = (void *)(eth + 1);
    if (ip + 1 > data_end) return XDP_PASS;
    if (ip->protocol == IPPROTO_ICMP) return XDP_DROP;

    return XDP_PASS;
}

char _license[] SEC("license") = "GPL";

Cilium, the Kubernetes CNI built entirely on eBPF, uses this approach to replace iptables with eBPF programs. The result: significantly lower latency and linear scalability as cluster size grows, compared to the O(n) iptables chain traversal.


Use Case 3: Runtime Security Enforcement

Falco, Tetragon, and KubeArmor use eBPF to enforce security policies at kernel level:

# Tetragon policy: block any process reading /etc/shadow
apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
  name: "block-shadow-read"
spec:
  kprobes:
  - call: "security_file_open"
    syscall: false
    return: false
    args:
    - index: 0
      type: "file"
    selectors:
    - matchArgs:
      - index: 0
        operator: "Equal"
        values:
        - "/etc/shadow"
      matchActions:
      - action: Sigkill

This kills any process attempting to read /etc/shadow — at kernel level, before the read completes. No filesystem-level hook, no audit daemon delay, no userspace agent round-trip. Pure kernel enforcement.


The Tooling Landscape

Low-Level

  • bcc — Python/C frontend for writing eBPF programs
  • bpftrace — High-level tracing language (like awk for kernel events)
  • libbpf — C library for CO-RE (Compile Once, Run Everywhere) eBPF
  • Aya — Rust-native eBPF library

High-Level / Production Tools

CategoryTools
ObservabilityPixie, Beyla, Parca, Coroot
Networking/CNICilium, Calico eBPF mode
SecurityFalco, Tetragon, KubeArmor
ProfilingPyroscope (eBPF profiler), Parca
Load BalancingKatran (Meta), DPVS (Baidu)

CO-RE: Write Once, Run on Any Kernel

A historical pain point was that eBPF programs were tied to the kernel version they compiled against. CO-RE (Compile Once — Run Everywhere) solves this with BTF (BPF Type Format):

// CO-RE allows accessing kernel struct fields portably
struct task_struct *task = (void *)bpf_get_current_task();
pid_t pid = BPF_CORE_READ(task, pid);

BPF_CORE_READ generates relocation info so the eBPF loader can adapt field offsets at load time. Your program compiles once and runs on kernels from 5.4 all the way to the latest.


Performance Overhead

The common question: does this slow things down?

In practice, well-written eBPF programs have sub-microsecond overhead per event. For tracing 10,000 syscalls/second, the overhead is typically under 1% CPU. For XDP packet processing, throughput is in the tens of millions of packets per second per core.

The caveat: poorly written eBPF with complex maps or frequent userspace communication can add overhead. Benchmark your programs — but the baseline is very good.


Getting Started

# Install bpftrace (easiest way to experiment)
sudo apt install bpftrace   # Ubuntu 22.04+

# Trace all open() syscalls
sudo bpftrace -e 'tracepoint:syscalls:sys_enter_openat { printf("%s opened %s\n", comm, str(args->filename)); }'

# Profile CPU usage by function for 10 seconds
sudo bpftrace -e 'profile:hz:99 { @[kstack] = count(); } interval:s:10 { exit(); }'

eBPF has moved from a Linux networking curiosity to a foundational layer of modern infrastructure. If you’re building observability tooling, writing a CNI, or hardening a Kubernetes cluster in 2026, eBPF proficiency is no longer optional — it’s the craft.

이 글이 도움이 되셨다면 공감 및 광고 클릭을 부탁드립니다 :)