eBPF is the Bee's Knees for Enterprise Environments

Let's get this out of the way first, Extended Berkeley Packet Filter (eBPF) is an innovative technology that allows developers to run sandboxed programs in the Linux kernel without the need for modifying the kernel's source code or loading kernel modules. Patrick Laidlaw, in his blog post, details the difference between kernel and user spaces inside a machine. Originally designed for packet filtering, eBPF has evolved to facilitate a wide range of applications, including performance monitoring, security enforcement, and network traffic management over its predecessor BPF. eBPF surpassed BPF in Linux kernel version 3.19 (~2015) but kept its terrible name and acronym from the original Berkley Packet Filter (BPF).

The core mechanism of eBPF

At its core, eBPF Programs are at the heart of how eBPF views and controls traffic. The eBPF technology is composed of bytecode that the kernel executes in response to specific events (a.k.a event-driven execution). Meaning the eBPF programs can be attached to various hook points in the kernel, allowing them to run in response to a specific event in the kernel. There are a lot of possibilities that stem from this event-driven execution. So many in fact, that eBPF was dubbed by Brendan Gregg as "superpowers for Linux!" Here's a breakdown of the main pieces of eBPF, and how they interact.

eBPF programs: Compiled bytecode that defines the behavior desired in response to events. The compiled eBPF program is stored in an ELF (Executable and Linkable Format) object file. This file contains BTF (BPF Type Format) information and relocation info for the kernel's use. Basically, the ELF format allows the eBPF loader (e.g., libbpf) to process and adjust the BPF program dynamically regardless of kernel version.
BPF CO-RE: The Compile Once – Run Everywhere (CO-RE) concept brings together BTF type information, libbpf, and the compiler to produce a single executable binary that you can run on multiple kernel versions and configurations. This means you only have to write the code once for all the different flavors of Linux out there.
Maps: Data structures used for sharing information for eBPF programs. Maps can be hash tables, arrays, or other types of organizational structures that allow for efficient data access by eBPF. Maps can also be used other ways like an eBPF program storing state for later retrieval by another eBPF program (or a future run of the same program).
Ring buffers: For efficient, asynchronous communication between eBPF programs and user space. In practice, eBPF ring buffers are typically used for scenarios such as a user space program writing configuration information to be retrieved by an eBPF program. eBPF ring buffers can also be used to provide information out to other programs.
Verifiers: Before execution, eBPF programs undergo verification to ensure they are safe and do not contain infinite loops or out-of-bounds accesses.

A tool is the sum of its parts, and eBPF uses these parts to make magic happen. While eBPF programs get a lot of attention, there are multiple support and validation pieces happening to have the eBPF programs work right every time. When the eBPF programs are loaded by the bpf() system call a few things happen. The program is compiled into bytecode, the program is verified to be safe and correct, and hooks are created in kernel to attach to other processes or mapping information out to applications in user space.

The hooks that are created upon compilation provide a whole host of options! From processing data through the express data path (XDP), tracing what is happening in the kernel for troubleshooting, getting performance insights from the kernel, or just seeing what is happening in the processor. That last function is one of the more useful options for security folks. The kernel probes (kprobes) and user probes (uprobes) look for specific events, and when they're triggered eBPF lets the user know. This is where eBPF gets the event-driven execution from. It's waiting for something specific to happen, and triggering when there's a match. There's so much possibility with this design!

Why enterprises are looking at eBPF

Traffic control and visibility- eBPF event driven execution in enterprise environments is extensively used for real-time network controls and insights.

Networking: eBPF programs can hook into networking tools to provide DDoS protection, Load balance, add split or customize TCP/UDP port traffic, tunnel traffic, provide QoS, or tune individual traffic sessions on a per-connection basis. That's some good stuff right there!
Visibility: eBPF can capture network flow data (like netflow), act as an inline TAP, or parse out L7 details to an IPS (or other tool) to detect malicious traffic.

Performance and debugging- eBPF shines in performance analysis, enabling enterprises to gain insights into system behavior and troubleshoot with minimal overhead:

Tracing and profiling: Tools like bpftrace and perf allow users to write custom scripts to capture function call durations, CPU usage, and memory allocations.
Latency monitoring: eBPF can track latency across different system calls and provide metrics that help identify bottlenecks.
Forensics: Tools like Tracee that utilizes eBPF to monitor behavioral patterns, and capturing system calls as events. All without requiring modifications to the application code.
Custom tracepoints: Developers can define tracepoints in their applications to capture relevant events, helping identify bugs or performance issues.

Security for eBPF- Running in the kernel:

eBPF programs: Just in time (jit) compiled, and run in the kernel as a sandbox. In addition eBPF programs only run when specific traffic happens, and each program has multiple checks for errors or malicious additions to the code.
eBPF verifier: A set of security controls to prevent kernel information from leaking into user space, and are required to run any eBPF program

Security by eBPF- Visibility and control options for applications:

Tetragon: Runtime enforcement of policy on traffic with eBPF monitoring and filtering. It allows controls over binary execution, detection of privilege changes, prevent data exfiltration in Kubernetes, monitor file integrity, and look for fileless execution.
Falco: Used for compliance checks and threat detection. Threat detection is aligned with MITRE ATT&CK framework and layer 7 traffic is checked against rule engines for a large set of known attack behaviors.
Cilium: Provides security for workloads with node-to-node encryption without configuration overhead, uses API-level security controls for HTTP, Kafka, gRPC, etc. for identities not IPs, and pairs with Tetragon for runtime enforcement

Service mesh integration- This is where Cilium comes into the eBPF story, but more on this in a future article. Broadly speaking, eBPF enhances service mesh capabilities by enabling features like:

Dynamic routing: eBPF can direct traffic based on application-level policies and API-aware routing, while reducing overhead by executing logic in the kernel.
Observability: In-depth metrics collection and logging directly from the data path, facilitating real-time insights without impacting application performance.
Security: Using eBPF for controlling microservices at an API level allows for blocking communication between microservices and controlling the data microservices can use. This approach is holistically better than IP and port controls for microservices.

It's not all sunshine and rainbows with eBPF

While eBPF provides significant benefits, there are challenges as well. The biggest challenge is its complexity. Writing eBPF programs demands a deep understanding of kernel internals and low-level programming. This can steepen the learning curve for developers and make true mastery of eBPF utilization a lifetime pursuit. Since we're on the subject, you know that famous Spiderman quote, "With great power, comes great responsibility"? Due to the ease of programmability, and the detailed observability, eBPF has been used as a tool for implementing microarchitectural timing side-channel attacks such as Spectre. Unprivileged use has ultimately been disabled by the kernel community by default to protect from use against future hardware vulnerabilities, but the potential is always going to be there in eBPF. Finally, while eBPF is efficient, there is a risk of performance overhead if programs are not optimized, which isn't the easiest thing to do with eBPF complexity. Careful design and following of best practices is needed when implementing eBPF to mitigate these risks.

In summary

eBPF represents a paradigm shift in how enterprises can manage and observe their systems. By allowing safe execution of custom code in the kernel, eBPF enhances capabilities in network security, performance monitoring, and microservice traffic handling. As its adoption grows, especially in cloud-native architectures, utilizing eBPF's potential will be crucial for organizations aiming to secure and optimize their infrastructure.

For further exploration of eBPF benefits take a look at eBPF documentation, Cilium certification, Isovalent's blog, and security tools like Tetragon. eBPF and eBPF based applications provide deeper insights, robust security, and control of communication in enterprise environments today. That's why it's the bee's knees, and maybe even the cat's pajamas of network controls now days.

eBPF is the Bee's Knees for Enterprise Environments

The core mechanism of eBPF

Why enterprises are looking at eBPF

It's not all sunshine and rainbows with eBPF

In summary

Technologies