Monitoring performance with eBPF, bpftrace and perf in Linux

Last update: 17/12/2025
Author Isaac
  • Perf offers classic CPU profiling and event analysis. hardware, ideal for quick performance diagnostics in Linux.
  • eBPF and bpftrace enable deep observability from the kernel with very low impact and high customization.
  • The combination of perf with eBPF covers everything from hardware counters to application metrics in production.
  • The eBPF ecosystem is consolidating itself as a standard for monitoring, security, and networking in modern systems.

Monitor performance with eBPF and bpftrace

La in-depth performance monitoring On Linux servers, you're no longer satisfied with just four basic metrics like CPU, memory, and disk. In modern environments, with microservices, containers, and hybrid clouds, you need to see what's really happening inside the kernel, at the level of system calls, network packets, and specific functions of your applications.

In this context, technologies such as perf, eBPF and bpftrace They have become key components for systems, SRE, and observability teams. They allow for the identification of bottlenecks, the reduction of infrastructure costs, and the improvement of the user experience with a level of detail that was unthinkable a few years ago, all with a very low impact on system load.

What is server performance analysis?

When we talk about performance analysis, we are referring to the process of measure, interpret and optimize how the services running on a server use resources: CPU, memory, disk I/O, network, and also factors such as response latency or the number of requests handled.

In business and production environments, this analysis is critical because a small bottleneck It can trigger crashes, unbearable response times, or exorbitant cloud bills if hardware is scaled without understanding the real problem.

Importance in business and production environments

  • La identification of bottlenecks The first objective is to locate processes, threads, or functions that are consuming more CPU, I/O, or memory than necessary in order to optimize or redesign them.
  • La cost optimization It comes right behind: if you know which services are saturating which resources, you can adjust machine sizes, container limits, or autoscaling policies to pay only for what you really need.
  • La improved user experience It depends directly on how your servers respond; reducing latency at critical points (slow queries, disk locks, network queues) makes a huge difference in web and mobile applications.
  • La failure prevention It relies on historical monitoring: if you collect metrics with sufficient detail, it is easier to detect anomalous patterns and anticipate a serious service downtime or degradation.

Modern tools: perf, eBPF and bpftrace

Today, serious performance analysis in Linux relies on a set of tools that cover different levels: perf as a classic utility linked to hardware counters, and the ecosystem eGMP (BCC, bpftrace, libbpf, Cilium, etc.) to have near-surgical observability from the kernel.

While perf focuses on predefined hardware and software eventseBPF opens the door to writing small programs that run in the kernel under strict control, reacting to events such as syscalls, function traces, network packets, or even perf events, sending back only the information you need.

With bpftrace, you also get a high-level layer which greatly simplifies working with eBPF, allowing you to write very expressive scripts with a syntax similar to AWK or DTrace, ideal for quick diagnostics in production.

perf: the classic tool for performance analysis

Utility perf It has been integrated into the Linux ecosystem for years and is designed to record and analyze low-level events: CPU cycles, cache misses, executed instructions, page faults, syscalls and much more.

Its typical usage flow is quite straightforward: registers first This is what happens over a period of time, and then the report is analyzed to locate the problematic functions or processes.

What is perf and how does it work?

  • The administrator launches an analysis with perf record, indicating the command or PID to study or even the entire system.
  • During the configured interval, perf displays or stores events of the hardware and the kernel, taking periodic samples without needing to instrument the source code.
  • With profile report An interactive report is generated showing CPU consumption by function, file, symbol or process, allowing navigation through the program's hot paths.

Key features of performance

Hardware event monitoring

  • You can count the CPU cycles consumedwhich helps detect functions that are hogging the processor.
  • It reports cache misses (cache misses), so you know when L1/L2/L3 cache lookups fail and the system has to go to main memory, increasing latency.
  • Collect the number of instructions executed, useful for analyzing the efficiency of an algorithm or comparing two different implementations.

CPU cycle and latency analysis

By combining CPU samples with blocking events, I/O, or waits in syscalls, perf allows for the study of where the cycles are consumed and which code segments add the most latency. This is especially useful in services with strict SLAs, where the latency queue matters more than the average.

  Windows 10 Automatic Repair Not Working. Causes, Solutions, Alternatives

Common use cases with perf

  • Application optimization: locate hotspots within a binary, review call stacks, and rewrite or adjust specific sections of the code.
  • Diagnosing overloaded systems: understand which processes are triggering CPU load or where thread contentions are occurring.
  • Hardware performance analysis: evaluate how CPU, memory and caches behave under different synthetic or real loads, comparing servers or processor generations.
  • Latency problems solving: study disk I/O waiting peaks, mutex blocking, or network delays and determine if the bottleneck is in the application or the system.

Installation and basic configuration of profiles

In most Linux distributions, perf is installed from the official repositories as part of the package linux-tools or similar. It is important that the perf version matches the kernel version that you are using to avoid incompatibilities in the exposed events.

After installation, it is usually enough to make sure that the performance counters kernel parameters must be enabled (such as kernel.perf_event_paranoid) so that perf can access the necessary events, especially if you are going to profile the entire system.

Practical examples with performance commands

A typical use case involves profiling a specific binary during its execution with perf record -g — ./my_program and then analyze the report with perf report, navigating through the most expensive functions and their call tree.

Perf Top can also be used for see in real time The functions that consume the most CPU in the system, something very useful in production incidents when you don't have time to run a more sophisticated analysis.

eBPF: the future of monitoring and analysis

Extended Berkeley Packet Filter, or eGMPIt has gone from being a curiosity associated with packet filtering to becoming the basis of many next-generation observability, security, and networking platforms.

Its key lies in allowing small, secure programs They run within the Linux kernel in response to events, with prior verification and access to internal structures, without the need to patch the kernel or load complex modules.

What is eBPF and how does it work in general?

  • Allows capture very detailed metrics of the real-time system: syscall latencies, TCP connection lifetimes, user function execution time, etc.
  • It gets hooked on kernel events (kprobes, tracepoints, sockets, perf events…) without modifying the kernel's own source code.
  • It is designed to introduce a minimal impact on performance, running in a virtual machine within the kernel at a very low cost.

Why is eBPF so revolutionary?

  • Su very low overload This makes it ideal for production: by working within the kernel, it avoids many context switches between user space and kernel, minimizing unnecessary data traffic.
  • Offers a brutal flexibilityYou can write custom programs to monitor exactly what interests you, from network statistics per Kubernetes pod to latency per function in a specific microservice.
  • La security is guaranteed by a verifier that checks each program before allowing its execution, ensuring that there are no infinite loops, out-of-range accesses, or actions dangerous to the stability of the system.
  • It has a enormous range of applications: performance monitoring, deep network observability, security hardening, access control, advanced packet filtering, application profiling, etc.

How eBPF programs are executed

Normally the developer writes the logic in C or another high-level languageIt is compiled to eBPF bytecode using LLVM/Clang and loaded into the kernel via the bpf() syscall or tools such as bpftool, BCC or bpftrace.

Once loaded, the program is associated with a concrete hook: a kernel function, a tracepoint, a socket, a perf event, or a user function call (uprobe), and is triggered each time that event occurs, reading parameters, updating maps, and, if appropriate, returning data to user space.

Advanced features of eBPF

Creation of customized programs for detailed analysis

With eBPF you can build custom tools that measure, for example, the exact latency of each query in a database, or that track calls to a specific library of a service, without touching a single line of that service.

Real-time monitoring with no noticeable impact

By running in the kernel and filtering data there, eBPF allows a continuous monitoring with very low impact, ideal for critical systems where you cannot afford heavy agents or extremely verbose traces.

Most relevant use cases with eBPF

  • Application performance monitoring: locate which functions in your code are hogging resources, even if they are inside third-party libraries or in binaries without Symbols.
  • Fine network analysis: View complete TCP connection lifetime, packet statistics per service, detection of losses or retransmissions without resorting to massive packet captures.
  • Security and auditing: track suspicious system calls, unusual binary executions, or behaviors that fit intrusion patterns, generating real-time alerts.
  • Optimization of distributed systems: measure latency between microservices, internal queues, serialization and deserialization times, or the impact of load balancing on the application.
  How to Increase Internet Speed ​​in Windows

eBPF in detail: BPF, evolution and ecosystem

To understand eBPF, it's helpful to first understand the Classical GMP, originally designed as a packet filter within the kernel for tools such as tcpdump, Snort or Suricata, with the idea of ​​avoiding copying unwanted packages to user space.

BPF implements a small virtual machine in the kernel where programs are loaded that They spike in events, such as receiving a packet. Only packets that pass the filter reach the socket in user space, saving a considerable amount of CPU and memory.

Modern eBPF is so versatile that eBPF Foundation and the Linux Foundation They are pushing for its adoption in other OSso that it is not limited to Linux and similar tracing and monitoring capabilities can be enjoyed on other platforms.

GMP programs and standard rates

BPF programs are usually written in C, compiled to .oy objects, and loaded into the kernel using the syscall bpf()From there they are linked to different types of events depending on the intended use.

  • kprobe: allows you to monitor the execution of kernel functions and modules without needing to patch them, ideal for seeing what the system does internally.
  • upprobeIt hooks into user space functions, for example within sshd or a database, to observe arguments or execution times without recompiling anything.
  • tracepoints: predefined plot points in the kernel, very stable between versions, that expose high-level events such as process creation, disk I/O, or scheduler events.
  • perf_event: programs that are activated based on performance counters, thus combining the performance world with the flexible logic of eBPF.

BPF Maps and Links

BPF maps are data structures key-value in the kernel where programs store and share information: counters, histograms, complex structures, etc. User-space processes can read and write to these maps through the bpf() syscall or libraries like BCC.

The links connect specific BPF programs to instrumentation points; that is, they define which program runs at which eventThis separation of programs, maps, and links provides a lot of flexibility and allows for the reuse of components.

The eBPF ecosystem: BCC, bpftrace and other utilities

The power of eBPF wouldn't come standard without a mature ecosystem of tools, libraries and projects that make everyday life easier. Among them are BCC, bpftrace, libbpf, goebpf and larger projects such as Cilium Falco or Katran.

Thanks to these utilities, even teams that are not kernel experts can take advantage of them. eBPF's advanced capabilities for observability, security, and networking in a relatively simple way.

BCC Installation and Configuration

BCC (BPF Compiler Collection) is a set of tools and libraries that make it easier to write and run eBPF programs using primarily Python or CIt already includes dozens of ready-to-use scripts that provide statistics on network, CPU, disk, processes, etc.

In distributions like Ubuntu, the typical installation involves updating repositories and installing bpfcc-tools, kernel headers and Python bindings, then verifying the installation by running scripts like execsnoop to view real-time processes.

Practical examples with BCC

With BCC you have a arsenal of utilities pre-built tools that leverage eBPF to give you detailed metrics without complicated setup on your part.

  • tcplife: shows the life of TCP connections, with duration, bytes sent and received, ports, etc., very useful for understanding the behavior of network applications.
  • exec snoop: teaches in real time the commands that are running on the system, including user and PID, perfect for auditing or to see what's happening on a host under suspicion.
  • biolatency: builds an interactive histogram of disk I/O latency, helping to detect bottlenecks in storage which you would otherwise only see as "high I/O wait".

Introduction to bpftrace

bpftrace is a high-level tool for writing eBPF scripts with a syntax similar to AWK or DTraceIt is ideal for conducting quick experiments, ad-hoc analyses, and production testing without setting up a whole specific software program.

A very simple example is hooking into the entry tracepoint of the clone syscall and count calls per process with a one-liner, storing the results in a map indexed by the command name and displaying the count upon exit.

Other tools based on eBPF

In addition to BCC and bpftrace, there are more general projects that use eBPF under the hood to offer complete solutions in different areas.

  • libbpf: C library for working with eBPF programs efficiently and close to the kernel, used as a basis in many modern projects.
  • goebpf: a library that brings eBPF to the Go ecosystem, very popular for writing observability agents or network components in cloud native environments.
  • cilium: a networking and security platform for Kubernetes based on eBPF, which implements API-level policies and offers deep observability of traffic between pods.
  • Falcon: intrusion detection system that uses eBPF to monitor system calls and suspicious behavior, generating alerts when something deviates from the norm.
  • Katran: a high-performance load balancing library developed by Meta, which uses eBPF to efficiently and flexibly distribute traffic in large-scale environments.
  Snoop vs Catfish vs FSearch and other search engines on Linux

Detailed comparison between perf and eBPF

Although perf and eBPF are often used for similar purposes, their focus and capabilities They are very different, and understanding them helps to choose the right tool in each case or to combine them meaningfully.

Focus and main purpose

Perf is mainly aimed at CPU profiling and hardware events already defined by the processor and kernel. It's fast to use, comes integrated, and works perfectly for many classic optimization scenarios.

eBPF, on the other hand, focuses on offering a generic framework to execute custom logic in the kernel, allowing you to observe and modify very specific flows, from network packets to user function calls, with almost unlimited granularity.

Flexibility and customization

  • perfIt has a very useful set of events for analyzing CPU, memory, I/O, and software, but its customization capabilities are limited to choosing which events to track and how to group them. It's ideal when you want something ready to use without too many complications.
  • eGMPIt allows you to build your own programs to collect exactly the information you're looking for, filtering, aggregating, and processing data in the kernel. It requires more knowledge, but the adaptability It is far superior.

Impact on performance

  • perf This can represent a moderate burden when many events are enabled or samples are taken very frequently, especially in highly saturated systems or when prolonged tracing is performed.
  • eGMP It is designed to minimize impact; programs are verified and executed in a highly optimized environment within the kernel, filtering data as early as possible to avoid flooding user space, making it suitable for monitoring. still in production.

Compatibility and requirements

  • perf It is available in virtually all Linux distributions and works with relatively old kernels. It does not require any extra components for basic use.
  • eGMP It depends on a modern kernel (starting from version 4.9 for many of its advanced capabilities) and usually relies on additional tools such as BCC or bpftrace to make it usable without developing everything at a low level.

Integration of perf and eBPF for advanced analytics

The most interesting thing is not choosing between one or the other, but learn how to combine themPerf remains fantastic for a quick CPU-level overview, while eBPF is used to dig deeper where it hurts or to monitor elements that perf doesn't even see.

Benefits of combining both tools

  • Full system coverage: perf relies on hardware counters, while eBPF can cover everything from syscalls to application logic, thus obtaining a very complete picture of what is happening on your server.
  • Balance between speed and depthWith perf you can quickly detect that a certain process or function is consuming too much CPU and then use a script eBPF or bpftrace for break down the problem and understand the context in detail.
  • Controlled impact: perf can be used in short time windows for specific analyses, while eBPF is configured to collect only the essential metrics over a longer period, reducing overall overhead.
  • Integration into DevOps workflowsBoth tools can be incorporated into CI/CD pipelines, scheduled diagnostic tasks, or observability dashboards, generating reports that can be used to continuously optimize resources.

Challenges and best practices

Most common challenges

  • La learning curve The cost of eBPF is high: you have to understand its model of programming, the kernel verifier and the implications of hooking into certain hooks.
  • An aggressive perf configuration or poorly designed eBPF scripts can add unnecessary load, something especially delicate in critical systems with little resource slack.
  • On servers with very old kernels or outdated distributions, eBPF may be limited or simply unavailable, forcing the use of more traditional techniques.

Key good practices

  • Always try in development or pre-production environments Before deploying new eBPF scripts in production, verify stability and resource consumption.
  • Consciously restrict the amount of data collected; the more filters and aggregations you apply to the kernel, the less noise and less impact you will have.
  • Automate recurring analyses by combining perf and eBPF in scripts or playbooks that run standard diagnostics and generate comparable reports in There.
Linux 6.14-0
Related article:
Linux 6.14 brings improvements in performance, compatibility, and security