Jason Xing 77af22f93d net: xsk: introduce XDP_MAX_TX_SKB_BUDGET setsockopt
This patch provides a setsockopt method to let applications leverage to
adjust how many descs to be handled at most in one send syscall. It
mitigates the situation where the default value (32) that is too small
leads to higher frequency of triggering send syscall.

Considering the prosperity/complexity the applications have, there is no
absolutely ideal suggestion fitting all cases. So keep 32 as its default
value like before.

The patch does the following things:
- Add XDP_MAX_TX_SKB_BUDGET socket option.
- Set max_tx_budget to 32 by default in the initialization phase as a
  per-socket granular control.
- Set the range of max_tx_budget as [32, xs->tx->nentries].

The idea behind this comes out of real workloads in production. We use a
user-level stack with xsk support to accelerate sending packets and
minimize triggering syscalls. When the packets are aggregated, it's not
hard to hit the upper bound (namely, 32). The moment user-space stack
fetches the -EAGAIN error number passed from sendto(), it will loop to try
again until all the expected descs from tx ring are sent out to the driver.
Enlarging the XDP_MAX_TX_SKB_BUDGET value contributes to less frequency of
sendto() and higher throughput/PPS.

Here is what I did in production, along with some numbers as follows:
For one application I saw lately, I suggested using 128 as max_tx_budget
because I saw two limitations without changing any default configuration:
1) XDP_MAX_TX_SKB_BUDGET, 2) socket sndbuf which is 212992 decided by
net.core.wmem_default. As to XDP_MAX_TX_SKB_BUDGET, the scenario behind
this was I counted how many descs are transmitted to the driver at one
time of sendto() based on [1] patch and then I calculated the
possibility of hitting the upper bound. Finally I chose 128 as a
suitable value because 1) it covers most of the cases, 2) a higher
number would not bring evident results. After twisting the parameters,
a stable improvement of around 4% for both PPS and throughput and less
resources consumption were found to be observed by strace -c -p xxx:
1) %time was decreased by 7.8%
2) error counter was decreased from 18367 to 572

[1]: https://lore.kernel.org/all/20250619093641.70700-1-kerneljasonxing@gmail.com/

Signed-off-by: Jason Xing <kernelxing@tencent.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://patch.msgid.link/20250704160138.48677-1-kerneljasonxing@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-08-12 12:51:17 -07:00
2022-08-24 21:51:42 -07:00
2025-07-01 15:38:22 -07:00
2021-02-22 11:35:49 -08:00
2024-01-25 16:47:44 -08:00

libbpf Github Actions Builds & Tests Coverity CodeQL OSS-Fuzz Status Read the Docs

This is the official home of the libbpf library.

Please use this Github repository for building and packaging libbpf and when using it in your projects through Git submodule.

Libbpf authoritative source code is developed as part of bpf-next Linux source tree under tools/lib/bpf subdirectory and is periodically synced to Github. As such, all the libbpf changes should be sent to BPF mailing list, please don't open PRs here unless you are changing Github-specific parts of libbpf (e.g., Github-specific Makefile).

Libbpf and general BPF usage questions

Libbpf documentation can be found here. It's an ongoing effort and has ways to go, but please take a look and consider contributing as well.

Please check out libbpf-bootstrap and the companion blog post for the examples of building BPF applications with libbpf. libbpf-tools are also a good source of the real-world libbpf-based tracing tools.

See also "BPF CO-RE reference guide" for the coverage of practical aspects of building BPF CO-RE applications and "BPF CO-RE" for general introduction into BPF portability issues and BPF CO-RE origins.

All general BPF questions, including kernel functionality, libbpf APIs and their application, should be sent to bpf@vger.kernel.org mailing list. You can subscribe to it here and search its archive here. Please search the archive before asking new questions. It very well might be that this was already addressed or answered before.

bpf@vger.kernel.org is monitored by many more people and they will happily try to help you with whatever issue you have. This repository's PRs and issues should be opened only for dealing with issues pertaining to specific way this libbpf mirror repo is set up and organized.

Building libbpf

libelf is an internal dependency of libbpf and thus it is required to link against and must be installed on the system for applications to work. pkg-config is used by default to find libelf, and the program called can be overridden with PKG_CONFIG.

If using pkg-config at build time is not desired, it can be disabled by setting NO_PKG_CONFIG=1 when calling make.

To build both static libbpf.a and shared libbpf.so:

$ cd src
$ make

To build only static libbpf.a library in directory build/ and install them together with libbpf headers in a staging directory root/:

$ cd src
$ mkdir build root
$ BUILD_STATIC_ONLY=y OBJDIR=build DESTDIR=root make install

To build both static libbpf.a and shared libbpf.so against a custom libelf dependency installed in /build/root/ and install them together with libbpf headers in a build directory /build/root/:

$ cd src
$ PKG_CONFIG_PATH=/build/root/lib64/pkgconfig DESTDIR=/build/root make install

BPF CO-RE (Compile Once Run Everywhere)

Libbpf supports building BPF CO-RE-enabled applications, which, in contrast to BCC, do not require Clang/LLVM runtime being deployed to target servers and doesn't rely on kernel-devel headers being available.

It does rely on kernel to be built with BTF type information, though. Some major Linux distributions come with kernel BTF already built in:

  • Fedora 31+
  • RHEL 8.2+
  • OpenSUSE Tumbleweed (in the next release, as of 2020-06-04)
  • Arch Linux (from kernel 5.7.1.arch1-1)
  • Manjaro (from kernel 5.4 if compiled after 2021-06-18)
  • Ubuntu 20.10
  • Debian 11 (amd64/arm64)

If your kernel doesn't come with BTF built-in, you'll need to build custom kernel. You'll need:

  • pahole 1.16+ tool (part of dwarves package), which performs DWARF to BTF conversion;
  • kernel built with CONFIG_DEBUG_INFO_BTF=y option;
  • you can check if your kernel has BTF built-in by looking for /sys/kernel/btf/vmlinux file:
$ ls -la /sys/kernel/btf/vmlinux
-r--r--r--. 1 root root 3541561 Jun  2 18:16 /sys/kernel/btf/vmlinux

To develop and build BPF programs, you'll need Clang/LLVM 10+. The following distributions have Clang/LLVM 10+ packaged by default:

  • Fedora 32+
  • Ubuntu 20.04+
  • Arch Linux
  • Ubuntu 20.10 (LLVM 11)
  • Debian 11 (LLVM 11)
  • Alpine 3.13+

Otherwise, please make sure to update it on your system.

The following resources are useful to understand what BPF CO-RE is and how to use it:

Distributions

Distributions packaging libbpf from this mirror:

Benefits of packaging from the mirror over packaging from kernel sources:

  • Consistent versioning across distributions.
  • No ties to any specific kernel, transparent handling of older kernels. Libbpf is designed to be kernel-agnostic and work across multitude of kernel versions. It has built-in mechanisms to gracefully handle older kernels, that are missing some of the features, by working around or gracefully degrading functionality. Thus libbpf is not tied to a specific kernel version and can/should be packaged and versioned independently.
  • Continuous integration testing via GitHub Actions.
  • Static code analysis via LGTM and Coverity.

Package dependencies of libbpf, package names may vary across distros:

  • zlib
  • libelf

libbpf distro packaging status

bpf-next to Github sync

All the gory details of syncing can be found in scripts/sync-kernel.sh script. See SYNC.md for instruction.

Some header files in this repo (include/linux/*.h) are reduced versions of their counterpart files at bpf-next's tools/include/linux/*.h to make compilation successful.

License

This work is dual-licensed under BSD 2-clause license and GNU LGPL v2.1 license. You can choose between one of them if you use this work.

SPDX-License-Identifier: BSD-2-Clause OR LGPL-2.1

Description
Automated upstream mirror for libbpf stand-alone build.
Readme 13 MiB
Languages
C 98.3%
Shell 1.4%
Makefile 0.3%