Compare commits

..

220 Commits

Author SHA1 Message Date
thiagoftsm
057f85d000 Merge branch 'libbpf:master' into master
Some checks failed
libbpf-build / Debian Build (${{ matrix.name }}) (ASan+UBSan, RUN_ASAN) (push) Has been cancelled
libbpf-build / Debian Build (${{ matrix.name }}) (clang ASan+UBSan, RUN_CLANG_ASAN) (push) Has been cancelled
libbpf-build / Debian Build (${{ matrix.name }}) (clang, RUN_CLANG) (push) Has been cancelled
libbpf-build / Debian Build (${{ matrix.name }}) (clang-14, RUN_CLANG14) (push) Has been cancelled
libbpf-build / Debian Build (${{ matrix.name }}) (clang-15, RUN_CLANG15) (push) Has been cancelled
libbpf-build / Debian Build (${{ matrix.name }}) (clang-16, RUN_CLANG16) (push) Has been cancelled
libbpf-build / Debian Build (${{ matrix.name }}) (default, RUN) (push) Has been cancelled
libbpf-build / Debian Build (${{ matrix.name }}) (gcc-10 ASan+UBSan, RUN_GCC10_ASAN) (push) Has been cancelled
libbpf-build / Debian Build (${{ matrix.name }}) (gcc-10, RUN_GCC10) (push) Has been cancelled
libbpf-build / Debian Build (${{ matrix.name }}) (gcc-11, RUN_GCC11) (push) Has been cancelled
libbpf-build / Debian Build (${{ matrix.name }}) (gcc-12, RUN_GCC12) (push) Has been cancelled
libbpf-ci / Kernel ${{ matrix.kernel }} on ${{ matrix.arch }} + selftests (s390x, LATEST, [s390x docker-noble-main]) (push) Has been cancelled
libbpf-ci / Kernel ${{ matrix.kernel }} on ${{ matrix.arch }} + selftests (x86_64, 4.9.0, ubuntu-24.04) (push) Has been cancelled
libbpf-ci / Kernel ${{ matrix.kernel }} on ${{ matrix.arch }} + selftests (x86_64, 5.5.0, ubuntu-24.04) (push) Has been cancelled
libbpf-ci / Kernel ${{ matrix.kernel }} on ${{ matrix.arch }} + selftests (x86_64, LATEST, ubuntu-24.04) (push) Has been cancelled
libbpf-build / Ubuntu Build (${{ matrix.arch }}) (aarch64) (push) Has been cancelled
libbpf-build / Ubuntu Build (${{ matrix.arch }}) (ppc64le) (push) Has been cancelled
libbpf-build / Ubuntu Build (${{ matrix.arch }}) (s390x) (push) Has been cancelled
libbpf-build / Ubuntu Build (${{ matrix.arch }}) (x86) (push) Has been cancelled
CIFuzz / Fuzzing (address) (push) Has been cancelled
CIFuzz / Fuzzing (memory) (push) Has been cancelled
CIFuzz / Fuzzing (undefined) (push) Has been cancelled
CodeQL / Analyze (cpp) (push) Has been cancelled
CodeQL / Analyze (python) (push) Has been cancelled
lint / ShellCheck (push) Has been cancelled
pahole-staging / Kernel LATEST + staging pahole (push) Has been cancelled
libbpf-ci-coverity / Coverity (push) Has been cancelled
2024-09-04 01:21:11 +00:00
Andrii Nakryiko
caa17bdcbf ci: regenerate vmlinux.h
Regenerated latest vmlinux.h for old kernels.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-08-30 16:29:01 -07:00
Andrii Nakryiko
76c9f50f3e sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   ec5b8c76ab1c6d163762d60cfbedcd27e7527144
Checkpoint bpf-next commit: 2ad6d23f465a4f851e3bcf6d74c315ce7b2c205b
Baseline bpf commit:        e1533b6319ab9c3a97dad314dd88b3783bc41b69
Checkpoint bpf commit:      b408473ea01b2e499d23503e2bf898416da9d7ac

Alan Maguire (1):
  libbpf: Fix license for btf_relocate.c

Andrii Nakryiko (2):
  libbpf: Fix no-args func prototype BTF dumping syntax
  libbpf: Fix bpf_object__open_skeleton()'s mishandling of options

David Vernet (1):
  libbpf: Don't take direct pointers into BTF data from st_ops

Jordan Rome (1):
  bpf: Add bpf_copy_from_user_str kfunc

Kan Liang (1):
  perf/x86/intel: Support new data source for Lunar Lake

Sam James (1):
  libbpf: Workaround -Wmaybe-uninitialized false positive

Stanislav Fomichev (1):
  selftests/bpf: Add XDP_UMEM_TX_METADATA_LEN to XSK TX metadata test

Tony Ambardar (1):
  libbpf: Ensure new BTF objects inherit input endianness

 include/uapi/linux/bpf.h        |  9 ++++
 include/uapi/linux/if_xdp.h     |  4 ++
 include/uapi/linux/perf_event.h |  6 ++-
 src/btf.c                       |  4 ++
 src/btf_dump.c                  |  8 ++--
 src/btf_relocate.c              |  2 +-
 src/elf.c                       |  3 ++
 src/libbpf.c                    | 75 ++++++++++++++-------------------
 8 files changed, 62 insertions(+), 49 deletions(-)

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-08-30 16:29:01 -07:00
Tony Ambardar
fe28fae57a libbpf: Ensure new BTF objects inherit input endianness
New split BTF needs to preserve base's endianness. Similarly, when
creating a distilled BTF, we need to preserve original endianness.

Fix by updating libbpf's btf__distill_base() and btf_new_empty() to retain
the byte order of any source BTF objects when creating new ones.

Fixes: ba451366bf44 ("libbpf: Implement basic split BTF support")
Fixes: 58e185a0dc35 ("libbpf: Add btf__distill_base() creating split BTF with distilled base BTF")
Reported-by: Song Liu <song@kernel.org>
Reported-by: Eduard Zingerman <eddyz87@gmail.com>
Suggested-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/6358db36c5f68b07873a0a5be2d062b1af5ea5f8.camel@gmail.com/
Link: https://lore.kernel.org/bpf/20240830095150.278881-1-tony.ambardar@gmail.com
2024-08-30 16:29:01 -07:00
Andrii Nakryiko
f6f24022d3 libbpf: Fix bpf_object__open_skeleton()'s mishandling of options
We do an ugly copying of options in bpf_object__open_skeleton() just to
be able to set object name from skeleton's recorded name (while still
allowing user to override it through opts->object_name).

This is not just ugly, but it also is broken due to memcpy() that
doesn't take into account potential skel_opts' and user-provided opts'
sizes differences due to backward and forward compatibility. This leads
to copying over extra bytes and then failing to validate options
properly. It could, technically, lead also to SIGSEGV, if we are unlucky.

So just get rid of that memory copy completely and instead pass
default object name into bpf_object_open() directly, simplifying all
this significantly. The rule now is that obj_name should be non-NULL for
bpf_object_open() when called with in-memory buffer, so validate that
explicitly as well.

We adopt bpf_object__open_mem() to this as well and generate default
name (based on buffer memory address and size) outside of bpf_object_open().

Fixes: d66562fba1ce ("libbpf: Add BPF object skeleton support")
Reported-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Daniel Müller <deso@posteo.net>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240827203721.1145494-1-andrii@kernel.org
2024-08-30 16:29:01 -07:00
Jordan Rome
4bd31a1044 bpf: Add bpf_copy_from_user_str kfunc
This adds a kfunc wrapper around strncpy_from_user,
which can be called from sleepable BPF programs.

This matches the non-sleepable 'bpf_probe_read_user_str'
helper except it includes an additional 'flags'
param, which allows consumers to clear the entire
destination buffer on success or failure.

Signed-off-by: Jordan Rome <linux@jordanrome.com>
Link: https://lore.kernel.org/r/20240823195101.3621028-1-linux@jordanrome.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-08-30 16:29:01 -07:00
Sam James
33b22671c2 libbpf: Workaround -Wmaybe-uninitialized false positive
In `elf_close`, we get this with GCC 15 -O3 (at least):
```
In function ‘elf_close’,
    inlined from ‘elf_close’ at elf.c:53:6,
    inlined from ‘elf_find_func_offset_from_file’ at elf.c:384:2:
elf.c:57:9: warning: ‘elf_fd.elf’ may be used uninitialized [-Wmaybe-uninitialized]
   57 |         elf_end(elf_fd->elf);
      |         ^~~~~~~~~~~~~~~~~~~~
elf.c: In function ‘elf_find_func_offset_from_file’:
elf.c:377:23: note: ‘elf_fd.elf’ was declared here
  377 |         struct elf_fd elf_fd;
      |                       ^~~~~~
In function ‘elf_close’,
    inlined from ‘elf_close’ at elf.c:53:6,
    inlined from ‘elf_find_func_offset_from_file’ at elf.c:384:2:
elf.c:58:9: warning: ‘elf_fd.fd’ may be used uninitialized [-Wmaybe-uninitialized]
   58 |         close(elf_fd->fd);
      |         ^~~~~~~~~~~~~~~~~
elf.c: In function ‘elf_find_func_offset_from_file’:
elf.c:377:23: note: ‘elf_fd.fd’ was declared here
  377 |         struct elf_fd elf_fd;
      |                       ^~~~~~
```

In reality, our use is fine, it's just that GCC doesn't model errno
here (see linked GCC bug). Suppress -Wmaybe-uninitialized accordingly
by initializing elf_fd.fd to -1 and elf_fd.elf to NULL.

I've done this in two other functions as well given it could easily
occur there too (same access/use pattern).

Signed-off-by: Sam James <sam@gentoo.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://gcc.gnu.org/PR114952
Link: https://lore.kernel.org/bpf/14ec488a1cac02794c2fa2b83ae0cef1bce2cb36.1723578546.git.sam@gentoo.org
2024-08-30 16:29:01 -07:00
Alan Maguire
8b29484790 libbpf: Fix license for btf_relocate.c
License should be

// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)

...as with other libbpf files.

Fixes: 19e00c897d50 ("libbpf: Split BTF relocation")
Reported-by: Neill Kapron <nkapron@google.com>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20240810093504.2111134-1-alan.maguire@oracle.com
2024-08-30 16:29:01 -07:00
David Vernet
7b5237996a libbpf: Don't take direct pointers into BTF data from st_ops
In struct bpf_struct_ops, we have take a pointer to a BTF type name, and
a struct btf_type. This was presumably done for convenience, but can
actually result in subtle and confusing bugs given that BTF data can be
invalidated before a program is loaded. For example, in sched_ext, we
may sometimes resize a data section after a skeleton has been opened,
but before the struct_ops scheduler map has been loaded. This may cause
the BTF data to be realloc'd, which can then cause a UAF when loading
the program because the struct_ops map has pointers directly into the
BTF data.

We're already storing the BTF type_id in struct bpf_struct_ops. Because
type_id is stable, we can therefore just update the places where we were
looking at those pointers to instead do the lookups we need from the
type_id.

Fixes: 590a00888250 ("bpf: libbpf: Add STRUCT_OPS support")
Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240724171459.281234-1-void@manifault.com
2024-08-30 16:29:01 -07:00
Stanislav Fomichev
a89e519b40 selftests/bpf: Add XDP_UMEM_TX_METADATA_LEN to XSK TX metadata test
This flag is now required to use tx_metadata_len.

Fixes: 40808a237d9c ("selftests/bpf: Add TX side to xdp_metadata")
Reported-by: Julian Schindel <mail@arctic-alpaca.de>
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20240713015253.121248-3-sdf@fomichev.me
2024-08-30 16:29:01 -07:00
Andrii Nakryiko
205e86de8b libbpf: Fix no-args func prototype BTF dumping syntax
For all these years libbpf's BTF dumper has been emitting not strictly
valid syntax for function prototypes that have no input arguments.

Instead of `int (*blah)()` we should emit `int (*blah)(void)`.

This is not normally a problem, but it manifests when we get kfuncs in
vmlinux.h that have no input arguments. Due to compiler internal
specifics, we get no BTF information for such kfuncs, if they are not
declared with proper `(void)`.

The fix is trivial. We also need to adjust a few ancient tests that
happily assumed `()` is correct.

Fixes: 351131b51c7a ("libbpf: add btf_dump API for BTF-to-C conversion")
Reported-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://lore.kernel.org/bpf/20240712224442.282823-1-andrii@kernel.org
2024-08-30 16:29:01 -07:00
Kan Liang
86fc78bd2b perf/x86/intel: Support new data source for Lunar Lake
A new PEBS data source format is introduced for the p-core of Lunar
Lake. The data source field is extended to 8 bits with new encodings.

A new layout is introduced into the union intel_x86_pebs_dse.
Introduce the lnl_latency_data() to parse the new format.
Enlarge the pebs_data_source[] accordingly to include new encodings.

Only the mem load and the mem store events can generate the data source.
Introduce INTEL_HYBRID_LDLAT_CONSTRAINT and
INTEL_HYBRID_STLAT_CONSTRAINT to mark them.

Add two new bits for the new cache-related data src, L2_MHB and MSC.
The L2_MHB is short for L2 Miss Handling Buffer, which is similar to
LFB (Line Fill Buffer), but to track the L2 Cache misses.
The MSC stands for the memory-side cache.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lkml.kernel.org/r/20240626143545.480761-6-kan.liang@linux.intel.com
2024-08-30 16:29:01 -07:00
Andrii Nakryiko
20ccbb303a ci: take into account common local DENYLIST/ALLOWLIST
Similar to naming convention in BPF selftests.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-08-30 09:14:00 -07:00
chantra
26443a6d43 ci: fix test job names
* use the architecture name in job name instead of `runs_on` labels

Signed-off-by: Manu Bretelle <chantr4@gmail.com>
2024-08-29 10:58:52 +01:00
Andrii Nakryiko
22ec3eb15d ci: deny verify_pkcs7_sig as it keeps failing
This has nothing to do with libbpf and is probably failing due to
environment setup.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-08-27 12:51:55 -07:00
Manu Bretelle
bc24cd126a ci: run test on Ubuntu 24.04
Signed-off-by: Manu Bretelle <chantr4@gmail.com>
2024-08-22 12:59:18 -07:00
Manu Bretelle
92316f5072 ci: Pass llvm-version as an input and enforce passing it to build-selftests action
Signed-off-by: Manu Bretelle <chantr4@gmail.com>
2024-08-21 16:04:36 -07:00
Manu Bretelle
a73c6f7f80 ci: Use llvm repositories matching the host we are running on
As this will change to a Ubuntu 24.04 runner, we want this to automatically detect
which ubuntu version it is running on.

Signed-off-by: Manu Bretelle <chantr4@gmail.com>
2024-08-21 16:04:36 -07:00
Manu Bretelle
8e47e755cd ci: bump default llvm version to 17
Ubuntu 24.04's minimum llvm version is 17. Bumping this now to limit changes later.

Signed-off-by: Manu Bretelle <chantr4@gmail.com>
2024-08-21 16:04:36 -07:00
Manu Bretelle
ec0d0fda8b ci: lock down s390x CI to Ubuntu 20.04 runners
I am working on upgrading to 24.04 runners. In order to make sure that current jobs are scheduled
on Ubuntu 20.04, we need to ask for runners with tag `docker-main`, which is currently
set by those old runners.
Later, we will be able to switch this tag to `docker-noble-main` which are Ubuntu 24.04 runners.

Signed-off-by: Manu Bretelle <chantr4@gmail.com>
2024-08-21 16:04:36 -07:00
Ivan Shapovalov
b07dfe3b2a Makefile: ensure $(OBJDIR) is created before writing to it
Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name>
2024-07-29 14:05:05 -07:00
thiagoftsm
6923eb970e Merge branch 'libbpf:master' into master 2024-07-12 00:47:44 +00:00
Andrii Nakryiko
686f600bca sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   a12978712d9001b060bcc10eaae42ad5102abe2b
Checkpoint bpf-next commit: ec5b8c76ab1c6d163762d60cfbedcd27e7527144
Baseline bpf commit:        b1c4b4d45263241ec6c2405a8df8265d4b58e707
Checkpoint bpf commit:      e1533b6319ab9c3a97dad314dd88b3783bc41b69

Alan Maguire (1):
  libbpf: Fix error handling in btf__distill_base()

Andreas Ziegler (1):
  libbpf: Add NULL checks to bpf_object__{prev_map,next_map}

Andrii Nakryiko (2):
  libbpf: fix BPF skeleton forward/backward compat handling
  libbpf: improve old BPF skeleton handling for map auto-attach

 src/btf.c    |  2 +-
 src/libbpf.c | 75 +++++++++++++++++++++++++++++-----------------------
 2 files changed, 43 insertions(+), 34 deletions(-)

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-07-10 14:22:00 -07:00
Andrii Nakryiko
726d7f3722 sync: update .mailmap
Update .mailmap based on libbpf's list of contributors and on the latest
.mailmap version in the upstream repository.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-07-10 14:22:00 -07:00
Andrii Nakryiko
e6f1ae2557 libbpf: improve old BPF skeleton handling for map auto-attach
Improve how we handle old BPF skeletons when it comes to BPF map
auto-attachment. Emit one warn-level message per each struct_ops map
that could have been auto-attached, if user provided recent enough BPF
skeleton version. Don't spam log if there are no relevant struct_ops
maps, though.

This should help users realize that they probably need to regenerate BPF
skeleton header with more recent bpftool/libbpf-cargo (or whatever other
means of BPF skeleton generation).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20240708204540.4188946-4-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-07-10 14:22:00 -07:00
Andrii Nakryiko
bf7ddbef99 libbpf: fix BPF skeleton forward/backward compat handling
BPF skeleton was designed from day one to be extensible. Generated BPF
skeleton code specifies actual sizes of map/prog/variable skeletons for
that reason and libbpf is supposed to work with newer/older versions
correctly.

Unfortunately, it was missed that we implicitly embed hard-coded most
up-to-date (according to libbpf's version of libbpf.h header used to
compile BPF skeleton header) sizes of those structs, which can differ
from the actual sizes at runtime when libbpf is used as a shared
library.

We have a few places were we just index array of maps/progs/vars, which
implicitly uses these potentially invalid sizes of structs.

This patch aims to fix this problem going forward. Once this lands,
we'll backport these changes in Github repo to create patched releases
for older libbpfs.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Fixes: d66562fba1ce ("libbpf: Add BPF object skeleton support")
Fixes: 430025e5dca5 ("libbpf: Add subskeleton scaffolding")
Fixes: 08ac454e258e ("libbpf: Auto-attach struct_ops BPF maps in BPF skeleton")
Co-developed-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240708204540.4188946-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-07-10 14:22:00 -07:00
Andreas Ziegler
1867490d8f libbpf: Add NULL checks to bpf_object__{prev_map,next_map}
In the current state, an erroneous call to
bpf_object__find_map_by_name(NULL, ...) leads to a segmentation
fault through the following call chain:

  bpf_object__find_map_by_name(obj = NULL, ...)
  -> bpf_object__for_each_map(pos, obj = NULL)
  -> bpf_object__next_map((obj = NULL), NULL)
  -> return (obj = NULL)->maps

While calling bpf_object__find_map_by_name with obj = NULL is
obviously incorrect, this should not lead to a segmentation
fault but rather be handled gracefully.

As __bpf_map__iter already handles this situation correctly, we
can delegate the check for the regular case there and only add
a check in case the prev or next parameter is NULL.

Signed-off-by: Andreas Ziegler <ziegler.andreas@siemens.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240703083436.505124-1-ziegler.andreas@siemens.com
2024-07-10 14:22:00 -07:00
Alan Maguire
24aca0740b libbpf: Fix error handling in btf__distill_base()
Coverity points out that after calling btf__new_empty_split() the wrong
value is checked for error.

Fixes: 58e185a0dc35 ("libbpf: Add btf__distill_base() creating split BTF with distilled base BTF")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240629100058.2866763-1-alan.maguire@oracle.com
2024-07-10 14:22:00 -07:00
Andrii Nakryiko
c1a6c770c4 libbpf: add btf_iter.o and btf_relocate.o to Makefile
Upstream libbpf got two new .c files, make sure they are built with
Github Makefile as well.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-06-27 10:01:42 -07:00
Andrii Nakryiko
223cd2273e sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   531876c80004ecff7bfdbd8ba6c6b48835ef5e22
Checkpoint bpf-next commit: a12978712d9001b060bcc10eaae42ad5102abe2b
Baseline bpf commit:        62da3acd28955e7299babebdfcb14243b789e773
Checkpoint bpf commit:      b1c4b4d45263241ec6c2405a8df8265d4b58e707

Alan Maguire (6):
  libbpf: Add btf__distill_base() creating split BTF with distilled base
    BTF
  libbpf: Split BTF relocation
  libbpf: BTF relocation followup fixing naming, loop logic
  libbpf: Split field iter code into its own file kernel
  libbpf,bpf: Share BTF relocate-related code with kernel
  libbpf: Fix clang compilation error in btf_relocate.c

Andrii Nakryiko (4):
  libbpf: Add BTF field iterator
  libbpf: Make use of BTF field iterator in BPF linker code
  libbpf: Make use of BTF field iterator in BTF handling code
  libbpf: Remove callback-based type/string BTF field visitor helpers

Antoine Tenart (1):
  libbpf: Skip base btf sanity checks

Donglin Peng (1):
  libbpf: Checking the btf_type kind when fixing variable offsets

Eduard Zingerman (1):
  libbpf: Make btf_parse_elf process .BTF.base transparently

Mykyta Yatsenko (1):
  libbpf: Auto-attach struct_ops BPF maps in BPF skeleton

Vadim Fedorenko (1):
  bpf: Add CHECKSUM_COMPLETE to bpf test progs

 include/uapi/linux/bpf.h |   2 +
 src/btf.c                | 696 +++++++++++++++++++++++++++------------
 src/btf.h                |  36 ++
 src/btf_iter.c           | 177 ++++++++++
 src/btf_relocate.c       | 519 +++++++++++++++++++++++++++++
 src/libbpf.c             |  64 +++-
 src/libbpf.h             |  18 +
 src/libbpf.map           |   4 +
 src/libbpf_internal.h    |  29 +-
 src/linker.c             |  69 ++--
 10 files changed, 1378 insertions(+), 236 deletions(-)
 create mode 100644 src/btf_iter.c
 create mode 100644 src/btf_relocate.c

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-06-27 10:01:42 -07:00
Andrii Nakryiko
dcd076347c sync: update .mailmap
Update .mailmap based on libbpf's list of contributors and on the latest
.mailmap version in the upstream repository.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-06-27 10:01:42 -07:00
Alan Maguire
e4982342e7 libbpf: Fix clang compilation error in btf_relocate.c
When building with clang for ARCH=i386, the following errors are
observed:

  CC      kernel/bpf/btf_relocate.o
./tools/lib/bpf/btf_relocate.c:206:23: error: implicit truncation from 'int' to a one-bit wide bit-field changes value from 1 to -1 [-Werror,-Wsingle-bit-bitfield-constant-conversion]
  206 |                 info[id].needs_size = true;
      |                                     ^ ~
./tools/lib/bpf/btf_relocate.c:256:25: error: implicit truncation from 'int' to a one-bit wide bit-field changes value from 1 to -1 [-Werror,-Wsingle-bit-bitfield-constant-conversion]
  256 |                         base_info.needs_size = true;
      |                                              ^ ~
2 errors generated.

The problem is we use 1-bit, 31-bit bitfields in a signed int.
Changing to

	bool needs_size: 1;
	unsigned int size:31;

...resolves the error and pahole reports that 4 bytes are used
for the underlying representation:

$ pahole btf_name_info tools/lib/bpf/btf_relocate.o
struct btf_name_info {
	const char  *              name;                 /*     0     8 */
	unsigned int               needs_size:1;         /*     8: 0  4 */
	unsigned int               size:31;              /*     8: 1  4 */
	__u32                      id;                   /*    12     4 */

	/* size: 16, cachelines: 1, members: 4 */
	/* last cacheline: 16 bytes */
};

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240624192903.854261-1-alan.maguire@oracle.com
2024-06-27 10:01:42 -07:00
Antoine Tenart
95c63a08f2 libbpf: Skip base btf sanity checks
When upgrading to libbpf 1.3 we noticed a big performance hit while
loading programs using CORE on non base-BTF symbols. This was tracked
down to the new BTF sanity check logic. The issue is the base BTF
definitions are checked first for the base BTF and then again for every
module BTF.

Loading 5 dummy programs (using libbpf-rs) that are using CORE on a
non-base BTF symbol on my system:
- Before this fix: 3s.
- With this fix: 0.1s.

Fix this by only checking the types starting at the BTF start id. This
should ensure the base BTF is still checked as expected but only once
(btf->start_id == 1 when creating the base BTF), and then only
additional types are checked for each module BTF.

Fixes: 3903802bb99a ("libbpf: Add basic BTF sanity validation")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20240624090908.171231-1-atenart@kernel.org
2024-06-27 10:01:42 -07:00
Alan Maguire
27f0169332 libbpf,bpf: Share BTF relocate-related code with kernel
Share relocation implementation with the kernel.  As part of this,
we also need the type/string iteration functions so also share
btf_iter.c file. Relocation code in kernel and userspace is identical
save for the impementation of the reparenting of split BTF to the
relocated base BTF and retrieval of the BTF header from "struct btf";
these small functions need separate user-space and kernel implementations
for the separate "struct btf"s they operate upon.

One other wrinkle on the kernel side is we have to map .BTF.ids in
modules as they were generated with the type ids used at BTF encoding
time. btf_relocate() optionally returns an array mapping from old BTF
ids to relocated ids, so we use that to fix up these references where
needed for kfuncs.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240620091733.1967885-5-alan.maguire@oracle.com
2024-06-27 10:01:42 -07:00
Alan Maguire
4ffb92e204 libbpf: Split field iter code into its own file kernel
This will allow it to be shared with the kernel.  No functional change.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240620091733.1967885-4-alan.maguire@oracle.com
2024-06-27 10:01:42 -07:00
Alan Maguire
bc021a8b42 libbpf: BTF relocation followup fixing naming, loop logic
Use less verbose names in BTF relocation code and fix off-by-one error
and typo in btf_relocate.c.  Simplify loop over matching distilled
types, moving from assigning a _next value in loop body to moving
match check conditions into the guard.

Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240620091733.1967885-2-alan.maguire@oracle.com
2024-06-27 10:01:42 -07:00
Donglin Peng
88a0787335 libbpf: Checking the btf_type kind when fixing variable offsets
I encountered an issue when building the test_progs from the repository [1]:

  $ pwd
  /work/Qemu/x86_64/linux-6.10-rc2/tools/testing/selftests/bpf/

  $ make test_progs V=1
  [...]
  ./tools/sbin/bpftool gen object ./ip_check_defrag.bpf.linked2.o ./ip_check_defrag.bpf.linked1.o
  libbpf: failed to find symbol for variable 'bpf_dynptr_slice' in section '.ksyms'
  Error: failed to link './ip_check_defrag.bpf.linked1.o': No such file or directory (2)
  [...]

Upon investigation, I discovered that the btf_types referenced in the '.ksyms'
section had a kind of BTF_KIND_FUNC instead of BTF_KIND_VAR:

  $ bpftool btf dump file ./ip_check_defrag.bpf.linked1.o
  [...]
  [2] DATASEC '.ksyms' size=0 vlen=2
        type_id=16 offset=0 size=0 (FUNC 'bpf_dynptr_from_skb')
        type_id=17 offset=0 size=0 (FUNC 'bpf_dynptr_slice')
  [...]
  [16] FUNC 'bpf_dynptr_from_skb' type_id=82 linkage=extern
  [17] FUNC 'bpf_dynptr_slice' type_id=85 linkage=extern
  [...]

For a detailed analysis, please refer to [2]. We can add a kind checking to
fix the issue.

  [1] https://github.com/eddyz87/bpf/tree/binsort-btf-dedup
  [2] https://lore.kernel.org/all/0c0ef20c-c05e-4db9-bad7-2cbc0d6dfae7@oracle.com/

Fixes: 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support")
Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240619122355.426405-1-dolinux.peng@gmail.com
2024-06-27 10:01:42 -07:00
Eduard Zingerman
4bc5a64933 libbpf: Make btf_parse_elf process .BTF.base transparently
Update btf_parse_elf() to check if .BTF.base section is present.
The logic is as follows:

  if .BTF.base section exists:
     distilled_base := btf_new(.BTF.base)
  if distilled_base:
     btf := btf_new(.BTF, .base_btf=distilled_base)
     if base_btf:
        btf_relocate(btf, base_btf)
  else:
     btf := btf_new(.BTF)
  return btf

In other words:
- if .BTF.base section exists, load BTF from it and use it as a base
  for .BTF load;
- if base_btf is specified and .BTF.base section exist, relocate newly
  loaded .BTF against base_btf.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240613095014.357981-6-alan.maguire@oracle.com
2024-06-27 10:01:42 -07:00
Alan Maguire
2afe409348 libbpf: Split BTF relocation
Map distilled base BTF type ids referenced in split BTF and their
references to the base BTF passed in, and if the mapping succeeds,
reparent the split BTF to the base BTF.

Relocation is done by first verifying that distilled base BTF
only consists of named INT, FLOAT, ENUM, FWD, STRUCT and
UNION kinds; then we sort these to speed lookups.  Once sorted,
the base BTF is iterated, and for each relevant kind we check
for an equivalent in distilled base BTF.  When found, the
mapping from distilled -> base BTF id and string offset is recorded.
In establishing mappings, we need to ensure we check STRUCT/UNION
size when the STRUCT/UNION is embedded in a split BTF STRUCT/UNION,
and when duplicate names exist for the same STRUCT/UNION.  Otherwise
size is ignored in matching STRUCT/UNIONs.

Once all mappings are established, we can update type ids
and string offsets in split BTF and reparent it to the new base.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240613095014.357981-4-alan.maguire@oracle.com
2024-06-27 10:01:42 -07:00
Alan Maguire
36cb1ad3ae libbpf: Add btf__distill_base() creating split BTF with distilled base BTF
To support more robust split BTF, adding supplemental context for the
base BTF type ids that split BTF refers to is required.  Without such
references, a simple shuffling of base BTF type ids (without any other
significant change) invalidates the split BTF.  Here the attempt is made
to store additional context to make split BTF more robust.

This context comes in the form of distilled base BTF providing minimal
information (name and - in some cases - size) for base INTs, FLOATs,
STRUCTs, UNIONs, ENUMs and ENUM64s along with modified split BTF that
points at that base and contains any additional types needed (such as
TYPEDEF, PTR and anonymous STRUCT/UNION declarations).  This
information constitutes the minimal BTF representation needed to
disambiguate or remove split BTF references to base BTF.  The rules
are as follows:

- INT, FLOAT, FWD are recorded in full.
- if a named base BTF STRUCT or UNION is referred to from split BTF, it
  will be encoded as a zero-member sized STRUCT/UNION (preserving
  size for later relocation checks).  Only base BTF STRUCT/UNIONs
  that are either embedded in split BTF STRUCT/UNIONs or that have
  multiple STRUCT/UNION instances of the same name will _need_ size
  checks at relocation time, but as it is possible a different set of
  types will be duplicates in the later to-be-resolved base BTF,
  we preserve size information for all named STRUCT/UNIONs.
- if an ENUM[64] is named, a ENUM forward representation (an ENUM
  with no values) of the same size is used.
- in all other cases, the type is added to the new split BTF.

Avoiding struct/union/enum/enum64 expansion is important to keep the
distilled base BTF representation to a minimum size.

When successful, new representations of the distilled base BTF and new
split BTF that refers to it are returned.  Both need to be freed by the
caller.

So to take a simple example, with split BTF with a type referring
to "struct sk_buff", we will generate distilled base BTF with a
0-member STRUCT sk_buff of the appropriate size, and the split BTF
will refer to it instead.

Tools like pahole can utilize such split BTF to populate the .BTF
section (split BTF) and an additional .BTF.base section.  Then
when the split BTF is loaded, the distilled base BTF can be used
to relocate split BTF to reference the current (and possibly changed)
base BTF.

So for example if "struct sk_buff" was id 502 when the split BTF was
originally generated,  we can use the distilled base BTF to see that
id 502 refers to a "struct sk_buff" and replace instances of id 502
with the current (relocated) base BTF sk_buff type id.

Distilled base BTF is small; when building a kernel with all modules
using distilled base BTF as a test, overall module size grew by only
5.3Mb total across ~2700 modules.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240613095014.357981-2-alan.maguire@oracle.com
2024-06-27 10:01:42 -07:00
Vadim Fedorenko
0a66859bf1 bpf: Add CHECKSUM_COMPLETE to bpf test progs
Add special flag to validate that TC BPF program properly updates
checksum information in skb.

Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240606145851.229116-1-vadfed@meta.com
2024-06-27 10:01:42 -07:00
Mykyta Yatsenko
be998aa3d4 libbpf: Auto-attach struct_ops BPF maps in BPF skeleton
Similarly to `bpf_program`, support `bpf_map` automatic attachment in
`bpf_object__attach_skeleton`. Currently only struct_ops maps could be
attached.

On bpftool side, code-generate links in skeleton struct for struct_ops maps.
Similarly to `bpf_program_skeleton`, set links in `bpf_map_skeleton`.

On libbpf side, extend `bpf_map` with new `autoattach` field to support
enabling or disabling autoattach functionality, introducing
getter/setter for this field.

`bpf_object__(attach|detach)_skeleton` is extended with
attaching/detaching struct_ops maps logic.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240605175135.117127-1-yatsenko@meta.com
2024-06-27 10:01:42 -07:00
Andrii Nakryiko
78c78e90cd libbpf: Remove callback-based type/string BTF field visitor helpers
Now that all libbpf/bpftool code switched to btf_field_iter, remove
btf_type_visit_type_ids() and btf_type_visit_str_offs() callback-based
helpers as not needed anymore.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20240605001629.4061937-6-andrii@kernel.org
2024-06-27 10:01:42 -07:00
Andrii Nakryiko
dd19c7ef77 libbpf: Make use of BTF field iterator in BTF handling code
Use new BTF field iterator logic to replace all the callback-based
visitor calls. There is still a .BTF.ext callback-based visitor APIs
that should be converted, which will happens as a follow up.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20240605001629.4061937-4-andrii@kernel.org
2024-06-27 10:01:42 -07:00
Andrii Nakryiko
13182b94f3 libbpf: Make use of BTF field iterator in BPF linker code
Switch all BPF linker code dealing with iterating BTF type ID and string
offset fields to new btf_field_iter facilities.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20240605001629.4061937-3-andrii@kernel.org
2024-06-27 10:01:42 -07:00
Andrii Nakryiko
cece3242fb libbpf: Add BTF field iterator
Implement iterator-based type ID and string offset BTF field iterator.
This is used extensively in BTF-handling code and BPF linker code for
various sanity checks, rewriting IDs/offsets, etc. Currently this is
implemented as visitor pattern calling custom callbacks, which makes the
logic (especially in simple cases) unnecessarily obscure and harder to
follow.

Having equivalent functionality using iterator pattern makes for simpler
to understand and maintain code. As we add more code for BTF processing
logic in libbpf, it's best to switch to iterator pattern before adding
more callback-based code.

The idea for iterator-based implementation is to record offsets of
necessary fields within fixed btf_type parts (which should be iterated
just once), and, for kinds that have multiple members (based on vlen
field), record where in each member necessary fields are located.

Generic iteration code then just keeps track of last offset that was
returned and handles N members correctly. Return type is just u32
pointer, where NULL is returned when all relevant fields were already
iterated.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20240605001629.4061937-2-andrii@kernel.org
2024-06-27 10:01:42 -07:00
Andrii Nakryiko
42065ea662 ci: make pahole-staging workflow manually triggerable
Allow to manually trigger pahole-staging workflow.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-06-06 14:39:09 -07:00
Andrii Nakryiko
764d19da07 ci: revert switching to ubuntu-latest for pahole-staging workflow
pahole staging workflow is using the same old VM image as BPF selftests
stages. It doesn't have recent enough glibc, so we can't yet switch to
newer Ubuntu, unfortunately.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-06-06 14:32:23 -07:00
thiagoftsm
7d1fe77f65 Merge branch 'libbpf:master' into master 2024-06-03 23:13:45 +00:00
Andrii Nakryiko
fbcb2871fe ci: regenerate vmlinux.h
Regenerated latest vmlinux.h.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-06-03 13:41:26 -07:00
Andrii Nakryiko
61a6e8edd7 github: remove PR template
No one is looking at it anyways. It just gets in the way.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-06-03 13:41:26 -07:00
Andrii Nakryiko
4ab7361e64 libbpf: don't close(-1) in multi-uprobe feature detector
Guard close(link_fd) with extra link_fd >= 0 check to prevent close(-1).

Detected by Coverity static analysis.

Fixes: 04d939a2ab22 ("libbpf: detect broken PID filtering logic for multi-uprobe")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20240529231212.768828-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-06-03 13:41:26 -07:00
Andrii Nakryiko
ff856238e2 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   eb4e7726279a344c82e3c23be396bcfd0a4d5669
Checkpoint bpf-next commit: 531876c80004ecff7bfdbd8ba6c6b48835ef5e22
Baseline bpf commit:        9dfdb706e164ae869b1d97f83ebf8523b2809714
Checkpoint bpf commit:      62da3acd28955e7299babebdfcb14243b789e773

Andrii Nakryiko (1):
  libbpf: keep FD_CLOEXEC flag when dup()'ing FD

Jakub Kicinski (1):
  netdev: add qstat for csum complete

 include/uapi/linux/netdev.h |  1 +
 src/libbpf_internal.h       | 10 +++-------
 2 files changed, 4 insertions(+), 7 deletions(-)

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-06-03 13:41:26 -07:00
Jakub Kicinski
c085e9c364 netdev: add qstat for csum complete
Recent commit 0cfe71f45f42 ("netdev: add queue stats") added
a lot of useful stats, but only those immediately needed by virtio.
Presumably virtio does not support CHECKSUM_COMPLETE,
so statistic for that form of checksumming wasn't included.
Other drivers will definitely need it, in fact we expect it
to be needed in net-next soon (mlx5). So let's add the definition
of the counter for CHECKSUM_COMPLETE to uAPI in net already,
so that the counters are in a more natural order (all subsequent
counters have not been present in any released kernel, yet).

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Fixes: 0cfe71f45f42 ("netdev: add queue stats")
Link: https://lore.kernel.org/r/20240529163547.3693194-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-03 13:41:26 -07:00
Andrii Nakryiko
805b689cd2 libbpf: keep FD_CLOEXEC flag when dup()'ing FD
Make sure to preserve and/or enforce FD_CLOEXEC flag on duped FDs.
Use dup3() with O_CLOEXEC flag for that.

Without this fix libbpf effectively clears FD_CLOEXEC flag on each of BPF
map/prog FD, which is definitely not the right or expected behavior.

Reported-by: Lennart Poettering <lennart@poettering.net>
Fixes: bc308d011ab8 ("libbpf: call dup2() syscall directly")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20240529223239.504241-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-06-03 13:41:26 -07:00
Andrii Nakryiko
9b789075a9 ci: switch to ubuntu-latest where possible
Track ubuntu-latest where relevant and possible.
We can't update to ubuntu-latest when building and running BPF
selftests, though, because our QEMU image has too old of an GLIBC.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-05-28 22:37:25 -07:00
Andrii Nakryiko
c22d662a95 ci: update vmlinux.h to latest version
Re-generate vmlinux.h to add latest kernel types necessary for BPF
selftests.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-05-28 21:15:00 -07:00
Andrii Nakryiko
074445067f ci: add temporary patch for failing upstream BPF selftest
Add fix that landed in bpf tree to fix sk_storage_tracing selftest.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-05-28 20:39:55 -07:00
Andrii Nakryiko
9a1f1f28c6 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   009367099eb61a4fc2af44d4eb06b6b4de7de6db
Checkpoint bpf-next commit: eb4e7726279a344c82e3c23be396bcfd0a4d5669
Baseline bpf commit:        3e9bc0472b910d4115e16e9c2d684c7757cb6c60
Checkpoint bpf commit:      9dfdb706e164ae869b1d97f83ebf8523b2809714

Abhishek Chauhan (1):
  net: Add additional bit to support clockid_t timestamp type

Andrii Nakryiko (2):
  libbpf: fix feature detectors when using token_fd
  libbpf: detect broken PID filtering logic for multi-uprobe

Arnaldo Carvalho de Melo (1):
  tools headers: Remove now unused copies of uapi/{fcntl,openat2}.h and
    asm/fcntl.h

Daniel Jurgens (1):
  netdev: Add queue stats for TX stop and wake

Mykyta Yatsenko (1):
  libbpf: Configure log verbosity with env variable

Xuan Zhuo (1):
  netdev: add queue stats

 docs/libbpf_overview.rst     |   8 +++
 include/uapi/linux/bpf.h     |  15 +++--
 include/uapi/linux/fcntl.h   | 123 -----------------------------------
 include/uapi/linux/netdev.h  |  21 ++++++
 include/uapi/linux/openat2.h |  43 ------------
 src/bpf.c                    |   2 +-
 src/features.c               |  33 +++++++++-
 src/libbpf.c                 |  25 ++++++-
 src/libbpf.h                 |   5 +-
 9 files changed, 99 insertions(+), 176 deletions(-)
 delete mode 100644 include/uapi/linux/fcntl.h
 delete mode 100644 include/uapi/linux/openat2.h

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-05-28 20:39:55 -07:00
Andrii Nakryiko
0a519f87ee sync: update .mailmap
Update .mailmap based on libbpf's list of contributors and on the latest
.mailmap version in the upstream repository.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-05-28 20:39:55 -07:00
Andrii Nakryiko
d9f9fd5b22 libbpf: detect broken PID filtering logic for multi-uprobe
Libbpf is automatically (and transparently to user) detecting
multi-uprobe support in the kernel, and, if supported, uses
multi-uprobes to improve USDT attachment speed.

USDTs can be attached system-wide or for the specific process by PID. In
the latter case, we rely on correct kernel logic of not triggering USDT
for unrelated processes.

As such, on older kernels that do support multi-uprobes, but still have
broken PID filtering logic, we need to fall back to singular uprobes.

Unfortunately, whether user is using PID filtering or not is known at
the attachment time, which happens after relevant BPF programs were
loaded into the kernel. Also unfortunately, we need to make a call
whether to use multi-uprobes or singular uprobe for SEC("usdt") programs
during BPF object load time, at which point we have no information about
possible PID filtering.

The distinction between single and multi-uprobes is small, but important
for the kernel. Multi-uprobes get BPF_TRACE_UPROBE_MULTI attach type,
and kernel internally substitiute different implementation of some of
BPF helpers (e.g., bpf_get_attach_cookie()) depending on whether uprobe
is multi or singular. So, multi-uprobes and singular uprobes cannot be
intermixed.

All the above implies that we have to make an early and conservative
call about the use of multi-uprobes. And so this patch modifies libbpf's
existing feature detector for multi-uprobe support to also check correct
PID filtering. If PID filtering is not yet fixed, we fall back to
singular uprobes for USDTs.

This extension to feature detection is simple thanks to kernel's -EINVAL
addition for pid < 0.

Acked-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240521163401.3005045-4-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-28 20:39:55 -07:00
Mykyta Yatsenko
d4d3e68e8d libbpf: Configure log verbosity with env variable
Configure logging verbosity by setting LIBBPF_LOG_LEVEL environment
variable, which is applied only to default logger. Once user set their
custom logging callback, it is up to them to handle filtering.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240524131840.114289-1-yatsenko@meta.com
2024-05-28 20:39:55 -07:00
Abhishek Chauhan
0babfb126a net: Add additional bit to support clockid_t timestamp type
tstamp_type is now set based on actual clockid_t compressed
into 2 bits.

To make the design scalable for future needs this commit bring in
the change to extend the tstamp_type:1 to tstamp_type:2 to support
other clockid_t timestamp.

We now support CLOCK_TAI as part of tstamp_type as part of this
commit with existing support CLOCK_MONOTONIC and CLOCK_REALTIME.

Signed-off-by: Abhishek Chauhan <quic_abchauha@quicinc.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240509211834.3235191-3-quic_abchauha@quicinc.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-28 20:39:55 -07:00
Arnaldo Carvalho de Melo
89ed67d7ab tools headers: Remove now unused copies of uapi/{fcntl,openat2}.h and asm/fcntl.h
These were used to build perf to provide defines not available in older
distros, but this was back in 2017, nowadays all the distros that are
supported and I have build containers for work using just the system
headers, so ditch them.

Some of these older distros may not have things that are used in 'perf
trace', but then they also don't have libtraceevent packages, so don't
build 'perf trace'.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/20240315204835.748716-5-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-28 20:39:55 -07:00
Andrii Nakryiko
8dfa981c53 libbpf: fix feature detectors when using token_fd
Adjust `union bpf_attr` size passed to kernel in two feature-detecting
functions to take into account prog_token_fd field.

Libbpf is avoiding memset()'ing entire `union bpf_attr` by only using
minimal set of bpf_attr's fields. Two places have been missed when
wiring BPF token support in libbpf's feature detection logic.

Fix them trivially.

Fixes: f3dcee938f48 ("libbpf: Wire up token_fd into feature probing logic")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240513180804.403775-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-28 20:39:55 -07:00
Daniel Jurgens
15b461a608 netdev: Add queue stats for TX stop and wake
TX queue stop and wake are counted by some drivers.
Support reporting these via netdev-genl queue stats.

Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://lore.kernel.org/r/20240510201927.1821109-2-danielj@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-28 20:39:55 -07:00
Xuan Zhuo
ec3c369941 netdev: add queue stats
These stats are commonly. Support reporting those via netdev-genl queue
stats.

name: rx-hw-drops
name: rx-hw-drop-overruns
name: rx-csum-unnecessary
name: rx-csum-none
name: rx-csum-bad
name: rx-hw-gro-packets
name: rx-hw-gro-bytes
name: rx-hw-gro-wire-packets
name: rx-hw-gro-wire-bytes
name: rx-hw-drop-ratelimits
name: tx-hw-drops
name: tx-hw-drop-errors
name: tx-csum-none
name: tx-needs-csum
name: tx-hw-gso-packets
name: tx-hw-gso-bytes
name: tx-hw-gso-wire-packets
name: tx-hw-gso-wire-bytes
name: tx-hw-drop-ratelimits

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-05-28 20:39:55 -07:00
thiagoftsm
89aecd2188 Merge branch 'libbpf:master' into master 2024-05-13 01:33:50 +00:00
Andrii Nakryiko
02724cfd07 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   0737df6de94661ae55fd3343ce9abec32c687e62
Checkpoint bpf-next commit: 009367099eb61a4fc2af44d4eb06b6b4de7de6db
Baseline bpf commit:        3e9bc0472b910d4115e16e9c2d684c7757cb6c60
Checkpoint bpf commit:      3e9bc0472b910d4115e16e9c2d684c7757cb6c60

Andrii Nakryiko (6):
  libbpf: fix potential overflow in ring__consume_n()
  libbpf: fix ring_buffer__consume_n() return result logic
  libbpf: remove unnecessary struct_ops prog validity check
  libbpf: handle yet another corner case of nulling out struct_ops
    program
  libbpf: fix libbpf_strerror_r() handling unknown errors
  libbpf: improve early detection of doomed-to-fail BPF program loading

Jiri Olsa (2):
  libbpf: Fix error message in attach_kprobe_session
  libbpf: Fix error message in attach_kprobe_multi

Jose E. Marchesi (3):
  libbpf: Fix bpf_ksym_exists() in GCC
  libbpf: Avoid casts from pointers to enums in bpf_tracing.h
  bpf: Avoid uninitialized value in BPF_CORE_READ_BITFIELD

 src/bpf_core_read.h |  1 +
 src/bpf_helpers.h   | 17 +++++++++--
 src/bpf_tracing.h   | 70 ++++++++++++++++++++++-----------------------
 src/libbpf.c        | 42 ++++++++++++++++++---------
 src/ringbuf.c       |  4 +--
 src/str_error.c     | 16 +++++++++--
 src/usdt.bpf.h      | 24 ++++++++--------
 7 files changed, 106 insertions(+), 68 deletions(-)

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-05-08 16:04:40 -07:00
Jose E. Marchesi
3827aa514c bpf: Avoid uninitialized value in BPF_CORE_READ_BITFIELD
[Changes from V1:
 - Use a default branch in the switch statement to initialize `val'.]

GCC warns that `val' may be used uninitialized in the
BPF_CRE_READ_BITFIELD macro, defined in bpf_core_read.h as:

	[...]
	unsigned long long val;						      \
	[...]								      \
	switch (__CORE_RELO(s, field, BYTE_SIZE)) {			      \
	case 1: val = *(const unsigned char *)p; break;			      \
	case 2: val = *(const unsigned short *)p; break;		      \
	case 4: val = *(const unsigned int *)p; break;			      \
	case 8: val = *(const unsigned long long *)p; break;		      \
        }       							      \
	[...]
	val;								      \
	}								      \

This patch adds a default entry in the switch statement that sets
`val' to zero in order to avoid the warning, and random values to be
used in case __builtin_preserve_field_info returns unexpected values
for BPF_FIELD_BYTE_SIZE.

Tested in bpf-next master.
No regressions.

Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240508101313.16662-1-jose.marchesi@oracle.com
2024-05-08 16:04:40 -07:00
Andrii Nakryiko
e5146eff75 libbpf: improve early detection of doomed-to-fail BPF program loading
Extend libbpf's pre-load checks for BPF programs, detecting more typical
conditions that are destinated to cause BPF program failure. This is an
opportunity to provide more helpful and actionable error message to
users, instead of potentially very confusing BPF verifier log and/or
error.

In this case, we detect struct_ops BPF program that was not referenced
anywhere, but still attempted to be loaded (according to libbpf logic).
Suggest that the program might need to be used in some struct_ops
variable. User will get a message of the following kind:

  libbpf: prog 'test_1_forgotten': SEC("struct_ops") program isn't referenced anywhere, did you forget to use it?

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240507001335.1445325-6-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-08 16:04:40 -07:00
Andrii Nakryiko
ed54f30307 libbpf: fix libbpf_strerror_r() handling unknown errors
strerror_r(), used from libbpf-specific libbpf_strerror_r() wrapper is
documented to return error in two different ways, depending on glibc
version. Take that into account when handling strerror_r()'s own errors,
which happens when we pass some non-standard (internal) kernel error to
it. Before this patch we'd have "ERROR: strerror_r(524)=22", which is
quite confusing. Now for the same situation we'll see a bit less
visually scary "unknown error (-524)".

At least we won't confuse user with irrelevant EINVAL (22).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240507001335.1445325-5-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-08 16:04:40 -07:00
Andrii Nakryiko
fe5fe762b9 libbpf: handle yet another corner case of nulling out struct_ops program
There is yet another corner case where user can set STRUCT_OPS program
reference in STRUCT_OPS map to NULL, but libbpf will fail to disable
autoload for such BPF program. This time it's the case of "new" kernel
which has type information about callback field, but user explicitly
nulled-out program reference from user-space after opening BPF object.

Fix, hopefully, the last remaining unhandled case.

Fixes: 0737df6de946 ("libbpf: better fix for handling nulled-out struct_ops program")
Fixes: f973fccd43d3 ("libbpf: handle nulled-out program in struct_ops correctly")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240507001335.1445325-3-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-08 16:04:40 -07:00
Andrii Nakryiko
504369cba4 libbpf: remove unnecessary struct_ops prog validity check
libbpf ensures that BPF program references set in map->st_ops->progs[i]
during open phase are always valid STRUCT_OPS programs. This is done in
bpf_object__collect_st_ops_relos(). So there is no need to double-check
that in bpf_map__init_kern_struct_ops().

Simplify the code by removing unnecessary check. Also, we avoid using
local prog variable to keep code similar to the upcoming fix, which adds
similar logic in another part of bpf_map__init_kern_struct_ops().

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240507001335.1445325-2-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-08 16:04:40 -07:00
Jose E. Marchesi
ea02e10fc4 libbpf: Avoid casts from pointers to enums in bpf_tracing.h
[Differences from V1:
  - Do not introduce a global typedef, as this is a public header.
  - Keep the void* casts in BPF_KPROBE_READ_RET_IP and
    BPF_KRETPROBE_READ_RET_IP, as these are necessary
    for converting to a const void* argument of
    bpf_probe_read_kernel.]

The BPF_PROG, BPF_KPROBE and BPF_KSYSCALL macros defined in
tools/lib/bpf/bpf_tracing.h use a clever hack in order to provide a
convenient way to define entry points for BPF programs as if they were
normal C functions that get typed actual arguments, instead of as
elements in a single "context" array argument.

For example, PPF_PROGS allows writing:

  SEC("struct_ops/cwnd_event")
  void BPF_PROG(cwnd_event, struct sock *sk, enum tcp_ca_event event)
  {
        bbr_cwnd_event(sk, event);
        dctcp_cwnd_event(sk, event);
        cubictcp_cwnd_event(sk, event);
  }

That expands into a pair of functions:

  void ____cwnd_event (unsigned long long *ctx, struct sock *sk, enum tcp_ca_event event)
  {
        bbr_cwnd_event(sk, event);
        dctcp_cwnd_event(sk, event);
        cubictcp_cwnd_event(sk, event);
  }

  void cwnd_event (unsigned long long *ctx)
  {
        _Pragma("GCC diagnostic push")
        _Pragma("GCC diagnostic ignored \"-Wint-conversion\"")
        return ____cwnd_event(ctx, (void*)ctx[0], (void*)ctx[1]);
        _Pragma("GCC diagnostic pop")
  }

Note how the 64-bit unsigned integers in the incoming CTX get casted
to a void pointer, and then implicitly converted to whatever type of
the actual argument in the wrapped function.  In this case:

  Arg1: unsigned long long -> void * -> struct sock *
  Arg2: unsigned long long -> void * -> enum tcp_ca_event

The behavior of GCC and clang when facing such conversions differ:

  pointer -> pointer

    Allowed by the C standard.
    GCC: no warning nor error.
    clang: no warning nor error.

  pointer -> integer type

    [C standard says the result of this conversion is implementation
     defined, and it may lead to unaligned pointer etc.]

    GCC: error: integer from pointer without a cast [-Wint-conversion]
    clang: error: incompatible pointer to integer conversion [-Wint-conversion]

  pointer -> enumerated type

    GCC: error: incompatible types in assigment (*)
    clang: error: incompatible pointer to integer conversion [-Wint-conversion]

These macros work because converting pointers to pointers is allowed,
and converting pointers to integers also works provided a suitable
integer type even if it is implementation defined, much like casting a
pointer to uintptr_t is guaranteed to work by the C standard.  The
conversion errors emitted by both compilers by default are silenced by
the pragmas.

However, the GCC error marked with (*) above when assigning a pointer
to an enumerated value is not associated with the -Wint-conversion
warning, and it is not possible to turn it off.

This is preventing building the BPF kernel selftests with GCC.

This patch fixes this by avoiding intermediate casts to void*,
replaced with casts to `unsigned long long', which is an integer type
capable of safely store a BPF pointer, much like the standard
uintptr_t.

Testing performed in bpf-next master:
  - vmtest.sh -- ./test_verifier
  - vmtest.sh -- ./test_progs
  - make M=samples/bpf
No regressions.

Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240502170925.3194-1-jose.marchesi@oracle.com
2024-05-08 16:04:40 -07:00
Jose E. Marchesi
4ec5e360ae libbpf: Fix bpf_ksym_exists() in GCC
The macro bpf_ksym_exists is defined in bpf_helpers.h as:

  #define bpf_ksym_exists(sym) ({								\
  	_Static_assert(!__builtin_constant_p(!!sym), #sym " should be marked as __weak");	\
  	!!sym;											\
  })

The purpose of the macro is to determine whether a given symbol has
been defined, given the address of the object associated with the
symbol.  It also has a compile-time check to make sure the object
whose address is passed to the macro has been declared as weak, which
makes the check on `sym' meaningful.

As it happens, the check for weak doesn't work in GCC in all cases,
because __builtin_constant_p not always folds at parse time when
optimizing.  This is because optimizations that happen later in the
compilation process, like inlining, may make a previously non-constant
expression a constant.  This results in errors like the following when
building the selftests with GCC:

  bpf_helpers.h:190:24: error: expression in static assertion is not constant
  190 |         _Static_assert(!__builtin_constant_p(!!sym), #sym " should be marked as __weak");       \
      |                        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~

Fortunately recent versions of GCC support a __builtin_has_attribute
that can be used to directly check for the __weak__ attribute.  This
patch changes bpf_helpers.h to use that builtin when building with a
recent enough GCC, and to omit the check if GCC is too old to support
the builtin.

The macro used for GCC becomes:

  #define bpf_ksym_exists(sym) ({									\
	_Static_assert(__builtin_has_attribute (*sym, __weak__), #sym " should be marked as __weak");	\
	!!sym;												\
  })

Note that since bpf_ksym_exists is designed to get the address of the
object associated with symbol SYM, we pass *sym to
__builtin_has_attribute instead of sym.  When an expression is passed
to __builtin_has_attribute then it is the type of the passed
expression that is checked for the specified attribute.  The
expression itself is not evaluated.  This accommodates well with the
existing usages of the macro:

- For function objects:

  struct task_struct *bpf_task_acquire(struct task_struct *p) __ksym __weak;
  [...]
  bpf_ksym_exists(bpf_task_acquire)

- For variable objects:

  extern const struct rq runqueues __ksym __weak; /* typed */
  [...]
  bpf_ksym_exists(&runqueues)

Note also that BPF support was added in GCC 10 and support for
__builtin_has_attribute in GCC 9.

Locally tested in bpf-next master branch.
No regressions.

Signed-of-by: Jose E. Marchesi <jose.marchesi@oracle.com>

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20240428112559.10518-1-jose.marchesi@oracle.com
2024-05-08 16:04:40 -07:00
Andrii Nakryiko
cb7bfc5e51 libbpf: fix ring_buffer__consume_n() return result logic
Add INT_MAX check to ring_buffer__consume_n(). We do the similar check
to handle int return result of all these ring buffer APIs in other APIs
and ring_buffer__consume_n() is missing one. This patch fixes this
omission.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20240430201952.888293-2-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-08 16:04:40 -07:00
Andrii Nakryiko
e3e84bd7d0 libbpf: fix potential overflow in ring__consume_n()
ringbuf_process_ring() return int64_t, while ring__consume_n() assigns
it to int. It's highly unlikely, but possible for ringbuf_process_ring()
to return value larger than INT_MAX, so use int64_t. ring__consume_n()
does check INT_MAX before returning int result to the user.

Fixes: 4d22ea94ea33 ("libbpf: Add ring__consume_n / ring_buffer__consume_n")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20240430201952.888293-1-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-08 16:04:40 -07:00
Jiri Olsa
f3c4172c61 libbpf: Fix error message in attach_kprobe_multi
We just failed to retrieve pattern, so we need to print spec instead.

Fixes: ddc6b04989eb ("libbpf: Add bpf_program__attach_kprobe_multi_opts function")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240502075541.1425761-2-jolsa@kernel.org
2024-05-08 16:04:40 -07:00
Jiri Olsa
d045f7682b libbpf: Fix error message in attach_kprobe_session
We just failed to retrieve pattern, so we need to print spec instead.

Fixes: 2ca178f02b2f ("libbpf: Add support for kprobe session attach")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240502075541.1425761-1-jolsa@kernel.org
2024-05-08 16:04:40 -07:00
thiagoftsm
b39b7f426f Merge branch 'libbpf:master' into master 2024-05-02 11:20:27 +00:00
Quentin Monnet
e055420033 sync: Commit .mailmap changes from script when sync-ing repo
In commit 4794f18bf4 ("sync: Sync .mailmap entries"), we updated the
sync-up script to automatically update libbpf's .mailmap; however, the
script would not take care of committing the changes. Let's address
this.

The code is copied and adapted from the part where we commit changes to
src/bpf_helper_defs.h.

Signed-off-by: Quentin Monnet <qmo@kernel.org>
2024-05-01 17:38:26 -07:00
Andrii Nakryiko
255b705a16 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   1bba3b3d373dbafae891e7cb06b8c82c8d62aba1
Checkpoint bpf-next commit: 0737df6de94661ae55fd3343ce9abec32c687e62
Baseline bpf commit:        b867247555c4181bf84eb10b72b176862c29112d
Checkpoint bpf commit:      3e9bc0472b910d4115e16e9c2d684c7757cb6c60

Andrii Nakryiko (1):
  libbpf: better fix for handling nulled-out struct_ops program

Jiri Olsa (3):
  bpf: Add support for kprobe session attach
  libbpf: Add support for kprobe session attach
  libbpf: Add kprobe session attach type name to attach_type_name

Viktor Malik (1):
  libbpf: support "module: Function" syntax for tracing programs

 include/uapi/linux/bpf.h |   1 +
 src/bpf.c                |   1 +
 src/libbpf.c             | 112 +++++++++++++++++++++++++++++++--------
 src/libbpf.h             |   4 +-
 4 files changed, 95 insertions(+), 23 deletions(-)

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-05-01 15:20:15 -07:00
Andrii Nakryiko
6a41f02ad4 libbpf: better fix for handling nulled-out struct_ops program
Previous attempt to fix the handling of nulled-out (from skeleton)
struct_ops program is working well only if struct_ops program is defined
as non-autoloaded by default (i.e., has SEC("?struct_ops") annotation,
with question mark).

Unfortunately, that fix is incomplete due to how
bpf_object_adjust_struct_ops_autoload() is marking referenced or
non-referenced struct_ops program as autoloaded (or not). Because
bpf_object_adjust_struct_ops_autoload() is run after
bpf_map__init_kern_struct_ops() step, which sets program slot to NULL,
such programs won't be considered "referenced", and so its autoload
property won't be changed.

This all sounds convoluted and it is, but the desire is to have as
natural behavior (as far as struct_ops usage is concerned) as possible.

This fix is redoing the original fix but makes it work for
autoloaded-by-default struct_ops programs as well. We achieve this by
forcing prog->autoload to false if prog was declaratively set for some
struct_ops map, but then nulled-out from skeleton (programmatically).
This achieves desired effect of not autoloading it. If such program is
still referenced somewhere else (different struct_ops map or different
callback field), it will get its autoload property adjusted by
bpf_object_adjust_struct_ops_autoload() later.

We also fix selftest, which accidentally used SEC("?struct_ops")
annotation. It was meant to use autoload-by-default program from the
very beginning.

Fixes: f973fccd43d3 ("libbpf: handle nulled-out program in struct_ops correctly")
Cc: Kui-Feng Lee <thinker.li@gmail.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240501041706.3712608-1-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-01 15:20:15 -07:00
Viktor Malik
dd589c3b31 libbpf: support "module: Function" syntax for tracing programs
In some situations, it is useful to explicitly specify a kernel module
to search for a tracing program target (e.g. when a function of the same
name exists in multiple modules or in vmlinux).

This patch enables that by allowing the "module:function" syntax for the
find_kernel_btf_id function. Thanks to this, the syntax can be used both
from a SEC macro (i.e. `SEC(fentry/module:function)`) and via the
bpf_program__set_attach_target API call.

Signed-off-by: Viktor Malik <vmalik@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/9085a8cb9a552de98e554deb22ff7e977d025440.1714469650.git.vmalik@redhat.com
2024-05-01 15:20:15 -07:00
Jiri Olsa
045a0372ef libbpf: Add kprobe session attach type name to attach_type_name
Adding kprobe session attach type name to attach_type_name,
so libbpf_bpf_attach_type_str returns proper string name for
BPF_TRACE_KPROBE_SESSION attach type.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240430112830.1184228-6-jolsa@kernel.org
2024-05-01 15:20:15 -07:00
Jiri Olsa
6c3cf5108e libbpf: Add support for kprobe session attach
Adding support to attach program in kprobe session mode
with bpf_program__attach_kprobe_multi_opts function.

Adding session bool to bpf_kprobe_multi_opts struct that allows
to load and attach the bpf program via kprobe session.
the attachment to create kprobe multi session.

Also adding new program loader section that allows:
 SEC("kprobe.session/bpf_fentry_test*")

and loads/attaches kprobe program as kprobe session.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240430112830.1184228-5-jolsa@kernel.org
2024-05-01 15:20:15 -07:00
Jiri Olsa
b63d2945ff bpf: Add support for kprobe session attach
Adding support to attach bpf program for entry and return probe
of the same function. This is common use case which at the moment
requires to create two kprobe multi links.

Adding new BPF_TRACE_KPROBE_SESSION attach type that instructs
kernel to attach single link program to both entry and exit probe.

It's possible to control execution of the bpf program on return
probe simply by returning zero or non zero from the entry bpf
program execution to execute or not the bpf program on return
probe respectively.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240430112830.1184228-2-jolsa@kernel.org
2024-05-01 15:20:15 -07:00
Andrii Nakryiko
d3e18fceec ci: remove tcp_rtt test from 5.5 ALLOWLIST
It's been updated to expecte the very latest kernel, can't succeed on
5.5 anymore.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-04-30 09:09:32 -07:00
Andrii Nakryiko
22bd976613 ci: update vmlinux.h
Regenerate vmlinux.h to get all the latest types.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-04-30 09:09:32 -07:00
Andrii Nakryiko
f9f3fbf72d sync: update .mailmap
Update .mailmap generated during sync.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-04-30 09:09:32 -07:00
Andrii Nakryiko
37b8e0eb2d sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   82e38a505c9868e784ec31e743fd8a9fa5ca1084
Checkpoint bpf-next commit: 1bba3b3d373dbafae891e7cb06b8c82c8d62aba1
Baseline bpf commit:        5bcf0dcbf9066348058b88a510c57f70f384c92c
Checkpoint bpf commit:      b867247555c4181bf84eb10b72b176862c29112d

Andrii Nakryiko (1):
  libbpf: handle nulled-out program in struct_ops correctly

Jose E. Marchesi (1):
  bpf_helpers.h: Define bpf_tail_call_static when building with GCC

Philo Lu (1):
  bpf: add mrtt and srtt as BPF_SOCK_OPS_RTT_CB args

 include/uapi/linux/bpf.h | 2 ++
 src/bpf_helpers.h        | 4 +++-
 src/libbpf.c             | 1 +
 3 files changed, 6 insertions(+), 1 deletion(-)

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-04-30 09:09:32 -07:00
Andrii Nakryiko
f28271ab72 libbpf: handle nulled-out program in struct_ops correctly
If struct_ops has one of program callbacks set declaratively and host
kernel is old and doesn't support this callback, libbpf will allow to
load such struct_ops as long as that callback was explicitly nulled-out
(presumably through skeleton). This is all working correctly, except we
won't reset corresponding program slot to NULL before bailing out, which
will lead to libbpf not detecting that BPF program has to be not
auto-loaded. Fix this by unconditionally resetting corresponding program
slot to NULL.

Fixes: c911fc61a7ce ("libbpf: Skip zeroed or null fields if not found in the kernel type.")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240428030954.3918764-1-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-04-30 09:09:32 -07:00
Jose E. Marchesi
b1051d9361 bpf_helpers.h: Define bpf_tail_call_static when building with GCC
The definition of bpf_tail_call_static in tools/lib/bpf/bpf_helpers.h
is guarded by a preprocessor check to assure that clang is recent
enough to support it.  This patch updates the guard so the function is
compiled when using GCC 13 or later as well.

Tested in bpf-next master. No regressions.

Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240426145158.14409-1-jose.marchesi@oracle.com
2024-04-30 09:09:32 -07:00
Philo Lu
43df08cd17 bpf: add mrtt and srtt as BPF_SOCK_OPS_RTT_CB args
Two important arguments in RTT estimation, mrtt and srtt, are passed to
tcp_bpf_rtt(), so that bpf programs get more information about RTT
computation in BPF_SOCK_OPS_RTT_CB.

The difference between bpf_sock_ops->srtt_us and the srtt here is: the
former is an old rtt before update, while srtt passed by tcp_bpf_rtt()
is that after update.

Signed-off-by: Philo Lu <lulie@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240425161724.73707-2-lulie@linux.alibaba.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-04-30 09:09:32 -07:00
Quentin Monnet
4794f18bf4 sync: Sync .mailmap entries
The kernel repository has a .mailmap file to remap author names and
email addresses to their desired format in Git logs (for details, see
gitmailmap documentation [0]). Alas, this is only visible for author
information when looking at the logs locally, as GitHub does not support
mailmaps at the moment [1].

This commit adds a .mailmap file for libbpf, automatically generated
from the kernel's version. The script to generate the .mailmap is added,
too: it works by grepping email addresses from authors in the
repository, and collecting all lines ending with this address in the
kernel's .mailmap - in other words, all lines where this address is used
as a pattern for a remapping.

To keep the .mailmap up-to-date, add a call to the script to
sync-kernel.sh.

[0] https://git-scm.com/docs/gitmailmap
[1] https://github.com/orgs/community/discussions/22518

Signed-off-by: Quentin Monnet <qmo@kernel.org>
2024-04-25 22:42:05 -07:00
Yonghong Song
2fdcc365a0 ci: regenerate latest vmlinux.h
Update vmlinux.h to make BPF selftests compile.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
2024-04-24 15:16:35 -07:00
Yonghong Song
52c37177cc Makefile: Ensure github libbpf version the same as the kernel one
The kernel libbpf version is 1.5 now. So change github libbpf
version to be 1.5 as well.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
2024-04-24 15:16:35 -07:00
Yonghong Song
7cbfddfdf2 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   14bb1e8c8d4ad5d9d2febb7d19c70a3cf536e1e5
Checkpoint bpf-next commit: 82e38a505c9868e784ec31e743fd8a9fa5ca1084
Baseline bpf commit:        443574b033876c85a35de4c65c14f7fe092222b2
Checkpoint bpf commit:      5bcf0dcbf9066348058b88a510c57f70f384c92c

Andrea Righi (3):
  libbpf: Start v1.5 development cycle
  libbpf: ringbuf: Allow to consume up to a certain amount of items
  libbpf: Add ring__consume_n / ring_buffer__consume_n

Anton Protopopov (2):
  bpf: Add support for passing mark with bpf_fib_lookup
  bpf: Pack struct bpf_fib_lookup

Benjamin Tissoires (1):
  tools: sync include/uapi/linux/bpf.h

David Lechner (1):
  bpf: Fix typo in uapi doc comments

Mykyta Yatsenko (1):
  bpf: improve error message for unsupported helper

Quentin Deslandes (2):
  libbpf: Fix misaligned array closing bracket
  libbpf: Fix dump of subsequent char arrays

Tobias Böhm (1):
  libbpf: Use local bpf_helpers.h include

Yonghong Song (4):
  libbpf: Mark libbpf_kallsyms_parse static function
  libbpf: Handle <orig_name>.llvm.<hash> symbol properly
  bpf: Add bpf_link support for sk_msg and sk_skb progs
  libbpf: Add bpf_link support for BPF_PROG_TYPE_SOCKMAP

 include/uapi/linux/bpf.h | 35 +++++++++++++++++++++----
 src/bpf_core_read.h      |  2 +-
 src/btf_dump.c           |  5 ++++
 src/libbpf.c             | 33 ++++++++++++++++++++++--
 src/libbpf.h             | 14 ++++++++++
 src/libbpf.map           |  7 +++++
 src/libbpf_internal.h    |  5 ----
 src/libbpf_probes.c      |  6 +++--
 src/libbpf_version.h     |  2 +-
 src/ringbuf.c            | 55 +++++++++++++++++++++++++++++++++-------
 10 files changed, 139 insertions(+), 25 deletions(-)

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
2024-04-24 15:16:35 -07:00
Yonghong Song
f2fe16ec95 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
2024-04-24 15:16:35 -07:00
Benjamin Tissoires
a911ca1e3e tools: sync include/uapi/linux/bpf.h
cp include/uapi/linux/bpf.h tools/include/uapi/linux/bpf.h

Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Link: https://lore.kernel.org/r/20240420-bpf_wq-v2-6-6c986a5a741f@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-04-24 15:16:35 -07:00
Quentin Deslandes
24924003c6 libbpf: Fix dump of subsequent char arrays
When dumping a character array, libbpf will watch for a '\0' and set
is_array_terminated=true if found. This prevents libbpf from printing
the remaining characters of the array, treating it as a nul-terminated
string.

However, once this flag is set, it's never reset, leading to subsequent
characters array not being printed properly:

.str_multi = (__u8[2][16])[
    [
        'H',
        'e',
        'l',
    ],
],

This patch saves the is_array_terminated flag and restores its
default (false) value before looping over the elements of an array,
then restores it afterward. This way, libbpf's behavior is unchanged
when dumping the characters of an array, but subsequent arrays are
printed properly:

.str_multi = (__u8[2][16])[
    [
        'H',
        'e',
        'l',
    ],
    [
        'l',
        'o',
    ],
],

Signed-off-by: Quentin Deslandes <qde@naccy.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240413211258.134421-3-qde@naccy.de
2024-04-24 15:16:35 -07:00
Quentin Deslandes
2c6f445a8e libbpf: Fix misaligned array closing bracket
In btf_dump_array_data(), libbpf will call btf_dump_dump_type_data() for
each element. For an array of characters, each element will be
processed the following way:

- btf_dump_dump_type_data() is called to print the character
- btf_dump_data_pfx() prefixes the current line with the proper number
  of indentations
- btf_dump_int_data() is called to print the character
- After the last character is printed, btf_dump_dump_type_data() calls
  btf_dump_data_pfx() before writing the closing bracket

However, for an array containing characters, btf_dump_int_data() won't
print any '\0' and subsequent characters. This leads to situations where
the line prefix is written, no character is added, then the prefix is
written again before adding the closing bracket:

(struct sk_metadata){
    .str_array = (__u8[14])[
        'H',
        'e',
        'l',
        'l',
        'o',
                ],

This change solves this issue by printing the '\0' character, which
has two benefits:

- The bracket closing the array is properly aligned
- It's clear from a user point of view that libbpf uses '\0' as a
  terminator for arrays of characters.

Signed-off-by: Quentin Deslandes <qde@naccy.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240413211258.134421-2-qde@naccy.de
2024-04-24 15:16:35 -07:00
Yonghong Song
09397e309a libbpf: Add bpf_link support for BPF_PROG_TYPE_SOCKMAP
Introduce a libbpf API function bpf_program__attach_sockmap()
which allow user to get a bpf_link for their corresponding programs.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240410043532.3737722-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-04-24 15:16:35 -07:00
Yonghong Song
62217fb32a bpf: Add bpf_link support for sk_msg and sk_skb progs
Add bpf_link support for sk_msg and sk_skb programs. We have an
internal request to support bpf_link for sk_msg programs so user
space can have a uniform handling with bpf_link based libbpf
APIs. Using bpf_link based libbpf API also has a benefit which
makes system robust by decoupling prog life cycle and
attachment life cycle.

Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240410043527.3737160-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-04-24 15:16:35 -07:00
Andrea Righi
b521a722b9 libbpf: Add ring__consume_n / ring_buffer__consume_n
Introduce a new API to consume items from a ring buffer, limited to a
specified amount, and return to the caller the actual number of items
consumed.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/lkml/20240310154726.734289-1-andrea.righi@canonical.com/T
Link: https://lore.kernel.org/bpf/20240406092005.92399-4-andrea.righi@canonical.com
2024-04-24 15:16:35 -07:00
Andrea Righi
98de9ace4d libbpf: ringbuf: Allow to consume up to a certain amount of items
In some cases, instead of always consuming all items from ring buffers
in a greedy way, we may want to consume up to a certain amount of items,
for example when we need to copy items from the BPF ring buffer to a
limited user buffer.

This change allows to set an upper limit to the amount of items consumed
from one or more ring buffers.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240406092005.92399-3-andrea.righi@canonical.com
2024-04-24 15:16:35 -07:00
Andrea Righi
26d9ab5f78 libbpf: Start v1.5 development cycle
Bump libbpf.map to v1.5.0 to start a new libbpf version cycle.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240406092005.92399-2-andrea.righi@canonical.com
2024-04-24 15:16:35 -07:00
Anton Protopopov
c5219d1b3d bpf: Pack struct bpf_fib_lookup
The struct bpf_fib_lookup is supposed to be of size 64. A recent commit
59b418c7063d ("bpf: Add a check for struct bpf_fib_lookup size") added
a static assertion to check this property so that future changes to the
structure will not accidentally break this assumption.

As it immediately turned out, on some 32-bit arm systems, when AEABI=n,
the total size of the structure was equal to 68, see [1]. This happened
because the bpf_fib_lookup structure contains a union of two 16-bit
fields:

    union {
            __u16 tot_len;
            __u16 mtu_result;
    };

which was supposed to compile to a 16-bit-aligned 16-bit field. On the
aforementioned setups it was instead both aligned and padded to 32-bits.

Declare this inner union as __attribute__((packed, aligned(2))) such
that it always is of size 2 and is aligned to 16 bits.

  [1] https://lore.kernel.org/all/CA+G9fYtsoP51f-oP_Sp5MOq-Ffv8La2RztNpwvE6+R1VtFiLrw@mail.gmail.com/#t

Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Fixes: e1850ea9bd9e ("bpf: bpf_fib_lookup return MTU value as output when looked up")
Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240403123303.1452184-1-aspsk@isovalent.com
2024-04-24 15:16:35 -07:00
Tobias Böhm
8d3a3e138b libbpf: Use local bpf_helpers.h include
Commit 20d59ee55172fdf6 ("libbpf: add bpf_core_cast() macro") added a
bpf_helpers include in bpf_core_read.h as a system include. Usually, the
includes are local, though, like in bpf_tracing.h. This commit adjusts
the include to be local as well.

Signed-off-by: Tobias Böhm <tobias@aibor.de>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/q5d5bgc6vty2fmaazd5e73efd6f5bhiru2le6fxn43vkw45bls@fhlw2s5ootdb
2024-04-24 15:16:35 -07:00
David Lechner
9f2853a352 bpf: Fix typo in uapi doc comments
In a few places in the bpf uapi headers, EOPNOTSUPP is missing a "P" in
the doc comments. This adds the missing "P".

Signed-off-by: David Lechner <dlechner@baylibre.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240329152900.398260-2-dlechner@baylibre.com
2024-04-24 15:16:35 -07:00
Yonghong Song
d2f83fb976 libbpf: Handle <orig_name>.llvm.<hash> symbol properly
With CONFIG_LTO_CLANG_THIN enabled, with some of previous
version of kernel code base ([1]), I hit the following
error:
   test_ksyms:PASS:kallsyms_fopen 0 nsec
   test_ksyms:FAIL:ksym_find symbol 'bpf_link_fops' not found
   #118     ksyms:FAIL

The reason is that 'bpf_link_fops' is renamed to
   bpf_link_fops.llvm.8325593422554671469
Due to cross-file inlining, the static variable 'bpf_link_fops'
in syscall.c is used by a function in another file. To avoid
potential duplicated names, the llvm added suffix
'.llvm.<hash>' ([2]) to 'bpf_link_fops' variable.
Such renaming caused a problem in libbpf if 'bpf_link_fops'
is used in bpf prog as a ksym but 'bpf_link_fops' does not
match any symbol in /proc/kallsyms.

To fix this issue, libbpf needs to understand that suffix '.llvm.<hash>'
is caused by clang lto kernel and to process such symbols properly.

With latest bpf-next code base built with CONFIG_LTO_CLANG_THIN,
I cannot reproduce the above failure any more. But such an issue
could happen with other symbols or in the future for bpf_link_fops symbol.

For example, with my current kernel, I got the following from
/proc/kallsyms:
  ffffffff84782154 d __func__.net_ratelimit.llvm.6135436931166841955
  ffffffff85f0a500 d tk_core.llvm.726630847145216431
  ffffffff85fdb960 d __fs_reclaim_map.llvm.10487989720912350772
  ffffffff864c7300 d fake_dst_ops.llvm.54750082607048300

I could not easily create a selftest to test newly-added
libbpf functionality with a static C test since I do not know
which symbol is cross-file inlined. But based on my particular kernel,
the following test change can run successfully.

>  diff --git a/tools/testing/selftests/bpf/prog_tests/ksyms.c b/tools/testing/selftests/bpf/prog_tests/ksyms.c
>  index 6a86d1f07800..904a103f7b1d 100644
>  --- a/tools/testing/selftests/bpf/prog_tests/ksyms.c
>  +++ b/tools/testing/selftests/bpf/prog_tests/ksyms.c
>  @@ -42,6 +42,7 @@ void test_ksyms(void)
>          ASSERT_EQ(data->out__bpf_link_fops, link_fops_addr, "bpf_link_fops");
>          ASSERT_EQ(data->out__bpf_link_fops1, 0, "bpf_link_fops1");
>          ASSERT_EQ(data->out__btf_size, btf_size, "btf_size");
>  +       ASSERT_NEQ(data->out__fake_dst_ops, 0, "fake_dst_ops");
>          ASSERT_EQ(data->out__per_cpu_start, per_cpu_start_addr, "__per_cpu_start");
>
>   cleanup:
>  diff --git a/tools/testing/selftests/bpf/progs/test_ksyms.c b/tools/testing/selftests/bpf/progs/test_ksyms.c
>  index 6c9cbb5a3bdf..fe91eef54b66 100644
>  --- a/tools/testing/selftests/bpf/progs/test_ksyms.c
>  +++ b/tools/testing/selftests/bpf/progs/test_ksyms.c
>  @@ -9,11 +9,13 @@ __u64 out__bpf_link_fops = -1;
>   __u64 out__bpf_link_fops1 = -1;
>   __u64 out__btf_size = -1;
>   __u64 out__per_cpu_start = -1;
>  +__u64 out__fake_dst_ops = -1;
>
>   extern const void bpf_link_fops __ksym;
>   extern const void __start_BTF __ksym;
>   extern const void __stop_BTF __ksym;
>   extern const void __per_cpu_start __ksym;
>  +extern const void fake_dst_ops __ksym;
>   /* non-existing symbol, weak, default to zero */
>   extern const void bpf_link_fops1 __ksym __weak;
>
>  @@ -23,6 +25,7 @@ int handler(const void *ctx)
>          out__bpf_link_fops = (__u64)&bpf_link_fops;
>          out__btf_size = (__u64)(&__stop_BTF - &__start_BTF);
>          out__per_cpu_start = (__u64)&__per_cpu_start;
>  +       out__fake_dst_ops = (__u64)&fake_dst_ops;
>
>          out__bpf_link_fops1 = (__u64)&bpf_link_fops1;

This patch fixed the issue in libbpf such that
the suffix '.llvm.<hash>' will be ignored during comparison of
bpf prog ksym vs. symbols in /proc/kallsyms, this resolved the issue.
Currently, only static variables in /proc/kallsyms are checked
with '.llvm.<hash>' suffix since in bpf programs function ksyms
with '.llvm.<hash>' suffix are most likely kfunc's and unlikely
to be cross-file inlined.

Note that currently kernel does not support gcc build with lto.

  [1] https://lore.kernel.org/bpf/20240302165017.1627295-1-yonghong.song@linux.dev/
  [2] https://github.com/llvm/llvm-project/blob/release/18.x/llvm/include/llvm/IR/ModuleSummaryIndex.h#L1714-L1719

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240326041458.1198161-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-04-24 15:16:35 -07:00
Yonghong Song
8b9cb7d479 libbpf: Mark libbpf_kallsyms_parse static function
Currently libbpf_kallsyms_parse() function is declared as a global
function but actually it is not a API and there is no external
users in bpftool/bpf-selftests. So let us mark the function as
static.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240326041453.1197949-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-04-24 15:16:35 -07:00
Mykyta Yatsenko
b062410166 bpf: improve error message for unsupported helper
BPF verifier emits "unknown func" message when given BPF program type
does not support BPF helper. This message may be confusing for users, as
important context that helper is unknown only to current program type is
not provided.

This patch changes message to "program of this type cannot use helper "
and aligns dependent code in libbpf and tests. Any suggestions on
improving/changing this message are welcome.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <qmo@kernel.org>
Link: https://lore.kernel.org/r/20240325152210.377548-1-yatsenko@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-04-24 15:16:35 -07:00
Anton Protopopov
89d8cdf741 bpf: Add support for passing mark with bpf_fib_lookup
Extend the bpf_fib_lookup() helper by making it to utilize mark if
the BPF_FIB_LOOKUP_MARK flag is set. In order to pass the mark the
four bytes of struct bpf_fib_lookup are used, shared with the
output-only smac/dmac fields.

Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: David Ahern <dsahern@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240326101742.17421-2-aspsk@isovalent.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-04-24 15:16:35 -07:00
Song Liu
8a2054f417 ci/diffs: Add temporary fix for mitigation config
Upstream is discussing the exact config to ship. In the meanwhile, which
would unblock CI.

More discussions here:

https://lore.kernel.org/lkml/20240423045548.1324969-1-song@kernel.org/T/#u

Signed-off-by: Song Liu <song@kernel.org>
2024-04-24 10:42:01 -07:00
thiagoftsm
b981a3a138 Merge branch 'libbpf:master' into master 2024-04-21 14:42:55 +00:00
Geyslan Gregório
46eafba62e ci: bump run-on-arch action to v2.7.1
More info: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/

Signed-off-by: Geyslan Gregório <geyslan@gmail.com>
2024-04-02 21:51:25 -07:00
Geyslan Gregório
6d3595d215 ci: bump checkout action to v4
Due to the transition from Node 16 to Node 20, the checkout action
needs to be updated to v4.

More info: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/

Signed-off-by: Geyslan Gregório <geyslan@gmail.com>
2024-04-02 10:25:11 -07:00
Andrii Nakryiko
20ea95b450 ci: sync DENYLISTs with BPF CI
Keep all the denylisted tests in sync.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-03-25 21:58:26 -07:00
Andrii Nakryiko
902af6913a ci: clean up temporary patch
It's already applied upstream.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-03-25 21:58:26 -07:00
Andrii Nakryiko
25a9cc27d7 ci: regenerate latest vmlinux.h
Update vmlinux.h to make BPF selftests compile.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-03-25 21:58:26 -07:00
Andrii Nakryiko
8db4a2feeb sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   e63985ecd22681c7f5975f2e8637187a326b6791
Checkpoint bpf-next commit: 14bb1e8c8d4ad5d9d2febb7d19c70a3cf536e1e5
Baseline bpf commit:        2487007aa3b9fafbd2cb14068f49791ce1d7ede5
Checkpoint bpf commit:      443574b033876c85a35de4c65c14f7fe092222b2

Alexei Starovoitov (6):
  libbpf: Allow specifying 64-bit integers in map BTF.
  bpf: Introduce bpf_arena.
  bpf: Disasm support for addr_space_cast instruction.
  libbpf: Add __arg_arena to bpf_helpers.h
  libbpf: Add support for bpf_arena.
  libbpf, selftests/bpf: Adjust libbpf, bpftool, selftests to match LLVM

Andrii Nakryiko (4):
  libbpf: Recognize __arena global variables.
  bpf: support BPF cookie in raw tracepoint (raw_tp, tp_btf) programs
  libbpf: add support for BPF cookie for raw_tp/tp_btf programs
  libbpf: fix u64-to-pointer cast on 32-bit arches

Arnaldo Carvalho de Melo (1):
  libbpf: Define MFD_CLOEXEC if not available

Jakub Kicinski (2):
  netdev: add per-queue statistics
  netdev: add queue stat for alloc failures

Kui-Feng Lee (1):
  libbpf: Skip zeroed or null fields if not found in the kernel type.

Mykyta Yatsenko (1):
  libbpbpf: Check bpf_map/bpf_program fd validity

Quentin Monnet (1):
  libbpf: Prevent null-pointer dereference when prog to load has no BTF

Yonghong Song (2):
  libbpf: Add new sec_def "sk_skb/verdict"
  bpf: Sync uapi bpf.h to tools directory

 include/uapi/linux/bpf.h    |  20 ++-
 include/uapi/linux/netdev.h |  20 +++
 src/bpf.c                   |  16 +-
 src/bpf.h                   |   9 +
 src/bpf_helpers.h           |   2 +
 src/libbpf.c                | 322 +++++++++++++++++++++++++++++++-----
 src/libbpf.h                |  13 +-
 src/libbpf.map              |   2 +
 src/libbpf_probes.c         |   7 +
 9 files changed, 366 insertions(+), 45 deletions(-)

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-03-25 21:58:26 -07:00
Arnaldo Carvalho de Melo
7fee466676 libbpf: Define MFD_CLOEXEC if not available
Since its going directly to the syscall to avoid not having
memfd_create() available in some systems, do the same for its
MFD_CLOEXEC flags, defining it if not available.

This fixes the build in those systems, noticed while building perf on a
set of build containers.

Fixes: 9fa5e1a180aa639f ("libbpf: Call memfd_create() syscall directly")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/ZfxZ9nCyKvwmpKkE@x1
2024-03-25 21:58:26 -07:00
Andrii Nakryiko
ddf722fb5c libbpf: fix u64-to-pointer cast on 32-bit arches
It's been reported that (void *)map->map_extra is causing compilation
warnings on 32-bit architectures. It's easy enough to fix this by
casting to long first.

Fixes: 79ff13e99169 ("libbpf: Add support for bpf_arena.")
Reported-by: Ryan Eatmon <reatmon@ti.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Message-ID: <20240319215143.1279312-1-andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-03-25 21:58:26 -07:00
Alexei Starovoitov
137193b655 libbpf, selftests/bpf: Adjust libbpf, bpftool, selftests to match LLVM
The selftests use
to tell LLVM about special pointers. For LLVM there is nothing "arena"
about them. They are simply pointers in a different address space.
Hence LLVM diff https://github.com/llvm/llvm-project/pull/85161 renamed:
. macro __BPF_FEATURE_ARENA_CAST -> __BPF_FEATURE_ADDR_SPACE_CAST
. global variables in __attribute__((address_space(N))) are now
  placed in section named ".addr_space.N" instead of ".arena.N".

Adjust libbpf, bpftool, and selftests to match LLVM.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/20240315021834.62988-3-alexei.starovoitov@gmail.com
2024-03-25 21:58:26 -07:00
Yonghong Song
d2676a58de bpf: Sync uapi bpf.h to tools directory
There is a difference between kernel uapi bpf.h and tools
uapi bpf.h. There is no functionality difference, but let
us sync properly to make it easy for later bpf.h update.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240325033842.1693553-1-yonghong.song@linux.dev
2024-03-25 21:58:26 -07:00
Yonghong Song
4d95d8b7f0 libbpf: Add new sec_def "sk_skb/verdict"
The new sec_def specifies sk_skb program type with
BPF_SK_SKB_VERDICT attachment type. This way, libbpf
will set expected_attach_type properly for the program.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240319175412.2941149-1-yonghong.song@linux.dev
2024-03-25 21:58:26 -07:00
Andrii Nakryiko
f5828cc352 libbpf: add support for BPF cookie for raw_tp/tp_btf programs
Wire up BPF cookie passing or raw_tp and tp_btf programs, both in
low-level and high-level APIs.

Acked-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Message-ID: <20240319233852.1977493-5-andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-03-25 21:58:26 -07:00
Andrii Nakryiko
cbd6e3596c bpf: support BPF cookie in raw tracepoint (raw_tp, tp_btf) programs
Wire up BPF cookie for raw tracepoint programs (both BTF and non-BTF
aware variants). This brings them up to part w.r.t. BPF cookie usage
with classic tracepoint and fentry/fexit programs.

Acked-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Message-ID: <20240319233852.1977493-4-andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-03-25 21:58:26 -07:00
Mykyta Yatsenko
7cfc365995 libbpbpf: Check bpf_map/bpf_program fd validity
libbpf creates bpf_program/bpf_map structs for each program/map that
user defines, but it allows to disable creating/loading those objects in
kernel, in that case they won't have associated file descriptor
(fd < 0). Such functionality is used for backward compatibility
with some older kernels.

Nothing prevents users from passing these maps or programs with no
kernel counterpart to libbpf APIs. This change introduces explicit
checks for kernel objects existence, aiming to improve visibility of
those edge cases and provide meaningful warnings to users.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240318131808.95959-1-yatsenko@meta.com
2024-03-25 21:58:26 -07:00
Kui-Feng Lee
a5459eac49 libbpf: Skip zeroed or null fields if not found in the kernel type.
Accept additional fields of a struct_ops type with all zero values even if
these fields are not in the corresponding type in the kernel. This provides
a way to be backward compatible. User space programs can use the same map
on a machine running an old kernel by clearing fields that do not exist in
the kernel.

Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240313214139.685112-2-thinker.li@gmail.com
2024-03-25 21:58:26 -07:00
Quentin Monnet
f84ee80801 libbpf: Prevent null-pointer dereference when prog to load has no BTF
In bpf_objec_load_prog(), there's no guarantee that obj->btf is non-NULL
when passing it to btf__fd(), and this function does not perform any
check before dereferencing its argument (as bpf_object__btf_fd() used to
do). As a consequence, we get segmentation fault errors in bpftool (for
example) when trying to load programs that come without BTF information.

v2: Keep btf__fd() in the fix instead of reverting to bpf_object__btf_fd().

Fixes: df7c3f7d3a3d ("libbpf: make uniform use of btf__fd() accessor inside libbpf")
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Quentin Monnet <qmo@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240314150438.232462-1-qmo@kernel.org
2024-03-25 21:58:26 -07:00
Andrii Nakryiko
2d042d22a7 libbpf: Recognize __arena global variables.
LLVM automatically places __arena variables into ".arena.1" ELF section.
In order to use such global variables bpf program must include definition
of arena map in ".maps" section, like:
struct {
       __uint(type, BPF_MAP_TYPE_ARENA);
       __uint(map_flags, BPF_F_MMAPABLE);
       __uint(max_entries, 1000);         /* number of pages */
       __ulong(map_extra, 2ull << 44);    /* start of mmap() region */
} arena SEC(".maps");

libbpf recognizes both uses of arena and creates single `struct bpf_map *`
instance in libbpf APIs.
".arena.1" ELF section data is used as initial data image, which is exposed
through skeleton and bpf_map__initial_value() to the user, if they need to tune
it before the load phase. During load phase, this initial image is copied over
into mmap()'ed region corresponding to arena, and discarded.

Few small checks here and there had to be added to make sure this
approach works with bpf_map__initial_value(), mostly due to hard-coded
assumption that map->mmaped is set up with mmap() syscall and should be
munmap()'ed. For arena, .arena.1 can be (much) smaller than maximum
arena size, so this smaller data size has to be tracked separately.
Given it is enforced that there is only one arena for entire bpf_object
instance, we just keep it in a separate field. This can be generalized
if necessary later.

All global variables from ".arena.1" section are accessible from user space
via skel->arena->name_of_var.

For bss/data/rodata the skeleton/libbpf perform the following sequence:
1. addr = mmap(MAP_ANONYMOUS)
2. user space optionally modifies global vars
3. map_fd = bpf_create_map()
4. bpf_update_map_elem(map_fd, addr) // to store values into the kernel
5. mmap(addr, MAP_FIXED, map_fd)
after step 5 user spaces see the values it wrote at step 2 at the same addresses

arena doesn't support update_map_elem. Hence skeleton/libbpf do:
1. addr = malloc(sizeof SEC ".arena.1")
2. user space optionally modifies global vars
3. map_fd = bpf_create_map(MAP_TYPE_ARENA)
4. real_addr = mmap(map->map_extra, MAP_SHARED | MAP_FIXED, map_fd)
5. memcpy(real_addr, addr) // this will fault-in and allocate pages

At the end look and feel of global data vs __arena global data is the same from
bpf prog pov.

Another complication is:
struct {
  __uint(type, BPF_MAP_TYPE_ARENA);
} arena SEC(".maps");

int __arena foo;
int bar;

  ptr1 = &foo;   // relocation against ".arena.1" section
  ptr2 = &arena; // relocation against ".maps" section
  ptr3 = &bar;   // relocation against ".bss" section

Fo the kernel ptr1 and ptr2 has point to the same arena's map_fd
while ptr3 points to a different global array's map_fd.
For the verifier:
ptr1->type == unknown_scalar
ptr2->type == const_ptr_to_map
ptr3->type == ptr_to_map_value

After verification, from JIT pov all 3 ptr-s are normal ld_imm64 insns.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20240308010812.89848-11-alexei.starovoitov@gmail.com
2024-03-25 21:58:26 -07:00
Alexei Starovoitov
4524a45a2a libbpf: Add support for bpf_arena.
mmap() bpf_arena right after creation, since the kernel needs to
remember the address returned from mmap. This is user_vm_start.
LLVM will generate bpf_arena_cast_user() instructions where
necessary and JIT will add upper 32-bit of user_vm_start
to such pointers.

Fix up bpf_map_mmap_sz() to compute mmap size as
map->value_size * map->max_entries for arrays and
PAGE_SIZE * map->max_entries for arena.

Don't set BTF at arena creation time, since it doesn't support it.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240308010812.89848-9-alexei.starovoitov@gmail.com
2024-03-25 21:58:26 -07:00
Alexei Starovoitov
086825355f libbpf: Add __arg_arena to bpf_helpers.h
Add __arg_arena to bpf_helpers.h

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240308010812.89848-8-alexei.starovoitov@gmail.com
2024-03-25 21:58:26 -07:00
Alexei Starovoitov
6de941bc1e bpf: Disasm support for addr_space_cast instruction.
LLVM generates rX = addr_space_cast(rY, dst_addr_space, src_addr_space)
instruction when pointers in non-zero address space are used by the bpf
program. Recognize this insn in uapi and in bpf disassembler.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/bpf/20240308010812.89848-3-alexei.starovoitov@gmail.com
2024-03-25 21:58:26 -07:00
Alexei Starovoitov
1675c13fae bpf: Introduce bpf_arena.
Introduce bpf_arena, which is a sparse shared memory region between the bpf
program and user space.

Use cases:
1. User space mmap-s bpf_arena and uses it as a traditional mmap-ed
   anonymous region, like memcached or any key/value storage. The bpf
   program implements an in-kernel accelerator. XDP prog can search for
   a key in bpf_arena and return a value without going to user space.
2. The bpf program builds arbitrary data structures in bpf_arena (hash
   tables, rb-trees, sparse arrays), while user space consumes it.
3. bpf_arena is a "heap" of memory from the bpf program's point of view.
   The user space may mmap it, but bpf program will not convert pointers
   to user base at run-time to improve bpf program speed.

Initially, the kernel vm_area and user vma are not populated. User space
can fault in pages within the range. While servicing a page fault,
bpf_arena logic will insert a new page into the kernel and user vmas. The
bpf program can allocate pages from that region via
bpf_arena_alloc_pages(). This kernel function will insert pages into the
kernel vm_area. The subsequent fault-in from user space will populate that
page into the user vma. The BPF_F_SEGV_ON_FAULT flag at arena creation time
can be used to prevent fault-in from user space. In such a case, if a page
is not allocated by the bpf program and not present in the kernel vm_area,
the user process will segfault. This is useful for use cases 2 and 3 above.

bpf_arena_alloc_pages() is similar to user space mmap(). It allocates pages
either at a specific address within the arena or allocates a range with the
maple tree. bpf_arena_free_pages() is analogous to munmap(), which frees
pages and removes the range from the kernel vm_area and from user process
vmas.

bpf_arena can be used as a bpf program "heap" of up to 4GB. The speed of
bpf program is more important than ease of sharing with user space. This is
use case 3. In such a case, the BPF_F_NO_USER_CONV flag is recommended.
It will tell the verifier to treat the rX = bpf_arena_cast_user(rY)
instruction as a 32-bit move wX = wY, which will improve bpf prog
performance. Otherwise, bpf_arena_cast_user is translated by JIT to
conditionally add the upper 32 bits of user vm_start (if the pointer is not
NULL) to arena pointers before they are stored into memory. This way, user
space sees them as valid 64-bit pointers.

Diff https://github.com/llvm/llvm-project/pull/84410 enables LLVM BPF
backend generate the bpf_addr_space_cast() instruction to cast pointers
between address_space(1) which is reserved for bpf_arena pointers and
default address space zero. All arena pointers in a bpf program written in
C language are tagged as __attribute__((address_space(1))). Hence, clang
provides helpful diagnostics when pointers cross address space. Libbpf and
the kernel support only address_space == 1. All other address space
identifiers are reserved.

rX = bpf_addr_space_cast(rY, /* dst_as */ 1, /* src_as */ 0) tells the
verifier that rX->type = PTR_TO_ARENA. Any further operations on
PTR_TO_ARENA register have to be in the 32-bit domain. The verifier will
mark load/store through PTR_TO_ARENA with PROBE_MEM32. JIT will generate
them as kern_vm_start + 32bit_addr memory accesses. The behavior is similar
to copy_from_kernel_nofault() except that no address checks are necessary.
The address is guaranteed to be in the 4GB range. If the page is not
present, the destination register is zeroed on read, and the operation is
ignored on write.

rX = bpf_addr_space_cast(rY, 0, 1) tells the verifier that rX->type =
unknown scalar. If arena->map_flags has BPF_F_NO_USER_CONV set, then the
verifier converts such cast instructions to mov32. Otherwise, JIT will emit
native code equivalent to:
rX = (u32)rY;
if (rY)
  rX |= clear_lo32_bits(arena->user_vm_start); /* replace hi32 bits in rX */

After such conversion, the pointer becomes a valid user pointer within
bpf_arena range. The user process can access data structures created in
bpf_arena without any additional computations. For example, a linked list
built by a bpf program can be walked natively by user space.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Barret Rhoden <brho@google.com>
Link: https://lore.kernel.org/bpf/20240308010812.89848-2-alexei.starovoitov@gmail.com
2024-03-25 21:58:26 -07:00
Alexei Starovoitov
385d344839 libbpf: Allow specifying 64-bit integers in map BTF.
__uint() macro that is used to specify map attributes like:
  __uint(type, BPF_MAP_TYPE_ARRAY);
  __uint(map_flags, BPF_F_MMAPABLE);
It is limited to 32-bit, since BTF_KIND_ARRAY has u32 "number of elements"
field in "struct btf_array".

Introduce __ulong() macro that allows specifying values bigger than 32-bit.
In map definition "map_extra" is the only u64 field, so far.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20240307031228.42896-5-alexei.starovoitov@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-03-25 21:58:26 -07:00
Jakub Kicinski
d71c0ed2ef netdev: add queue stat for alloc failures
Rx alloc failures are commonly counted by drivers.
Support reporting those via netdev-genl queue stats.

Acked-by: Stanislav Fomichev <sdf@google.com>
Reviewed-by: Amritha Nambiar <amritha.nambiar@intel.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240306195509.1502746-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-25 21:58:26 -07:00
Jakub Kicinski
5e80833e50 netdev: add per-queue statistics
The ethtool-nl family does a good job exposing various protocol
related and IEEE/IETF statistics which used to get dumped under
ethtool -S, with creative names. Queue stats don't have a netlink
API, yet, and remain a lion's share of ethtool -S output for new
drivers. Not only is that bad because the names differ driver to
driver but it's also bug-prone. Intuitively drivers try to report
only the stats for active queues, but querying ethtool stats
involves multiple system calls, and the number of stats is
read separately from the stats themselves. Worse still when user
space asks for values of the stats, it doesn't inform the kernel
how big the buffer is. If number of stats increases in the meantime
kernel will overflow user buffer.

Add a netlink API for dumping queue stats. Queue information is
exposed via the netdev-genl family, so add the stats there.
Support per-queue and sum-for-device dumps. Latter will be useful
when subsequent patches add more interesting common stats than
just bytes and packets.

The API does not currently distinguish between HW and SW stats.
The expectation is that the source of the stats will either not
matter much (good packets) or be obvious (skb alloc errors).

Acked-by: Stanislav Fomichev <sdf@google.com>
Reviewed-by: Amritha Nambiar <amritha.nambiar@intel.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240306195509.1502746-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-25 21:58:26 -07:00
Andrii Nakryiko
2778cbce60 ci: add xdp_bonding fixes from bpf/master
bpf tree has fixes for xdp_bonding selftests which are not yet in
bpf-next, so add them as temporary CI-only patches.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-03-06 15:22:13 -08:00
Andrii Nakryiko
4f875865b7 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   2ab256e93249f5ac1da665861aa0f03fb4208d9c
Checkpoint bpf-next commit: 7d763bc4a44a51e48dde406d6c6c8a26a60ec647
Baseline bpf commit:        dced881ead78e4d6add3735d02a9186ba2415630
Checkpoint bpf commit:      2487007aa3b9fafbd2cb14068f49791ce1d7ede5

Aahil Awatramani (1):
  bonding: Add independent control state machine

Alexei Starovoitov (1):
  bpf: Introduce may_goto instruction

Chen Shen (1):
  libbpf: Correct debug message in btf__load_vmlinux_btf

Eduard Zingerman (7):
  libbpf: Allow version suffixes (___smth) for struct_ops types
  libbpf: Tie struct_ops programs to kernel BTF ids, not to local ids
  libbpf: Honor autocreate flag for struct_ops maps
  libbpf: Sync progs autoload with maps autocreate for struct_ops maps
  libbpf: Replace elf_state->st_ops_* fields with SEC_ST_OPS sec_type
  libbpf: Struct_ops in SEC("?.struct_ops") / SEC("?.struct_ops.link")
  libbpf: Rewrite btf datasec names starting from '?'

Kees Cook (1):
  bpf: Replace bpf_lpm_trie_key 0-length array with flexible array

Kui-Feng Lee (2):
  libbpf: Set btf_value_type_id of struct bpf_map for struct_ops.
  libbpf: Convert st_ops->data to shadow type.

 include/uapi/linux/bpf.h     |  24 +++-
 include/uapi/linux/if_link.h |   1 +
 src/btf.c                    |   2 +-
 src/features.c               |  22 +++
 src/libbpf.c                 | 252 +++++++++++++++++++++++++++--------
 src/libbpf_internal.h        |   2 +
 6 files changed, 242 insertions(+), 61 deletions(-)

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-03-06 15:22:13 -08:00
Eduard Zingerman
438adf417d libbpf: Rewrite btf datasec names starting from '?'
Optional struct_ops maps are defined using question mark at the start
of the section name, e.g.:

    SEC("?.struct_ops")
    struct test_ops optional_map = { ... };

This commit teaches libbpf to detect if kernel allows '?' prefix
in datasec names, and if it doesn't then to rewrite such names
by replacing '?' with '_', e.g.:

    DATASEC ?.struct_ops -> DATASEC _.struct_ops

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240306104529.6453-13-eddyz87@gmail.com
2024-03-06 13:58:27 -08:00
Eduard Zingerman
fc8b86bda2 libbpf: Struct_ops in SEC("?.struct_ops") / SEC("?.struct_ops.link")
Allow using two new section names for struct_ops maps:
- SEC("?.struct_ops")
- SEC("?.struct_ops.link")

To specify maps that have bpf_map->autocreate == false after open.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240306104529.6453-12-eddyz87@gmail.com
2024-03-06 13:58:27 -08:00
Eduard Zingerman
d5d0b6e920 libbpf: Replace elf_state->st_ops_* fields with SEC_ST_OPS sec_type
The next patch would add two new section names for struct_ops maps.
To make working with multiple struct_ops sections more convenient:
- remove fields like elf_state->st_ops_{shndx,link_shndx};
- mark section descriptions hosting struct_ops as
  elf_sec_desc->sec_type == SEC_ST_OPS;

After these changes struct_ops sections could be processed uniformly
by iterating bpf_object->efile.secs entries.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240306104529.6453-11-eddyz87@gmail.com
2024-03-06 13:58:27 -08:00
Eduard Zingerman
060c604db8 libbpf: Sync progs autoload with maps autocreate for struct_ops maps
Automatically select which struct_ops programs to load depending on
which struct_ops maps are selected for automatic creation.
E.g. for the BPF code below:

    SEC("struct_ops/test_1") int BPF_PROG(foo) { ... }
    SEC("struct_ops/test_2") int BPF_PROG(bar) { ... }

    SEC(".struct_ops.link")
    struct test_ops___v1 A = {
        .foo = (void *)foo
    };

    SEC(".struct_ops.link")
    struct test_ops___v2 B = {
        .foo = (void *)foo,
        .bar = (void *)bar,
    };

And the following libbpf API calls:

    bpf_map__set_autocreate(skel->maps.A, true);
    bpf_map__set_autocreate(skel->maps.B, false);

The autoload would be enabled for program 'foo' and disabled for
program 'bar'.

During load, for each struct_ops program P, referenced from some
struct_ops map M:
- set P.autoload = true if M.autocreate is true for some M;
- set P.autoload = false if M.autocreate is false for all M;
- don't change P.autoload, if P is not referenced from any map.

Do this after bpf_object__init_kern_struct_ops_maps()
to make sure that shadow vars assignment is done.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240306104529.6453-9-eddyz87@gmail.com
2024-03-06 13:58:27 -08:00
Eduard Zingerman
cb426140d0 libbpf: Honor autocreate flag for struct_ops maps
Skip load steps for struct_ops maps not marked for automatic creation.
This should allow to load bpf object in situations like below:

    SEC("struct_ops/foo") int BPF_PROG(foo) { ... }
    SEC("struct_ops/bar") int BPF_PROG(bar) { ... }

    struct test_ops___v1 {
    	int (*foo)(void);
    };

    struct test_ops___v2 {
    	int (*foo)(void);
    	int (*does_not_exist)(void);
    };

    SEC(".struct_ops.link")
    struct test_ops___v1 map_for_old = {
    	.test_1 = (void *)foo
    };

    SEC(".struct_ops.link")
    struct test_ops___v2 map_for_new = {
    	.test_1 = (void *)foo,
    	.does_not_exist = (void *)bar
    };

Suppose program is loaded on old kernel that does not have definition
for 'does_not_exist' struct_ops member. After this commit it would be
possible to load such object file after the following tweaks:

    bpf_program__set_autoload(skel->progs.bar, false);
    bpf_map__set_autocreate(skel->maps.map_for_new, false);

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/bpf/20240306104529.6453-4-eddyz87@gmail.com
2024-03-06 13:58:27 -08:00
Eduard Zingerman
aded62b120 libbpf: Tie struct_ops programs to kernel BTF ids, not to local ids
Enforce the following existing limitation on struct_ops programs based
on kernel BTF id instead of program-local BTF id:

    struct_ops BPF prog can be re-used between multiple .struct_ops &
    .struct_ops.link as long as it's the same struct_ops struct
    definition and the same function pointer field

This allows reusing same BPF program for versioned struct_ops map
definitions, e.g.:

    SEC("struct_ops/test")
    int BPF_PROG(foo) { ... }

    struct some_ops___v1 { int (*test)(void); };
    struct some_ops___v2 { int (*test)(void); };

    SEC(".struct_ops.link") struct some_ops___v1 a = { .test = foo }
    SEC(".struct_ops.link") struct some_ops___v2 b = { .test = foo }

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240306104529.6453-3-eddyz87@gmail.com
2024-03-06 13:58:27 -08:00
Eduard Zingerman
f7fd5dbc07 libbpf: Allow version suffixes (___smth) for struct_ops types
E.g. allow the following struct_ops definitions:

    struct bpf_testmod_ops___v1 { int (*test)(void); };
    struct bpf_testmod_ops___v2 { int (*test)(void); };

    SEC(".struct_ops.link")
    struct bpf_testmod_ops___v1 a = { .test = ... }
    SEC(".struct_ops.link")
    struct bpf_testmod_ops___v2 b = { .test = ... }

Where both bpf_testmod_ops__v1 and bpf_testmod_ops__v2 would be
resolved as 'struct bpf_testmod_ops' from kernel BTF.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: David Vernet <void@manifault.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240306104529.6453-2-eddyz87@gmail.com
2024-03-06 13:58:27 -08:00
Alexei Starovoitov
00b08dceea bpf: Introduce may_goto instruction
Introduce may_goto instruction that from the verifier pov is similar to
open coded iterators bpf_for()/bpf_repeat() and bpf_loop() helper, but it
doesn't iterate any objects.
In assembly 'may_goto' is a nop most of the time until bpf runtime has to
terminate the program for whatever reason. In the current implementation
may_goto has a hidden counter, but other mechanisms can be used.
For programs written in C the later patch introduces 'cond_break' macro
that combines 'may_goto' with 'break' statement and has similar semantics:
cond_break is a nop until bpf runtime has to break out of this loop.
It can be used in any normal "for" or "while" loop, like

  for (i = zero; i < cnt; cond_break, i++) {

The verifier recognizes that may_goto is used in the program, reserves
additional 8 bytes of stack, initializes them in subprog prologue, and
replaces may_goto instruction with:
aux_reg = *(u64 *)(fp - 40)
if aux_reg == 0 goto pc+off
aux_reg -= 1
*(u64 *)(fp - 40) = aux_reg

may_goto instruction can be used by LLVM to implement __builtin_memcpy,
__builtin_strcmp.

may_goto is not a full substitute for bpf_for() macro.
bpf_for() doesn't have induction variable that verifiers sees,
so 'i' in bpf_for(i, 0, 100) is seen as imprecise and bounded.

But when the code is written as:
for (i = 0; i < 100; cond_break, i++)
the verifier see 'i' as precise constant zero,
hence cond_break (aka may_goto) doesn't help to converge the loop.
A static or global variable can be used as a workaround:
static int zero = 0;
for (i = zero; i < 100; cond_break, i++) // works!

may_goto works well with arena pointers that don't need to be bounds
checked on access. Load/store from arena returns imprecise unbounded
scalar and loops with may_goto pass the verifier.

Reserve new opcode BPF_JMP | BPF_JCOND for may_goto insn.
JCOND stands for conditional pseudo jump.
Since goto_or_nop insn was proposed, it may use the same opcode.
may_goto vs goto_or_nop can be distinguished by src_reg:
code = BPF_JMP | BPF_JCOND
src_reg = 0 - may_goto
src_reg = 1 - goto_or_nop

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240306031929.42666-2-alexei.starovoitov@gmail.com
2024-03-06 13:58:27 -08:00
Chen Shen
bf52494e2b libbpf: Correct debug message in btf__load_vmlinux_btf
In the function btf__load_vmlinux_btf, the debug message incorrectly
refers to 'path' instead of 'sysfs_btf_path'.

Signed-off-by: Chen Shen <peterchenshen@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20240302062218.3587-1-peterchenshen@gmail.com
2024-03-06 13:58:27 -08:00
Kui-Feng Lee
acfaeffeaa libbpf: Convert st_ops->data to shadow type.
Convert st_ops->data to the shadow type of the struct_ops map. The shadow
type of a struct_ops type is a variant of the original struct type
providing a way to access/change the values in the maps of the struct_ops
type.

bpf_map__initial_value() will return st_ops->data for struct_ops types. The
skeleton is going to use it as the pointer to the shadow type of the
original struct type.

One of the main differences between the original struct type and the shadow
type is that all function pointers of the shadow type are converted to
pointers of struct bpf_program. Users can replace these bpf_program
pointers with other BPF programs. The st_ops->progs[] will be updated
before updating the value of a map to reflect the changes made by users.

Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240229064523.2091270-3-thinker.li@gmail.com
2024-03-06 13:58:27 -08:00
Kui-Feng Lee
0758d8b0f2 libbpf: Set btf_value_type_id of struct bpf_map for struct_ops.
For a struct_ops map, btf_value_type_id is the type ID of it's struct
type. This value is required by bpftool to generate skeleton including
pointers of shadow types. The code generator gets the type ID from
bpf_map__btf_value_type_id() in order to get the type information of the
struct type of a map.

Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240229064523.2091270-2-thinker.li@gmail.com
2024-03-06 13:58:27 -08:00
Kees Cook
fa4d00254d bpf: Replace bpf_lpm_trie_key 0-length array with flexible array
Replace deprecated 0-length array in struct bpf_lpm_trie_key with
flexible array. Found with GCC 13:

../kernel/bpf/lpm_trie.c:207:51: warning: array subscript i is outside array bounds of 'const __u8[0]' {aka 'const unsigned char[]'} [-Warray-bounds=]
  207 |                                        *(__be16 *)&key->data[i]);
      |                                                   ^~~~~~~~~~~~~
../include/uapi/linux/swab.h:102:54: note: in definition of macro '__swab16'
  102 | #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
      |                                                      ^
../include/linux/byteorder/generic.h:97:21: note: in expansion of macro '__be16_to_cpu'
   97 | #define be16_to_cpu __be16_to_cpu
      |                     ^~~~~~~~~~~~~
../kernel/bpf/lpm_trie.c:206:28: note: in expansion of macro 'be16_to_cpu'
  206 |                 u16 diff = be16_to_cpu(*(__be16 *)&node->data[i]
^
      |                            ^~~~~~~~~~~
In file included from ../include/linux/bpf.h:7:
../include/uapi/linux/bpf.h:82:17: note: while referencing 'data'
   82 |         __u8    data[0];        /* Arbitrary size */
      |                 ^~~~

And found at run-time under CONFIG_FORTIFY_SOURCE:

  UBSAN: array-index-out-of-bounds in kernel/bpf/lpm_trie.c:218:49
  index 0 is out of range for type '__u8 [*]'

Changing struct bpf_lpm_trie_key is difficult since has been used by
userspace. For example, in Cilium:

	struct egress_gw_policy_key {
	        struct bpf_lpm_trie_key lpm_key;
	        __u32 saddr;
	        __u32 daddr;
	};

While direct references to the "data" member haven't been found, there
are static initializers what include the final member. For example,
the "{}" here:

        struct egress_gw_policy_key in_key = {
                .lpm_key = { 32 + 24, {} },
                .saddr   = CLIENT_IP,
                .daddr   = EXTERNAL_SVC_IP & 0Xffffff,
        };

To avoid the build time and run time warnings seen with a 0-sized
trailing array for struct bpf_lpm_trie_key, introduce a new struct
that correctly uses a flexible array for the trailing bytes,
struct bpf_lpm_trie_key_u8. As part of this, include the "header"
portion (which is just the "prefixlen" member), so it can be used
by anything building a bpf_lpr_trie_key that has trailing members that
aren't a u8 flexible array (like the self-test[1]), which is named
struct bpf_lpm_trie_key_hdr.

Unfortunately, C++ refuses to parse the __struct_group() helper, so
it is not possible to define struct bpf_lpm_trie_key_hdr directly in
struct bpf_lpm_trie_key_u8, so we must open-code the union directly.

Adjust the kernel code to use struct bpf_lpm_trie_key_u8 through-out,
and for the selftest to use struct bpf_lpm_trie_key_hdr. Add a comment
to the UAPI header directing folks to the two new options.

Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Closes: https://paste.debian.net/hidden/ca500597/
Link: https://lore.kernel.org/all/202206281009.4332AA33@keescook/ [1]
Link: https://lore.kernel.org/bpf/20240222155612.it.533-kees@kernel.org
2024-03-06 13:58:27 -08:00
Aahil Awatramani
f749be80b7 bonding: Add independent control state machine
Add support for the independent control state machine per IEEE
802.1AX-2008 5.4.15 in addition to the existing implementation of the
coupled control state machine.

Introduces two new states, AD_MUX_COLLECTING and AD_MUX_DISTRIBUTING in
the LACP MUX state machine for separated handling of an initial
Collecting state before the Collecting and Distributing state. This
enables a port to be in a state where it can receive incoming packets
while not still distributing. This is useful for reducing packet loss when
a port begins distributing before its partner is able to collect.

Added new functions such as bond_set_slave_tx_disabled_flags and
bond_set_slave_rx_enabled_flags to precisely manage the port's collecting
and distributing states. Previously, there was no dedicated method to
disable TX while keeping RX enabled, which this patch addresses.

Note that the regular flow process in the kernel's bonding driver remains
unaffected by this patch. The extension requires explicit opt-in by the
user (in order to ensure no disruptions for existing setups) via netlink
support using the new bonding parameter coupled_control. The default value
for coupled_control is set to 1 so as to preserve existing behaviour.

Signed-off-by: Aahil Awatramani <aahila@google.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240202175858.1573852-1-aahila@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-06 13:58:27 -08:00
Andrii Nakryiko
fb98d4bd25 include: fix BPF_CALL_REL definition
Fix our Github-specific definition of BPF_CALL_REL macro. It was missing
the code part.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-03-01 15:39:45 -08:00
Kui-Feng Lee
f4e9b606f4 ci: clean up bpf_test_no_cfi.ko for v5.5.0 and v4.9.0.
bpf_test_no_cfi.ko is not available for v5.5.0 and v4.9.0.

Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
2024-02-27 10:14:31 -08:00
Kui-Feng Lee
ff95bd6238 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   92a871ab9fa59a74d013bc04f321026a057618e7
Checkpoint bpf-next commit: 2ab256e93249f5ac1da665861aa0f03fb4208d9c
Baseline bpf commit:        577e4432f3ac810049cb7e6b71f4d96ec7c6e894
Checkpoint bpf commit:      dced881ead78e4d6add3735d02a9186ba2415630

Arnaldo Carvalho de Melo (1):
  tools headers UAPI: Sync linux/fcntl.h with the kernel sources

Cupertino Miranda (1):
  libbpf: Add support to GCC in CORE macro definitions

Martin Kelly (1):
  bpf: Clarify batch lookup/lookup_and_delete semantics

Matt Bobrowski (1):
  libbpf: Make remark about zero-initializing bpf_*_info structs

 include/uapi/linux/bpf.h   |  6 ++++-
 include/uapi/linux/fcntl.h |  3 +++
 src/bpf.h                  | 39 ++++++++++++++++++++++++---------
 src/bpf_core_read.h        | 45 ++++++++++++++++++++++++++++++++------
 4 files changed, 75 insertions(+), 18 deletions(-)

Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
2024-02-27 10:14:31 -08:00
Arnaldo Carvalho de Melo
a894b0cb9b tools headers UAPI: Sync linux/fcntl.h with the kernel sources
To get the changes in:

  8a924db2d7b5eb69 ("fs: Pass AT_GETATTR_NOSEC flag to getattr interface function")

That don't add anything that is handled by existing hard coded tables or
table generation scripts.

This silences this perf build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/include/uapi/linux/fcntl.h include/uapi/linux/fcntl.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stefan Berger <stefanb@linux.ibm.com>
Link: https://lore.kernel.org/lkml/ZbJv9fGF_k2xXEdr@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-02-27 10:14:31 -08:00
Martin Kelly
afa81fb1cb bpf: Clarify batch lookup/lookup_and_delete semantics
The batch lookup and lookup_and_delete APIs have two parameters,
in_batch and out_batch, to facilitate iterative
lookup/lookup_and_deletion operations for supported maps. Except NULL
for in_batch at the start of these two batch operations, both parameters
need to point to memory equal or larger than the respective map key
size, except for various hashmaps (hash, percpu_hash, lru_hash,
lru_percpu_hash) where the in_batch/out_batch memory size should be
at least 4 bytes.

Document these semantics to clarify the API.

Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240221211838.1241578-1-martin.kelly@crowdstrike.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-02-27 10:14:31 -08:00
Matt Bobrowski
16e68ab13c libbpf: Make remark about zero-initializing bpf_*_info structs
In some situations, if you fail to zero-initialize the
bpf_{prog,map,btf,link}_info structs supplied to the set of LIBBPF
helpers bpf_{prog,map,btf,link}_get_info_by_fd(), you can expect the
helper to return an error. This can possibly leave people in a
situation where they're scratching their heads for an unnnecessary
amount of time. Make an explicit remark about the requirement of
zero-initializing the supplied bpf_{prog,map,btf,link}_info structs
for the respective LIBBPF helpers.

Internally, LIBBPF helpers bpf_{prog,map,btf,link}_get_info_by_fd()
call into bpf_obj_get_info_by_fd() where the bpf(2)
BPF_OBJ_GET_INFO_BY_FD command is used. This specific command is
effectively backed by restrictions enforced by the
bpf_check_uarg_tail_zero() helper. This function ensures that if the
size of the supplied bpf_{prog,map,btf,link}_info structs are larger
than what the kernel can handle, trailing bits are zeroed. This can be
a problem when compiling against UAPI headers that don't necessarily
match the sizes of the same underlying types known to the kernel.

Signed-off-by: Matt Bobrowski <mattbobrowski@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/ZcyEb8x4VbhieWsL@google.com
2024-02-27 10:14:31 -08:00
Cupertino Miranda
b19fdbf1be libbpf: Add support to GCC in CORE macro definitions
Due to internal differences between LLVM and GCC the current
implementation for the CO-RE macros does not fit GCC parser, as it will
optimize those expressions even before those would be accessible by the
BPF backend.

As examples, the following would be optimized out with the original
definitions:
  - As enums are converted to their integer representation during
  parsing, the IR would not know how to distinguish an integer
  constant from an actual enum value.
  - Types need to be kept as temporary variables, as the existing type
  casts of the 0 address (as expanded for LLVM), are optimized away by
  the GCC C parser, never really reaching GCCs IR.

Although, the macros appear to add extra complexity, the expanded code
is removed from the compilation flow very early in the compilation
process, not really affecting the quality of the generated assembly.

Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240213173543.1397708-1-cupertino.miranda@oracle.com
2024-02-27 10:14:31 -08:00
Manu Bretelle
445486dcbf ci: Pass arch parameter to setup-build-env
Since 1bc40aecb3
arch parameter needs to be passed to `setup-build-env`

Signed-off-by: Manu Bretelle <chantr4@gmail.com>
2024-02-15 10:44:45 -08:00
Andrii Nakryiko
820bca2cb6 ci: verifier_global_subprogs can't be run on 5.5
We get:

  libbpf: struct_ops init_kern: struct bpf_dummy_ops is not found in kernel BTF

So even though it's irrelevant to the subtests we do want to test,
entire test has to be skipped, unfortunately.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-02-06 11:52:00 -08:00
Andrii Nakryiko
8a8feae5f4 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   943b043aeecce9accb6d367af47791c633e95e4d
Checkpoint bpf-next commit: 92a871ab9fa59a74d013bc04f321026a057618e7
Baseline bpf commit:        577e4432f3ac810049cb7e6b71f4d96ec7c6e894
Checkpoint bpf commit:      577e4432f3ac810049cb7e6b71f4d96ec7c6e894

Andrii Nakryiko (1):
  libbpf: fix return value for PERF_EVENT __arg_ctx type fix up check

Toke Høiland-Jørgensen (1):
  libbpf: Use OPTS_SET() macro in bpf_xdp_query()

 src/libbpf.c  | 6 +++---
 src/netlink.c | 4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-02-06 11:52:00 -08:00
Toke Høiland-Jørgensen
a20b60f971 libbpf: Use OPTS_SET() macro in bpf_xdp_query()
When the feature_flags and xdp_zc_max_segs fields were added to the libbpf
bpf_xdp_query_opts, the code writing them did not use the OPTS_SET() macro.
This causes libbpf to write to those fields unconditionally, which means
that programs compiled against an older version of libbpf (with a smaller
size of the bpf_xdp_query_opts struct) will have its stack corrupted by
libbpf writing out of bounds.

The patch adding the feature_flags field has an early bail out if the
feature_flags field is not part of the opts struct (via the OPTS_HAS)
macro, but the patch adding xdp_zc_max_segs does not. For consistency, this
fix just changes the assignments to both fields to use the OPTS_SET()
macro.

Fixes: 13ce2daa259a ("xsk: add new netlink attribute dedicated for ZC max frags")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240206125922.1992815-1-toke@redhat.com
2024-02-06 11:52:00 -08:00
Andrii Nakryiko
b24a6277cc libbpf: fix return value for PERF_EVENT __arg_ctx type fix up check
If PERF_EVENT program has __arg_ctx argument with matching
architecture-specific pt_regs/user_pt_regs/user_regs_struct pointer
type, libbpf should still perform type rewrite for old kernels, but not
emit the warning. Fix copy/paste from kernel code where 0 is meant to
signify "no error" condition. For libbpf we need to return "true" to
proceed with type rewrite (which for PERF_EVENT program will be
a canonical `struct bpf_perf_event_data *` type).

Fixes: 9eea8fafe33e ("libbpf: fix __arg_ctx type enforcement for perf_event programs")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240206002243.1439450-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-02-06 11:52:00 -08:00
Andrii Nakryiko
25fe467af4 ci: allowlist tests validating libbpf's __arg_ctx type rewrite logic
Allowlist test_global_funcs/arg_tag_ctx* and a few of
verifier_global_subprogs subtests that validate libbpf's logic for
rewriting __arg_ctx globl subprog argument types on kernels that don't
natively support __arg_ctx.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-02-06 10:17:28 -08:00
Andrii Nakryiko
f11758a780 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   ced33f2cfa21a14a292a00e31dc9f85c1bfbda1c
Checkpoint bpf-next commit: 943b043aeecce9accb6d367af47791c633e95e4d
Baseline bpf commit:        577e4432f3ac810049cb7e6b71f4d96ec7c6e894
Checkpoint bpf commit:      577e4432f3ac810049cb7e6b71f4d96ec7c6e894

Andrii Nakryiko (8):
  libbpf: integrate __arg_ctx feature detector into kernel_supports()
  libbpf: fix __arg_ctx type enforcement for perf_event programs
  libbpf: add __arg_trusted and __arg_nullable tag macros
  libbpf: add bpf_core_cast() macro
  libbpf: Call memfd_create() syscall directly
  libbpf: Add missing LIBBPF_API annotation to libbpf_set_memlock_rlim
    API
  libbpf: Add btf__new_split() API that was declared but not implemented
  libbpf: Add missed btf_ext__raw_data() API

Eduard Zingerman (1):
  libbpf: Remove unnecessary null check in kernel_supports()

Ian Rogers (1):
  libbpf: Add some details for BTF parsing failures

 src/bpf.h             |  2 +-
 src/bpf_core_read.h   | 13 ++++++
 src/bpf_helpers.h     |  2 +
 src/btf.c             | 33 ++++++++++++---
 src/features.c        | 58 +++++++++++++++++++++++++
 src/libbpf.c          | 99 ++++++++++++++-----------------------------
 src/libbpf.map        |  5 ++-
 src/libbpf_internal.h |  2 +
 src/linker.c          |  2 +-
 9 files changed, 140 insertions(+), 76 deletions(-)

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-02-01 15:10:17 -08:00
Andrii Nakryiko
cbb8ba352d sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-02-01 15:10:17 -08:00
Andrii Nakryiko
95b4beb502 libbpf: Add missed btf_ext__raw_data() API
Another API that was declared in libbpf.map but actual implementation
was missing. btf_ext__get_raw_data() was intended as a discouraged alias
to consistently-named btf_ext__raw_data(), so make this an actuality.

Fixes: 20eccf29e297 ("libbpf: hide and discourage inconsistently named getters")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240201172027.604869-5-andrii@kernel.org
2024-02-01 15:10:17 -08:00
Andrii Nakryiko
5b7613e50f libbpf: Add btf__new_split() API that was declared but not implemented
Seems like original commit adding split BTF support intended to add
btf__new_split() API, and even declared it in libbpf.map, but never
added (trivial) implementation. Fix this.

Fixes: ba451366bf44 ("libbpf: Implement basic split BTF support")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240201172027.604869-4-andrii@kernel.org
2024-02-01 15:10:17 -08:00
Andrii Nakryiko
245394fb36 libbpf: Add missing LIBBPF_API annotation to libbpf_set_memlock_rlim API
LIBBPF_API annotation seems missing on libbpf_set_memlock_rlim API, so
add it to make this API callable from libbpf's shared library version.

Fixes: e542f2c4cd16 ("libbpf: Auto-bump RLIMIT_MEMLOCK if kernel needs it for BPF")
Fixes: ab9a5a05dc48 ("libbpf: fix up few libbpf.map problems")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240201172027.604869-3-andrii@kernel.org
2024-02-01 15:10:17 -08:00
Andrii Nakryiko
3b19b1bb55 libbpf: Call memfd_create() syscall directly
Some versions of Android do not implement memfd_create() wrapper in
their libc implementation, leading to build failures ([0]). On the other
hand, memfd_create() is available as a syscall on quite old kernels
(3.17+, while bpf() syscall itself is available since 3.18+), so it is
ok to assume that syscall availability and call into it with syscall()
helper to avoid Android-specific workarounds.

Validated in libbpf-bootstrap's CI ([1]).

  [0] https://github.com/libbpf/libbpf-bootstrap/actions/runs/7701003207/job/20986080319#step:5:83
  [1] https://github.com/libbpf/libbpf-bootstrap/actions/runs/7715988887/job/21031767212?pr=253

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240201172027.604869-2-andrii@kernel.org
2024-02-01 15:10:17 -08:00
Eduard Zingerman
7529e0c4c7 libbpf: Remove unnecessary null check in kernel_supports()
After recent changes, Coverity complained about inconsistent null checks
in kernel_supports() function:

    kernel_supports(const struct bpf_object *obj, ...)
    [...]
    // var_compare_op: Comparing obj to null implies that obj might be null
    if (obj && obj->gen_loader)
        return true;

    // var_deref_op: Dereferencing null pointer obj
    if (obj->token_fd)
        return feat_supported(obj->feat_cache, feat_id);
    [...]

- The original null check was introduced by commit [0], which introduced
  a call `kernel_supports(NULL, ...)` in function bump_rlimit_memlock();
- This call was refactored to use `feat_supported(NULL, ...)` in commit [1].

Looking at all places where kernel_supports() is called:

- There is either `obj->...` access before the call;
- Or `obj` comes from `prog->obj` expression, where `prog` comes from
  enumeration of programs in `obj`;
- Or `obj` comes from `prog->obj`, where `prog` is a parameter to one
  of the API functions:
  - bpf_program__attach_kprobe_opts;
  - bpf_program__attach_kprobe;
  - bpf_program__attach_ksyscall.

Assuming correct API usage, it appears that `obj` can never be null when
passed to kernel_supports(). Silence the Coverity warning by removing
redundant null check.

  [0] e542f2c4cd16 ("libbpf: Auto-bump RLIMIT_MEMLOCK if kernel needs it for BPF")
  [1] d6dd1d49367a ("libbpf: Further decouple feature checking logic from bpf_object")

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240131212615.20112-1-eddyz87@gmail.com
2024-02-01 15:10:17 -08:00
Andrii Nakryiko
688879fb01 libbpf: add bpf_core_cast() macro
Add bpf_core_cast() macro that wraps bpf_rdonly_cast() kfunc. It's more
ergonomic than kfunc, as it automatically extracts btf_id with
bpf_core_type_id_kernel(), and works with type names. It also casts result
to (T *) pointer. See the definition of the macro, it's self-explanatory.

libbpf declares bpf_rdonly_cast() extern as __weak __ksym and should be
safe to not conflict with other possible declarations in user code.

But we do have a conflict with current BPF selftests that declare their
externs with first argument as `void *obj`, while libbpf opts into more
permissive `const void *obj`. This causes conflict, so we fix up BPF
selftests uses in the same patch.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240130212023.183765-2-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-02-01 15:10:17 -08:00
Andrii Nakryiko
0303e25be3 libbpf: add __arg_trusted and __arg_nullable tag macros
Add __arg_trusted to annotate global func args that accept trusted
PTR_TO_BTF_ID arguments.

Also add __arg_nullable to combine with __arg_trusted (and maybe other
tags in the future) to force global subprog itself (i.e., callee) to do
NULL checks, as opposed to default non-NULL semantics (and thus caller's
responsibility to ensure non-NULL values).

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240130000648.2144827-4-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-02-01 15:10:17 -08:00
Ian Rogers
9b306ac9be libbpf: Add some details for BTF parsing failures
As CONFIG_DEBUG_INFO_BTF is default off the existing "failed to find
valid kernel BTF" message makes diagnosing the kernel build issue somewhat
cryptic. Add a little more detail with the hope of helping users.

Before:
```
libbpf: failed to find valid kernel BTF
libbpf: Error loading vmlinux BTF: -3
```

After not accessible:
```
libbpf: kernel BTF is missing at '/sys/kernel/btf/vmlinux', was CONFIG_DEBUG_INFO_BTF enabled?
libbpf: failed to find valid kernel BTF
libbpf: Error loading vmlinux BTF: -3
```

After not readable:
```
libbpf: failed to read kernel BTF from (/sys/kernel/btf/vmlinux): -1
```

Closes: https://lore.kernel.org/bpf/CAP-5=fU+DN_+Y=Y4gtELUsJxKNDDCOvJzPHvjUVaUoeFAzNnig@mail.gmail.com/

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240125231840.1647951-1-irogers@google.com
2024-02-01 15:10:17 -08:00
Andrii Nakryiko
c57fb75864 libbpf: fix __arg_ctx type enforcement for perf_event programs
Adjust PERF_EVENT type enforcement around __arg_ctx to match exactly
what kernel is doing.

Fixes: 76ec90a996e3 ("libbpf: warn on unexpected __arg_ctx type when rewriting BTF")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240125205510.3642094-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-02-01 15:10:17 -08:00
Andrii Nakryiko
0b412d1918 libbpf: integrate __arg_ctx feature detector into kernel_supports()
Now that feature detection code is in bpf-next tree, integrate __arg_ctx
kernel-side support into kernel_supports() framework.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240125205510.3642094-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-02-01 15:10:17 -08:00
Andrii Nakryiko
3b09738928 sync: remove NETDEV_XSK_FLAGS_MASK which is not in bpf/bpf-next anymore
This part of code is not present in either bpf or bpf-next trees
anymore, so manually remove it.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-01-29 10:48:12 -08:00
Andrii Nakryiko
5139f12ef1 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   c8632acf193beac64bbdaebef013368c480bf74f
Checkpoint bpf-next commit: ced33f2cfa21a14a292a00e31dc9f85c1bfbda1c
Baseline bpf commit:        0a5bd0ffe790511d802e7f40898429a89e2487df
Checkpoint bpf commit:      577e4432f3ac810049cb7e6b71f4d96ec7c6e894

Andrii Nakryiko (1):
  libbpf: Fix faccessat() usage on Android

 src/libbpf_internal.h | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-01-29 10:48:12 -08:00
Andrii Nakryiko
830e0d017b libbpf: Fix faccessat() usage on Android
Android implementation of libc errors out with -EINVAL in faccessat() if
passed AT_EACCESS ([0]), this leads to ridiculous issue with libbpf
refusing to load /sys/kernel/btf/vmlinux on Androids ([1]). Fix by
detecting Android and redefining AT_EACCESS to 0, it's equivalent on
Android.

  [0] https://android.googlesource.com/platform/bionic/+/refs/heads/android13-release/libc/bionic/faccessat.cpp#50
  [1] https://github.com/libbpf/libbpf-bootstrap/issues/250#issuecomment-1911324250

Fixes: 6a4ab8869d0b ("libbpf: Fix the case of running as non-root with capabilities")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20240126220944.2497665-1-andrii@kernel.org
2024-01-29 10:48:12 -08:00
Andrii Nakryiko
fad5d91381 libbpf: make sure linux/kernel.h includes linux/compiler.h
This replicates kernel upstream setup and brings READ_ONCE() and
WRITE_ONCE() macros anywhere where linux/kernel.h is included, which is
assumption libbpf code makes.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-01-26 18:12:29 -05:00
Andrii Nakryiko
8ca30626cc Makefile: add features.o to Makefile
Libbpf got new source code file, features.c, we need to add it to
Makefile here on Github version as well.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-01-26 18:12:29 -05:00
Andrii Nakryiko
274d6037f8 libbpf: add BPF_CALL_REL() macro implementation
Add BPF_CALL_REL() macro implementation into include/linux/filter.h
header, which is now used by libbpf code for feature detection.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-01-26 18:12:29 -05:00
Andrii Nakryiko
0f84f3bef6 ci: regenerate vmlinux.h
Update vmlinux.h for old kernel CI workflows.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-01-26 18:12:29 -05:00
Andrii Nakryiko
3dea2db84b ci: drop custom patches for fixing upstream kernel issues
All the issues should be fixed upstream already.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-01-26 18:12:29 -05:00
Andrii Nakryiko
2f81310ec0 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   98e20e5e13d2811898921f999288be7151a11954
Checkpoint bpf-next commit: c8632acf193beac64bbdaebef013368c480bf74f
Baseline bpf commit:        7c5e046bdcb2513f9decb3765d8bf92d604279cf
Checkpoint bpf commit:      0a5bd0ffe790511d802e7f40898429a89e2487df

Andrey Grafin (1):
  libbpf: Apply map_set_def_max_entries() for inner_maps on creation

Andrii Nakryiko (17):
  libbpf: feature-detect arg:ctx tag support in kernel
  libbpf: warn on unexpected __arg_ctx type when rewriting BTF
  libbpf: call dup2() syscall directly
  bpf: Introduce BPF token object
  bpf: Add BPF token support to BPF_MAP_CREATE command
  bpf: Add BPF token support to BPF_BTF_LOAD command
  bpf: Add BPF token support to BPF_PROG_LOAD command
  libbpf: Add bpf_token_create() API
  libbpf: Add BPF token support to bpf_map_create() API
  libbpf: Add BPF token support to bpf_btf_load() API
  libbpf: Add BPF token support to bpf_prog_load() API
  libbpf: Split feature detectors definitions from cached results
  libbpf: Further decouple feature checking logic from bpf_object
  libbpf: Move feature detection code into its own file
  libbpf: Wire up token_fd into feature probing logic
  libbpf: Wire up BPF token support at BPF object level
  libbpf: Support BPF token path setting through LIBBPF_BPF_TOKEN_PATH
    envvar

Daniel Borkmann (1):
  bpf: Sync uapi bpf.h header for the tooling infra

Dima Tisnek (1):
  libbpf: Correct bpf_core_read.h comment wrt bpf_core_relo struct

Jiri Olsa (2):
  bpf: Add cookie to perf_event bpf_link_info records
  bpf: Store cookies in kprobe_multi bpf_link_info data

Kan Liang (2):
  perf: Add branch stack counters
  perf/x86/intel: Support branch counters logging

Kui-Feng Lee (3):
  bpf: pass btf object id in bpf_map_info.
  bpf: pass attached BTF to the bpf_struct_ops subsystem
  libbpf: Find correct module BTFs for struct_ops maps and progs.

Martin KaFai Lau (1):
  libbpf: Ensure undefined bpf_attr field stays 0

 include/uapi/linux/bpf.h        |  79 +++-
 include/uapi/linux/perf_event.h |  13 +
 src/bpf.c                       |  42 +-
 src/bpf.h                       |  38 +-
 src/bpf_core_read.h             |   2 +-
 src/btf.c                       |  10 +-
 src/elf.c                       |   2 -
 src/features.c                  | 503 +++++++++++++++++++++
 src/libbpf.c                    | 744 ++++++++++++--------------------
 src/libbpf.h                    |  21 +-
 src/libbpf.map                  |   1 +
 src/libbpf_internal.h           |  50 ++-
 src/libbpf_probes.c             |  12 +-
 src/str_error.h                 |   3 +
 14 files changed, 1019 insertions(+), 501 deletions(-)
 create mode 100644 src/features.c

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-01-26 18:12:29 -05:00
Andrii Nakryiko
0e57fade4e sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-01-26 18:12:29 -05:00
Andrii Nakryiko
a36646e2b3 libbpf: Support BPF token path setting through LIBBPF_BPF_TOKEN_PATH envvar
To allow external admin authority to override default BPF FS location
(/sys/fs/bpf) for implicit BPF token creation, teach libbpf to recognize
LIBBPF_BPF_TOKEN_PATH envvar. If it is specified and user application
didn't explicitly specify bpf_token_path option, it will be treated
exactly like bpf_token_path option, overriding default /sys/fs/bpf
location and making BPF token mandatory.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-29-andrii@kernel.org
2024-01-26 18:12:29 -05:00
Andrii Nakryiko
e1a43809a9 libbpf: Wire up BPF token support at BPF object level
Add BPF token support to BPF object-level functionality.

BPF token is supported by BPF object logic either as an explicitly
provided BPF token from outside (through BPF FS path), or implicitly
(unless prevented through bpf_object_open_opts).

Implicit mode is assumed to be the most common one for user namespaced
unprivileged workloads. The assumption is that privileged container
manager sets up default BPF FS mount point at /sys/fs/bpf with BPF token
delegation options (delegate_{cmds,maps,progs,attachs} mount options).
BPF object during loading will attempt to create BPF token from
/sys/fs/bpf location, and pass it for all relevant operations
(currently, map creation, BTF load, and program load).

In this implicit mode, if BPF token creation fails due to whatever
reason (BPF FS is not mounted, or kernel doesn't support BPF token,
etc), this is not considered an error. BPF object loading sequence will
proceed with no BPF token.

In explicit BPF token mode, user provides explicitly custom BPF FS mount
point path. In such case, BPF object will attempt to create BPF token
from provided BPF FS location. If BPF token creation fails, that is
considered a critical error and BPF object load fails with an error.

Libbpf provides a way to disable implicit BPF token creation, if it
causes any troubles (BPF token is designed to be completely optional and
shouldn't cause any problems even if provided, but in the world of BPF
LSM, custom security logic can be installed that might change outcome
depending on the presence of BPF token). To disable libbpf's default BPF
token creation behavior user should provide either invalid BPF token FD
(negative), or empty bpf_token_path option.

BPF token presence can influence libbpf's feature probing, so if BPF
object has associated BPF token, feature probing is instructed to use
BPF object-specific feature detection cache and token FD.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-26-andrii@kernel.org
2024-01-26 18:12:29 -05:00
Andrii Nakryiko
a3b317a9c0 libbpf: Wire up token_fd into feature probing logic
Adjust feature probing callbacks to take into account optional token_fd.
In unprivileged contexts, some feature detectors would fail to detect
kernel support just because BPF program, BPF map, or BTF object can't be
loaded due to privileged nature of those operations. So when BPF object
is loaded with BPF token, this token should be used for feature probing.

This patch is setting support for this scenario, but we don't yet pass
non-zero token FD. This will be added in the next patch.

We also switched BPF cookie detector from using kprobe program to
tracepoint one, as tracepoint is somewhat less dangerous BPF program
type and has higher likelihood of being allowed through BPF token in the
future. This change has no effect on detection behavior.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-25-andrii@kernel.org
2024-01-26 18:12:29 -05:00
Andrii Nakryiko
d42f0b8943 libbpf: Move feature detection code into its own file
It's quite a lot of well isolated code, so it seems like a good
candidate to move it out of libbpf.c to reduce its size.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-24-andrii@kernel.org
2024-01-26 18:12:29 -05:00
Andrii Nakryiko
9bf95048b7 libbpf: Further decouple feature checking logic from bpf_object
Add feat_supported() helper that accepts feature cache instead of
bpf_object. This allows low-level code in bpf.c to not know or care
about higher-level concept of bpf_object, yet it will be able to utilize
custom feature checking in cases where BPF token might influence the
outcome.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-23-andrii@kernel.org
2024-01-26 18:12:29 -05:00
Andrii Nakryiko
9454419946 libbpf: Split feature detectors definitions from cached results
Split a list of supported feature detectors with their corresponding
callbacks from actual cached supported/missing values. This will allow
to have more flexible per-token or per-object feature detectors in
subsequent refactorings.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-22-andrii@kernel.org
2024-01-26 18:12:29 -05:00
Andrii Nakryiko
8082a311d3 libbpf: Add BPF token support to bpf_prog_load() API
Wire through token_fd into bpf_prog_load().

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-16-andrii@kernel.org
2024-01-26 18:12:29 -05:00
Andrii Nakryiko
ac4a66ea12 libbpf: Add BPF token support to bpf_btf_load() API
Allow user to specify token_fd for bpf_btf_load() API that wraps
kernel's BPF_BTF_LOAD command. This allows loading BTF from unprivileged
process as long as it has BPF token allowing BPF_BTF_LOAD command, which
can be created and delegated by privileged process.

Wire through new btf_flags as well, so that user can provide
BPF_F_TOKEN_FD flag, if necessary.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-15-andrii@kernel.org
2024-01-26 18:12:29 -05:00
Andrii Nakryiko
8002c052f3 libbpf: Add BPF token support to bpf_map_create() API
Add ability to provide token_fd for BPF_MAP_CREATE command through
bpf_map_create() API.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-14-andrii@kernel.org
2024-01-26 18:12:29 -05:00
Andrii Nakryiko
5cc8482fe2 libbpf: Add bpf_token_create() API
Add low-level wrapper API for BPF_TOKEN_CREATE command in bpf() syscall.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-13-andrii@kernel.org
2024-01-26 18:12:29 -05:00
Andrii Nakryiko
21fb08cb35 bpf: Add BPF token support to BPF_PROG_LOAD command
Add basic support of BPF token to BPF_PROG_LOAD. BPF_F_TOKEN_FD flag
should be set in prog_flags field when providing prog_token_fd.

Wire through a set of allowed BPF program types and attach types,
derived from BPF FS at BPF token creation time. Then make sure we
perform bpf_token_capable() checks everywhere where it's relevant.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-7-andrii@kernel.org
2024-01-26 18:12:29 -05:00
Andrii Nakryiko
eb9d10835c bpf: Add BPF token support to BPF_BTF_LOAD command
Accept BPF token FD in BPF_BTF_LOAD command to allow BTF data loading
through delegated BPF token. BPF_F_TOKEN_FD flag has to be specified
when passing BPF token FD. Given BPF_BTF_LOAD command didn't have flags
field before, we also add btf_flags field.

BTF loading is a pretty straightforward operation, so as long as BPF
token is created with allow_cmds granting BPF_BTF_LOAD command, kernel
proceeds to parsing BTF data and creating BTF object.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-6-andrii@kernel.org
2024-01-26 18:12:29 -05:00
Andrii Nakryiko
1386c15b7b bpf: Add BPF token support to BPF_MAP_CREATE command
Allow providing token_fd for BPF_MAP_CREATE command to allow controlled
BPF map creation from unprivileged process through delegated BPF token.
New BPF_F_TOKEN_FD flag is added to specify together with BPF token FD
for BPF_MAP_CREATE command.

Wire through a set of allowed BPF map types to BPF token, derived from
BPF FS at BPF token creation time. This, in combination with allowed_cmds
allows to create a narrowly-focused BPF token (controlled by privileged
agent) with a restrictive set of BPF maps that application can attempt
to create.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-5-andrii@kernel.org
2024-01-26 18:12:29 -05:00
Andrii Nakryiko
5cd6a8493b bpf: Introduce BPF token object
Add new kind of BPF kernel object, BPF token. BPF token is meant to
allow delegating privileged BPF functionality, like loading a BPF
program or creating a BPF map, from privileged process to a *trusted*
unprivileged process, all while having a good amount of control over which
privileged operations could be performed using provided BPF token.

This is achieved through mounting BPF FS instance with extra delegation
mount options, which determine what operations are delegatable, and also
constraining it to the owning user namespace (as mentioned in the
previous patch).

BPF token itself is just a derivative from BPF FS and can be created
through a new bpf() syscall command, BPF_TOKEN_CREATE, which accepts BPF
FS FD, which can be attained through open() API by opening BPF FS mount
point. Currently, BPF token "inherits" delegated command, map types,
prog type, and attach type bit sets from BPF FS as is. In the future,
having an BPF token as a separate object with its own FD, we can allow
to further restrict BPF token's allowable set of things either at the
creation time or after the fact, allowing the process to guard itself
further from unintentionally trying to load undesired kind of BPF
programs. But for now we keep things simple and just copy bit sets as is.

When BPF token is created from BPF FS mount, we take reference to the
BPF super block's owning user namespace, and then use that namespace for
checking all the {CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, CAP_SYS_ADMIN}
capabilities that are normally only checked against init userns (using
capable()), but now we check them using ns_capable() instead (if BPF
token is provided). See bpf_token_capable() for details.

Such setup means that BPF token in itself is not sufficient to grant BPF
functionality. User namespaced process has to *also* have necessary
combination of capabilities inside that user namespace. So while
previously CAP_BPF was useless when granted within user namespace, now
it gains a meaning and allows container managers and sys admins to have
a flexible control over which processes can and need to use BPF
functionality within the user namespace (i.e., container in practice).
And BPF FS delegation mount options and derived BPF tokens serve as
a per-container "flag" to grant overall ability to use bpf() (plus further
restrict on which parts of bpf() syscalls are treated as namespaced).

Note also, BPF_TOKEN_CREATE command itself requires ns_capable(CAP_BPF)
within the BPF FS owning user namespace, rounding up the ns_capable()
story of BPF token. Also creating BPF token in init user namespace is
currently not supported, given BPF token doesn't have any effect in init
user namespace anyways.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Christian Brauner <brauner@kernel.org>
Link: https://lore.kernel.org/bpf/20240124022127.2379740-4-andrii@kernel.org
2024-01-26 18:12:29 -05:00
Martin KaFai Lau
385ae492fa libbpf: Ensure undefined bpf_attr field stays 0
The commit 9e926acda0c2 ("libbpf: Find correct module BTFs for struct_ops maps and progs.")
sets a newly added field (value_type_btf_obj_fd) to -1 in libbpf when
the caller of the libbpf's bpf_map_create did not define this field by
passing a NULL "opts" or passing in a "opts" that does not cover this
new field. OPT_HAS(opts, field) is used to decide if the field is
defined or not:

	((opts) && opts->sz >= offsetofend(typeof(*(opts)), field))

Once OPTS_HAS decided the field is not defined, that field should
be set to 0. For this particular new field (value_type_btf_obj_fd),
its corresponding map_flags "BPF_F_VTYPE_BTF_OBJ_FD" is not set.
Thus, the kernel does not treat it as an fd field.

Fixes: 9e926acda0c2 ("libbpf: Find correct module BTFs for struct_ops maps and progs.")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240124224418.2905133-1-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-26 18:12:29 -05:00
Dima Tisnek
517ff8d823 libbpf: Correct bpf_core_read.h comment wrt bpf_core_relo struct
Past commit ([0]) removed the last vestiges of struct bpf_field_reloc,
it's called struct bpf_core_relo now.

  [0] 28b93c64499a ("libbpf: Clean up and improve CO-RE reloc logging")

Signed-off-by: Dima Tisnek <dimaqq@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20240121060126.15650-1-dimaqq@gmail.com
2024-01-26 18:12:29 -05:00
Kui-Feng Lee
0b0dfaf1be libbpf: Find correct module BTFs for struct_ops maps and progs.
Locate the module BTFs for struct_ops maps and progs and pass them to the
kernel. This ensures that the kernel correctly resolves type IDs from the
appropriate module BTFs.

For the map of a struct_ops object, the FD of the module BTF is set to
bpf_map to keep a reference to the module BTF. The FD is passed to the
kernel as value_type_btf_obj_fd when the struct_ops object is loaded.

For a bpf_struct_ops prog, attach_btf_obj_fd of bpf_prog is the FD of a
module BTF in the kernel.

Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240119225005.668602-13-thinker.li@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-01-26 18:12:29 -05:00
Kui-Feng Lee
d767ead51f bpf: pass attached BTF to the bpf_struct_ops subsystem
Pass the fd of a btf from the userspace to the bpf() syscall, and then
convert the fd into a btf. The btf is generated from the module that
defines the target BPF struct_ops type.

In order to inform the kernel about the module that defines the target
struct_ops type, the userspace program needs to provide a btf fd for the
respective module's btf. This btf contains essential information on the
types defined within the module, including the target struct_ops type.

A btf fd must be provided to the kernel for struct_ops maps and for the bpf
programs attached to those maps.

In the case of the bpf programs, the attach_btf_obj_fd parameter is passed
as part of the bpf_attr and is converted into a btf. This btf is then
stored in the prog->aux->attach_btf field. Here, it just let the verifier
access attach_btf directly.

In the case of struct_ops maps, a btf fd is passed as value_type_btf_obj_fd
of bpf_attr. The bpf_struct_ops_map_alloc() function converts the fd to a
btf and stores it as st_map->btf. A flag BPF_F_VTYPE_BTF_OBJ_FD is added
for map_flags to indicate that the value of value_type_btf_obj_fd is set.

Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/20240119225005.668602-9-thinker.li@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-01-26 18:12:29 -05:00
Kui-Feng Lee
d70e2071e2 bpf: pass btf object id in bpf_map_info.
Include btf object id (btf_obj_id) in bpf_map_info so that tools (ex:
bpftools struct_ops dump) know the correct btf from the kernel to look up
type information of struct_ops types.

Since struct_ops types can be defined and registered in a module. The
type information of a struct_ops type are defined in the btf of the
module defining it.  The userspace tools need to know which btf is for
the module defining a struct_ops type.

Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/20240119225005.668602-7-thinker.li@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-01-26 18:12:29 -05:00
Jiri Olsa
fe508381b4 bpf: Store cookies in kprobe_multi bpf_link_info data
Storing cookies in kprobe_multi bpf_link_info data. The cookies
field is optional and if provided it needs to be an array of
__u64 with kprobe_multi.count length.

Acked-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20240119110505.400573-3-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-26 18:12:29 -05:00
Jiri Olsa
de2f366450 bpf: Add cookie to perf_event bpf_link_info records
At the moment we don't store cookie for perf_event probes,
while we do that for the rest of the probes.

Adding cookie fields to struct bpf_link_info perf event
probe records:

  perf_event.uprobe
  perf_event.kprobe
  perf_event.tracepoint
  perf_event.perf_event

And the code to store that in bpf_link_info struct.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/r/20240119110505.400573-2-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-26 18:12:29 -05:00
Andrii Nakryiko
390c6234eb libbpf: call dup2() syscall directly
We've ran into issues with using dup2() API in production setting, where
libbpf is linked into large production environment and ends up calling
unintended custom implementations of dup2(). These custom implementations
don't provide atomic FD replacement guarantees of dup2() syscall,
leading to subtle and hard to debug issues.

To prevent this in the future and guarantee that no libc implementation
will do their own custom non-atomic dup2() implementation, call dup2()
syscall directly with syscall(SYS_dup2).

Note that some architectures don't seem to provide dup2 and have dup3
instead. Try to detect and pick best syscall.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240119210201.1295511-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-26 18:12:29 -05:00
Andrey Grafin
89ca11a79b libbpf: Apply map_set_def_max_entries() for inner_maps on creation
This patch allows to auto create BPF_MAP_TYPE_ARRAY_OF_MAPS and
BPF_MAP_TYPE_HASH_OF_MAPS with values of BPF_MAP_TYPE_PERF_EVENT_ARRAY
by bpf_object__load().

Previous behaviour created a zero filled btf_map_def for inner maps and
tried to use it for a map creation but the linux kernel forbids to create
a BPF_MAP_TYPE_PERF_EVENT_ARRAY map with max_entries=0.

Fixes: 646f02ffdd49 ("libbpf: Add BTF-defined map-in-map support")
Signed-off-by: Andrey Grafin <conquistador@yandex-team.ru>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/bpf/20240117130619.9403-1-conquistador@yandex-team.ru
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-26 18:12:29 -05:00
Daniel Borkmann
4c3742d9c1 bpf: Sync uapi bpf.h header for the tooling infra
Both commit 91051f003948 ("tcp: Dump bound-only sockets in inet_diag.")
and commit 985b8ea9ec7e ("bpf, docs: Fix bpf_redirect_peer header doc")
missed the tooling header sync. Fix it.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-26 18:12:29 -05:00
Andrii Nakryiko
f4211a704f libbpf: warn on unexpected __arg_ctx type when rewriting BTF
On kernel that don't support arg:ctx tag, before adjusting global
subprog BTF information to match kernel's expected canonical type names,
make sure that types used by user are meaningful, and if not, warn and
don't do BTF adjustments.

This is similar to checks that kernel performs, but narrower in scope,
as only a small subset of BPF program types can be accommodated by
libbpf using canonical type names.

Libbpf unconditionally allows `struct pt_regs *` for perf_event program
types, unlike kernel, which supports that conditionally on architecture.
This is done to keep things simple and not cause unnecessary false
positives. This seems like a minor and harmless deviation, which in
real-world programs will be caught by kernels with arg:ctx tag support
anyways. So KISS principle.

This logic is hard to test (especially on latest kernels), so manual
testing was performed instead. Libbpf emitted the following warning for
perf_event program with wrong context argument type:

  libbpf: prog 'arg_tag_ctx_perf': subprog 'subprog_ctx_tag' arg#0 is expected to be of `struct bpf_perf_event_data *` type

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240118033143.3384355-6-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-26 18:12:29 -05:00
Andrii Nakryiko
939ab641b8 libbpf: feature-detect arg:ctx tag support in kernel
Add feature detector of kernel-side arg:ctx (__arg_ctx) tag support. If
this is detected, libbpf will avoid doing any __arg_ctx-related BTF
rewriting and checks in favor of letting kernel handle this completely.

test_global_funcs/ctx_arg_rewrite subtest is adjusted to do the same
feature detection (albeit in much simpler, though round-about and
inefficient, way), and skip the tests. This is done to still be able to
execute this test on older kernels (like in libbpf CI).

Note, BPF token series ([0]) does a major refactor and code moving of
libbpf-internal feature detection "framework", so to avoid unnecessary
conflicts we keep newly added feature detection stand-alone with ad-hoc
result caching. Once things settle, there will be a small follow up to
re-integrate everything back and move code into its final place in
newly-added (by BPF token series) features.c file.

  [0] https://patchwork.kernel.org/project/netdevbpf/list/?series=814209&state=*

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240118033143.3384355-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-26 18:12:29 -05:00
Kan Liang
82ebbd97c8 perf/x86/intel: Support branch counters logging
The branch counters logging (A.K.A LBR event logging) introduces a
per-counter indication of precise event occurrences in LBRs. It can
provide a means to attribute exposed retirement latency to combinations
of events across a block of instructions. It also provides a means of
attributing Timed LBR latencies to events.

The feature is first introduced on SRF/GRR. It is an enhancement of the
ARCH LBR. It adds new fields in the LBR_INFO MSRs to log the occurrences
of events on the GP counters. The information is displayed by the order
of counters.

The design proposed in this patch requires that the events which are
logged must be in a group with the event that has LBR. If there are
more than one LBR group, the counters logging information only from the
current group (overflowed) are stored for the perf tool, otherwise the
perf tool cannot know which and when other groups are scheduled
especially when multiplexing is triggered. The user can ensure it uses
the maximum number of counters that support LBR info (4 by now) by
making the group large enough.

The HW only logs events by the order of counters. The order may be
different from the order of enabling which the perf tool can understand.
When parsing the information of each branch entry, convert the counter
order to the enabled order, and store the enabled order in the extension
space.

Unconditionally reset LBRs for an LBR event group when it's deleted. The
logged counter information is only valid for the current LBR group. If
another LBR group is scheduled later, the information from the stale
LBRs would be otherwise wrongly interpreted.

Add a sanity check in intel_pmu_hw_config(). Disable the feature if other
counter filters (inv, cmask, edge, in_tx) are set or LBR call stack mode
is enabled. (For the LBR call stack mode, we cannot simply flush the
LBR, since it will break the call stack. Also, there is no obvious usage
with the call stack mode for now.)

Only applying the PERF_SAMPLE_BRANCH_COUNTERS doesn't require any branch
stack setup.

Expose the maximum number of supported counters and the width of the
counters into the sysfs. The perf tool can use the information to parse
the logged counters in each branch.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20231025201626.3000228-5-kan.liang@linux.intel.com
2024-01-26 18:12:29 -05:00
Kan Liang
9705c4c622 perf: Add branch stack counters
Currently, the additional information of a branch entry is stored in a
u64 space. With more and more information added, the space is running
out. For example, the information of occurrences of events will be added
for each branch.

Two places were suggested to append the counters.
https://lore.kernel.org/lkml/20230802215814.GH231007@hirez.programming.kicks-ass.net/
One place is right after the flags of each branch entry. It changes the
existing struct perf_branch_entry. The later ARCH specific
implementation has to be really careful to consistently pick
the right struct.
The other place is right after the entire struct perf_branch_stack.
The disadvantage is that the pointer of the extra space has to be
recorded. The common interface perf_sample_save_brstack() has to be
updated.

The latter is much straightforward, and should be easily understood and
maintained. It is implemented in the patch.

Add a new branch sample type, PERF_SAMPLE_BRANCH_COUNTERS, to indicate
the event which is recorded in the branch info.

The "u64 counters" may store the occurrences of several events. The
information regarding the number of events/counters and the width of
each counter should be exposed via sysfs as a reference for the perf
tool. Define the branch_counter_nr and branch_counter_width ABI here.
The support will be implemented later in the Intel-specific patch.

Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20231025201626.3000228-1-kan.liang@linux.intel.com
2024-01-26 18:12:29 -05:00
Andrii Nakryiko
528cb9d3e9 README: update Ubuntu link
Do what [0] proposed to do, but with properly formatted commit message
and Signed-off-by.

  [0] https://github.com/libbpf/libbpf/pull/742

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-01-25 16:47:44 -08:00
66 changed files with 100182 additions and 93562 deletions

View File

@@ -1,3 +0,0 @@
Thank you for considering a contribution!
Please note that the `libbpf` authoritative source code is developed as part of bpf-next Linux source tree under tools/lib/bpf subdirectory and is periodically synced to Github. As such, all the libbpf changes should be sent to BPF mailing list, please don't open PRs here unless you are changing Github-specific parts of libbpf (e.g., Github-specific Makefile).

View File

@@ -12,6 +12,9 @@ inputs:
description: 'where is vmlinux file'
required: true
default: '${{ github.workspace }}/vmlinux'
llvm-version:
description: 'llvm version'
required: true
runs:
using: "composite"
@@ -28,4 +31,6 @@ runs:
export REPO_ROOT="${{ github.workspace }}"
export REPO_PATH="${{ inputs.repo-path }}"
export VMLINUX_BTF="${{ inputs.vmlinux }}"
export LLVM_VERSION="${{ inputs.llvm-version }}"
${{ github.action_path }}/build_selftests.sh

View File

@@ -10,22 +10,21 @@ foldable start prepare_selftests "Building selftests"
LIBBPF_PATH="${REPO_ROOT}"
llvm_default_version() {
echo "16"
}
llvm_latest_version() {
echo "17"
echo "19"
}
LLVM_VERSION=$(llvm_default_version)
if [[ "${LLVM_VERSION}" == $(llvm_latest_version) ]]; then
REPO_DISTRO_SUFFIX=""
else
REPO_DISTRO_SUFFIX="-${LLVM_VERSION}"
fi
echo "deb https://apt.llvm.org/focal/ llvm-toolchain-focal${REPO_DISTRO_SUFFIX} main" \
DISTRIB_CODENAME="noble"
test -f /etc/lsb-release && . /etc/lsb-release
echo "${DISTRIB_CODENAME}"
echo "deb https://apt.llvm.org/${DISTRIB_CODENAME}/ llvm-toolchain-${DISTRIB_CODENAME}${REPO_DISTRO_SUFFIX} main" \
| sudo tee /etc/apt/sources.list.d/llvm.list
PREPARE_SELFTESTS_SCRIPT=${THISDIR}/prepare_selftests-${KERNEL}.sh

View File

@@ -1,3 +1,5 @@
#!/bin/bash
printf "all:\n\ttouch bpf_testmod.ko\n\nclean:\n" > bpf_testmod/Makefile
printf "all:\n\ttouch bpf_test_no_cfi.ko\n\nclean:\n" > bpf_test_no_cfi/Makefile

View File

@@ -1,3 +1,5 @@
#!/bin/bash
printf "all:\n\ttouch bpf_testmod.ko\n\nclean:\n" > bpf_testmod/Makefile
printf "all:\n\ttouch bpf_test_no_cfi.ko\n\nclean:\n" > bpf_test_no_cfi/Makefile

File diff suppressed because it is too large Load Diff

View File

@@ -13,6 +13,10 @@ inputs:
description: 'pahole rev or master'
required: true
default: 'master'
llvm-version:
description: 'llvm version'
required: false
default: '17'
runs:
using: "composite"
steps:
@@ -37,6 +41,8 @@ runs:
uses: libbpf/ci/setup-build-env@main
with:
pahole: ${{ inputs.pahole }}
arch: ${{ inputs.arch }}
llvm-version: ${{ inputs.llvm-version }}
# 1. download CHECKPOINT kernel source
- name: Get checkpoint commit
shell: bash
@@ -92,6 +98,7 @@ runs:
with:
repo-path: '.kernel'
kernel: ${{ inputs.kernel }}
llvm-version: ${{ inputs.llvm-version }}
# 4. prepare rootfs
- name: prepare rootfs
uses: libbpf/ci/prepare-rootfs@main

View File

@@ -42,7 +42,7 @@ jobs:
- name: gcc-12
target: RUN_GCC12
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
name: Checkout
- uses: ./.github/actions/setup
name: Setup
@@ -53,7 +53,7 @@ jobs:
ubuntu:
runs-on: ubuntu-latest
name: Ubuntu Focal Build (${{ matrix.arch }})
name: Ubuntu Build (${{ matrix.arch }})
strategy:
fail-fast: false
matrix:
@@ -63,19 +63,19 @@ jobs:
- arch: s390x
- arch: x86
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
name: Checkout
- uses: ./.github/actions/setup
name: Pre-Setup
- run: source /tmp/ci_setup && sudo -E $CI_ROOT/managers/ubuntu.sh
if: matrix.arch == 'x86'
name: Setup
- uses: uraimo/run-on-arch-action@v2.0.5
- uses: uraimo/run-on-arch-action@v2.7.1
name: Build in docker
if: matrix.arch != 'x86'
with:
distro:
ubuntu20.04
ubuntu22.04
arch:
${{ matrix.arch }}
setup:

View File

@@ -17,7 +17,7 @@ permissions:
jobs:
analyze:
name: Analyze
runs-on: ubuntu-22.04
runs-on: ubuntu-latest
concurrency:
group: ${{ github.workflow }}-${{ matrix.language }}-${{ github.ref }}
cancel-in-progress: true
@@ -32,7 +32,7 @@ jobs:
steps:
- name: Checkout repository
uses: actions/checkout@v3
uses: actions/checkout@v4
- name: Initialize CodeQL
uses: github/codeql-action/init@v2

View File

@@ -11,7 +11,7 @@ jobs:
if: github.repository == 'libbpf/libbpf'
name: Coverity
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- uses: ./.github/actions/setup
- name: Run coverity
run: |

View File

@@ -12,7 +12,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v3
uses: actions/checkout@v4
- name: Run ShellCheck
uses: ludeeus/action-shellcheck@master
env:

View File

@@ -25,7 +25,7 @@ jobs:
runs-on: ubuntu-latest
name: vmtest with customized pahole/Kernel
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- uses: ./.github/actions/setup
- uses: ./.github/actions/vmtest
with:

View File

@@ -1,10 +1,10 @@
name: pahole-staging
on:
workflow_dispatch:
schedule:
- cron: '0 18 * * *'
jobs:
vmtest:
runs-on: ubuntu-20.04
@@ -12,7 +12,7 @@ jobs:
env:
STAGING: tmp.master
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- uses: ./.github/actions/setup
- uses: ./.github/actions/vmtest
with:

View File

@@ -13,25 +13,25 @@ concurrency:
jobs:
vmtest:
runs-on: ${{ matrix.runs_on }}
name: Kernel ${{ matrix.kernel }} on ${{ matrix.runs_on }} + selftests
name: Kernel ${{ matrix.kernel }} on ${{ matrix.arch }} + selftests
strategy:
fail-fast: false
matrix:
include:
- kernel: 'LATEST'
runs_on: ubuntu-20.04
runs_on: ubuntu-24.04
arch: 'x86_64'
- kernel: '5.5.0'
runs_on: ubuntu-20.04
runs_on: ubuntu-24.04
arch: 'x86_64'
- kernel: '4.9.0'
runs_on: ubuntu-20.04
runs_on: ubuntu-24.04
arch: 'x86_64'
- kernel: 'LATEST'
runs_on: s390x
runs_on: ["s390x", "docker-noble-main"]
arch: 's390x'
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
name: Checkout
- uses: ./.github/actions/setup
name: Setup

21
.mailmap Normal file
View File

@@ -0,0 +1,21 @@
Alexei Starovoitov <ast@kernel.org> <alexei.starovoitov@gmail.com>
Antoine Tenart <atenart@kernel.org> <antoine.tenart@bootlin.com>
Benjamin Tissoires <bentiss@kernel.org> <benjamin.tissoires@redhat.com>
Björn Töpel <bjorn@kernel.org> <bjorn.topel@intel.com>
Changbin Du <changbin.du@intel.com> <changbin.du@gmail.com>
Colin Ian King <colin.i.king@gmail.com> <colin.king@canonical.com>
Dan Carpenter <error27@gmail.com> <dan.carpenter@oracle.com>
Geliang Tang <geliang@kernel.org> <geliang.tang@suse.com>
Herbert Xu <herbert@gondor.apana.org.au>
Jakub Kicinski <kuba@kernel.org> <jakub.kicinski@netronome.com>
Kees Cook <kees@kernel.org> <keescook@chromium.org>
Leo Yan <leo.yan@linux.dev> <leo.yan@linaro.org>
Mark Starovoytov <mstarovo@pm.me> <mstarovoitov@marvell.com>
Maxim Mikityanskiy <maxtram95@gmail.com> <maximmi@mellanox.com>
Maxim Mikityanskiy <maxtram95@gmail.com> <maximmi@nvidia.com>
Puranjay Mohan <puranjay@kernel.org> <puranjay12@gmail.com>
Quentin Monnet <qmo@kernel.org> <quentin@isovalent.com>
Quentin Monnet <qmo@kernel.org> <quentin.monnet@netronome.com>
Stanislav Fomichev <sdf@fomichev.me> <sdf@google.com>
Vadim Fedorenko <vadim.fedorenko@linux.dev> <vadfed@meta.com>
Vadim Fedorenko <vadim.fedorenko@linux.dev> <vfedorenko@novek.ru>

View File

@@ -1 +1 @@
7c5e046bdcb2513f9decb3765d8bf92d604279cf
b408473ea01b2e499d23503e2bf898416da9d7ac

View File

@@ -1 +1 @@
98e20e5e13d2811898921f999288be7151a11954
2ad6d23f465a4f851e3bcf6d74c315ce7b2c205b

View File

@@ -146,7 +146,7 @@ Distributions packaging libbpf from this mirror:
- [Gentoo](https://packages.gentoo.org/packages/dev-libs/libbpf)
- [Debian](https://packages.debian.org/source/sid/libbpf)
- [Arch](https://archlinux.org/packages/core/x86_64/libbpf/)
- [Ubuntu](https://packages.ubuntu.com/source/impish/libbpf)
- [Ubuntu](https://packages.ubuntu.com/source/jammy/libbpf)
- [Alpine](https://pkgs.alpinelinux.org/packages?name=libbpf)
Benefits of packaging from the mirror over packaging from kernel sources:

View File

@@ -0,0 +1,69 @@
From c71766e8ff7a7f950522d25896fba758585500df Mon Sep 17 00:00:00 2001
From: Song Liu <song@kernel.org>
Date: Mon, 22 Apr 2024 21:14:40 -0700
Subject: [PATCH] arch/Kconfig: Move SPECULATION_MITIGATIONS to arch/Kconfig
SPECULATION_MITIGATIONS is currently defined only for x86. As a result,
IS_ENABLED(CONFIG_SPECULATION_MITIGATIONS) is always false for other
archs. f337a6a21e2f effectively set "mitigations=off" by default on
non-x86 archs, which is not desired behavior. Jakub observed this
change when running bpf selftests on s390 and arm64.
Fix this by moving SPECULATION_MITIGATIONS to arch/Kconfig so that it is
available in all archs and thus can be used safely in kernel/cpu.c
Fixes: f337a6a21e2f ("x86/cpu: Actually turn off mitigations by default for SPECULATION_MITIGATIONS=n")
Cc: stable@vger.kernel.org
Cc: Sean Christopherson <seanjc@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Song Liu <song@kernel.org>
---
arch/Kconfig | 10 ++++++++++
arch/x86/Kconfig | 10 ----------
2 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/arch/Kconfig b/arch/Kconfig
index 9f066785bb71..8f4af75005f8 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1609,4 +1609,14 @@ config CC_HAS_SANE_FUNCTION_ALIGNMENT
# strict alignment always, even with -falign-functions.
def_bool CC_HAS_MIN_FUNCTION_ALIGNMENT || CC_IS_CLANG
+menuconfig SPECULATION_MITIGATIONS
+ bool "Mitigations for speculative execution vulnerabilities"
+ default y
+ help
+ Say Y here to enable options which enable mitigations for
+ speculative execution hardware vulnerabilities.
+
+ If you say N, all mitigations will be disabled. You really
+ should know what you are doing to say so.
+
endmenu
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 39886bab943a..50c890fce5e0 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2486,16 +2486,6 @@ config PREFIX_SYMBOLS
def_bool y
depends on CALL_PADDING && !CFI_CLANG
-menuconfig SPECULATION_MITIGATIONS
- bool "Mitigations for speculative execution vulnerabilities"
- default y
- help
- Say Y here to enable options which enable mitigations for
- speculative execution hardware vulnerabilities.
-
- If you say N, all mitigations will be disabled. You really
- should know what you are doing to say so.
-
if SPECULATION_MITIGATIONS
config MITIGATION_PAGE_TABLE_ISOLATION
--
2.43.0

View File

@@ -1,29 +0,0 @@
From 61e8893a1e32ab57d15974427f41b75de608dbda Mon Sep 17 00:00:00 2001
From: Andrii Nakryiko <andrii@kernel.org>
Date: Mon, 4 Dec 2023 21:21:23 -0800
Subject: [PATCH] bpf: patch out BPF_F_TEST_REG_INVARIANTS for old kernels
CI-only patch to avoid setting BPF_F_TEST_REG_INVARIANTS flag for old
kernels that don't support it.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
tools/include/uapi/linux/bpf.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index e88746ba7d21..8344c9ce60e0 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1201,7 +1201,7 @@ enum bpf_perf_event_type {
#define BPF_F_XDP_DEV_BOUND_ONLY (1U << 6)
/* The verifier internal test flag. Behavior is undefined */
-#define BPF_F_TEST_REG_INVARIANTS (1U << 7)
+#define BPF_F_TEST_REG_INVARIANTS (0)
/* link_create.kprobe_multi.flags used in LINK_CREATE command for
* BPF_TRACE_KPROBE_MULTI attach type to create return probe.
--
2.34.1

View File

@@ -0,0 +1,32 @@
From 0daad0a615e687e1247230f3d0c31ae60ba32314 Mon Sep 17 00:00:00 2001
From: Andrii Nakryiko <andrii@kernel.org>
Date: Tue, 28 May 2024 15:29:38 -0700
Subject: [PATCH bpf-next] selftests/bpf: fix inet_csk_accept prototype in
test_sk_storage_tracing.c
Recent kernel change ([0]) changed inet_csk_accept() prototype. Adapt
progs/test_sk_storage_tracing.c to take that into account.
[0] 92ef0fd55ac8 ("net: change proto and proto_ops accept type")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
tools/testing/selftests/bpf/progs/test_sk_storage_tracing.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/progs/test_sk_storage_tracing.c b/tools/testing/selftests/bpf/progs/test_sk_storage_tracing.c
index 02e718f06e0f..40531e56776e 100644
--- a/tools/testing/selftests/bpf/progs/test_sk_storage_tracing.c
+++ b/tools/testing/selftests/bpf/progs/test_sk_storage_tracing.c
@@ -84,7 +84,7 @@ int BPF_PROG(trace_tcp_connect, struct sock *sk)
}
SEC("fexit/inet_csk_accept")
-int BPF_PROG(inet_csk_accept, struct sock *sk, int flags, int *err, bool kern,
+int BPF_PROG(inet_csk_accept, struct sock *sk, struct proto_accept_arg *arg,
struct sock *accepted_sk)
{
set_task_info(accepted_sk);
--
2.43.0

View File

@@ -1,89 +0,0 @@
From fe69a1b1b6ed9ffc2c578c63f526026a8ab74f0c Mon Sep 17 00:00:00 2001
From: Anders Roxell <anders.roxell@linaro.org>
Date: Thu, 9 Nov 2023 18:43:28 +0100
Subject: [PATCH] selftests: bpf: xskxceiver: ksft_print_msg: fix format type
error
Crossbuilding selftests/bpf for architecture arm64, format specifies
type error show up like.
xskxceiver.c:912:34: error: format specifies type 'int' but the argument
has type '__u64' (aka 'unsigned long long') [-Werror,-Wformat]
ksft_print_msg("[%s] expected meta_count [%d], got meta_count [%d]\n",
~~
%llu
__func__, pkt->pkt_nb, meta->count);
^~~~~~~~~~~
xskxceiver.c:929:55: error: format specifies type 'unsigned long long' but
the argument has type 'u64' (aka 'unsigned long') [-Werror,-Wformat]
ksft_print_msg("Frag invalid addr: %llx len: %u\n", addr, len);
~~~~ ^~~~
Fixing the issues by casting to (unsigned long long) and changing the
specifiers to be %llu from %d and %u, since with u64s it might be %llx
or %lx, depending on architecture.
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Link: https://lore.kernel.org/r/20231109174328.1774571-1-anders.roxell@linaro.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
tools/testing/selftests/bpf/xskxceiver.c | 19 ++++++++++++-------
1 file changed, 12 insertions(+), 7 deletions(-)
diff --git a/tools/testing/selftests/bpf/xskxceiver.c b/tools/testing/selftests/bpf/xskxceiver.c
index 591ca9637b23..b604c570309a 100644
--- a/tools/testing/selftests/bpf/xskxceiver.c
+++ b/tools/testing/selftests/bpf/xskxceiver.c
@@ -908,8 +908,9 @@ static bool is_metadata_correct(struct pkt *pkt, void *buffer, u64 addr)
struct xdp_info *meta = data - sizeof(struct xdp_info);
if (meta->count != pkt->pkt_nb) {
- ksft_print_msg("[%s] expected meta_count [%d], got meta_count [%d]\n",
- __func__, pkt->pkt_nb, meta->count);
+ ksft_print_msg("[%s] expected meta_count [%d], got meta_count [%llu]\n",
+ __func__, pkt->pkt_nb,
+ (unsigned long long)meta->count);
return false;
}
@@ -926,11 +927,13 @@ static bool is_frag_valid(struct xsk_umem_info *umem, u64 addr, u32 len, u32 exp
if (addr >= umem->num_frames * umem->frame_size ||
addr + len > umem->num_frames * umem->frame_size) {
- ksft_print_msg("Frag invalid addr: %llx len: %u\n", addr, len);
+ ksft_print_msg("Frag invalid addr: %llx len: %u\n",
+ (unsigned long long)addr, len);
return false;
}
if (!umem->unaligned_mode && addr % umem->frame_size + len > umem->frame_size) {
- ksft_print_msg("Frag crosses frame boundary addr: %llx len: %u\n", addr, len);
+ ksft_print_msg("Frag crosses frame boundary addr: %llx len: %u\n",
+ (unsigned long long)addr, len);
return false;
}
@@ -1029,7 +1032,8 @@ static int complete_pkts(struct xsk_socket_info *xsk, int batch_size)
u64 addr = *xsk_ring_cons__comp_addr(&xsk->umem->cq, idx + rcvd - 1);
ksft_print_msg("[%s] Too many packets completed\n", __func__);
- ksft_print_msg("Last completion address: %llx\n", addr);
+ ksft_print_msg("Last completion address: %llx\n",
+ (unsigned long long)addr);
return TEST_FAILURE;
}
@@ -1513,8 +1517,9 @@ static int validate_tx_invalid_descs(struct ifobject *ifobject)
}
if (stats.tx_invalid_descs != ifobject->xsk->pkt_stream->nb_pkts / 2) {
- ksft_print_msg("[%s] tx_invalid_descs incorrect. Got [%u] expected [%u]\n",
- __func__, stats.tx_invalid_descs,
+ ksft_print_msg("[%s] tx_invalid_descs incorrect. Got [%llu] expected [%u]\n",
+ __func__,
+ (unsigned long long)stats.tx_invalid_descs,
ifobject->xsk->pkt_stream->nb_pkts);
return TEST_FAILURE;
}
--
2.34.1

View File

@@ -0,0 +1,56 @@
From f267f262815033452195f46c43b572159262f533 Mon Sep 17 00:00:00 2001
From: Daniel Borkmann <daniel@iogearbox.net>
Date: Tue, 5 Mar 2024 10:08:28 +0100
Subject: [PATCH 2/2] xdp, bonding: Fix feature flags when there are no slave
devs anymore
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Commit 9b0ed890ac2a ("bonding: do not report NETDEV_XDP_ACT_XSK_ZEROCOPY")
changed the driver from reporting everything as supported before a device
was bonded into having the driver report that no XDP feature is supported
until a real device is bonded as it seems to be more truthful given
eventually real underlying devices decide what XDP features are supported.
The change however did not take into account when all slave devices get
removed from the bond device. In this case after 9b0ed890ac2a, the driver
keeps reporting a feature mask of 0x77, that is, NETDEV_XDP_ACT_MASK &
~NETDEV_XDP_ACT_XSK_ZEROCOPY whereas it should have reported a feature
mask of 0.
Fix it by resetting XDP feature flags in the same way as if no XDP program
is attached to the bond device. This was uncovered by the XDP bond selftest
which let BPF CI fail. After adjusting the starting masks on the latter
to 0 instead of NETDEV_XDP_ACT_MASK the test passes again together with
this fix.
Fixes: 9b0ed890ac2a ("bonding: do not report NETDEV_XDP_ACT_XSK_ZEROCOPY")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Magnus Karlsson <magnus.karlsson@intel.com>
Cc: Prashant Batra <prbatra.mail@gmail.com>
Cc: Toke Høiland-Jørgensen <toke@redhat.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Message-ID: <20240305090829.17131-1-daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
drivers/net/bonding/bond_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index a11748b8d69b..cd0683bcca03 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1811,7 +1811,7 @@ void bond_xdp_set_features(struct net_device *bond_dev)
ASSERT_RTNL();
- if (!bond_xdp_check(bond)) {
+ if (!bond_xdp_check(bond) || !bond_has_slaves(bond)) {
xdp_clear_features_flag(bond_dev);
return;
}
--
2.43.0

View File

@@ -41,7 +41,7 @@ task_fd_query_rawtp
task_fd_query_tp
tc_bpf
tcp_estats
tcp_rtt
test_global_funcs/arg_tag_ctx*
tp_attach_query
usdt/urand_pid_attach
xdp

View File

@@ -0,0 +1,15 @@
# TEMPORARY
btf_dump/btf_dump: syntax
kprobe_multi_bench_attach
core_reloc/enum64val
core_reloc/size___diff_sz
core_reloc/type_based___diff_sz
test_ima # All of CI is broken on it following 6.3-rc1 merge
lwt_reroute # crashes kernel after netnext merge from 2ab1efad60ad "net/sched: cls_api: complement tcf_tfilter_dump_policy"
tc_links_ingress # started failing after net-next merge from 2ab1efad60ad "net/sched: cls_api: complement tcf_tfilter_dump_policy"
xdp_bonding/xdp_bonding_features # started failing after net merge from 359e54a93ab4 "l2tp: pass correct message length to ip6_append_data"
tc_redirect/tc_redirect_dtime # uapi breakage after net-next commit 885c36e59f46 ("net: Re-use and set mono_delivery_time bit for userspace tstamp packets")
migrate_reuseport/IPv4 TCP_NEW_SYN_RECV reqsk_timer_handler # flaky, under investigation
migrate_reuseport/IPv6 TCP_NEW_SYN_RECV reqsk_timer_handler # flaky, under investigation
verify_pkcs7_sig # keeps failing

View File

@@ -10,3 +10,4 @@ send_signal/send_signal_nmi_thread
lwt_reroute # crashes kernel, fix pending upstream
tc_links_ingress # fails, same fix is pending upstream
tc_redirect # enough is enough, banned for life for flakiness

View File

@@ -2,3 +2,16 @@
sockmap_listen/sockhash VSOCK test_vsock_redir
usdt/basic # failing verifier due to bounds check after LLVM update
usdt/multispec # same as above
deny_namespace # not yet in bpf denylist
tc_redirect/tc_redirect_dtime # very flaky
lru_bug # not yet in bpf-next denylist
# Disabled temporarily for a crash.
# https://lore.kernel.org/bpf/c9923c1d-971d-4022-8dc8-1364e929d34c@gmail.com/
dummy_st_ops/dummy_init_ptr_arg
fexit_bpf2bpf
tailcalls
trace_ext
xdp_bpf2bpf
xdp_metadata

View File

@@ -67,12 +67,14 @@ local_configs_path=${PROJECT_NAME}/vmtest/configs
DENYLIST=$(read_lists \
"$configs_path/DENYLIST" \
"$configs_path/DENYLIST.${ARCH}" \
"$local_configs_path/DENYLIST" \
"$local_configs_path/DENYLIST-${KERNEL}" \
"$local_configs_path/DENYLIST-${KERNEL}.${ARCH}" \
)
ALLOWLIST=$(read_lists \
"$configs_path/ALLOWLIST" \
"$configs_path/ALLOWLIST.${ARCH}" \
"$local_configs_path/ALLOWLIST" \
"$local_configs_path/ALLOWLIST-${KERNEL}" \
"$local_configs_path/ALLOWLIST-${KERNEL}.${ARCH}" \
)

View File

@@ -219,6 +219,14 @@ compilation and skeleton generation. Using Libbpf-rs will make building user
space part of the BPF application easier. Note that the BPF program themselves
must still be written in plain C.
libbpf logging
==============
By default, libbpf logs informational and warning messages to stderr. The
verbosity of these messages can be controlled by setting the environment
variable LIBBPF_LOG_LEVEL to either warn, info, or debug. A custom log
callback can be set using ``libbpf_set_print()``.
Additional Documentation
========================

View File

@@ -37,6 +37,14 @@
.off = 0, \
.imm = IMM })
#define BPF_CALL_REL(DST) \
((struct bpf_insn) { \
.code = BPF_JMP | BPF_CALL, \
.dst_reg = 0, \
.src_reg = BPF_PSEUDO_CALL, \
.off = 0, \
.imm = DST })
#define BPF_EXIT_INSN() \
((struct bpf_insn) { \
.code = BPF_JMP | BPF_EXIT, \

View File

@@ -3,6 +3,8 @@
#ifndef __LINUX_KERNEL_H
#define __LINUX_KERNEL_H
#include <linux/compiler.h>
#ifndef offsetof
#define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)
#endif

View File

@@ -42,6 +42,7 @@
#define BPF_JSGE 0x70 /* SGE is signed '>=', GE in x86 */
#define BPF_JSLT 0xc0 /* SLT is signed, '<' */
#define BPF_JSLE 0xd0 /* SLE is signed, '<=' */
#define BPF_JCOND 0xe0 /* conditional pseudo jumps: may_goto, goto_or_nop */
#define BPF_CALL 0x80 /* function call */
#define BPF_EXIT 0x90 /* function return */
@@ -50,6 +51,10 @@
#define BPF_XCHG (0xe0 | BPF_FETCH) /* atomic exchange */
#define BPF_CMPXCHG (0xf0 | BPF_FETCH) /* atomic compare-and-write */
enum bpf_cond_pseudo_jmp {
BPF_MAY_GOTO = 0,
};
/* Register numbers */
enum {
BPF_REG_0 = 0,
@@ -77,12 +82,29 @@ struct bpf_insn {
__s32 imm; /* signed immediate constant */
};
/* Key of an a BPF_MAP_TYPE_LPM_TRIE entry */
/* Deprecated: use struct bpf_lpm_trie_key_u8 (when the "data" member is needed for
* byte access) or struct bpf_lpm_trie_key_hdr (when using an alternative type for
* the trailing flexible array member) instead.
*/
struct bpf_lpm_trie_key {
__u32 prefixlen; /* up to 32 for AF_INET, 128 for AF_INET6 */
__u8 data[0]; /* Arbitrary size */
};
/* Header for bpf_lpm_trie_key structs */
struct bpf_lpm_trie_key_hdr {
__u32 prefixlen;
};
/* Key of an a BPF_MAP_TYPE_LPM_TRIE entry, with trailing byte array. */
struct bpf_lpm_trie_key_u8 {
union {
struct bpf_lpm_trie_key_hdr hdr;
__u32 prefixlen;
};
__u8 data[]; /* Arbitrary size */
};
struct bpf_cgroup_storage_key {
__u64 cgroup_inode_id; /* cgroup inode id */
__u32 attach_type; /* program attach type (enum bpf_attach_type) */
@@ -617,7 +639,11 @@ union bpf_iter_link_info {
* to NULL to begin the batched operation. After each subsequent
* **BPF_MAP_LOOKUP_BATCH**, the caller should pass the resultant
* *out_batch* as the *in_batch* for the next operation to
* continue iteration from the current point.
* continue iteration from the current point. Both *in_batch* and
* *out_batch* must point to memory large enough to hold a key,
* except for maps of type **BPF_MAP_TYPE_{HASH, PERCPU_HASH,
* LRU_HASH, LRU_PERCPU_HASH}**, for which batch parameters
* must be at least 4 bytes wide regardless of key size.
*
* The *keys* and *values* are output parameters which must point
* to memory large enough to hold *count* items based on the key
@@ -847,6 +873,36 @@ union bpf_iter_link_info {
* Returns zero on success. On error, -1 is returned and *errno*
* is set appropriately.
*
* BPF_TOKEN_CREATE
* Description
* Create BPF token with embedded information about what
* BPF-related functionality it allows:
* - a set of allowed bpf() syscall commands;
* - a set of allowed BPF map types to be created with
* BPF_MAP_CREATE command, if BPF_MAP_CREATE itself is allowed;
* - a set of allowed BPF program types and BPF program attach
* types to be loaded with BPF_PROG_LOAD command, if
* BPF_PROG_LOAD itself is allowed.
*
* BPF token is created (derived) from an instance of BPF FS,
* assuming it has necessary delegation mount options specified.
* This BPF token can be passed as an extra parameter to various
* bpf() syscall commands to grant BPF subsystem functionality to
* unprivileged processes.
*
* When created, BPF token is "associated" with the owning
* user namespace of BPF FS instance (super block) that it was
* derived from, and subsequent BPF operations performed with
* BPF token would be performing capabilities checks (i.e.,
* CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, CAP_SYS_ADMIN) within
* that user namespace. Without BPF token, such capabilities
* have to be granted in init user namespace, making bpf()
* syscall incompatible with user namespace, for the most part.
*
* Return
* A new file descriptor (a nonnegative integer), or -1 if an
* error occurred (in which case, *errno* is set appropriately).
*
* NOTES
* eBPF objects (maps and programs) can be shared between processes.
*
@@ -901,6 +957,8 @@ enum bpf_cmd {
BPF_ITER_CREATE,
BPF_LINK_DETACH,
BPF_PROG_BIND_MAP,
BPF_TOKEN_CREATE,
__MAX_BPF_CMD,
};
enum bpf_map_type {
@@ -951,6 +1009,8 @@ enum bpf_map_type {
BPF_MAP_TYPE_BLOOM_FILTER,
BPF_MAP_TYPE_USER_RINGBUF,
BPF_MAP_TYPE_CGRP_STORAGE,
BPF_MAP_TYPE_ARENA,
__MAX_BPF_MAP_TYPE
};
/* Note that tracing related programs such as
@@ -995,6 +1055,7 @@ enum bpf_prog_type {
BPF_PROG_TYPE_SK_LOOKUP,
BPF_PROG_TYPE_SYSCALL, /* a program that can execute syscalls */
BPF_PROG_TYPE_NETFILTER,
__MAX_BPF_PROG_TYPE
};
enum bpf_attach_type {
@@ -1054,6 +1115,7 @@ enum bpf_attach_type {
BPF_CGROUP_UNIX_GETSOCKNAME,
BPF_NETKIT_PRIMARY,
BPF_NETKIT_PEER,
BPF_TRACE_KPROBE_SESSION,
__MAX_BPF_ATTACH_TYPE
};
@@ -1074,6 +1136,7 @@ enum bpf_link_type {
BPF_LINK_TYPE_TCX = 11,
BPF_LINK_TYPE_UPROBE_MULTI = 12,
BPF_LINK_TYPE_NETKIT = 13,
BPF_LINK_TYPE_SOCKMAP = 14,
__MAX_BPF_LINK_TYPE,
};
@@ -1278,6 +1341,10 @@ enum {
*/
#define BPF_PSEUDO_KFUNC_CALL 2
enum bpf_addr_space_cast {
BPF_ADDR_SPACE_CAST = 1,
};
/* flags for BPF_MAP_UPDATE_ELEM command */
enum {
BPF_ANY = 0, /* create new element or update existing */
@@ -1330,6 +1397,18 @@ enum {
/* Get path from provided FD in BPF_OBJ_PIN/BPF_OBJ_GET commands */
BPF_F_PATH_FD = (1U << 14),
/* Flag for value_type_btf_obj_fd, the fd is available */
BPF_F_VTYPE_BTF_OBJ_FD = (1U << 15),
/* BPF token FD is passed in a corresponding command's token_fd field */
BPF_F_TOKEN_FD = (1U << 16),
/* When user space page faults in bpf_arena send SIGSEGV instead of inserting new page */
BPF_F_SEGV_ON_FAULT = (1U << 17),
/* Do not translate kernel bpf_arena pointers to user pointers */
BPF_F_NO_USER_CONV = (1U << 18),
};
/* Flags for BPF_PROG_QUERY. */
@@ -1346,6 +1425,8 @@ enum {
#define BPF_F_TEST_RUN_ON_CPU (1U << 0)
/* If set, XDP frames will be transmitted after processing */
#define BPF_F_TEST_XDP_LIVE_FRAMES (1U << 1)
/* If set, apply CHECKSUM_COMPLETE to skb and validate the checksum */
#define BPF_F_TEST_SKB_CHECKSUM_COMPLETE (1U << 2)
/* type for BPF_ENABLE_STATS */
enum bpf_stats_type {
@@ -1401,8 +1482,20 @@ union bpf_attr {
* BPF_MAP_TYPE_BLOOM_FILTER - the lowest 4 bits indicate the
* number of hash functions (if 0, the bloom filter will default
* to using 5 hash functions).
*
* BPF_MAP_TYPE_ARENA - contains the address where user space
* is going to mmap() the arena. It has to be page aligned.
*/
__u64 map_extra;
__s32 value_type_btf_obj_fd; /* fd pointing to a BTF
* type data for
* btf_vmlinux_value_type_id.
*/
/* BPF token FD to use with BPF_MAP_CREATE operation.
* If provided, map_flags should have BPF_F_TOKEN_FD flag set.
*/
__s32 map_token_fd;
};
struct { /* anonymous struct used by BPF_MAP_*_ELEM commands */
@@ -1472,6 +1565,10 @@ union bpf_attr {
* truncated), or smaller (if log buffer wasn't filled completely).
*/
__u32 log_true_size;
/* BPF token FD to use with BPF_PROG_LOAD operation.
* If provided, prog_flags should have BPF_F_TOKEN_FD flag set.
*/
__s32 prog_token_fd;
};
struct { /* anonymous struct used by BPF_OBJ_* commands */
@@ -1569,8 +1666,10 @@ union bpf_attr {
} query;
struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */
__u64 name;
__u32 prog_fd;
__u64 name;
__u32 prog_fd;
__u32 :32;
__aligned_u64 cookie;
} raw_tracepoint;
struct { /* anonymous struct for BPF_BTF_LOAD */
@@ -1584,6 +1683,11 @@ union bpf_attr {
* truncated), or smaller (if log buffer wasn't filled completely).
*/
__u32 btf_log_true_size;
__u32 btf_flags;
/* BPF token FD to use with BPF_BTF_LOAD operation.
* If provided, btf_flags should have BPF_F_TOKEN_FD flag set.
*/
__s32 btf_token_fd;
};
struct {
@@ -1714,6 +1818,11 @@ union bpf_attr {
__u32 flags; /* extra flags */
} prog_bind_map;
struct { /* struct used by BPF_TOKEN_CREATE command */
__u32 flags;
__u32 bpffs_fd;
} token_create;
} __attribute__((aligned(8)));
/* The description below is an attempt at providing documentation to eBPF
@@ -3289,6 +3398,10 @@ union bpf_attr {
* for the nexthop. If the src addr cannot be derived,
* **BPF_FIB_LKUP_RET_NO_SRC_ADDR** is returned. In this
* case, *params*->dmac and *params*->smac are not set either.
* **BPF_FIB_LOOKUP_MARK**
* Use the mark present in *params*->mark for the fib lookup.
* This option should not be used with BPF_FIB_LOOKUP_DIRECT,
* as it only has meaning for full lookups.
*
* *ctx* is either **struct xdp_md** for XDP programs or
* **struct sk_buff** tc cls_act programs.
@@ -4839,9 +4952,9 @@ union bpf_attr {
* going through the CPU's backlog queue.
*
* The *flags* argument is reserved and must be 0. The helper is
* currently only supported for tc BPF program types at the ingress
* hook and for veth device types. The peer device must reside in a
* different network namespace.
* currently only supported for tc BPF program types at the
* ingress hook and for veth and netkit target device types. The
* peer device must reside in a different network namespace.
* Return
* The helper returns **TC_ACT_REDIRECT** on success or
* **TC_ACT_SHOT** on error.
@@ -4917,7 +5030,7 @@ union bpf_attr {
* bytes will be copied to *dst*
* Return
* The **hash_algo** is returned on success,
* **-EOPNOTSUP** if IMA is disabled or **-EINVAL** if
* **-EOPNOTSUPP** if IMA is disabled or **-EINVAL** if
* invalid arguments are passed.
*
* struct socket *bpf_sock_from_file(struct file *file)
@@ -5403,7 +5516,7 @@ union bpf_attr {
* bytes will be copied to *dst*
* Return
* The **hash_algo** is returned on success,
* **-EOPNOTSUP** if the hash calculation failed or **-EINVAL** if
* **-EOPNOTSUPP** if the hash calculation failed or **-EINVAL** if
* invalid arguments are passed.
*
* void *bpf_kptr_xchg(void *map_value, void *ptr)
@@ -6096,12 +6209,17 @@ union { \
__u64 :64; \
} __attribute__((aligned(8)))
/* The enum used in skb->tstamp_type. It specifies the clock type
* of the time stored in the skb->tstamp.
*/
enum {
BPF_SKB_TSTAMP_UNSPEC,
BPF_SKB_TSTAMP_DELIVERY_MONO, /* tstamp has mono delivery time */
/* For any BPF_SKB_TSTAMP_* that the bpf prog cannot handle,
* the bpf prog should handle it like BPF_SKB_TSTAMP_UNSPEC
* and try to deduce it by ingress, egress or skb->sk->sk_clockid.
BPF_SKB_TSTAMP_UNSPEC = 0, /* DEPRECATED */
BPF_SKB_TSTAMP_DELIVERY_MONO = 1, /* DEPRECATED */
BPF_SKB_CLOCK_REALTIME = 0,
BPF_SKB_CLOCK_MONOTONIC = 1,
BPF_SKB_CLOCK_TAI = 2,
/* For any future BPF_SKB_CLOCK_* that the bpf prog cannot handle,
* the bpf prog can try to deduce it by ingress/egress/skb->sk->sk_clockid.
*/
};
@@ -6487,7 +6605,7 @@ struct bpf_map_info {
__u32 btf_id;
__u32 btf_key_type_id;
__u32 btf_value_type_id;
__u32 :32; /* alignment pad */
__u32 btf_vmlinux_id;
__u64 map_extra;
} __attribute__((aligned(8)));
@@ -6563,6 +6681,7 @@ struct bpf_link_info {
__u32 count; /* in/out: kprobe_multi function count */
__u32 flags;
__u64 missed;
__aligned_u64 cookies;
} kprobe_multi;
struct {
__aligned_u64 path;
@@ -6582,6 +6701,7 @@ struct bpf_link_info {
__aligned_u64 file_name; /* in/out */
__u32 name_len;
__u32 offset; /* offset from file_name */
__u64 cookie;
} uprobe; /* BPF_PERF_EVENT_UPROBE, BPF_PERF_EVENT_URETPROBE */
struct {
__aligned_u64 func_name; /* in/out */
@@ -6589,14 +6709,19 @@ struct bpf_link_info {
__u32 offset; /* offset from func_name */
__u64 addr;
__u64 missed;
__u64 cookie;
} kprobe; /* BPF_PERF_EVENT_KPROBE, BPF_PERF_EVENT_KRETPROBE */
struct {
__aligned_u64 tp_name; /* in/out */
__u32 name_len;
__u32 :32;
__u64 cookie;
} tracepoint; /* BPF_PERF_EVENT_TRACEPOINT */
struct {
__u64 config;
__u32 type;
__u32 :32;
__u64 cookie;
} event; /* BPF_PERF_EVENT_EVENT */
};
} perf_event;
@@ -6608,6 +6733,10 @@ struct bpf_link_info {
__u32 ifindex;
__u32 attach_type;
} netkit;
struct {
__u32 map_id;
__u32 attach_type;
} sockmap;
};
} __attribute__((aligned(8)));
@@ -6826,6 +6955,8 @@ enum {
* socket transition to LISTEN state.
*/
BPF_SOCK_OPS_RTT_CB, /* Called on every RTT.
* Arg1: measured RTT input (mrtt)
* Arg2: updated srtt
*/
BPF_SOCK_OPS_PARSE_HDR_OPT_CB, /* Parse the header option.
* It will be called to handle
@@ -6904,6 +7035,7 @@ enum {
BPF_TCP_LISTEN,
BPF_TCP_CLOSING, /* Now a valid state */
BPF_TCP_NEW_SYN_RECV,
BPF_TCP_BOUND_INACTIVE,
BPF_TCP_MAX_STATES /* Leave at the end! */
};
@@ -7007,6 +7139,7 @@ enum {
BPF_FIB_LOOKUP_SKIP_NEIGH = (1U << 2),
BPF_FIB_LOOKUP_TBID = (1U << 3),
BPF_FIB_LOOKUP_SRC = (1U << 4),
BPF_FIB_LOOKUP_MARK = (1U << 5),
};
enum {
@@ -7039,7 +7172,7 @@ struct bpf_fib_lookup {
/* output: MTU value */
__u16 mtu_result;
};
} __attribute__((packed, aligned(2)));
/* input: L3 device index for lookup
* output: device index from FIB lookup
*/
@@ -7084,8 +7217,19 @@ struct bpf_fib_lookup {
__u32 tbid;
};
__u8 smac[6]; /* ETH_ALEN */
__u8 dmac[6]; /* ETH_ALEN */
union {
/* input */
struct {
__u32 mark; /* policy routing */
/* 2 4-byte holes for input */
};
/* output: source and dest mac */
struct {
__u8 smac[6]; /* ETH_ALEN */
__u8 dmac[6]; /* ETH_ALEN */
};
};
};
struct bpf_redir_neigh {
@@ -7172,6 +7316,10 @@ struct bpf_timer {
__u64 __opaque[2];
} __attribute__((aligned(8)));
struct bpf_wq {
__u64 __opaque[2];
} __attribute__((aligned(8)));
struct bpf_dynptr {
__u64 __opaque[2];
} __attribute__((aligned(8)));
@@ -7364,4 +7512,13 @@ struct bpf_iter_num {
__u64 __opaque[1];
} __attribute__((aligned(8)));
/*
* Flags to control BPF kfunc behaviour.
* - BPF_F_PAD_ZEROS: Pad destination buffer with zeros. (See the respective
* helper documentation for details.)
*/
enum bpf_kfunc_flags {
BPF_F_PAD_ZEROS = (1ULL << 0),
};
#endif /* _UAPI__LINUX_BPF_H__ */

View File

@@ -1,120 +0,0 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
#ifndef _UAPI_LINUX_FCNTL_H
#define _UAPI_LINUX_FCNTL_H
#include <asm/fcntl.h>
#include <linux/openat2.h>
#define F_SETLEASE (F_LINUX_SPECIFIC_BASE + 0)
#define F_GETLEASE (F_LINUX_SPECIFIC_BASE + 1)
/*
* Cancel a blocking posix lock; internal use only until we expose an
* asynchronous lock api to userspace:
*/
#define F_CANCELLK (F_LINUX_SPECIFIC_BASE + 5)
/* Create a file descriptor with FD_CLOEXEC set. */
#define F_DUPFD_CLOEXEC (F_LINUX_SPECIFIC_BASE + 6)
/*
* Request nofications on a directory.
* See below for events that may be notified.
*/
#define F_NOTIFY (F_LINUX_SPECIFIC_BASE+2)
/*
* Set and get of pipe page size array
*/
#define F_SETPIPE_SZ (F_LINUX_SPECIFIC_BASE + 7)
#define F_GETPIPE_SZ (F_LINUX_SPECIFIC_BASE + 8)
/*
* Set/Get seals
*/
#define F_ADD_SEALS (F_LINUX_SPECIFIC_BASE + 9)
#define F_GET_SEALS (F_LINUX_SPECIFIC_BASE + 10)
/*
* Types of seals
*/
#define F_SEAL_SEAL 0x0001 /* prevent further seals from being set */
#define F_SEAL_SHRINK 0x0002 /* prevent file from shrinking */
#define F_SEAL_GROW 0x0004 /* prevent file from growing */
#define F_SEAL_WRITE 0x0008 /* prevent writes */
#define F_SEAL_FUTURE_WRITE 0x0010 /* prevent future writes while mapped */
#define F_SEAL_EXEC 0x0020 /* prevent chmod modifying exec bits */
/* (1U << 31) is reserved for signed error codes */
/*
* Set/Get write life time hints. {GET,SET}_RW_HINT operate on the
* underlying inode, while {GET,SET}_FILE_RW_HINT operate only on
* the specific file.
*/
#define F_GET_RW_HINT (F_LINUX_SPECIFIC_BASE + 11)
#define F_SET_RW_HINT (F_LINUX_SPECIFIC_BASE + 12)
#define F_GET_FILE_RW_HINT (F_LINUX_SPECIFIC_BASE + 13)
#define F_SET_FILE_RW_HINT (F_LINUX_SPECIFIC_BASE + 14)
/*
* Valid hint values for F_{GET,SET}_RW_HINT. 0 is "not set", or can be
* used to clear any hints previously set.
*/
#define RWH_WRITE_LIFE_NOT_SET 0
#define RWH_WRITE_LIFE_NONE 1
#define RWH_WRITE_LIFE_SHORT 2
#define RWH_WRITE_LIFE_MEDIUM 3
#define RWH_WRITE_LIFE_LONG 4
#define RWH_WRITE_LIFE_EXTREME 5
/*
* The originally introduced spelling is remained from the first
* versions of the patch set that introduced the feature, see commit
* v4.13-rc1~212^2~51.
*/
#define RWF_WRITE_LIFE_NOT_SET RWH_WRITE_LIFE_NOT_SET
/*
* Types of directory notifications that may be requested.
*/
#define DN_ACCESS 0x00000001 /* File accessed */
#define DN_MODIFY 0x00000002 /* File modified */
#define DN_CREATE 0x00000004 /* File created */
#define DN_DELETE 0x00000008 /* File removed */
#define DN_RENAME 0x00000010 /* File renamed */
#define DN_ATTRIB 0x00000020 /* File changed attibutes */
#define DN_MULTISHOT 0x80000000 /* Don't remove notifier */
/*
* The constants AT_REMOVEDIR and AT_EACCESS have the same value. AT_EACCESS is
* meaningful only to faccessat, while AT_REMOVEDIR is meaningful only to
* unlinkat. The two functions do completely different things and therefore,
* the flags can be allowed to overlap. For example, passing AT_REMOVEDIR to
* faccessat would be undefined behavior and thus treating it equivalent to
* AT_EACCESS is valid undefined behavior.
*/
#define AT_FDCWD -100 /* Special value used to indicate
openat should use the current
working directory. */
#define AT_SYMLINK_NOFOLLOW 0x100 /* Do not follow symbolic links. */
#define AT_EACCESS 0x200 /* Test access permitted for
effective IDs, not real IDs. */
#define AT_REMOVEDIR 0x200 /* Remove directory instead of
unlinking file. */
#define AT_SYMLINK_FOLLOW 0x400 /* Follow symbolic links. */
#define AT_NO_AUTOMOUNT 0x800 /* Suppress terminal automount traversal */
#define AT_EMPTY_PATH 0x1000 /* Allow empty relative pathname */
#define AT_STATX_SYNC_TYPE 0x6000 /* Type of synchronisation required from statx() */
#define AT_STATX_SYNC_AS_STAT 0x0000 /* - Do whatever stat() does */
#define AT_STATX_FORCE_SYNC 0x2000 /* - Force the attributes to be sync'd with the server */
#define AT_STATX_DONT_SYNC 0x4000 /* - Don't sync attributes with the server */
#define AT_RECURSIVE 0x8000 /* Apply to the entire subtree */
/* Flags for name_to_handle_at(2). We reuse AT_ flag space to save bits... */
#define AT_HANDLE_FID AT_REMOVEDIR /* file handle is needed to
compare object identity and may not
be usable to open_by_handle_at(2) */
#endif /* _UAPI_LINUX_FCNTL_H */

View File

@@ -974,6 +974,7 @@ enum {
IFLA_BOND_AD_LACP_ACTIVE,
IFLA_BOND_MISSED_MAX,
IFLA_BOND_NS_IP6_TARGET,
IFLA_BOND_COUPLED_CONTROL,
__IFLA_BOND_MAX,
};

View File

@@ -41,6 +41,10 @@
*/
#define XDP_UMEM_TX_SW_CSUM (1 << 1)
/* Request to reserve tx_metadata_len bytes of per-chunk metadata.
*/
#define XDP_UMEM_TX_METADATA_LEN (1 << 2)
struct sockaddr_xdp {
__u16 sxdp_family;
__u16 sxdp_flags;

View File

@@ -63,9 +63,6 @@ enum netdev_xdp_rx_metadata {
enum netdev_xsk_flags {
NETDEV_XSK_FLAGS_TX_TIMESTAMP = 1,
NETDEV_XSK_FLAGS_TX_CHECKSUM = 2,
/* private: */
NETDEV_XSK_FLAGS_MASK = 3,
};
enum netdev_queue_type {
@@ -73,6 +70,10 @@ enum netdev_queue_type {
NETDEV_QUEUE_TYPE_TX,
};
enum netdev_qstats_scope {
NETDEV_QSTATS_SCOPE_QUEUE = 1,
};
enum {
NETDEV_A_DEV_IFINDEX = 1,
NETDEV_A_DEV_PAD,
@@ -135,6 +136,43 @@ enum {
NETDEV_A_QUEUE_MAX = (__NETDEV_A_QUEUE_MAX - 1)
};
enum {
NETDEV_A_QSTATS_IFINDEX = 1,
NETDEV_A_QSTATS_QUEUE_TYPE,
NETDEV_A_QSTATS_QUEUE_ID,
NETDEV_A_QSTATS_SCOPE,
NETDEV_A_QSTATS_RX_PACKETS = 8,
NETDEV_A_QSTATS_RX_BYTES,
NETDEV_A_QSTATS_TX_PACKETS,
NETDEV_A_QSTATS_TX_BYTES,
NETDEV_A_QSTATS_RX_ALLOC_FAIL,
NETDEV_A_QSTATS_RX_HW_DROPS,
NETDEV_A_QSTATS_RX_HW_DROP_OVERRUNS,
NETDEV_A_QSTATS_RX_CSUM_COMPLETE,
NETDEV_A_QSTATS_RX_CSUM_UNNECESSARY,
NETDEV_A_QSTATS_RX_CSUM_NONE,
NETDEV_A_QSTATS_RX_CSUM_BAD,
NETDEV_A_QSTATS_RX_HW_GRO_PACKETS,
NETDEV_A_QSTATS_RX_HW_GRO_BYTES,
NETDEV_A_QSTATS_RX_HW_GRO_WIRE_PACKETS,
NETDEV_A_QSTATS_RX_HW_GRO_WIRE_BYTES,
NETDEV_A_QSTATS_RX_HW_DROP_RATELIMITS,
NETDEV_A_QSTATS_TX_HW_DROPS,
NETDEV_A_QSTATS_TX_HW_DROP_ERRORS,
NETDEV_A_QSTATS_TX_CSUM_NONE,
NETDEV_A_QSTATS_TX_NEEDS_CSUM,
NETDEV_A_QSTATS_TX_HW_GSO_PACKETS,
NETDEV_A_QSTATS_TX_HW_GSO_BYTES,
NETDEV_A_QSTATS_TX_HW_GSO_WIRE_PACKETS,
NETDEV_A_QSTATS_TX_HW_GSO_WIRE_BYTES,
NETDEV_A_QSTATS_TX_HW_DROP_RATELIMITS,
NETDEV_A_QSTATS_TX_STOP,
NETDEV_A_QSTATS_TX_WAKE,
__NETDEV_A_QSTATS_MAX,
NETDEV_A_QSTATS_MAX = (__NETDEV_A_QSTATS_MAX - 1)
};
enum {
NETDEV_CMD_DEV_GET = 1,
NETDEV_CMD_DEV_ADD_NTF,
@@ -147,6 +185,7 @@ enum {
NETDEV_CMD_PAGE_POOL_STATS_GET,
NETDEV_CMD_QUEUE_GET,
NETDEV_CMD_NAPI_GET,
NETDEV_CMD_QSTATS_GET,
__NETDEV_CMD_MAX,
NETDEV_CMD_MAX = (__NETDEV_CMD_MAX - 1)

View File

@@ -1,43 +0,0 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
#ifndef _UAPI_LINUX_OPENAT2_H
#define _UAPI_LINUX_OPENAT2_H
#include <linux/types.h>
/*
* Arguments for how openat2(2) should open the target path. If only @flags and
* @mode are non-zero, then openat2(2) operates very similarly to openat(2).
*
* However, unlike openat(2), unknown or invalid bits in @flags result in
* -EINVAL rather than being silently ignored. @mode must be zero unless one of
* {O_CREAT, O_TMPFILE} are set.
*
* @flags: O_* flags.
* @mode: O_CREAT/O_TMPFILE file mode.
* @resolve: RESOLVE_* flags.
*/
struct open_how {
__u64 flags;
__u64 mode;
__u64 resolve;
};
/* how->resolve flags for openat2(2). */
#define RESOLVE_NO_XDEV 0x01 /* Block mount-point crossings
(includes bind-mounts). */
#define RESOLVE_NO_MAGICLINKS 0x02 /* Block traversal through procfs-style
"magic-links". */
#define RESOLVE_NO_SYMLINKS 0x04 /* Block traversal through all symlinks
(implies OEXT_NO_MAGICLINKS) */
#define RESOLVE_BENEATH 0x08 /* Block "lexical" trickery like
"..", symlinks, and absolute
paths which escape the dirfd. */
#define RESOLVE_IN_ROOT 0x10 /* Make all jumps to "/" and ".."
be scoped inside the dirfd
(similar to chroot(2)). */
#define RESOLVE_CACHED 0x20 /* Only complete if resolution can be
completed through cached lookup. May
return -EAGAIN if that's not
possible. */
#endif /* _UAPI_LINUX_OPENAT2_H */

View File

@@ -204,6 +204,8 @@ enum perf_branch_sample_type_shift {
PERF_SAMPLE_BRANCH_PRIV_SAVE_SHIFT = 18, /* save privilege mode */
PERF_SAMPLE_BRANCH_COUNTERS_SHIFT = 19, /* save occurrences of events on a branch */
PERF_SAMPLE_BRANCH_MAX_SHIFT /* non-ABI */
};
@@ -235,6 +237,8 @@ enum perf_branch_sample_type {
PERF_SAMPLE_BRANCH_PRIV_SAVE = 1U << PERF_SAMPLE_BRANCH_PRIV_SAVE_SHIFT,
PERF_SAMPLE_BRANCH_COUNTERS = 1U << PERF_SAMPLE_BRANCH_COUNTERS_SHIFT,
PERF_SAMPLE_BRANCH_MAX = 1U << PERF_SAMPLE_BRANCH_MAX_SHIFT,
};
@@ -982,6 +986,12 @@ enum perf_event_type {
* { u64 nr;
* { u64 hw_idx; } && PERF_SAMPLE_BRANCH_HW_INDEX
* { u64 from, to, flags } lbr[nr];
* #
* # The format of the counters is decided by the
* # "branch_counter_nr" and "branch_counter_width",
* # which are defined in the ABI.
* #
* { u64 counters; } cntr[nr] && PERF_SAMPLE_BRANCH_COUNTERS
* } && PERF_SAMPLE_BRANCH_STACK
*
* { u64 abi; # enum perf_sample_regs_abi
@@ -1339,12 +1349,14 @@ union perf_mem_data_src {
#define PERF_MEM_LVLNUM_L2 0x02 /* L2 */
#define PERF_MEM_LVLNUM_L3 0x03 /* L3 */
#define PERF_MEM_LVLNUM_L4 0x04 /* L4 */
/* 5-0x7 available */
#define PERF_MEM_LVLNUM_L2_MHB 0x05 /* L2 Miss Handling Buffer */
#define PERF_MEM_LVLNUM_MSC 0x06 /* Memory-side Cache */
/* 0x7 available */
#define PERF_MEM_LVLNUM_UNC 0x08 /* Uncached */
#define PERF_MEM_LVLNUM_CXL 0x09 /* CXL */
#define PERF_MEM_LVLNUM_IO 0x0a /* I/O */
#define PERF_MEM_LVLNUM_ANY_CACHE 0x0b /* Any cache */
#define PERF_MEM_LVLNUM_LFB 0x0c /* LFB */
#define PERF_MEM_LVLNUM_LFB 0x0c /* LFB / L1 Miss Handling Buffer */
#define PERF_MEM_LVLNUM_RAM 0x0d /* RAM */
#define PERF_MEM_LVLNUM_PMEM 0x0e /* PMEM */
#define PERF_MEM_LVLNUM_NA 0x0f /* N/A */
@@ -1427,6 +1439,9 @@ struct perf_branch_entry {
reserved:31;
};
/* Size of used info bits in struct perf_branch_entry */
#define PERF_BRANCH_ENTRY_INFO_BITS_MAX 33
union perf_sample_weight {
__u64 full;
#if defined(__LITTLE_ENDIAN_BITFIELD)

37
scripts/mailmap-update.sh Executable file
View File

@@ -0,0 +1,37 @@
#!/usr/bin/env bash
set -eu
usage () {
echo "USAGE: ./mailmap-update.sh <libbpf-repo> <linux-repo>"
exit 1
}
LIBBPF_REPO="${1-""}"
LINUX_REPO="${2-""}"
if [ -z "${LIBBPF_REPO}" ] || [ -z "${LINUX_REPO}" ]; then
echo "Error: libbpf or linux repos are not specified"
usage
fi
LIBBPF_MAILMAP="${LIBBPF_REPO}/.mailmap"
LINUX_MAILMAP="${LINUX_REPO}/.mailmap"
tmpfile="$(mktemp)"
cleanup() {
rm -f "${tmpfile}"
}
trap cleanup EXIT
grep_lines() {
local pattern="$1"
local file="$2"
grep "${pattern}" "${file}" || true
}
while read -r email; do
grep_lines "${email}$" "${LINUX_MAILMAP}" >> "${tmpfile}"
done < <(git log --format='<%ae>' | sort -u)
sort -u "${tmpfile}" > "${LIBBPF_MAILMAP}"

View File

@@ -295,6 +295,22 @@ Latest changes to BPF helper definitions.
" -- src/bpf_helper_defs.h
fi
echo "Regenerating .mailmap..."
cd_to "${LINUX_REPO}"
git checkout "${TIP_SYM_REF}"
cd_to "${LIBBPF_REPO}"
"${LIBBPF_REPO}"/scripts/mailmap-update.sh "${LIBBPF_REPO}" "${LINUX_REPO}"
# if anything changed, commit it
mailmap_changes=$(git status --porcelain .mailmap | wc -l)
if ((${mailmap_changes} == 1)); then
git add .mailmap
git commit -s -m "sync: update .mailmap
Update .mailmap based on libbpf's list of contributors and on the latest
.mailmap version in the upstream repository.
" -- .mailmap
fi
# Use generated cover-letter as a template for "sync commit" with
# baseline and checkpoint commits from kernel repo (and leave summary
# from cover letter intact, of course)

View File

@@ -9,7 +9,7 @@ else
endif
LIBBPF_MAJOR_VERSION := 1
LIBBPF_MINOR_VERSION := 4
LIBBPF_MINOR_VERSION := 5
LIBBPF_PATCH_VERSION := 0
LIBBPF_VERSION := $(LIBBPF_MAJOR_VERSION).$(LIBBPF_MINOR_VERSION).$(LIBBPF_PATCH_VERSION)
LIBBPF_MAJMIN_VERSION := $(LIBBPF_MAJOR_VERSION).$(LIBBPF_MINOR_VERSION).0
@@ -55,7 +55,7 @@ STATIC_OBJDIR := $(OBJDIR)/staticobjs
OBJS := bpf.o btf.o libbpf.o libbpf_errno.o netlink.o \
nlattr.o str_error.o libbpf_probes.o bpf_prog_linfo.o \
btf_dump.o hashmap.o ringbuf.o strset.o linker.o gen_loader.o \
relo_core.o usdt.o zip.o elf.o
relo_core.o usdt.o zip.o elf.o features.o btf_iter.o btf_relocate.o
SHARED_OBJS := $(addprefix $(SHARED_OBJDIR)/,$(OBJS))
STATIC_OBJS := $(addprefix $(STATIC_OBJDIR)/,$(OBJS))
@@ -119,13 +119,13 @@ $(OBJDIR)/libbpf.so.$(LIBBPF_VERSION): $(SHARED_OBJS)
-Wl,-soname,libbpf.so.$(LIBBPF_MAJOR_VERSION) \
$^ $(ALL_LDFLAGS) -o $@
$(OBJDIR)/libbpf.pc: force
$(OBJDIR)/libbpf.pc: force | $(OBJDIR)
$(Q)sed -e "s|@PREFIX@|$(PREFIX)|" \
-e "s|@LIBDIR@|$(LIBDIR_PC)|" \
-e "s|@VERSION@|$(LIBBPF_VERSION)|" \
< libbpf.pc.template > $@
$(STATIC_OBJDIR) $(SHARED_OBJDIR):
$(OBJDIR) $(STATIC_OBJDIR) $(SHARED_OBJDIR):
$(call msg,MKDIR,$@)
$(Q)mkdir -p $@

View File

@@ -103,9 +103,9 @@ int sys_bpf_prog_load(union bpf_attr *attr, unsigned int size, int attempts)
* [0] https://lore.kernel.org/bpf/20201201215900.3569844-1-guro@fb.com/
* [1] d05512618056 ("bpf: Add bpf_ktime_get_coarse_ns helper")
*/
int probe_memcg_account(void)
int probe_memcg_account(int token_fd)
{
const size_t attr_sz = offsetofend(union bpf_attr, attach_btf_obj_fd);
const size_t attr_sz = offsetofend(union bpf_attr, prog_token_fd);
struct bpf_insn insns[] = {
BPF_EMIT_CALL(BPF_FUNC_ktime_get_coarse_ns),
BPF_EXIT_INSN(),
@@ -120,6 +120,9 @@ int probe_memcg_account(void)
attr.insns = ptr_to_u64(insns);
attr.insn_cnt = insn_cnt;
attr.license = ptr_to_u64("GPL");
attr.prog_token_fd = token_fd;
if (token_fd)
attr.prog_flags |= BPF_F_TOKEN_FD;
prog_fd = sys_bpf_fd(BPF_PROG_LOAD, &attr, attr_sz);
if (prog_fd >= 0) {
@@ -146,7 +149,7 @@ int bump_rlimit_memlock(void)
struct rlimit rlim;
/* if kernel supports memcg-based accounting, skip bumping RLIMIT_MEMLOCK */
if (memlock_bumped || kernel_supports(NULL, FEAT_MEMCG_ACCOUNT))
if (memlock_bumped || feat_supported(NULL, FEAT_MEMCG_ACCOUNT))
return 0;
memlock_bumped = true;
@@ -169,7 +172,7 @@ int bpf_map_create(enum bpf_map_type map_type,
__u32 max_entries,
const struct bpf_map_create_opts *opts)
{
const size_t attr_sz = offsetofend(union bpf_attr, map_extra);
const size_t attr_sz = offsetofend(union bpf_attr, map_token_fd);
union bpf_attr attr;
int fd;
@@ -181,7 +184,7 @@ int bpf_map_create(enum bpf_map_type map_type,
return libbpf_err(-EINVAL);
attr.map_type = map_type;
if (map_name && kernel_supports(NULL, FEAT_PROG_NAME))
if (map_name && feat_supported(NULL, FEAT_PROG_NAME))
libbpf_strlcpy(attr.map_name, map_name, sizeof(attr.map_name));
attr.key_size = key_size;
attr.value_size = value_size;
@@ -191,6 +194,7 @@ int bpf_map_create(enum bpf_map_type map_type,
attr.btf_key_type_id = OPTS_GET(opts, btf_key_type_id, 0);
attr.btf_value_type_id = OPTS_GET(opts, btf_value_type_id, 0);
attr.btf_vmlinux_value_type_id = OPTS_GET(opts, btf_vmlinux_value_type_id, 0);
attr.value_type_btf_obj_fd = OPTS_GET(opts, value_type_btf_obj_fd, 0);
attr.inner_map_fd = OPTS_GET(opts, inner_map_fd, 0);
attr.map_flags = OPTS_GET(opts, map_flags, 0);
@@ -198,6 +202,8 @@ int bpf_map_create(enum bpf_map_type map_type,
attr.numa_node = OPTS_GET(opts, numa_node, 0);
attr.map_ifindex = OPTS_GET(opts, map_ifindex, 0);
attr.map_token_fd = OPTS_GET(opts, token_fd, 0);
fd = sys_bpf_fd(BPF_MAP_CREATE, &attr, attr_sz);
return libbpf_err_errno(fd);
}
@@ -232,7 +238,7 @@ int bpf_prog_load(enum bpf_prog_type prog_type,
const struct bpf_insn *insns, size_t insn_cnt,
struct bpf_prog_load_opts *opts)
{
const size_t attr_sz = offsetofend(union bpf_attr, log_true_size);
const size_t attr_sz = offsetofend(union bpf_attr, prog_token_fd);
void *finfo = NULL, *linfo = NULL;
const char *func_info, *line_info;
__u32 log_size, log_level, attach_prog_fd, attach_btf_obj_fd;
@@ -261,8 +267,9 @@ int bpf_prog_load(enum bpf_prog_type prog_type,
attr.prog_flags = OPTS_GET(opts, prog_flags, 0);
attr.prog_ifindex = OPTS_GET(opts, prog_ifindex, 0);
attr.kern_version = OPTS_GET(opts, kern_version, 0);
attr.prog_token_fd = OPTS_GET(opts, token_fd, 0);
if (prog_name && kernel_supports(NULL, FEAT_PROG_NAME))
if (prog_name && feat_supported(NULL, FEAT_PROG_NAME))
libbpf_strlcpy(attr.prog_name, prog_name, sizeof(attr.prog_name));
attr.license = ptr_to_u64(license);
@@ -759,6 +766,7 @@ int bpf_link_create(int prog_fd, int target_fd,
return libbpf_err(-EINVAL);
break;
case BPF_TRACE_KPROBE_MULTI:
case BPF_TRACE_KPROBE_SESSION:
attr.link_create.kprobe_multi.flags = OPTS_GET(opts, kprobe_multi.flags, 0);
attr.link_create.kprobe_multi.cnt = OPTS_GET(opts, kprobe_multi.cnt, 0);
attr.link_create.kprobe_multi.syms = ptr_to_u64(OPTS_GET(opts, kprobe_multi.syms, 0));
@@ -778,6 +786,7 @@ int bpf_link_create(int prog_fd, int target_fd,
if (!OPTS_ZEROED(opts, uprobe_multi))
return libbpf_err(-EINVAL);
break;
case BPF_TRACE_RAW_TP:
case BPF_TRACE_FENTRY:
case BPF_TRACE_FEXIT:
case BPF_MODIFY_RETURN:
@@ -1166,23 +1175,34 @@ int bpf_link_get_info_by_fd(int link_fd, struct bpf_link_info *info, __u32 *info
return bpf_obj_get_info_by_fd(link_fd, info, info_len);
}
int bpf_raw_tracepoint_open(const char *name, int prog_fd)
int bpf_raw_tracepoint_open_opts(int prog_fd, struct bpf_raw_tp_opts *opts)
{
const size_t attr_sz = offsetofend(union bpf_attr, raw_tracepoint);
union bpf_attr attr;
int fd;
if (!OPTS_VALID(opts, bpf_raw_tp_opts))
return libbpf_err(-EINVAL);
memset(&attr, 0, attr_sz);
attr.raw_tracepoint.name = ptr_to_u64(name);
attr.raw_tracepoint.prog_fd = prog_fd;
attr.raw_tracepoint.name = ptr_to_u64(OPTS_GET(opts, tp_name, NULL));
attr.raw_tracepoint.cookie = OPTS_GET(opts, cookie, 0);
fd = sys_bpf_fd(BPF_RAW_TRACEPOINT_OPEN, &attr, attr_sz);
return libbpf_err_errno(fd);
}
int bpf_raw_tracepoint_open(const char *name, int prog_fd)
{
LIBBPF_OPTS(bpf_raw_tp_opts, opts, .tp_name = name);
return bpf_raw_tracepoint_open_opts(prog_fd, &opts);
}
int bpf_btf_load(const void *btf_data, size_t btf_size, struct bpf_btf_load_opts *opts)
{
const size_t attr_sz = offsetofend(union bpf_attr, btf_log_true_size);
const size_t attr_sz = offsetofend(union bpf_attr, btf_token_fd);
union bpf_attr attr;
char *log_buf;
size_t log_size;
@@ -1207,6 +1227,10 @@ int bpf_btf_load(const void *btf_data, size_t btf_size, struct bpf_btf_load_opts
attr.btf = ptr_to_u64(btf_data);
attr.btf_size = btf_size;
attr.btf_flags = OPTS_GET(opts, btf_flags, 0);
attr.btf_token_fd = OPTS_GET(opts, token_fd, 0);
/* log_level == 0 and log_buf != NULL means "try loading without
* log_buf, but retry with log_buf and log_level=1 on error", which is
* consistent across low-level and high-level BTF and program loading
@@ -1287,3 +1311,20 @@ int bpf_prog_bind_map(int prog_fd, int map_fd,
ret = sys_bpf(BPF_PROG_BIND_MAP, &attr, attr_sz);
return libbpf_err_errno(ret);
}
int bpf_token_create(int bpffs_fd, struct bpf_token_create_opts *opts)
{
const size_t attr_sz = offsetofend(union bpf_attr, token_create);
union bpf_attr attr;
int fd;
if (!OPTS_VALID(opts, bpf_token_create_opts))
return libbpf_err(-EINVAL);
memset(&attr, 0, attr_sz);
attr.token_create.bpffs_fd = bpffs_fd;
attr.token_create.flags = OPTS_GET(opts, flags, 0);
fd = sys_bpf_fd(BPF_TOKEN_CREATE, &attr, attr_sz);
return libbpf_err_errno(fd);
}

View File

@@ -35,7 +35,7 @@
extern "C" {
#endif
int libbpf_set_memlock_rlim(size_t memlock_bytes);
LIBBPF_API int libbpf_set_memlock_rlim(size_t memlock_bytes);
struct bpf_map_create_opts {
size_t sz; /* size of this struct for forward/backward compatibility */
@@ -51,8 +51,12 @@ struct bpf_map_create_opts {
__u32 numa_node;
__u32 map_ifindex;
__s32 value_type_btf_obj_fd;
__u32 token_fd;
size_t :0;
};
#define bpf_map_create_opts__last_field map_ifindex
#define bpf_map_create_opts__last_field token_fd
LIBBPF_API int bpf_map_create(enum bpf_map_type map_type,
const char *map_name,
@@ -102,9 +106,10 @@ struct bpf_prog_load_opts {
* If kernel doesn't support this feature, log_size is left unchanged.
*/
__u32 log_true_size;
__u32 token_fd;
size_t :0;
};
#define bpf_prog_load_opts__last_field log_true_size
#define bpf_prog_load_opts__last_field token_fd
LIBBPF_API int bpf_prog_load(enum bpf_prog_type prog_type,
const char *prog_name, const char *license,
@@ -130,9 +135,12 @@ struct bpf_btf_load_opts {
* If kernel doesn't support this feature, log_size is left unchanged.
*/
__u32 log_true_size;
__u32 btf_flags;
__u32 token_fd;
size_t :0;
};
#define bpf_btf_load_opts__last_field log_true_size
#define bpf_btf_load_opts__last_field token_fd
LIBBPF_API int bpf_btf_load(const void *btf_data, size_t btf_size,
struct bpf_btf_load_opts *opts);
@@ -182,10 +190,14 @@ LIBBPF_API int bpf_map_delete_batch(int fd, const void *keys,
/**
* @brief **bpf_map_lookup_batch()** allows for batch lookup of BPF map elements.
*
* The parameter *in_batch* is the address of the first element in the batch to read.
* *out_batch* is an output parameter that should be passed as *in_batch* to subsequent
* calls to **bpf_map_lookup_batch()**. NULL can be passed for *in_batch* to indicate
* that the batched lookup starts from the beginning of the map.
* The parameter *in_batch* is the address of the first element in the batch to
* read. *out_batch* is an output parameter that should be passed as *in_batch*
* to subsequent calls to **bpf_map_lookup_batch()**. NULL can be passed for
* *in_batch* to indicate that the batched lookup starts from the beginning of
* the map. Both *in_batch* and *out_batch* must point to memory large enough to
* hold a single key, except for maps of type **BPF_MAP_TYPE_{HASH, PERCPU_HASH,
* LRU_HASH, LRU_PERCPU_HASH}**, for which the memory size must be at
* least 4 bytes wide regardless of key size.
*
* The *keys* and *values* are output parameters which must point to memory large enough to
* hold *count* items based on the key and value size of the map *map_fd*. The *keys*
@@ -218,7 +230,10 @@ LIBBPF_API int bpf_map_lookup_batch(int fd, void *in_batch, void *out_batch,
*
* @param fd BPF map file descriptor
* @param in_batch address of the first element in batch to read, can pass NULL to
* get address of the first element in *out_batch*
* get address of the first element in *out_batch*. If not NULL, must be large
* enough to hold a key. For **BPF_MAP_TYPE_{HASH, PERCPU_HASH, LRU_HASH,
* LRU_PERCPU_HASH}**, the memory size must be at least 4 bytes wide regardless
* of key size.
* @param out_batch output parameter that should be passed to next call as *in_batch*
* @param keys pointer to an array of *count* keys
* @param values pointer to an array large enough for *count* values
@@ -492,7 +507,10 @@ LIBBPF_API int bpf_obj_get_info_by_fd(int bpf_fd, void *info, __u32 *info_len);
* program corresponding to *prog_fd*.
*
* Populates up to *info_len* bytes of *info* and updates *info_len* with the
* actual number of bytes written to *info*.
* actual number of bytes written to *info*. Note that *info* should be
* zero-initialized or initialized as expected by the requested *info*
* type. Failing to (zero-)initialize *info* under certain circumstances can
* result in this helper returning an error.
*
* @param prog_fd BPF program file descriptor
* @param info pointer to **struct bpf_prog_info** that will be populated with
@@ -509,7 +527,10 @@ LIBBPF_API int bpf_prog_get_info_by_fd(int prog_fd, struct bpf_prog_info *info,
* map corresponding to *map_fd*.
*
* Populates up to *info_len* bytes of *info* and updates *info_len* with the
* actual number of bytes written to *info*.
* actual number of bytes written to *info*. Note that *info* should be
* zero-initialized or initialized as expected by the requested *info*
* type. Failing to (zero-)initialize *info* under certain circumstances can
* result in this helper returning an error.
*
* @param map_fd BPF map file descriptor
* @param info pointer to **struct bpf_map_info** that will be populated with
@@ -522,11 +543,14 @@ LIBBPF_API int bpf_prog_get_info_by_fd(int prog_fd, struct bpf_prog_info *info,
LIBBPF_API int bpf_map_get_info_by_fd(int map_fd, struct bpf_map_info *info, __u32 *info_len);
/**
* @brief **bpf_btf_get_info_by_fd()** obtains information about the
* @brief **bpf_btf_get_info_by_fd()** obtains information about the
* BTF object corresponding to *btf_fd*.
*
* Populates up to *info_len* bytes of *info* and updates *info_len* with the
* actual number of bytes written to *info*.
* actual number of bytes written to *info*. Note that *info* should be
* zero-initialized or initialized as expected by the requested *info*
* type. Failing to (zero-)initialize *info* under certain circumstances can
* result in this helper returning an error.
*
* @param btf_fd BTF object file descriptor
* @param info pointer to **struct bpf_btf_info** that will be populated with
@@ -543,7 +567,10 @@ LIBBPF_API int bpf_btf_get_info_by_fd(int btf_fd, struct bpf_btf_info *info, __u
* link corresponding to *link_fd*.
*
* Populates up to *info_len* bytes of *info* and updates *info_len* with the
* actual number of bytes written to *info*.
* actual number of bytes written to *info*. Note that *info* should be
* zero-initialized or initialized as expected by the requested *info*
* type. Failing to (zero-)initialize *info* under certain circumstances can
* result in this helper returning an error.
*
* @param link_fd BPF link file descriptor
* @param info pointer to **struct bpf_link_info** that will be populated with
@@ -590,6 +617,15 @@ LIBBPF_API int bpf_prog_query(int target_fd, enum bpf_attach_type type,
__u32 query_flags, __u32 *attach_flags,
__u32 *prog_ids, __u32 *prog_cnt);
struct bpf_raw_tp_opts {
size_t sz; /* size of this struct for forward/backward compatibility */
const char *tp_name;
__u64 cookie;
size_t :0;
};
#define bpf_raw_tp_opts__last_field cookie
LIBBPF_API int bpf_raw_tracepoint_open_opts(int prog_fd, struct bpf_raw_tp_opts *opts);
LIBBPF_API int bpf_raw_tracepoint_open(const char *name, int prog_fd);
LIBBPF_API int bpf_task_fd_query(int pid, int fd, __u32 flags, char *buf,
__u32 *buf_len, __u32 *prog_id, __u32 *fd_type,
@@ -640,6 +676,30 @@ struct bpf_test_run_opts {
LIBBPF_API int bpf_prog_test_run_opts(int prog_fd,
struct bpf_test_run_opts *opts);
struct bpf_token_create_opts {
size_t sz; /* size of this struct for forward/backward compatibility */
__u32 flags;
size_t :0;
};
#define bpf_token_create_opts__last_field flags
/**
* @brief **bpf_token_create()** creates a new instance of BPF token derived
* from specified BPF FS mount point.
*
* BPF token created with this API can be passed to bpf() syscall for
* commands like BPF_PROG_LOAD, BPF_MAP_CREATE, etc.
*
* @param bpffs_fd FD for BPF FS instance from which to derive a BPF token
* instance.
* @param opts optional BPF token creation options, can be NULL
*
* @return BPF token FD > 0, on success; negative error code, otherwise (errno
* is also set to the error code)
*/
LIBBPF_API int bpf_token_create(int bpffs_fd,
struct bpf_token_create_opts *opts);
#ifdef __cplusplus
} /* extern "C" */
#endif

View File

@@ -2,6 +2,8 @@
#ifndef __BPF_CORE_READ_H__
#define __BPF_CORE_READ_H__
#include "bpf_helpers.h"
/*
* enum bpf_field_info_kind is passed as a second argument into
* __builtin_preserve_field_info() built-in to get a specific aspect of
@@ -44,7 +46,7 @@ enum bpf_enum_value_kind {
#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
#define __CORE_BITFIELD_PROBE_READ(dst, src, fld) \
bpf_probe_read_kernel( \
(void *)dst, \
(void *)dst, \
__CORE_RELO(src, fld, BYTE_SIZE), \
(const void *)src + __CORE_RELO(src, fld, BYTE_OFFSET))
#else
@@ -102,6 +104,7 @@ enum bpf_enum_value_kind {
case 2: val = *(const unsigned short *)p; break; \
case 4: val = *(const unsigned int *)p; break; \
case 8: val = *(const unsigned long long *)p; break; \
default: val = 0; break; \
} \
val <<= __CORE_RELO(s, field, LSHIFT_U64); \
if (__CORE_RELO(s, field, SIGNED)) \
@@ -143,8 +146,29 @@ enum bpf_enum_value_kind {
} \
})
/* Differentiator between compilers builtin implementations. This is a
* requirement due to the compiler parsing differences where GCC optimizes
* early in parsing those constructs of type pointers to the builtin specific
* type, resulting in not being possible to collect the required type
* information in the builtin expansion.
*/
#ifdef __clang__
#define ___bpf_typeof(type) ((typeof(type) *) 0)
#else
#define ___bpf_typeof1(type, NR) ({ \
extern typeof(type) *___concat(bpf_type_tmp_, NR); \
___concat(bpf_type_tmp_, NR); \
})
#define ___bpf_typeof(type) ___bpf_typeof1(type, __COUNTER__)
#endif
#ifdef __clang__
#define ___bpf_field_ref1(field) (field)
#define ___bpf_field_ref2(type, field) (((typeof(type) *)0)->field)
#define ___bpf_field_ref2(type, field) (___bpf_typeof(type)->field)
#else
#define ___bpf_field_ref1(field) (&(field))
#define ___bpf_field_ref2(type, field) (&(___bpf_typeof(type)->field))
#endif
#define ___bpf_field_ref(args...) \
___bpf_apply(___bpf_field_ref, ___bpf_narg(args))(args)
@@ -194,7 +218,7 @@ enum bpf_enum_value_kind {
* BTF. Always succeeds.
*/
#define bpf_core_type_id_local(type) \
__builtin_btf_type_id(*(typeof(type) *)0, BPF_TYPE_ID_LOCAL)
__builtin_btf_type_id(*___bpf_typeof(type), BPF_TYPE_ID_LOCAL)
/*
* Convenience macro to get BTF type ID of a target kernel's type that matches
@@ -204,7 +228,7 @@ enum bpf_enum_value_kind {
* - 0, if no matching type was found in a target kernel BTF.
*/
#define bpf_core_type_id_kernel(type) \
__builtin_btf_type_id(*(typeof(type) *)0, BPF_TYPE_ID_TARGET)
__builtin_btf_type_id(*___bpf_typeof(type), BPF_TYPE_ID_TARGET)
/*
* Convenience macro to check that provided named type
@@ -214,7 +238,7 @@ enum bpf_enum_value_kind {
* 0, if no matching type is found.
*/
#define bpf_core_type_exists(type) \
__builtin_preserve_type_info(*(typeof(type) *)0, BPF_TYPE_EXISTS)
__builtin_preserve_type_info(*___bpf_typeof(type), BPF_TYPE_EXISTS)
/*
* Convenience macro to check that provided named type
@@ -224,7 +248,7 @@ enum bpf_enum_value_kind {
* 0, if the type does not match any in the target kernel
*/
#define bpf_core_type_matches(type) \
__builtin_preserve_type_info(*(typeof(type) *)0, BPF_TYPE_MATCHES)
__builtin_preserve_type_info(*___bpf_typeof(type), BPF_TYPE_MATCHES)
/*
* Convenience macro to get the byte size of a provided named type
@@ -234,7 +258,7 @@ enum bpf_enum_value_kind {
* 0, if no matching type is found.
*/
#define bpf_core_type_size(type) \
__builtin_preserve_type_info(*(typeof(type) *)0, BPF_TYPE_SIZE)
__builtin_preserve_type_info(*___bpf_typeof(type), BPF_TYPE_SIZE)
/*
* Convenience macro to check that provided enumerator value is defined in
@@ -244,8 +268,13 @@ enum bpf_enum_value_kind {
* kernel's BTF;
* 0, if no matching enum and/or enum value within that enum is found.
*/
#ifdef __clang__
#define bpf_core_enum_value_exists(enum_type, enum_value) \
__builtin_preserve_enum_value(*(typeof(enum_type) *)enum_value, BPF_ENUMVAL_EXISTS)
#else
#define bpf_core_enum_value_exists(enum_type, enum_value) \
__builtin_preserve_enum_value(___bpf_typeof(enum_type), enum_value, BPF_ENUMVAL_EXISTS)
#endif
/*
* Convenience macro to get the integer value of an enumerator value in
@@ -255,8 +284,13 @@ enum bpf_enum_value_kind {
* present in target kernel's BTF;
* 0, if no matching enum and/or enum value within that enum is found.
*/
#ifdef __clang__
#define bpf_core_enum_value(enum_type, enum_value) \
__builtin_preserve_enum_value(*(typeof(enum_type) *)enum_value, BPF_ENUMVAL_VALUE)
#else
#define bpf_core_enum_value(enum_type, enum_value) \
__builtin_preserve_enum_value(___bpf_typeof(enum_type), enum_value, BPF_ENUMVAL_VALUE)
#endif
/*
* bpf_core_read() abstracts away bpf_probe_read_kernel() call and captures
@@ -268,7 +302,7 @@ enum bpf_enum_value_kind {
* a relocation, which records BTF type ID describing root struct/union and an
* accessor string which describes exact embedded field that was used to take
* an address. See detailed description of this relocation format and
* semantics in comments to struct bpf_field_reloc in libbpf_internal.h.
* semantics in comments to struct bpf_core_relo in include/uapi/linux/bpf.h.
*
* This relocation allows libbpf to adjust BPF instruction to use correct
* actual field offset, based on target kernel BTF type that matches original
@@ -292,6 +326,17 @@ enum bpf_enum_value_kind {
#define bpf_core_read_user_str(dst, sz, src) \
bpf_probe_read_user_str(dst, sz, (const void *)__builtin_preserve_access_index(src))
extern void *bpf_rdonly_cast(const void *obj, __u32 btf_id) __ksym __weak;
/*
* Cast provided pointer *ptr* into a pointer to a specified *type* in such
* a way that BPF verifier will become aware of associated kernel-side BTF
* type. This allows to access members of kernel types directly without the
* need to use BPF_CORE_READ() macros.
*/
#define bpf_core_cast(ptr, type) \
((typeof(type) *)bpf_rdonly_cast((ptr), bpf_core_type_id_kernel(type)))
#define ___concat(a, b) a ## b
#define ___apply(fn, n) ___concat(fn, n)
#define ___nth(_1, _2, _3, _4, _5, _6, _7, _8, _9, _10, __11, N, ...) N

File diff suppressed because it is too large Load Diff

View File

@@ -13,6 +13,7 @@
#define __uint(name, val) int (*name)[val]
#define __type(name, val) typeof(val) *name
#define __array(name, val) typeof(val) *name[]
#define __ulong(name, val) enum { ___bpf_concat(__unique_value, __COUNTER__) = val } name
/*
* Helper macro to place programs, maps, license in
@@ -136,7 +137,8 @@
/*
* Helper function to perform a tail call with a constant/immediate map slot.
*/
#if __clang_major__ >= 8 && defined(__bpf__)
#if (defined(__clang__) && __clang_major__ >= 8) || (!defined(__clang__) && __GNUC__ > 12)
#if defined(__bpf__)
static __always_inline void
bpf_tail_call_static(void *ctx, const void *map, const __u32 slot)
{
@@ -164,6 +166,7 @@ bpf_tail_call_static(void *ctx, const void *map, const __u32 slot)
: "r0", "r1", "r2", "r3", "r4", "r5");
}
#endif
#endif
enum libbpf_pin_type {
LIBBPF_PIN_NONE,
@@ -183,13 +186,27 @@ enum libbpf_tristate {
#define __kptr __attribute__((btf_type_tag("kptr")))
#define __percpu_kptr __attribute__((btf_type_tag("percpu_kptr")))
#define bpf_ksym_exists(sym) ({ \
_Static_assert(!__builtin_constant_p(!!sym), #sym " should be marked as __weak"); \
!!sym; \
#if defined (__clang__)
#define bpf_ksym_exists(sym) ({ \
_Static_assert(!__builtin_constant_p(!!sym), \
#sym " should be marked as __weak"); \
!!sym; \
})
#elif __GNUC__ > 8
#define bpf_ksym_exists(sym) ({ \
_Static_assert(__builtin_has_attribute (*sym, __weak__), \
#sym " should be marked as __weak"); \
!!sym; \
})
#else
#define bpf_ksym_exists(sym) !!sym
#endif
#define __arg_ctx __attribute__((btf_decl_tag("arg:ctx")))
#define __arg_nonnull __attribute((btf_decl_tag("arg:nonnull")))
#define __arg_nullable __attribute((btf_decl_tag("arg:nullable")))
#define __arg_trusted __attribute((btf_decl_tag("arg:trusted")))
#define __arg_arena __attribute((btf_decl_tag("arg:arena")))
#ifndef ___bpf_concat
#define ___bpf_concat(a, b) a ## b

View File

@@ -633,18 +633,18 @@ struct pt_regs;
#endif
#define ___bpf_ctx_cast0() ctx
#define ___bpf_ctx_cast1(x) ___bpf_ctx_cast0(), (void *)ctx[0]
#define ___bpf_ctx_cast2(x, args...) ___bpf_ctx_cast1(args), (void *)ctx[1]
#define ___bpf_ctx_cast3(x, args...) ___bpf_ctx_cast2(args), (void *)ctx[2]
#define ___bpf_ctx_cast4(x, args...) ___bpf_ctx_cast3(args), (void *)ctx[3]
#define ___bpf_ctx_cast5(x, args...) ___bpf_ctx_cast4(args), (void *)ctx[4]
#define ___bpf_ctx_cast6(x, args...) ___bpf_ctx_cast5(args), (void *)ctx[5]
#define ___bpf_ctx_cast7(x, args...) ___bpf_ctx_cast6(args), (void *)ctx[6]
#define ___bpf_ctx_cast8(x, args...) ___bpf_ctx_cast7(args), (void *)ctx[7]
#define ___bpf_ctx_cast9(x, args...) ___bpf_ctx_cast8(args), (void *)ctx[8]
#define ___bpf_ctx_cast10(x, args...) ___bpf_ctx_cast9(args), (void *)ctx[9]
#define ___bpf_ctx_cast11(x, args...) ___bpf_ctx_cast10(args), (void *)ctx[10]
#define ___bpf_ctx_cast12(x, args...) ___bpf_ctx_cast11(args), (void *)ctx[11]
#define ___bpf_ctx_cast1(x) ___bpf_ctx_cast0(), ctx[0]
#define ___bpf_ctx_cast2(x, args...) ___bpf_ctx_cast1(args), ctx[1]
#define ___bpf_ctx_cast3(x, args...) ___bpf_ctx_cast2(args), ctx[2]
#define ___bpf_ctx_cast4(x, args...) ___bpf_ctx_cast3(args), ctx[3]
#define ___bpf_ctx_cast5(x, args...) ___bpf_ctx_cast4(args), ctx[4]
#define ___bpf_ctx_cast6(x, args...) ___bpf_ctx_cast5(args), ctx[5]
#define ___bpf_ctx_cast7(x, args...) ___bpf_ctx_cast6(args), ctx[6]
#define ___bpf_ctx_cast8(x, args...) ___bpf_ctx_cast7(args), ctx[7]
#define ___bpf_ctx_cast9(x, args...) ___bpf_ctx_cast8(args), ctx[8]
#define ___bpf_ctx_cast10(x, args...) ___bpf_ctx_cast9(args), ctx[9]
#define ___bpf_ctx_cast11(x, args...) ___bpf_ctx_cast10(args), ctx[10]
#define ___bpf_ctx_cast12(x, args...) ___bpf_ctx_cast11(args), ctx[11]
#define ___bpf_ctx_cast(args...) ___bpf_apply(___bpf_ctx_cast, ___bpf_narg(args))(args)
/*
@@ -786,14 +786,14 @@ ____##name(unsigned long long *ctx ___bpf_ctx_decl(args))
struct pt_regs;
#define ___bpf_kprobe_args0() ctx
#define ___bpf_kprobe_args1(x) ___bpf_kprobe_args0(), (void *)PT_REGS_PARM1(ctx)
#define ___bpf_kprobe_args2(x, args...) ___bpf_kprobe_args1(args), (void *)PT_REGS_PARM2(ctx)
#define ___bpf_kprobe_args3(x, args...) ___bpf_kprobe_args2(args), (void *)PT_REGS_PARM3(ctx)
#define ___bpf_kprobe_args4(x, args...) ___bpf_kprobe_args3(args), (void *)PT_REGS_PARM4(ctx)
#define ___bpf_kprobe_args5(x, args...) ___bpf_kprobe_args4(args), (void *)PT_REGS_PARM5(ctx)
#define ___bpf_kprobe_args6(x, args...) ___bpf_kprobe_args5(args), (void *)PT_REGS_PARM6(ctx)
#define ___bpf_kprobe_args7(x, args...) ___bpf_kprobe_args6(args), (void *)PT_REGS_PARM7(ctx)
#define ___bpf_kprobe_args8(x, args...) ___bpf_kprobe_args7(args), (void *)PT_REGS_PARM8(ctx)
#define ___bpf_kprobe_args1(x) ___bpf_kprobe_args0(), (unsigned long long)PT_REGS_PARM1(ctx)
#define ___bpf_kprobe_args2(x, args...) ___bpf_kprobe_args1(args), (unsigned long long)PT_REGS_PARM2(ctx)
#define ___bpf_kprobe_args3(x, args...) ___bpf_kprobe_args2(args), (unsigned long long)PT_REGS_PARM3(ctx)
#define ___bpf_kprobe_args4(x, args...) ___bpf_kprobe_args3(args), (unsigned long long)PT_REGS_PARM4(ctx)
#define ___bpf_kprobe_args5(x, args...) ___bpf_kprobe_args4(args), (unsigned long long)PT_REGS_PARM5(ctx)
#define ___bpf_kprobe_args6(x, args...) ___bpf_kprobe_args5(args), (unsigned long long)PT_REGS_PARM6(ctx)
#define ___bpf_kprobe_args7(x, args...) ___bpf_kprobe_args6(args), (unsigned long long)PT_REGS_PARM7(ctx)
#define ___bpf_kprobe_args8(x, args...) ___bpf_kprobe_args7(args), (unsigned long long)PT_REGS_PARM8(ctx)
#define ___bpf_kprobe_args(args...) ___bpf_apply(___bpf_kprobe_args, ___bpf_narg(args))(args)
/*
@@ -821,7 +821,7 @@ static __always_inline typeof(name(0)) \
____##name(struct pt_regs *ctx, ##args)
#define ___bpf_kretprobe_args0() ctx
#define ___bpf_kretprobe_args1(x) ___bpf_kretprobe_args0(), (void *)PT_REGS_RC(ctx)
#define ___bpf_kretprobe_args1(x) ___bpf_kretprobe_args0(), (unsigned long long)PT_REGS_RC(ctx)
#define ___bpf_kretprobe_args(args...) ___bpf_apply(___bpf_kretprobe_args, ___bpf_narg(args))(args)
/*
@@ -845,24 +845,24 @@ static __always_inline typeof(name(0)) ____##name(struct pt_regs *ctx, ##args)
/* If kernel has CONFIG_ARCH_HAS_SYSCALL_WRAPPER, read pt_regs directly */
#define ___bpf_syscall_args0() ctx
#define ___bpf_syscall_args1(x) ___bpf_syscall_args0(), (void *)PT_REGS_PARM1_SYSCALL(regs)
#define ___bpf_syscall_args2(x, args...) ___bpf_syscall_args1(args), (void *)PT_REGS_PARM2_SYSCALL(regs)
#define ___bpf_syscall_args3(x, args...) ___bpf_syscall_args2(args), (void *)PT_REGS_PARM3_SYSCALL(regs)
#define ___bpf_syscall_args4(x, args...) ___bpf_syscall_args3(args), (void *)PT_REGS_PARM4_SYSCALL(regs)
#define ___bpf_syscall_args5(x, args...) ___bpf_syscall_args4(args), (void *)PT_REGS_PARM5_SYSCALL(regs)
#define ___bpf_syscall_args6(x, args...) ___bpf_syscall_args5(args), (void *)PT_REGS_PARM6_SYSCALL(regs)
#define ___bpf_syscall_args7(x, args...) ___bpf_syscall_args6(args), (void *)PT_REGS_PARM7_SYSCALL(regs)
#define ___bpf_syscall_args1(x) ___bpf_syscall_args0(), (unsigned long long)PT_REGS_PARM1_SYSCALL(regs)
#define ___bpf_syscall_args2(x, args...) ___bpf_syscall_args1(args), (unsigned long long)PT_REGS_PARM2_SYSCALL(regs)
#define ___bpf_syscall_args3(x, args...) ___bpf_syscall_args2(args), (unsigned long long)PT_REGS_PARM3_SYSCALL(regs)
#define ___bpf_syscall_args4(x, args...) ___bpf_syscall_args3(args), (unsigned long long)PT_REGS_PARM4_SYSCALL(regs)
#define ___bpf_syscall_args5(x, args...) ___bpf_syscall_args4(args), (unsigned long long)PT_REGS_PARM5_SYSCALL(regs)
#define ___bpf_syscall_args6(x, args...) ___bpf_syscall_args5(args), (unsigned long long)PT_REGS_PARM6_SYSCALL(regs)
#define ___bpf_syscall_args7(x, args...) ___bpf_syscall_args6(args), (unsigned long long)PT_REGS_PARM7_SYSCALL(regs)
#define ___bpf_syscall_args(args...) ___bpf_apply(___bpf_syscall_args, ___bpf_narg(args))(args)
/* If kernel doesn't have CONFIG_ARCH_HAS_SYSCALL_WRAPPER, we have to BPF_CORE_READ from pt_regs */
#define ___bpf_syswrap_args0() ctx
#define ___bpf_syswrap_args1(x) ___bpf_syswrap_args0(), (void *)PT_REGS_PARM1_CORE_SYSCALL(regs)
#define ___bpf_syswrap_args2(x, args...) ___bpf_syswrap_args1(args), (void *)PT_REGS_PARM2_CORE_SYSCALL(regs)
#define ___bpf_syswrap_args3(x, args...) ___bpf_syswrap_args2(args), (void *)PT_REGS_PARM3_CORE_SYSCALL(regs)
#define ___bpf_syswrap_args4(x, args...) ___bpf_syswrap_args3(args), (void *)PT_REGS_PARM4_CORE_SYSCALL(regs)
#define ___bpf_syswrap_args5(x, args...) ___bpf_syswrap_args4(args), (void *)PT_REGS_PARM5_CORE_SYSCALL(regs)
#define ___bpf_syswrap_args6(x, args...) ___bpf_syswrap_args5(args), (void *)PT_REGS_PARM6_CORE_SYSCALL(regs)
#define ___bpf_syswrap_args7(x, args...) ___bpf_syswrap_args6(args), (void *)PT_REGS_PARM7_CORE_SYSCALL(regs)
#define ___bpf_syswrap_args1(x) ___bpf_syswrap_args0(), (unsigned long long)PT_REGS_PARM1_CORE_SYSCALL(regs)
#define ___bpf_syswrap_args2(x, args...) ___bpf_syswrap_args1(args), (unsigned long long)PT_REGS_PARM2_CORE_SYSCALL(regs)
#define ___bpf_syswrap_args3(x, args...) ___bpf_syswrap_args2(args), (unsigned long long)PT_REGS_PARM3_CORE_SYSCALL(regs)
#define ___bpf_syswrap_args4(x, args...) ___bpf_syswrap_args3(args), (unsigned long long)PT_REGS_PARM4_CORE_SYSCALL(regs)
#define ___bpf_syswrap_args5(x, args...) ___bpf_syswrap_args4(args), (unsigned long long)PT_REGS_PARM5_CORE_SYSCALL(regs)
#define ___bpf_syswrap_args6(x, args...) ___bpf_syswrap_args5(args), (unsigned long long)PT_REGS_PARM6_CORE_SYSCALL(regs)
#define ___bpf_syswrap_args7(x, args...) ___bpf_syswrap_args6(args), (unsigned long long)PT_REGS_PARM7_CORE_SYSCALL(regs)
#define ___bpf_syswrap_args(args...) ___bpf_apply(___bpf_syswrap_args, ___bpf_narg(args))(args)
/*

771
src/btf.c
View File

@@ -116,6 +116,9 @@ struct btf {
/* whether strings are already deduplicated */
bool strs_deduped;
/* whether base_btf should be freed in btf_free for this instance */
bool owns_base;
/* BTF object FD, if loaded into kernel */
int fd;
@@ -598,7 +601,7 @@ static int btf_sanity_check(const struct btf *btf)
__u32 i, n = btf__type_cnt(btf);
int err;
for (i = 1; i < n; i++) {
for (i = btf->start_id; i < n; i++) {
t = btf_type_by_id(btf, i);
err = btf_validate_type(btf, t, i);
if (err)
@@ -969,6 +972,8 @@ void btf__free(struct btf *btf)
free(btf->raw_data);
free(btf->raw_data_swapped);
free(btf->type_offs);
if (btf->owns_base)
btf__free(btf->base_btf);
free(btf);
}
@@ -991,6 +996,7 @@ static struct btf *btf_new_empty(struct btf *base_btf)
btf->base_btf = base_btf;
btf->start_id = btf__type_cnt(base_btf);
btf->start_str_off = base_btf->hdr->str_len;
btf->swapped_endian = base_btf->swapped_endian;
}
/* +1 for empty string at offset 0 */
@@ -1079,16 +1085,91 @@ struct btf *btf__new(const void *data, __u32 size)
return libbpf_ptr(btf_new(data, size, NULL));
}
struct btf *btf__new_split(const void *data, __u32 size, struct btf *base_btf)
{
return libbpf_ptr(btf_new(data, size, base_btf));
}
struct btf_elf_secs {
Elf_Data *btf_data;
Elf_Data *btf_ext_data;
Elf_Data *btf_base_data;
};
static int btf_find_elf_sections(Elf *elf, const char *path, struct btf_elf_secs *secs)
{
Elf_Scn *scn = NULL;
Elf_Data *data;
GElf_Ehdr ehdr;
size_t shstrndx;
int idx = 0;
if (!gelf_getehdr(elf, &ehdr)) {
pr_warn("failed to get EHDR from %s\n", path);
goto err;
}
if (elf_getshdrstrndx(elf, &shstrndx)) {
pr_warn("failed to get section names section index for %s\n",
path);
goto err;
}
if (!elf_rawdata(elf_getscn(elf, shstrndx), NULL)) {
pr_warn("failed to get e_shstrndx from %s\n", path);
goto err;
}
while ((scn = elf_nextscn(elf, scn)) != NULL) {
Elf_Data **field;
GElf_Shdr sh;
char *name;
idx++;
if (gelf_getshdr(scn, &sh) != &sh) {
pr_warn("failed to get section(%d) header from %s\n",
idx, path);
goto err;
}
name = elf_strptr(elf, shstrndx, sh.sh_name);
if (!name) {
pr_warn("failed to get section(%d) name from %s\n",
idx, path);
goto err;
}
if (strcmp(name, BTF_ELF_SEC) == 0)
field = &secs->btf_data;
else if (strcmp(name, BTF_EXT_ELF_SEC) == 0)
field = &secs->btf_ext_data;
else if (strcmp(name, BTF_BASE_ELF_SEC) == 0)
field = &secs->btf_base_data;
else
continue;
data = elf_getdata(scn, 0);
if (!data) {
pr_warn("failed to get section(%d, %s) data from %s\n",
idx, name, path);
goto err;
}
*field = data;
}
return 0;
err:
return -LIBBPF_ERRNO__FORMAT;
}
static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
struct btf_ext **btf_ext)
{
Elf_Data *btf_data = NULL, *btf_ext_data = NULL;
int err = 0, fd = -1, idx = 0;
struct btf_elf_secs secs = {};
struct btf *dist_base_btf = NULL;
struct btf *btf = NULL;
Elf_Scn *scn = NULL;
int err = 0, fd = -1;
Elf *elf = NULL;
GElf_Ehdr ehdr;
size_t shstrndx;
if (elf_version(EV_CURRENT) == EV_NONE) {
pr_warn("failed to init libelf for %s\n", path);
@@ -1102,73 +1183,48 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
return ERR_PTR(err);
}
err = -LIBBPF_ERRNO__FORMAT;
elf = elf_begin(fd, ELF_C_READ, NULL);
if (!elf) {
pr_warn("failed to open %s as ELF file\n", path);
goto done;
}
if (!gelf_getehdr(elf, &ehdr)) {
pr_warn("failed to get EHDR from %s\n", path);
err = btf_find_elf_sections(elf, path, &secs);
if (err)
goto done;
}
if (elf_getshdrstrndx(elf, &shstrndx)) {
pr_warn("failed to get section names section index for %s\n",
path);
goto done;
}
if (!elf_rawdata(elf_getscn(elf, shstrndx), NULL)) {
pr_warn("failed to get e_shstrndx from %s\n", path);
goto done;
}
while ((scn = elf_nextscn(elf, scn)) != NULL) {
GElf_Shdr sh;
char *name;
idx++;
if (gelf_getshdr(scn, &sh) != &sh) {
pr_warn("failed to get section(%d) header from %s\n",
idx, path);
goto done;
}
name = elf_strptr(elf, shstrndx, sh.sh_name);
if (!name) {
pr_warn("failed to get section(%d) name from %s\n",
idx, path);
goto done;
}
if (strcmp(name, BTF_ELF_SEC) == 0) {
btf_data = elf_getdata(scn, 0);
if (!btf_data) {
pr_warn("failed to get section(%d, %s) data from %s\n",
idx, name, path);
goto done;
}
continue;
} else if (btf_ext && strcmp(name, BTF_EXT_ELF_SEC) == 0) {
btf_ext_data = elf_getdata(scn, 0);
if (!btf_ext_data) {
pr_warn("failed to get section(%d, %s) data from %s\n",
idx, name, path);
goto done;
}
continue;
}
}
if (!btf_data) {
if (!secs.btf_data) {
pr_warn("failed to find '%s' ELF section in %s\n", BTF_ELF_SEC, path);
err = -ENODATA;
goto done;
}
btf = btf_new(btf_data->d_buf, btf_data->d_size, base_btf);
err = libbpf_get_error(btf);
if (err)
if (secs.btf_base_data) {
dist_base_btf = btf_new(secs.btf_base_data->d_buf, secs.btf_base_data->d_size,
NULL);
if (IS_ERR(dist_base_btf)) {
err = PTR_ERR(dist_base_btf);
dist_base_btf = NULL;
goto done;
}
}
btf = btf_new(secs.btf_data->d_buf, secs.btf_data->d_size,
dist_base_btf ?: base_btf);
if (IS_ERR(btf)) {
err = PTR_ERR(btf);
goto done;
}
if (dist_base_btf && base_btf) {
err = btf__relocate(btf, base_btf);
if (err)
goto done;
btf__free(dist_base_btf);
dist_base_btf = NULL;
}
if (dist_base_btf)
btf->owns_base = true;
switch (gelf_getclass(elf)) {
case ELFCLASS32:
@@ -1182,11 +1238,12 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
break;
}
if (btf_ext && btf_ext_data) {
*btf_ext = btf_ext__new(btf_ext_data->d_buf, btf_ext_data->d_size);
err = libbpf_get_error(*btf_ext);
if (err)
if (btf_ext && secs.btf_ext_data) {
*btf_ext = btf_ext__new(secs.btf_ext_data->d_buf, secs.btf_ext_data->d_size);
if (IS_ERR(*btf_ext)) {
err = PTR_ERR(*btf_ext);
goto done;
}
} else if (btf_ext) {
*btf_ext = NULL;
}
@@ -1200,6 +1257,7 @@ done:
if (btf_ext)
btf_ext__free(*btf_ext);
btf__free(dist_base_btf);
btf__free(btf);
return ERR_PTR(err);
@@ -1317,7 +1375,9 @@ struct btf *btf__parse_split(const char *path, struct btf *base_btf)
static void *btf_get_raw_data(const struct btf *btf, __u32 *size, bool swap_endian);
int btf_load_into_kernel(struct btf *btf, char *log_buf, size_t log_sz, __u32 log_level)
int btf_load_into_kernel(struct btf *btf,
char *log_buf, size_t log_sz, __u32 log_level,
int token_fd)
{
LIBBPF_OPTS(bpf_btf_load_opts, opts);
__u32 buf_sz = 0, raw_size;
@@ -1367,6 +1427,10 @@ retry_load:
opts.log_level = log_level;
}
opts.token_fd = token_fd;
if (token_fd)
opts.btf_flags |= BPF_F_TOKEN_FD;
btf->fd = bpf_btf_load(raw_data, raw_size, &opts);
if (btf->fd < 0) {
/* time to turn on verbose mode and try again */
@@ -1394,7 +1458,7 @@ done:
int btf__load_into_kernel(struct btf *btf)
{
return btf_load_into_kernel(btf, NULL, 0, 0);
return btf_load_into_kernel(btf, NULL, 0, 0, 0);
}
int btf__fd(const struct btf *btf)
@@ -1728,9 +1792,8 @@ struct btf_pipe {
struct hashmap *str_off_map; /* map string offsets from src to dst */
};
static int btf_rewrite_str(__u32 *str_off, void *ctx)
static int btf_rewrite_str(struct btf_pipe *p, __u32 *str_off)
{
struct btf_pipe *p = ctx;
long mapped_off;
int off, err;
@@ -1760,10 +1823,11 @@ static int btf_rewrite_str(__u32 *str_off, void *ctx)
return 0;
}
int btf__add_type(struct btf *btf, const struct btf *src_btf, const struct btf_type *src_type)
static int btf_add_type(struct btf_pipe *p, const struct btf_type *src_type)
{
struct btf_pipe p = { .src = src_btf, .dst = btf };
struct btf_field_iter it;
struct btf_type *t;
__u32 *str_off;
int sz, err;
sz = btf_type_size(src_type);
@@ -1771,35 +1835,33 @@ int btf__add_type(struct btf *btf, const struct btf *src_btf, const struct btf_t
return libbpf_err(sz);
/* deconstruct BTF, if necessary, and invalidate raw_data */
if (btf_ensure_modifiable(btf))
if (btf_ensure_modifiable(p->dst))
return libbpf_err(-ENOMEM);
t = btf_add_type_mem(btf, sz);
t = btf_add_type_mem(p->dst, sz);
if (!t)
return libbpf_err(-ENOMEM);
memcpy(t, src_type, sz);
err = btf_type_visit_str_offs(t, btf_rewrite_str, &p);
err = btf_field_iter_init(&it, t, BTF_FIELD_ITER_STRS);
if (err)
return libbpf_err(err);
return btf_commit_type(btf, sz);
while ((str_off = btf_field_iter_next(&it))) {
err = btf_rewrite_str(p, str_off);
if (err)
return libbpf_err(err);
}
return btf_commit_type(p->dst, sz);
}
static int btf_rewrite_type_ids(__u32 *type_id, void *ctx)
int btf__add_type(struct btf *btf, const struct btf *src_btf, const struct btf_type *src_type)
{
struct btf *btf = ctx;
struct btf_pipe p = { .src = src_btf, .dst = btf };
if (!*type_id) /* nothing to do for VOID references */
return 0;
/* we haven't updated btf's type count yet, so
* btf->start_id + btf->nr_types - 1 is the type ID offset we should
* add to all newly added BTF types
*/
*type_id += btf->start_id + btf->nr_types - 1;
return 0;
return btf_add_type(&p, src_type);
}
static size_t btf_dedup_identity_hash_fn(long key, void *ctx);
@@ -1847,6 +1909,9 @@ int btf__add_btf(struct btf *btf, const struct btf *src_btf)
memcpy(t, src_btf->types_data, data_sz);
for (i = 0; i < cnt; i++) {
struct btf_field_iter it;
__u32 *type_id, *str_off;
sz = btf_type_size(t);
if (sz < 0) {
/* unlikely, has to be corrupted src_btf */
@@ -1858,14 +1923,30 @@ int btf__add_btf(struct btf *btf, const struct btf *src_btf)
*off = t - btf->types_data;
/* add, dedup, and remap strings referenced by this BTF type */
err = btf_type_visit_str_offs(t, btf_rewrite_str, &p);
err = btf_field_iter_init(&it, t, BTF_FIELD_ITER_STRS);
if (err)
goto err_out;
while ((str_off = btf_field_iter_next(&it))) {
err = btf_rewrite_str(&p, str_off);
if (err)
goto err_out;
}
/* remap all type IDs referenced from this BTF type */
err = btf_field_iter_init(&it, t, BTF_FIELD_ITER_IDS);
if (err)
goto err_out;
/* remap all type IDs referenced from this BTF type */
err = btf_type_visit_type_ids(t, btf_rewrite_type_ids, btf);
if (err)
goto err_out;
while ((type_id = btf_field_iter_next(&it))) {
if (!*type_id) /* nothing to do for VOID references */
continue;
/* we haven't updated btf's type count yet, so
* btf->start_id + btf->nr_types - 1 is the type ID offset we should
* add to all newly added BTF types
*/
*type_id += btf->start_id + btf->nr_types - 1;
}
/* go to next type data and type offset index entry */
t += sz;
@@ -3039,12 +3120,16 @@ done:
return btf_ext;
}
const void *btf_ext__get_raw_data(const struct btf_ext *btf_ext, __u32 *size)
const void *btf_ext__raw_data(const struct btf_ext *btf_ext, __u32 *size)
{
*size = btf_ext->data_size;
return btf_ext->data;
}
__attribute__((alias("btf_ext__raw_data")))
const void *btf_ext__get_raw_data(const struct btf_ext *btf_ext, __u32 *size);
struct btf_dedup;
static struct btf_dedup *btf_dedup_new(struct btf *btf, const struct btf_dedup_opts *opts);
@@ -3438,11 +3523,19 @@ static int btf_for_each_str_off(struct btf_dedup *d, str_off_visit_fn fn, void *
int i, r;
for (i = 0; i < d->btf->nr_types; i++) {
struct btf_field_iter it;
struct btf_type *t = btf_type_by_id(d->btf, d->btf->start_id + i);
__u32 *str_off;
r = btf_type_visit_str_offs(t, fn, ctx);
r = btf_field_iter_init(&it, t, BTF_FIELD_ITER_STRS);
if (r)
return r;
while ((str_off = btf_field_iter_next(&it))) {
r = fn(str_off, ctx);
if (r)
return r;
}
}
if (!d->btf_ext)
@@ -4904,10 +4997,23 @@ static int btf_dedup_remap_types(struct btf_dedup *d)
for (i = 0; i < d->btf->nr_types; i++) {
struct btf_type *t = btf_type_by_id(d->btf, d->btf->start_id + i);
struct btf_field_iter it;
__u32 *type_id;
r = btf_type_visit_type_ids(t, btf_dedup_remap_type_id, d);
r = btf_field_iter_init(&it, t, BTF_FIELD_ITER_IDS);
if (r)
return r;
while ((type_id = btf_field_iter_next(&it))) {
__u32 resolved_id, new_id;
resolved_id = resolve_type_id(d, *type_id);
new_id = d->hypot_map[resolved_id];
if (new_id > BTF_MAX_NR_TYPES)
return -EINVAL;
*type_id = new_id;
}
}
if (!d->btf_ext)
@@ -4926,10 +5032,9 @@ static int btf_dedup_remap_types(struct btf_dedup *d)
*/
struct btf *btf__load_vmlinux_btf(void)
{
const char *sysfs_btf_path = "/sys/kernel/btf/vmlinux";
/* fall back locations, trying to find vmlinux on disk */
const char *locations[] = {
/* try canonical vmlinux BTF through sysfs first */
"/sys/kernel/btf/vmlinux",
/* fall back to trying to find vmlinux on disk otherwise */
"/boot/vmlinux-%1$s",
"/lib/modules/%1$s/vmlinux-%1$s",
"/lib/modules/%1$s/build/vmlinux",
@@ -4943,8 +5048,23 @@ struct btf *btf__load_vmlinux_btf(void)
struct btf *btf;
int i, err;
uname(&buf);
/* is canonical sysfs location accessible? */
if (faccessat(AT_FDCWD, sysfs_btf_path, F_OK, AT_EACCESS) < 0) {
pr_warn("kernel BTF is missing at '%s', was CONFIG_DEBUG_INFO_BTF enabled?\n",
sysfs_btf_path);
} else {
btf = btf__parse(sysfs_btf_path, NULL);
if (!btf) {
err = -errno;
pr_warn("failed to read kernel BTF from '%s': %d\n", sysfs_btf_path, err);
return libbpf_err_ptr(err);
}
pr_debug("loaded kernel BTF from '%s'\n", sysfs_btf_path);
return btf;
}
/* try fallback locations */
uname(&buf);
for (i = 0; i < ARRAY_SIZE(locations); i++) {
snprintf(path, PATH_MAX, locations[i], buf.release);
@@ -4974,136 +5094,6 @@ struct btf *btf__load_module_btf(const char *module_name, struct btf *vmlinux_bt
return btf__parse_split(path, vmlinux_btf);
}
int btf_type_visit_type_ids(struct btf_type *t, type_id_visit_fn visit, void *ctx)
{
int i, n, err;
switch (btf_kind(t)) {
case BTF_KIND_INT:
case BTF_KIND_FLOAT:
case BTF_KIND_ENUM:
case BTF_KIND_ENUM64:
return 0;
case BTF_KIND_FWD:
case BTF_KIND_CONST:
case BTF_KIND_VOLATILE:
case BTF_KIND_RESTRICT:
case BTF_KIND_PTR:
case BTF_KIND_TYPEDEF:
case BTF_KIND_FUNC:
case BTF_KIND_VAR:
case BTF_KIND_DECL_TAG:
case BTF_KIND_TYPE_TAG:
return visit(&t->type, ctx);
case BTF_KIND_ARRAY: {
struct btf_array *a = btf_array(t);
err = visit(&a->type, ctx);
err = err ?: visit(&a->index_type, ctx);
return err;
}
case BTF_KIND_STRUCT:
case BTF_KIND_UNION: {
struct btf_member *m = btf_members(t);
for (i = 0, n = btf_vlen(t); i < n; i++, m++) {
err = visit(&m->type, ctx);
if (err)
return err;
}
return 0;
}
case BTF_KIND_FUNC_PROTO: {
struct btf_param *m = btf_params(t);
err = visit(&t->type, ctx);
if (err)
return err;
for (i = 0, n = btf_vlen(t); i < n; i++, m++) {
err = visit(&m->type, ctx);
if (err)
return err;
}
return 0;
}
case BTF_KIND_DATASEC: {
struct btf_var_secinfo *m = btf_var_secinfos(t);
for (i = 0, n = btf_vlen(t); i < n; i++, m++) {
err = visit(&m->type, ctx);
if (err)
return err;
}
return 0;
}
default:
return -EINVAL;
}
}
int btf_type_visit_str_offs(struct btf_type *t, str_off_visit_fn visit, void *ctx)
{
int i, n, err;
err = visit(&t->name_off, ctx);
if (err)
return err;
switch (btf_kind(t)) {
case BTF_KIND_STRUCT:
case BTF_KIND_UNION: {
struct btf_member *m = btf_members(t);
for (i = 0, n = btf_vlen(t); i < n; i++, m++) {
err = visit(&m->name_off, ctx);
if (err)
return err;
}
break;
}
case BTF_KIND_ENUM: {
struct btf_enum *m = btf_enum(t);
for (i = 0, n = btf_vlen(t); i < n; i++, m++) {
err = visit(&m->name_off, ctx);
if (err)
return err;
}
break;
}
case BTF_KIND_ENUM64: {
struct btf_enum64 *m = btf_enum64(t);
for (i = 0, n = btf_vlen(t); i < n; i++, m++) {
err = visit(&m->name_off, ctx);
if (err)
return err;
}
break;
}
case BTF_KIND_FUNC_PROTO: {
struct btf_param *m = btf_params(t);
for (i = 0, n = btf_vlen(t); i < n; i++, m++) {
err = visit(&m->name_off, ctx);
if (err)
return err;
}
break;
}
default:
break;
}
return 0;
}
int btf_ext_visit_type_ids(struct btf_ext *btf_ext, type_id_visit_fn visit, void *ctx)
{
const struct btf_ext_info *seg;
@@ -5183,3 +5173,328 @@ int btf_ext_visit_str_offs(struct btf_ext *btf_ext, str_off_visit_fn visit, void
return 0;
}
struct btf_distill {
struct btf_pipe pipe;
int *id_map;
unsigned int split_start_id;
unsigned int split_start_str;
int diff_id;
};
static int btf_add_distilled_type_ids(struct btf_distill *dist, __u32 i)
{
struct btf_type *split_t = btf_type_by_id(dist->pipe.src, i);
struct btf_field_iter it;
__u32 *id;
int err;
err = btf_field_iter_init(&it, split_t, BTF_FIELD_ITER_IDS);
if (err)
return err;
while ((id = btf_field_iter_next(&it))) {
struct btf_type *base_t;
if (!*id)
continue;
/* split BTF id, not needed */
if (*id >= dist->split_start_id)
continue;
/* already added ? */
if (dist->id_map[*id] > 0)
continue;
/* only a subset of base BTF types should be referenced from
* split BTF; ensure nothing unexpected is referenced.
*/
base_t = btf_type_by_id(dist->pipe.src, *id);
switch (btf_kind(base_t)) {
case BTF_KIND_INT:
case BTF_KIND_FLOAT:
case BTF_KIND_FWD:
case BTF_KIND_ARRAY:
case BTF_KIND_STRUCT:
case BTF_KIND_UNION:
case BTF_KIND_TYPEDEF:
case BTF_KIND_ENUM:
case BTF_KIND_ENUM64:
case BTF_KIND_PTR:
case BTF_KIND_CONST:
case BTF_KIND_RESTRICT:
case BTF_KIND_VOLATILE:
case BTF_KIND_FUNC_PROTO:
case BTF_KIND_TYPE_TAG:
dist->id_map[*id] = *id;
break;
default:
pr_warn("unexpected reference to base type[%u] of kind [%u] when creating distilled base BTF.\n",
*id, btf_kind(base_t));
return -EINVAL;
}
/* If a base type is used, ensure types it refers to are
* marked as used also; so for example if we find a PTR to INT
* we need both the PTR and INT.
*
* The only exception is named struct/unions, since distilled
* base BTF composite types have no members.
*/
if (btf_is_composite(base_t) && base_t->name_off)
continue;
err = btf_add_distilled_type_ids(dist, *id);
if (err)
return err;
}
return 0;
}
static int btf_add_distilled_types(struct btf_distill *dist)
{
bool adding_to_base = dist->pipe.dst->start_id == 1;
int id = btf__type_cnt(dist->pipe.dst);
struct btf_type *t;
int i, err = 0;
/* Add types for each of the required references to either distilled
* base or split BTF, depending on type characteristics.
*/
for (i = 1; i < dist->split_start_id; i++) {
const char *name;
int kind;
if (!dist->id_map[i])
continue;
t = btf_type_by_id(dist->pipe.src, i);
kind = btf_kind(t);
name = btf__name_by_offset(dist->pipe.src, t->name_off);
switch (kind) {
case BTF_KIND_INT:
case BTF_KIND_FLOAT:
case BTF_KIND_FWD:
/* Named int, float, fwd are added to base. */
if (!adding_to_base)
continue;
err = btf_add_type(&dist->pipe, t);
break;
case BTF_KIND_STRUCT:
case BTF_KIND_UNION:
/* Named struct/union are added to base as 0-vlen
* struct/union of same size. Anonymous struct/unions
* are added to split BTF as-is.
*/
if (adding_to_base) {
if (!t->name_off)
continue;
err = btf_add_composite(dist->pipe.dst, kind, name, t->size);
} else {
if (t->name_off)
continue;
err = btf_add_type(&dist->pipe, t);
}
break;
case BTF_KIND_ENUM:
case BTF_KIND_ENUM64:
/* Named enum[64]s are added to base as a sized
* enum; relocation will match with appropriately-named
* and sized enum or enum64.
*
* Anonymous enums are added to split BTF as-is.
*/
if (adding_to_base) {
if (!t->name_off)
continue;
err = btf__add_enum(dist->pipe.dst, name, t->size);
} else {
if (t->name_off)
continue;
err = btf_add_type(&dist->pipe, t);
}
break;
case BTF_KIND_ARRAY:
case BTF_KIND_TYPEDEF:
case BTF_KIND_PTR:
case BTF_KIND_CONST:
case BTF_KIND_RESTRICT:
case BTF_KIND_VOLATILE:
case BTF_KIND_FUNC_PROTO:
case BTF_KIND_TYPE_TAG:
/* All other types are added to split BTF. */
if (adding_to_base)
continue;
err = btf_add_type(&dist->pipe, t);
break;
default:
pr_warn("unexpected kind when adding base type '%s'[%u] of kind [%u] to distilled base BTF.\n",
name, i, kind);
return -EINVAL;
}
if (err < 0)
break;
dist->id_map[i] = id++;
}
return err;
}
/* Split BTF ids without a mapping will be shifted downwards since distilled
* base BTF is smaller than the original base BTF. For those that have a
* mapping (either to base or updated split BTF), update the id based on
* that mapping.
*/
static int btf_update_distilled_type_ids(struct btf_distill *dist, __u32 i)
{
struct btf_type *t = btf_type_by_id(dist->pipe.dst, i);
struct btf_field_iter it;
__u32 *id;
int err;
err = btf_field_iter_init(&it, t, BTF_FIELD_ITER_IDS);
if (err)
return err;
while ((id = btf_field_iter_next(&it))) {
if (dist->id_map[*id])
*id = dist->id_map[*id];
else if (*id >= dist->split_start_id)
*id -= dist->diff_id;
}
return 0;
}
/* Create updated split BTF with distilled base BTF; distilled base BTF
* consists of BTF information required to clarify the types that split
* BTF refers to, omitting unneeded details. Specifically it will contain
* base types and memberless definitions of named structs, unions and enumerated
* types. Associated reference types like pointers, arrays and anonymous
* structs, unions and enumerated types will be added to split BTF.
* Size is recorded for named struct/unions to help guide matching to the
* target base BTF during later relocation.
*
* The only case where structs, unions or enumerated types are fully represented
* is when they are anonymous; in such cases, the anonymous type is added to
* split BTF in full.
*
* We return newly-created split BTF where the split BTF refers to a newly-created
* distilled base BTF. Both must be freed separately by the caller.
*/
int btf__distill_base(const struct btf *src_btf, struct btf **new_base_btf,
struct btf **new_split_btf)
{
struct btf *new_base = NULL, *new_split = NULL;
const struct btf *old_base;
unsigned int n = btf__type_cnt(src_btf);
struct btf_distill dist = {};
struct btf_type *t;
int i, err = 0;
/* src BTF must be split BTF. */
old_base = btf__base_btf(src_btf);
if (!new_base_btf || !new_split_btf || !old_base)
return libbpf_err(-EINVAL);
new_base = btf__new_empty();
if (!new_base)
return libbpf_err(-ENOMEM);
btf__set_endianness(new_base, btf__endianness(src_btf));
dist.id_map = calloc(n, sizeof(*dist.id_map));
if (!dist.id_map) {
err = -ENOMEM;
goto done;
}
dist.pipe.src = src_btf;
dist.pipe.dst = new_base;
dist.pipe.str_off_map = hashmap__new(btf_dedup_identity_hash_fn, btf_dedup_equal_fn, NULL);
if (IS_ERR(dist.pipe.str_off_map)) {
err = -ENOMEM;
goto done;
}
dist.split_start_id = btf__type_cnt(old_base);
dist.split_start_str = old_base->hdr->str_len;
/* Pass over src split BTF; generate the list of base BTF type ids it
* references; these will constitute our distilled BTF set to be
* distributed over base and split BTF as appropriate.
*/
for (i = src_btf->start_id; i < n; i++) {
err = btf_add_distilled_type_ids(&dist, i);
if (err < 0)
goto done;
}
/* Next add types for each of the required references to base BTF and split BTF
* in turn.
*/
err = btf_add_distilled_types(&dist);
if (err < 0)
goto done;
/* Create new split BTF with distilled base BTF as its base; the final
* state is split BTF with distilled base BTF that represents enough
* about its base references to allow it to be relocated with the base
* BTF available.
*/
new_split = btf__new_empty_split(new_base);
if (!new_split) {
err = -errno;
goto done;
}
dist.pipe.dst = new_split;
/* First add all split types */
for (i = src_btf->start_id; i < n; i++) {
t = btf_type_by_id(src_btf, i);
err = btf_add_type(&dist.pipe, t);
if (err < 0)
goto done;
}
/* Now add distilled types to split BTF that are not added to base. */
err = btf_add_distilled_types(&dist);
if (err < 0)
goto done;
/* All split BTF ids will be shifted downwards since there are less base
* BTF ids in distilled base BTF.
*/
dist.diff_id = dist.split_start_id - btf__type_cnt(new_base);
n = btf__type_cnt(new_split);
/* Now update base/split BTF ids. */
for (i = 1; i < n; i++) {
err = btf_update_distilled_type_ids(&dist, i);
if (err < 0)
break;
}
done:
free(dist.id_map);
hashmap__free(dist.pipe.str_off_map);
if (err) {
btf__free(new_split);
btf__free(new_base);
return libbpf_err(err);
}
*new_base_btf = new_base;
*new_split_btf = new_split;
return 0;
}
const struct btf_header *btf_header(const struct btf *btf)
{
return btf->hdr;
}
void btf_set_base_btf(struct btf *btf, const struct btf *base_btf)
{
btf->base_btf = (struct btf *)base_btf;
btf->start_id = btf__type_cnt(base_btf);
btf->start_str_off = base_btf->hdr->str_len;
}
int btf__relocate(struct btf *btf, const struct btf *base_btf)
{
int err = btf_relocate(btf, base_btf, NULL);
if (!err)
btf->owns_base = false;
return libbpf_err(err);
}

View File

@@ -18,6 +18,7 @@ extern "C" {
#define BTF_ELF_SEC ".BTF"
#define BTF_EXT_ELF_SEC ".BTF.ext"
#define BTF_BASE_ELF_SEC ".BTF.base"
#define MAPS_ELF_SEC ".maps"
struct btf;
@@ -107,6 +108,27 @@ LIBBPF_API struct btf *btf__new_empty(void);
*/
LIBBPF_API struct btf *btf__new_empty_split(struct btf *base_btf);
/**
* @brief **btf__distill_base()** creates new versions of the split BTF
* *src_btf* and its base BTF. The new base BTF will only contain the types
* needed to improve robustness of the split BTF to small changes in base BTF.
* When that split BTF is loaded against a (possibly changed) base, this
* distilled base BTF will help update references to that (possibly changed)
* base BTF.
*
* Both the new split and its associated new base BTF must be freed by
* the caller.
*
* If successful, 0 is returned and **new_base_btf** and **new_split_btf**
* will point at new base/split BTF. Both the new split and its associated
* new base BTF must be freed by the caller.
*
* A negative value is returned on error and the thread-local `errno` variable
* is set to the error code as well.
*/
LIBBPF_API int btf__distill_base(const struct btf *src_btf, struct btf **new_base_btf,
struct btf **new_split_btf);
LIBBPF_API struct btf *btf__parse(const char *path, struct btf_ext **btf_ext);
LIBBPF_API struct btf *btf__parse_split(const char *path, struct btf *base_btf);
LIBBPF_API struct btf *btf__parse_elf(const char *path, struct btf_ext **btf_ext);
@@ -231,6 +253,20 @@ struct btf_dedup_opts {
LIBBPF_API int btf__dedup(struct btf *btf, const struct btf_dedup_opts *opts);
/**
* @brief **btf__relocate()** will check the split BTF *btf* for references
* to base BTF kinds, and verify those references are compatible with
* *base_btf*; if they are, *btf* is adjusted such that is re-parented to
* *base_btf* and type ids and strings are adjusted to accommodate this.
*
* If successful, 0 is returned and **btf** now has **base_btf** as its
* base.
*
* A negative value is returned on error and the thread-local `errno` variable
* is set to the error code as well.
*/
LIBBPF_API int btf__relocate(struct btf *btf, const struct btf *base_btf);
struct btf_dump;
struct btf_dump_opts {

View File

@@ -1559,10 +1559,12 @@ static void btf_dump_emit_type_chain(struct btf_dump *d,
* Clang for BPF target generates func_proto with no
* args as a func_proto with a single void arg (e.g.,
* `int (*f)(void)` vs just `int (*f)()`). We are
* going to pretend there are no args for such case.
* going to emit valid empty args (void) syntax for
* such case. Similarly and conveniently, valid
* no args case can be special-cased here as well.
*/
if (vlen == 1 && p->type == 0) {
btf_dump_printf(d, ")");
if (vlen == 0 || (vlen == 1 && p->type == 0)) {
btf_dump_printf(d, "void)");
return;
}
@@ -1929,6 +1931,7 @@ static int btf_dump_int_data(struct btf_dump *d,
if (d->typed_dump->is_array_terminated)
break;
if (*(char *)data == '\0') {
btf_dump_type_values(d, "'\\0'");
d->typed_dump->is_array_terminated = true;
break;
}
@@ -2031,6 +2034,7 @@ static int btf_dump_array_data(struct btf_dump *d,
__u32 i, elem_type_id;
__s64 elem_size;
bool is_array_member;
bool is_array_terminated;
elem_type_id = array->type;
elem_type = skip_mods_and_typedefs(d->btf, elem_type_id, NULL);
@@ -2066,12 +2070,15 @@ static int btf_dump_array_data(struct btf_dump *d,
*/
is_array_member = d->typed_dump->is_array_member;
d->typed_dump->is_array_member = true;
is_array_terminated = d->typed_dump->is_array_terminated;
d->typed_dump->is_array_terminated = false;
for (i = 0; i < array->nelems; i++, data += elem_size) {
if (d->typed_dump->is_array_terminated)
break;
btf_dump_dump_type_data(d, NULL, elem_type, elem_type_id, data, 0, 0);
}
d->typed_dump->is_array_member = is_array_member;
d->typed_dump->is_array_terminated = is_array_terminated;
d->typed_dump->depth--;
btf_dump_data_pfx(d);
btf_dump_type_values(d, "]");

177
src/btf_iter.c Normal file
View File

@@ -0,0 +1,177 @@
// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
/* Copyright (c) 2021 Facebook */
/* Copyright (c) 2024, Oracle and/or its affiliates. */
#ifdef __KERNEL__
#include <linux/bpf.h>
#include <linux/btf.h>
#define btf_var_secinfos(t) (struct btf_var_secinfo *)btf_type_var_secinfo(t)
#else
#include "btf.h"
#include "libbpf_internal.h"
#endif
int btf_field_iter_init(struct btf_field_iter *it, struct btf_type *t,
enum btf_field_iter_kind iter_kind)
{
it->p = NULL;
it->m_idx = -1;
it->off_idx = 0;
it->vlen = 0;
switch (iter_kind) {
case BTF_FIELD_ITER_IDS:
switch (btf_kind(t)) {
case BTF_KIND_UNKN:
case BTF_KIND_INT:
case BTF_KIND_FLOAT:
case BTF_KIND_ENUM:
case BTF_KIND_ENUM64:
it->desc = (struct btf_field_desc) {};
break;
case BTF_KIND_FWD:
case BTF_KIND_CONST:
case BTF_KIND_VOLATILE:
case BTF_KIND_RESTRICT:
case BTF_KIND_PTR:
case BTF_KIND_TYPEDEF:
case BTF_KIND_FUNC:
case BTF_KIND_VAR:
case BTF_KIND_DECL_TAG:
case BTF_KIND_TYPE_TAG:
it->desc = (struct btf_field_desc) { 1, {offsetof(struct btf_type, type)} };
break;
case BTF_KIND_ARRAY:
it->desc = (struct btf_field_desc) {
2, {sizeof(struct btf_type) + offsetof(struct btf_array, type),
sizeof(struct btf_type) + offsetof(struct btf_array, index_type)}
};
break;
case BTF_KIND_STRUCT:
case BTF_KIND_UNION:
it->desc = (struct btf_field_desc) {
0, {},
sizeof(struct btf_member),
1, {offsetof(struct btf_member, type)}
};
break;
case BTF_KIND_FUNC_PROTO:
it->desc = (struct btf_field_desc) {
1, {offsetof(struct btf_type, type)},
sizeof(struct btf_param),
1, {offsetof(struct btf_param, type)}
};
break;
case BTF_KIND_DATASEC:
it->desc = (struct btf_field_desc) {
0, {},
sizeof(struct btf_var_secinfo),
1, {offsetof(struct btf_var_secinfo, type)}
};
break;
default:
return -EINVAL;
}
break;
case BTF_FIELD_ITER_STRS:
switch (btf_kind(t)) {
case BTF_KIND_UNKN:
it->desc = (struct btf_field_desc) {};
break;
case BTF_KIND_INT:
case BTF_KIND_FLOAT:
case BTF_KIND_FWD:
case BTF_KIND_ARRAY:
case BTF_KIND_CONST:
case BTF_KIND_VOLATILE:
case BTF_KIND_RESTRICT:
case BTF_KIND_PTR:
case BTF_KIND_TYPEDEF:
case BTF_KIND_FUNC:
case BTF_KIND_VAR:
case BTF_KIND_DECL_TAG:
case BTF_KIND_TYPE_TAG:
case BTF_KIND_DATASEC:
it->desc = (struct btf_field_desc) {
1, {offsetof(struct btf_type, name_off)}
};
break;
case BTF_KIND_ENUM:
it->desc = (struct btf_field_desc) {
1, {offsetof(struct btf_type, name_off)},
sizeof(struct btf_enum),
1, {offsetof(struct btf_enum, name_off)}
};
break;
case BTF_KIND_ENUM64:
it->desc = (struct btf_field_desc) {
1, {offsetof(struct btf_type, name_off)},
sizeof(struct btf_enum64),
1, {offsetof(struct btf_enum64, name_off)}
};
break;
case BTF_KIND_STRUCT:
case BTF_KIND_UNION:
it->desc = (struct btf_field_desc) {
1, {offsetof(struct btf_type, name_off)},
sizeof(struct btf_member),
1, {offsetof(struct btf_member, name_off)}
};
break;
case BTF_KIND_FUNC_PROTO:
it->desc = (struct btf_field_desc) {
1, {offsetof(struct btf_type, name_off)},
sizeof(struct btf_param),
1, {offsetof(struct btf_param, name_off)}
};
break;
default:
return -EINVAL;
}
break;
default:
return -EINVAL;
}
if (it->desc.m_sz)
it->vlen = btf_vlen(t);
it->p = t;
return 0;
}
__u32 *btf_field_iter_next(struct btf_field_iter *it)
{
if (!it->p)
return NULL;
if (it->m_idx < 0) {
if (it->off_idx < it->desc.t_off_cnt)
return it->p + it->desc.t_offs[it->off_idx++];
/* move to per-member iteration */
it->m_idx = 0;
it->p += sizeof(struct btf_type);
it->off_idx = 0;
}
/* if type doesn't have members, stop */
if (it->desc.m_sz == 0) {
it->p = NULL;
return NULL;
}
if (it->off_idx >= it->desc.m_off_cnt) {
/* exhausted this member's fields, go to the next member */
it->m_idx++;
it->p += it->desc.m_sz;
it->off_idx = 0;
}
if (it->m_idx < it->vlen)
return it->p + it->desc.m_offs[it->off_idx++];
it->p = NULL;
return NULL;
}

519
src/btf_relocate.c Normal file
View File

@@ -0,0 +1,519 @@
// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
/* Copyright (c) 2024, Oracle and/or its affiliates. */
#ifndef _GNU_SOURCE
#define _GNU_SOURCE
#endif
#ifdef __KERNEL__
#include <linux/bpf.h>
#include <linux/bsearch.h>
#include <linux/btf.h>
#include <linux/sort.h>
#include <linux/string.h>
#include <linux/bpf_verifier.h>
#define btf_type_by_id (struct btf_type *)btf_type_by_id
#define btf__type_cnt btf_nr_types
#define btf__base_btf btf_base_btf
#define btf__name_by_offset btf_name_by_offset
#define btf__str_by_offset btf_str_by_offset
#define btf_kflag btf_type_kflag
#define calloc(nmemb, sz) kvcalloc(nmemb, sz, GFP_KERNEL | __GFP_NOWARN)
#define free(ptr) kvfree(ptr)
#define qsort(base, num, sz, cmp) sort(base, num, sz, cmp, NULL)
#else
#include "btf.h"
#include "bpf.h"
#include "libbpf.h"
#include "libbpf_internal.h"
#endif /* __KERNEL__ */
struct btf;
struct btf_relocate {
struct btf *btf;
const struct btf *base_btf;
const struct btf *dist_base_btf;
unsigned int nr_base_types;
unsigned int nr_split_types;
unsigned int nr_dist_base_types;
int dist_str_len;
int base_str_len;
__u32 *id_map;
__u32 *str_map;
};
/* Set temporarily in relocation id_map if distilled base struct/union is
* embedded in a split BTF struct/union; in such a case, size information must
* match between distilled base BTF and base BTF representation of type.
*/
#define BTF_IS_EMBEDDED ((__u32)-1)
/* <name, size, id> triple used in sorting/searching distilled base BTF. */
struct btf_name_info {
const char *name;
/* set when search requires a size match */
bool needs_size: 1;
unsigned int size: 31;
__u32 id;
};
static int btf_relocate_rewrite_type_id(struct btf_relocate *r, __u32 i)
{
struct btf_type *t = btf_type_by_id(r->btf, i);
struct btf_field_iter it;
__u32 *id;
int err;
err = btf_field_iter_init(&it, t, BTF_FIELD_ITER_IDS);
if (err)
return err;
while ((id = btf_field_iter_next(&it)))
*id = r->id_map[*id];
return 0;
}
/* Simple string comparison used for sorting within BTF, since all distilled
* types are named. If strings match, and size is non-zero for both elements
* fall back to using size for ordering.
*/
static int cmp_btf_name_size(const void *n1, const void *n2)
{
const struct btf_name_info *ni1 = n1;
const struct btf_name_info *ni2 = n2;
int name_diff = strcmp(ni1->name, ni2->name);
if (!name_diff && ni1->needs_size && ni2->needs_size)
return ni2->size - ni1->size;
return name_diff;
}
/* Binary search with a small twist; find leftmost element that matches
* so that we can then iterate through all exact matches. So for example
* searching { "a", "bb", "bb", "c" } we would always match on the
* leftmost "bb".
*/
static struct btf_name_info *search_btf_name_size(struct btf_name_info *key,
struct btf_name_info *vals,
int nelems)
{
struct btf_name_info *ret = NULL;
int high = nelems - 1;
int low = 0;
while (low <= high) {
int mid = (low + high)/2;
struct btf_name_info *val = &vals[mid];
int diff = cmp_btf_name_size(key, val);
if (diff == 0)
ret = val;
/* even if found, keep searching for leftmost match */
if (diff <= 0)
high = mid - 1;
else
low = mid + 1;
}
return ret;
}
/* If a member of a split BTF struct/union refers to a base BTF
* struct/union, mark that struct/union id temporarily in the id_map
* with BTF_IS_EMBEDDED. Members can be const/restrict/volatile/typedef
* reference types, but if a pointer is encountered, the type is no longer
* considered embedded.
*/
static int btf_mark_embedded_composite_type_ids(struct btf_relocate *r, __u32 i)
{
struct btf_type *t = btf_type_by_id(r->btf, i);
struct btf_field_iter it;
__u32 *id;
int err;
if (!btf_is_composite(t))
return 0;
err = btf_field_iter_init(&it, t, BTF_FIELD_ITER_IDS);
if (err)
return err;
while ((id = btf_field_iter_next(&it))) {
__u32 next_id = *id;
while (next_id) {
t = btf_type_by_id(r->btf, next_id);
switch (btf_kind(t)) {
case BTF_KIND_CONST:
case BTF_KIND_RESTRICT:
case BTF_KIND_VOLATILE:
case BTF_KIND_TYPEDEF:
case BTF_KIND_TYPE_TAG:
next_id = t->type;
break;
case BTF_KIND_ARRAY: {
struct btf_array *a = btf_array(t);
next_id = a->type;
break;
}
case BTF_KIND_STRUCT:
case BTF_KIND_UNION:
if (next_id < r->nr_dist_base_types)
r->id_map[next_id] = BTF_IS_EMBEDDED;
next_id = 0;
break;
default:
next_id = 0;
break;
}
}
}
return 0;
}
/* Build a map from distilled base BTF ids to base BTF ids. To do so, iterate
* through base BTF looking up distilled type (using binary search) equivalents.
*/
static int btf_relocate_map_distilled_base(struct btf_relocate *r)
{
struct btf_name_info *info, *info_end;
struct btf_type *base_t, *dist_t;
__u8 *base_name_cnt = NULL;
int err = 0;
__u32 id;
/* generate a sort index array of name/type ids sorted by name for
* distilled base BTF to speed name-based lookups.
*/
info = calloc(r->nr_dist_base_types, sizeof(*info));
if (!info) {
err = -ENOMEM;
goto done;
}
info_end = info + r->nr_dist_base_types;
for (id = 0; id < r->nr_dist_base_types; id++) {
dist_t = btf_type_by_id(r->dist_base_btf, id);
info[id].name = btf__name_by_offset(r->dist_base_btf, dist_t->name_off);
info[id].id = id;
info[id].size = dist_t->size;
info[id].needs_size = true;
}
qsort(info, r->nr_dist_base_types, sizeof(*info), cmp_btf_name_size);
/* Mark distilled base struct/union members of split BTF structs/unions
* in id_map with BTF_IS_EMBEDDED; this signals that these types
* need to match both name and size, otherwise embedding the base
* struct/union in the split type is invalid.
*/
for (id = r->nr_dist_base_types; id < r->nr_split_types; id++) {
err = btf_mark_embedded_composite_type_ids(r, id);
if (err)
goto done;
}
/* Collect name counts for composite types in base BTF. If multiple
* instances of a struct/union of the same name exist, we need to use
* size to determine which to map to since name alone is ambiguous.
*/
base_name_cnt = calloc(r->base_str_len, sizeof(*base_name_cnt));
if (!base_name_cnt) {
err = -ENOMEM;
goto done;
}
for (id = 1; id < r->nr_base_types; id++) {
base_t = btf_type_by_id(r->base_btf, id);
if (!btf_is_composite(base_t) || !base_t->name_off)
continue;
if (base_name_cnt[base_t->name_off] < 255)
base_name_cnt[base_t->name_off]++;
}
/* Now search base BTF for matching distilled base BTF types. */
for (id = 1; id < r->nr_base_types; id++) {
struct btf_name_info *dist_info, base_info = {};
int dist_kind, base_kind;
base_t = btf_type_by_id(r->base_btf, id);
/* distilled base consists of named types only. */
if (!base_t->name_off)
continue;
base_kind = btf_kind(base_t);
base_info.id = id;
base_info.name = btf__name_by_offset(r->base_btf, base_t->name_off);
switch (base_kind) {
case BTF_KIND_INT:
case BTF_KIND_FLOAT:
case BTF_KIND_ENUM:
case BTF_KIND_ENUM64:
/* These types should match both name and size */
base_info.needs_size = true;
base_info.size = base_t->size;
break;
case BTF_KIND_FWD:
/* No size considerations for fwds. */
break;
case BTF_KIND_STRUCT:
case BTF_KIND_UNION:
/* Size only needs to be used for struct/union if there
* are multiple types in base BTF with the same name.
* If there are multiple _distilled_ types with the same
* name (a very unlikely scenario), that doesn't matter
* unless corresponding _base_ types to match them are
* missing.
*/
base_info.needs_size = base_name_cnt[base_t->name_off] > 1;
base_info.size = base_t->size;
break;
default:
continue;
}
/* iterate over all matching distilled base types */
for (dist_info = search_btf_name_size(&base_info, info, r->nr_dist_base_types);
dist_info != NULL && dist_info < info_end &&
cmp_btf_name_size(&base_info, dist_info) == 0;
dist_info++) {
if (!dist_info->id || dist_info->id >= r->nr_dist_base_types) {
pr_warn("base BTF id [%d] maps to invalid distilled base BTF id [%d]\n",
id, dist_info->id);
err = -EINVAL;
goto done;
}
dist_t = btf_type_by_id(r->dist_base_btf, dist_info->id);
dist_kind = btf_kind(dist_t);
/* Validate that the found distilled type is compatible.
* Do not error out on mismatch as another match may
* occur for an identically-named type.
*/
switch (dist_kind) {
case BTF_KIND_FWD:
switch (base_kind) {
case BTF_KIND_FWD:
if (btf_kflag(dist_t) != btf_kflag(base_t))
continue;
break;
case BTF_KIND_STRUCT:
if (btf_kflag(base_t))
continue;
break;
case BTF_KIND_UNION:
if (!btf_kflag(base_t))
continue;
break;
default:
continue;
}
break;
case BTF_KIND_INT:
if (dist_kind != base_kind ||
btf_int_encoding(base_t) != btf_int_encoding(dist_t))
continue;
break;
case BTF_KIND_FLOAT:
if (dist_kind != base_kind)
continue;
break;
case BTF_KIND_ENUM:
/* ENUM and ENUM64 are encoded as sized ENUM in
* distilled base BTF.
*/
if (base_kind != dist_kind && base_kind != BTF_KIND_ENUM64)
continue;
break;
case BTF_KIND_STRUCT:
case BTF_KIND_UNION:
/* size verification is required for embedded
* struct/unions.
*/
if (r->id_map[dist_info->id] == BTF_IS_EMBEDDED &&
base_t->size != dist_t->size)
continue;
break;
default:
continue;
}
if (r->id_map[dist_info->id] &&
r->id_map[dist_info->id] != BTF_IS_EMBEDDED) {
/* we already have a match; this tells us that
* multiple base types of the same name
* have the same size, since for cases where
* multiple types have the same name we match
* on name and size. In this case, we have
* no way of determining which to relocate
* to in base BTF, so error out.
*/
pr_warn("distilled base BTF type '%s' [%u], size %u has multiple candidates of the same size (ids [%u, %u]) in base BTF\n",
base_info.name, dist_info->id,
base_t->size, id, r->id_map[dist_info->id]);
err = -EINVAL;
goto done;
}
/* map id and name */
r->id_map[dist_info->id] = id;
r->str_map[dist_t->name_off] = base_t->name_off;
}
}
/* ensure all distilled BTF ids now have a mapping... */
for (id = 1; id < r->nr_dist_base_types; id++) {
const char *name;
if (r->id_map[id] && r->id_map[id] != BTF_IS_EMBEDDED)
continue;
dist_t = btf_type_by_id(r->dist_base_btf, id);
name = btf__name_by_offset(r->dist_base_btf, dist_t->name_off);
pr_warn("distilled base BTF type '%s' [%d] is not mapped to base BTF id\n",
name, id);
err = -EINVAL;
break;
}
done:
free(base_name_cnt);
free(info);
return err;
}
/* distilled base should only have named int/float/enum/fwd/struct/union types. */
static int btf_relocate_validate_distilled_base(struct btf_relocate *r)
{
unsigned int i;
for (i = 1; i < r->nr_dist_base_types; i++) {
struct btf_type *t = btf_type_by_id(r->dist_base_btf, i);
int kind = btf_kind(t);
switch (kind) {
case BTF_KIND_INT:
case BTF_KIND_FLOAT:
case BTF_KIND_ENUM:
case BTF_KIND_STRUCT:
case BTF_KIND_UNION:
case BTF_KIND_FWD:
if (t->name_off)
break;
pr_warn("type [%d], kind [%d] is invalid for distilled base BTF; it is anonymous\n",
i, kind);
return -EINVAL;
default:
pr_warn("type [%d] in distilled based BTF has unexpected kind [%d]\n",
i, kind);
return -EINVAL;
}
}
return 0;
}
static int btf_relocate_rewrite_strs(struct btf_relocate *r, __u32 i)
{
struct btf_type *t = btf_type_by_id(r->btf, i);
struct btf_field_iter it;
__u32 *str_off;
int off, err;
err = btf_field_iter_init(&it, t, BTF_FIELD_ITER_STRS);
if (err)
return err;
while ((str_off = btf_field_iter_next(&it))) {
if (!*str_off)
continue;
if (*str_off >= r->dist_str_len) {
*str_off += r->base_str_len - r->dist_str_len;
} else {
off = r->str_map[*str_off];
if (!off) {
pr_warn("string '%s' [offset %u] is not mapped to base BTF",
btf__str_by_offset(r->btf, off), *str_off);
return -ENOENT;
}
*str_off = off;
}
}
return 0;
}
/* If successful, output of relocation is updated BTF with base BTF pointing
* at base_btf, and type ids, strings adjusted accordingly.
*/
int btf_relocate(struct btf *btf, const struct btf *base_btf, __u32 **id_map)
{
unsigned int nr_types = btf__type_cnt(btf);
const struct btf_header *dist_base_hdr;
const struct btf_header *base_hdr;
struct btf_relocate r = {};
int err = 0;
__u32 id, i;
r.dist_base_btf = btf__base_btf(btf);
if (!base_btf || r.dist_base_btf == base_btf)
return -EINVAL;
r.nr_dist_base_types = btf__type_cnt(r.dist_base_btf);
r.nr_base_types = btf__type_cnt(base_btf);
r.nr_split_types = nr_types - r.nr_dist_base_types;
r.btf = btf;
r.base_btf = base_btf;
r.id_map = calloc(nr_types, sizeof(*r.id_map));
r.str_map = calloc(btf_header(r.dist_base_btf)->str_len, sizeof(*r.str_map));
dist_base_hdr = btf_header(r.dist_base_btf);
base_hdr = btf_header(r.base_btf);
r.dist_str_len = dist_base_hdr->str_len;
r.base_str_len = base_hdr->str_len;
if (!r.id_map || !r.str_map) {
err = -ENOMEM;
goto err_out;
}
err = btf_relocate_validate_distilled_base(&r);
if (err)
goto err_out;
/* Split BTF ids need to be adjusted as base and distilled base
* have different numbers of types, changing the start id of split
* BTF.
*/
for (id = r.nr_dist_base_types; id < nr_types; id++)
r.id_map[id] = id + r.nr_base_types - r.nr_dist_base_types;
/* Build a map from distilled base ids to actual base BTF ids; it is used
* to update split BTF id references. Also build a str_map mapping from
* distilled base BTF names to base BTF names.
*/
err = btf_relocate_map_distilled_base(&r);
if (err)
goto err_out;
/* Next, rewrite type ids in split BTF, replacing split ids with updated
* ids based on number of types in base BTF, and base ids with
* relocated ids from base_btf.
*/
for (i = 0, id = r.nr_dist_base_types; i < r.nr_split_types; i++, id++) {
err = btf_relocate_rewrite_type_id(&r, id);
if (err)
goto err_out;
}
/* String offsets now need to be updated using the str_map. */
for (i = 0; i < r.nr_split_types; i++) {
err = btf_relocate_rewrite_strs(&r, i + r.nr_dist_base_types);
if (err)
goto err_out;
}
/* Finally reset base BTF to be base_btf */
btf_set_base_btf(btf, base_btf);
if (id_map) {
*id_map = r.id_map;
r.id_map = NULL;
}
err_out:
free(r.id_map);
free(r.str_map);
return err;
}

View File

@@ -11,8 +11,6 @@
#include "libbpf_internal.h"
#include "str_error.h"
#define STRERR_BUFSIZE 128
/* A SHT_GNU_versym section holds 16-bit words. This bit is set if
* the symbol is hidden and can only be seen when referenced using an
* explicit version number. This is a GNU extension.
@@ -30,6 +28,9 @@ int elf_open(const char *binary_path, struct elf_fd *elf_fd)
int fd, ret;
Elf *elf;
elf_fd->elf = NULL;
elf_fd->fd = -1;
if (elf_version(EV_CURRENT) == EV_NONE) {
pr_warn("elf: failed to init libelf for %s\n", binary_path);
return -LIBBPF_ERRNO__LIBELF;

613
src/features.c Normal file
View File

@@ -0,0 +1,613 @@
// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */
#include <linux/kernel.h>
#include <linux/filter.h>
#include "bpf.h"
#include "libbpf.h"
#include "libbpf_common.h"
#include "libbpf_internal.h"
#include "str_error.h"
static inline __u64 ptr_to_u64(const void *ptr)
{
return (__u64)(unsigned long)ptr;
}
int probe_fd(int fd)
{
if (fd >= 0)
close(fd);
return fd >= 0;
}
static int probe_kern_prog_name(int token_fd)
{
const size_t attr_sz = offsetofend(union bpf_attr, prog_token_fd);
struct bpf_insn insns[] = {
BPF_MOV64_IMM(BPF_REG_0, 0),
BPF_EXIT_INSN(),
};
union bpf_attr attr;
int ret;
memset(&attr, 0, attr_sz);
attr.prog_type = BPF_PROG_TYPE_SOCKET_FILTER;
attr.license = ptr_to_u64("GPL");
attr.insns = ptr_to_u64(insns);
attr.insn_cnt = (__u32)ARRAY_SIZE(insns);
attr.prog_token_fd = token_fd;
if (token_fd)
attr.prog_flags |= BPF_F_TOKEN_FD;
libbpf_strlcpy(attr.prog_name, "libbpf_nametest", sizeof(attr.prog_name));
/* make sure loading with name works */
ret = sys_bpf_prog_load(&attr, attr_sz, PROG_LOAD_ATTEMPTS);
return probe_fd(ret);
}
static int probe_kern_global_data(int token_fd)
{
char *cp, errmsg[STRERR_BUFSIZE];
struct bpf_insn insns[] = {
BPF_LD_MAP_VALUE(BPF_REG_1, 0, 16),
BPF_ST_MEM(BPF_DW, BPF_REG_1, 0, 42),
BPF_MOV64_IMM(BPF_REG_0, 0),
BPF_EXIT_INSN(),
};
LIBBPF_OPTS(bpf_map_create_opts, map_opts,
.token_fd = token_fd,
.map_flags = token_fd ? BPF_F_TOKEN_FD : 0,
);
LIBBPF_OPTS(bpf_prog_load_opts, prog_opts,
.token_fd = token_fd,
.prog_flags = token_fd ? BPF_F_TOKEN_FD : 0,
);
int ret, map, insn_cnt = ARRAY_SIZE(insns);
map = bpf_map_create(BPF_MAP_TYPE_ARRAY, "libbpf_global", sizeof(int), 32, 1, &map_opts);
if (map < 0) {
ret = -errno;
cp = libbpf_strerror_r(ret, errmsg, sizeof(errmsg));
pr_warn("Error in %s():%s(%d). Couldn't create simple array map.\n",
__func__, cp, -ret);
return ret;
}
insns[0].imm = map;
ret = bpf_prog_load(BPF_PROG_TYPE_SOCKET_FILTER, NULL, "GPL", insns, insn_cnt, &prog_opts);
close(map);
return probe_fd(ret);
}
static int probe_kern_btf(int token_fd)
{
static const char strs[] = "\0int";
__u32 types[] = {
/* int */
BTF_TYPE_INT_ENC(1, BTF_INT_SIGNED, 0, 32, 4),
};
return probe_fd(libbpf__load_raw_btf((char *)types, sizeof(types),
strs, sizeof(strs), token_fd));
}
static int probe_kern_btf_func(int token_fd)
{
static const char strs[] = "\0int\0x\0a";
/* void x(int a) {} */
__u32 types[] = {
/* int */
BTF_TYPE_INT_ENC(1, BTF_INT_SIGNED, 0, 32, 4), /* [1] */
/* FUNC_PROTO */ /* [2] */
BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_FUNC_PROTO, 0, 1), 0),
BTF_PARAM_ENC(7, 1),
/* FUNC x */ /* [3] */
BTF_TYPE_ENC(5, BTF_INFO_ENC(BTF_KIND_FUNC, 0, 0), 2),
};
return probe_fd(libbpf__load_raw_btf((char *)types, sizeof(types),
strs, sizeof(strs), token_fd));
}
static int probe_kern_btf_func_global(int token_fd)
{
static const char strs[] = "\0int\0x\0a";
/* static void x(int a) {} */
__u32 types[] = {
/* int */
BTF_TYPE_INT_ENC(1, BTF_INT_SIGNED, 0, 32, 4), /* [1] */
/* FUNC_PROTO */ /* [2] */
BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_FUNC_PROTO, 0, 1), 0),
BTF_PARAM_ENC(7, 1),
/* FUNC x BTF_FUNC_GLOBAL */ /* [3] */
BTF_TYPE_ENC(5, BTF_INFO_ENC(BTF_KIND_FUNC, 0, BTF_FUNC_GLOBAL), 2),
};
return probe_fd(libbpf__load_raw_btf((char *)types, sizeof(types),
strs, sizeof(strs), token_fd));
}
static int probe_kern_btf_datasec(int token_fd)
{
static const char strs[] = "\0x\0.data";
/* static int a; */
__u32 types[] = {
/* int */
BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */
/* VAR x */ /* [2] */
BTF_TYPE_ENC(1, BTF_INFO_ENC(BTF_KIND_VAR, 0, 0), 1),
BTF_VAR_STATIC,
/* DATASEC val */ /* [3] */
BTF_TYPE_ENC(3, BTF_INFO_ENC(BTF_KIND_DATASEC, 0, 1), 4),
BTF_VAR_SECINFO_ENC(2, 0, 4),
};
return probe_fd(libbpf__load_raw_btf((char *)types, sizeof(types),
strs, sizeof(strs), token_fd));
}
static int probe_kern_btf_qmark_datasec(int token_fd)
{
static const char strs[] = "\0x\0?.data";
/* static int a; */
__u32 types[] = {
/* int */
BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */
/* VAR x */ /* [2] */
BTF_TYPE_ENC(1, BTF_INFO_ENC(BTF_KIND_VAR, 0, 0), 1),
BTF_VAR_STATIC,
/* DATASEC ?.data */ /* [3] */
BTF_TYPE_ENC(3, BTF_INFO_ENC(BTF_KIND_DATASEC, 0, 1), 4),
BTF_VAR_SECINFO_ENC(2, 0, 4),
};
return probe_fd(libbpf__load_raw_btf((char *)types, sizeof(types),
strs, sizeof(strs), token_fd));
}
static int probe_kern_btf_float(int token_fd)
{
static const char strs[] = "\0float";
__u32 types[] = {
/* float */
BTF_TYPE_FLOAT_ENC(1, 4),
};
return probe_fd(libbpf__load_raw_btf((char *)types, sizeof(types),
strs, sizeof(strs), token_fd));
}
static int probe_kern_btf_decl_tag(int token_fd)
{
static const char strs[] = "\0tag";
__u32 types[] = {
/* int */
BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */
/* VAR x */ /* [2] */
BTF_TYPE_ENC(1, BTF_INFO_ENC(BTF_KIND_VAR, 0, 0), 1),
BTF_VAR_STATIC,
/* attr */
BTF_TYPE_DECL_TAG_ENC(1, 2, -1),
};
return probe_fd(libbpf__load_raw_btf((char *)types, sizeof(types),
strs, sizeof(strs), token_fd));
}
static int probe_kern_btf_type_tag(int token_fd)
{
static const char strs[] = "\0tag";
__u32 types[] = {
/* int */
BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */
/* attr */
BTF_TYPE_TYPE_TAG_ENC(1, 1), /* [2] */
/* ptr */
BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_PTR, 0, 0), 2), /* [3] */
};
return probe_fd(libbpf__load_raw_btf((char *)types, sizeof(types),
strs, sizeof(strs), token_fd));
}
static int probe_kern_array_mmap(int token_fd)
{
LIBBPF_OPTS(bpf_map_create_opts, opts,
.map_flags = BPF_F_MMAPABLE | (token_fd ? BPF_F_TOKEN_FD : 0),
.token_fd = token_fd,
);
int fd;
fd = bpf_map_create(BPF_MAP_TYPE_ARRAY, "libbpf_mmap", sizeof(int), sizeof(int), 1, &opts);
return probe_fd(fd);
}
static int probe_kern_exp_attach_type(int token_fd)
{
LIBBPF_OPTS(bpf_prog_load_opts, opts,
.expected_attach_type = BPF_CGROUP_INET_SOCK_CREATE,
.token_fd = token_fd,
.prog_flags = token_fd ? BPF_F_TOKEN_FD : 0,
);
struct bpf_insn insns[] = {
BPF_MOV64_IMM(BPF_REG_0, 0),
BPF_EXIT_INSN(),
};
int fd, insn_cnt = ARRAY_SIZE(insns);
/* use any valid combination of program type and (optional)
* non-zero expected attach type (i.e., not a BPF_CGROUP_INET_INGRESS)
* to see if kernel supports expected_attach_type field for
* BPF_PROG_LOAD command
*/
fd = bpf_prog_load(BPF_PROG_TYPE_CGROUP_SOCK, NULL, "GPL", insns, insn_cnt, &opts);
return probe_fd(fd);
}
static int probe_kern_probe_read_kernel(int token_fd)
{
LIBBPF_OPTS(bpf_prog_load_opts, opts,
.token_fd = token_fd,
.prog_flags = token_fd ? BPF_F_TOKEN_FD : 0,
);
struct bpf_insn insns[] = {
BPF_MOV64_REG(BPF_REG_1, BPF_REG_10), /* r1 = r10 (fp) */
BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -8), /* r1 += -8 */
BPF_MOV64_IMM(BPF_REG_2, 8), /* r2 = 8 */
BPF_MOV64_IMM(BPF_REG_3, 0), /* r3 = 0 */
BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_probe_read_kernel),
BPF_EXIT_INSN(),
};
int fd, insn_cnt = ARRAY_SIZE(insns);
fd = bpf_prog_load(BPF_PROG_TYPE_TRACEPOINT, NULL, "GPL", insns, insn_cnt, &opts);
return probe_fd(fd);
}
static int probe_prog_bind_map(int token_fd)
{
char *cp, errmsg[STRERR_BUFSIZE];
struct bpf_insn insns[] = {
BPF_MOV64_IMM(BPF_REG_0, 0),
BPF_EXIT_INSN(),
};
LIBBPF_OPTS(bpf_map_create_opts, map_opts,
.token_fd = token_fd,
.map_flags = token_fd ? BPF_F_TOKEN_FD : 0,
);
LIBBPF_OPTS(bpf_prog_load_opts, prog_opts,
.token_fd = token_fd,
.prog_flags = token_fd ? BPF_F_TOKEN_FD : 0,
);
int ret, map, prog, insn_cnt = ARRAY_SIZE(insns);
map = bpf_map_create(BPF_MAP_TYPE_ARRAY, "libbpf_det_bind", sizeof(int), 32, 1, &map_opts);
if (map < 0) {
ret = -errno;
cp = libbpf_strerror_r(ret, errmsg, sizeof(errmsg));
pr_warn("Error in %s():%s(%d). Couldn't create simple array map.\n",
__func__, cp, -ret);
return ret;
}
prog = bpf_prog_load(BPF_PROG_TYPE_SOCKET_FILTER, NULL, "GPL", insns, insn_cnt, &prog_opts);
if (prog < 0) {
close(map);
return 0;
}
ret = bpf_prog_bind_map(prog, map, NULL);
close(map);
close(prog);
return ret >= 0;
}
static int probe_module_btf(int token_fd)
{
static const char strs[] = "\0int";
__u32 types[] = {
/* int */
BTF_TYPE_INT_ENC(1, BTF_INT_SIGNED, 0, 32, 4),
};
struct bpf_btf_info info;
__u32 len = sizeof(info);
char name[16];
int fd, err;
fd = libbpf__load_raw_btf((char *)types, sizeof(types), strs, sizeof(strs), token_fd);
if (fd < 0)
return 0; /* BTF not supported at all */
memset(&info, 0, sizeof(info));
info.name = ptr_to_u64(name);
info.name_len = sizeof(name);
/* check that BPF_OBJ_GET_INFO_BY_FD supports specifying name pointer;
* kernel's module BTF support coincides with support for
* name/name_len fields in struct bpf_btf_info.
*/
err = bpf_btf_get_info_by_fd(fd, &info, &len);
close(fd);
return !err;
}
static int probe_perf_link(int token_fd)
{
struct bpf_insn insns[] = {
BPF_MOV64_IMM(BPF_REG_0, 0),
BPF_EXIT_INSN(),
};
LIBBPF_OPTS(bpf_prog_load_opts, opts,
.token_fd = token_fd,
.prog_flags = token_fd ? BPF_F_TOKEN_FD : 0,
);
int prog_fd, link_fd, err;
prog_fd = bpf_prog_load(BPF_PROG_TYPE_TRACEPOINT, NULL, "GPL",
insns, ARRAY_SIZE(insns), &opts);
if (prog_fd < 0)
return -errno;
/* use invalid perf_event FD to get EBADF, if link is supported;
* otherwise EINVAL should be returned
*/
link_fd = bpf_link_create(prog_fd, -1, BPF_PERF_EVENT, NULL);
err = -errno; /* close() can clobber errno */
if (link_fd >= 0)
close(link_fd);
close(prog_fd);
return link_fd < 0 && err == -EBADF;
}
static int probe_uprobe_multi_link(int token_fd)
{
LIBBPF_OPTS(bpf_prog_load_opts, load_opts,
.expected_attach_type = BPF_TRACE_UPROBE_MULTI,
.token_fd = token_fd,
.prog_flags = token_fd ? BPF_F_TOKEN_FD : 0,
);
LIBBPF_OPTS(bpf_link_create_opts, link_opts);
struct bpf_insn insns[] = {
BPF_MOV64_IMM(BPF_REG_0, 0),
BPF_EXIT_INSN(),
};
int prog_fd, link_fd, err;
unsigned long offset = 0;
prog_fd = bpf_prog_load(BPF_PROG_TYPE_KPROBE, NULL, "GPL",
insns, ARRAY_SIZE(insns), &load_opts);
if (prog_fd < 0)
return -errno;
/* Creating uprobe in '/' binary should fail with -EBADF. */
link_opts.uprobe_multi.path = "/";
link_opts.uprobe_multi.offsets = &offset;
link_opts.uprobe_multi.cnt = 1;
link_fd = bpf_link_create(prog_fd, -1, BPF_TRACE_UPROBE_MULTI, &link_opts);
err = -errno; /* close() can clobber errno */
if (link_fd >= 0 || err != -EBADF) {
if (link_fd >= 0)
close(link_fd);
close(prog_fd);
return 0;
}
/* Initial multi-uprobe support in kernel didn't handle PID filtering
* correctly (it was doing thread filtering, not process filtering).
* So now we'll detect if PID filtering logic was fixed, and, if not,
* we'll pretend multi-uprobes are not supported, if not.
* Multi-uprobes are used in USDT attachment logic, and we need to be
* conservative here, because multi-uprobe selection happens early at
* load time, while the use of PID filtering is known late at
* attachment time, at which point it's too late to undo multi-uprobe
* selection.
*
* Creating uprobe with pid == -1 for (invalid) '/' binary will fail
* early with -EINVAL on kernels with fixed PID filtering logic;
* otherwise -ESRCH would be returned if passed correct binary path
* (but we'll just get -BADF, of course).
*/
link_opts.uprobe_multi.pid = -1; /* invalid PID */
link_opts.uprobe_multi.path = "/"; /* invalid path */
link_opts.uprobe_multi.offsets = &offset;
link_opts.uprobe_multi.cnt = 1;
link_fd = bpf_link_create(prog_fd, -1, BPF_TRACE_UPROBE_MULTI, &link_opts);
err = -errno; /* close() can clobber errno */
if (link_fd >= 0)
close(link_fd);
close(prog_fd);
return link_fd < 0 && err == -EINVAL;
}
static int probe_kern_bpf_cookie(int token_fd)
{
struct bpf_insn insns[] = {
BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_get_attach_cookie),
BPF_EXIT_INSN(),
};
LIBBPF_OPTS(bpf_prog_load_opts, opts,
.token_fd = token_fd,
.prog_flags = token_fd ? BPF_F_TOKEN_FD : 0,
);
int ret, insn_cnt = ARRAY_SIZE(insns);
ret = bpf_prog_load(BPF_PROG_TYPE_TRACEPOINT, NULL, "GPL", insns, insn_cnt, &opts);
return probe_fd(ret);
}
static int probe_kern_btf_enum64(int token_fd)
{
static const char strs[] = "\0enum64";
__u32 types[] = {
BTF_TYPE_ENC(1, BTF_INFO_ENC(BTF_KIND_ENUM64, 0, 0), 8),
};
return probe_fd(libbpf__load_raw_btf((char *)types, sizeof(types),
strs, sizeof(strs), token_fd));
}
static int probe_kern_arg_ctx_tag(int token_fd)
{
static const char strs[] = "\0a\0b\0arg:ctx\0";
const __u32 types[] = {
/* [1] INT */
BTF_TYPE_INT_ENC(1 /* "a" */, BTF_INT_SIGNED, 0, 32, 4),
/* [2] PTR -> VOID */
BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_PTR, 0, 0), 0),
/* [3] FUNC_PROTO `int(void *a)` */
BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_FUNC_PROTO, 0, 1), 1),
BTF_PARAM_ENC(1 /* "a" */, 2),
/* [4] FUNC 'a' -> FUNC_PROTO (main prog) */
BTF_TYPE_ENC(1 /* "a" */, BTF_INFO_ENC(BTF_KIND_FUNC, 0, BTF_FUNC_GLOBAL), 3),
/* [5] FUNC_PROTO `int(void *b __arg_ctx)` */
BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_FUNC_PROTO, 0, 1), 1),
BTF_PARAM_ENC(3 /* "b" */, 2),
/* [6] FUNC 'b' -> FUNC_PROTO (subprog) */
BTF_TYPE_ENC(3 /* "b" */, BTF_INFO_ENC(BTF_KIND_FUNC, 0, BTF_FUNC_GLOBAL), 5),
/* [7] DECL_TAG 'arg:ctx' -> func 'b' arg 'b' */
BTF_TYPE_DECL_TAG_ENC(5 /* "arg:ctx" */, 6, 0),
};
const struct bpf_insn insns[] = {
/* main prog */
BPF_CALL_REL(+1),
BPF_EXIT_INSN(),
/* global subprog */
BPF_EMIT_CALL(BPF_FUNC_get_func_ip), /* needs PTR_TO_CTX */
BPF_EXIT_INSN(),
};
const struct bpf_func_info_min func_infos[] = {
{ 0, 4 }, /* main prog -> FUNC 'a' */
{ 2, 6 }, /* subprog -> FUNC 'b' */
};
LIBBPF_OPTS(bpf_prog_load_opts, opts,
.token_fd = token_fd,
.prog_flags = token_fd ? BPF_F_TOKEN_FD : 0,
);
int prog_fd, btf_fd, insn_cnt = ARRAY_SIZE(insns);
btf_fd = libbpf__load_raw_btf((char *)types, sizeof(types), strs, sizeof(strs), token_fd);
if (btf_fd < 0)
return 0;
opts.prog_btf_fd = btf_fd;
opts.func_info = &func_infos;
opts.func_info_cnt = ARRAY_SIZE(func_infos);
opts.func_info_rec_size = sizeof(func_infos[0]);
prog_fd = bpf_prog_load(BPF_PROG_TYPE_KPROBE, "det_arg_ctx",
"GPL", insns, insn_cnt, &opts);
close(btf_fd);
return probe_fd(prog_fd);
}
typedef int (*feature_probe_fn)(int /* token_fd */);
static struct kern_feature_cache feature_cache;
static struct kern_feature_desc {
const char *desc;
feature_probe_fn probe;
} feature_probes[__FEAT_CNT] = {
[FEAT_PROG_NAME] = {
"BPF program name", probe_kern_prog_name,
},
[FEAT_GLOBAL_DATA] = {
"global variables", probe_kern_global_data,
},
[FEAT_BTF] = {
"minimal BTF", probe_kern_btf,
},
[FEAT_BTF_FUNC] = {
"BTF functions", probe_kern_btf_func,
},
[FEAT_BTF_GLOBAL_FUNC] = {
"BTF global function", probe_kern_btf_func_global,
},
[FEAT_BTF_DATASEC] = {
"BTF data section and variable", probe_kern_btf_datasec,
},
[FEAT_ARRAY_MMAP] = {
"ARRAY map mmap()", probe_kern_array_mmap,
},
[FEAT_EXP_ATTACH_TYPE] = {
"BPF_PROG_LOAD expected_attach_type attribute",
probe_kern_exp_attach_type,
},
[FEAT_PROBE_READ_KERN] = {
"bpf_probe_read_kernel() helper", probe_kern_probe_read_kernel,
},
[FEAT_PROG_BIND_MAP] = {
"BPF_PROG_BIND_MAP support", probe_prog_bind_map,
},
[FEAT_MODULE_BTF] = {
"module BTF support", probe_module_btf,
},
[FEAT_BTF_FLOAT] = {
"BTF_KIND_FLOAT support", probe_kern_btf_float,
},
[FEAT_PERF_LINK] = {
"BPF perf link support", probe_perf_link,
},
[FEAT_BTF_DECL_TAG] = {
"BTF_KIND_DECL_TAG support", probe_kern_btf_decl_tag,
},
[FEAT_BTF_TYPE_TAG] = {
"BTF_KIND_TYPE_TAG support", probe_kern_btf_type_tag,
},
[FEAT_MEMCG_ACCOUNT] = {
"memcg-based memory accounting", probe_memcg_account,
},
[FEAT_BPF_COOKIE] = {
"BPF cookie support", probe_kern_bpf_cookie,
},
[FEAT_BTF_ENUM64] = {
"BTF_KIND_ENUM64 support", probe_kern_btf_enum64,
},
[FEAT_SYSCALL_WRAPPER] = {
"Kernel using syscall wrapper", probe_kern_syscall_wrapper,
},
[FEAT_UPROBE_MULTI_LINK] = {
"BPF multi-uprobe link support", probe_uprobe_multi_link,
},
[FEAT_ARG_CTX_TAG] = {
"kernel-side __arg_ctx tag", probe_kern_arg_ctx_tag,
},
[FEAT_BTF_QMARK_DATASEC] = {
"BTF DATASEC names starting from '?'", probe_kern_btf_qmark_datasec,
},
};
bool feat_supported(struct kern_feature_cache *cache, enum kern_feature_id feat_id)
{
struct kern_feature_desc *feat = &feature_probes[feat_id];
int ret;
/* assume global feature cache, unless custom one is provided */
if (!cache)
cache = &feature_cache;
if (READ_ONCE(cache->res[feat_id]) == FEAT_UNKNOWN) {
ret = feat->probe(cache->token_fd);
if (ret > 0) {
WRITE_ONCE(cache->res[feat_id], FEAT_SUPPORTED);
} else if (ret == 0) {
WRITE_ONCE(cache->res[feat_id], FEAT_MISSING);
} else {
pr_warn("Detection of kernel %s support failed: %d\n", feat->desc, ret);
WRITE_ONCE(cache->res[feat_id], FEAT_MISSING);
}
}
return READ_ONCE(cache->res[feat_id]) == FEAT_SUPPORTED;
}

File diff suppressed because it is too large Load Diff

View File

@@ -98,7 +98,10 @@ typedef int (*libbpf_print_fn_t)(enum libbpf_print_level level,
/**
* @brief **libbpf_set_print()** sets user-provided log callback function to
* be used for libbpf warnings and informational messages.
* be used for libbpf warnings and informational messages. If the user callback
* is not set, messages are logged to stderr by default. The verbosity of these
* messages can be controlled by setting the environment variable
* LIBBPF_LOG_LEVEL to either warn, info, or debug.
* @param fn The log print function. If NULL, libbpf won't print anything.
* @return Pointer to old print function.
*
@@ -177,10 +180,29 @@ struct bpf_object_open_opts {
* logs through its print callback.
*/
__u32 kernel_log_level;
/* Path to BPF FS mount point to derive BPF token from.
*
* Created BPF token will be used for all bpf() syscall operations
* that accept BPF token (e.g., map creation, BTF and program loads,
* etc) automatically within instantiated BPF object.
*
* If bpf_token_path is not specified, libbpf will consult
* LIBBPF_BPF_TOKEN_PATH environment variable. If set, it will be
* taken as a value of bpf_token_path option and will force libbpf to
* either create BPF token from provided custom BPF FS path, or will
* disable implicit BPF token creation, if envvar value is an empty
* string. bpf_token_path overrides LIBBPF_BPF_TOKEN_PATH, if both are
* set at the same time.
*
* Setting bpf_token_path option to empty string disables libbpf's
* automatic attempt to create BPF token from default BPF FS mount
* point (/sys/fs/bpf), in case this default behavior is undesirable.
*/
const char *bpf_token_path;
size_t :0;
};
#define bpf_object_open_opts__last_field kernel_log_level
#define bpf_object_open_opts__last_field bpf_token_path
/**
* @brief **bpf_object__open()** creates a bpf_object by opening
@@ -520,10 +542,12 @@ struct bpf_kprobe_multi_opts {
size_t cnt;
/* create return kprobes */
bool retprobe;
/* create session kprobes */
bool session;
size_t :0;
};
#define bpf_kprobe_multi_opts__last_field retprobe
#define bpf_kprobe_multi_opts__last_field session
LIBBPF_API struct bpf_link *
bpf_program__attach_kprobe_multi_opts(const struct bpf_program *prog,
@@ -741,9 +765,20 @@ bpf_program__attach_tracepoint_opts(const struct bpf_program *prog,
const char *tp_name,
const struct bpf_tracepoint_opts *opts);
struct bpf_raw_tracepoint_opts {
size_t sz; /* size of this struct for forward/backward compatibility */
__u64 cookie;
size_t :0;
};
#define bpf_raw_tracepoint_opts__last_field cookie
LIBBPF_API struct bpf_link *
bpf_program__attach_raw_tracepoint(const struct bpf_program *prog,
const char *tp_name);
LIBBPF_API struct bpf_link *
bpf_program__attach_raw_tracepoint_opts(const struct bpf_program *prog,
const char *tp_name,
struct bpf_raw_tracepoint_opts *opts);
struct bpf_trace_opts {
/* size of this struct, for forward/backward compatibility */
@@ -765,6 +800,8 @@ bpf_program__attach_cgroup(const struct bpf_program *prog, int cgroup_fd);
LIBBPF_API struct bpf_link *
bpf_program__attach_netns(const struct bpf_program *prog, int netns_fd);
LIBBPF_API struct bpf_link *
bpf_program__attach_sockmap(const struct bpf_program *prog, int map_fd);
LIBBPF_API struct bpf_link *
bpf_program__attach_xdp(const struct bpf_program *prog, int ifindex);
LIBBPF_API struct bpf_link *
bpf_program__attach_freplace(const struct bpf_program *prog,
@@ -941,6 +978,23 @@ bpf_object__prev_map(const struct bpf_object *obj, const struct bpf_map *map);
LIBBPF_API int bpf_map__set_autocreate(struct bpf_map *map, bool autocreate);
LIBBPF_API bool bpf_map__autocreate(const struct bpf_map *map);
/**
* @brief **bpf_map__set_autoattach()** sets whether libbpf has to auto-attach
* map during BPF skeleton attach phase.
* @param map the BPF map instance
* @param autoattach whether to attach map during BPF skeleton attach phase
* @return 0 on success; negative error code, otherwise
*/
LIBBPF_API int bpf_map__set_autoattach(struct bpf_map *map, bool autoattach);
/**
* @brief **bpf_map__autoattach()** returns whether BPF map is configured to
* auto-attach during BPF skeleton attach phase.
* @param map the BPF map instance
* @return true if map is set to auto-attach during skeleton attach phase; false, otherwise
*/
LIBBPF_API bool bpf_map__autoattach(const struct bpf_map *map);
/**
* @brief **bpf_map__fd()** gets the file descriptor of the passed
* BPF map
@@ -995,7 +1049,7 @@ LIBBPF_API int bpf_map__set_map_extra(struct bpf_map *map, __u64 map_extra);
LIBBPF_API int bpf_map__set_initial_value(struct bpf_map *map,
const void *data, size_t size);
LIBBPF_API void *bpf_map__initial_value(struct bpf_map *map, size_t *psize);
LIBBPF_API void *bpf_map__initial_value(const struct bpf_map *map, size_t *psize);
/**
* @brief **bpf_map__is_internal()** tells the caller whether or not the
@@ -1263,6 +1317,7 @@ LIBBPF_API int ring_buffer__add(struct ring_buffer *rb, int map_fd,
ring_buffer_sample_fn sample_cb, void *ctx);
LIBBPF_API int ring_buffer__poll(struct ring_buffer *rb, int timeout_ms);
LIBBPF_API int ring_buffer__consume(struct ring_buffer *rb);
LIBBPF_API int ring_buffer__consume_n(struct ring_buffer *rb, size_t n);
LIBBPF_API int ring_buffer__epoll_fd(const struct ring_buffer *rb);
/**
@@ -1337,6 +1392,17 @@ LIBBPF_API int ring__map_fd(const struct ring *r);
*/
LIBBPF_API int ring__consume(struct ring *r);
/**
* @brief **ring__consume_n()** consumes up to a requested amount of items from
* a ringbuffer without event polling.
*
* @param r A ringbuffer object.
* @param n Maximum amount of items to consume.
* @return The number of items consumed, or a negative number if any of the
* callbacks return an error.
*/
LIBBPF_API int ring__consume_n(struct ring *r, size_t n);
struct user_ring_buffer_opts {
size_t sz; /* size of this struct, for forward/backward compatibility */
};
@@ -1623,6 +1689,7 @@ struct bpf_map_skeleton {
const char *name;
struct bpf_map **map;
void **mmaped;
struct bpf_link **link;
};
struct bpf_prog_skeleton {

View File

@@ -245,7 +245,6 @@ LIBBPF_0.3.0 {
btf__parse_raw_split;
btf__parse_split;
btf__new_empty_split;
btf__new_split;
ring_buffer__epoll_fd;
} LIBBPF_0.2.0;
@@ -326,7 +325,6 @@ LIBBPF_0.7.0 {
bpf_xdp_detach;
bpf_xdp_query;
bpf_xdp_query_id;
btf_ext__raw_data;
libbpf_probe_bpf_helper;
libbpf_probe_bpf_map_type;
libbpf_probe_bpf_prog_type;
@@ -411,4 +409,21 @@ LIBBPF_1.3.0 {
} LIBBPF_1.2.0;
LIBBPF_1.4.0 {
global:
bpf_program__attach_raw_tracepoint_opts;
bpf_raw_tracepoint_open_opts;
bpf_token_create;
btf__new_split;
btf_ext__raw_data;
} LIBBPF_1.3.0;
LIBBPF_1.5.0 {
global:
btf__distill_base;
btf__relocate;
bpf_map__autoattach;
bpf_map__set_autoattach;
bpf_program__attach_sockmap;
ring__consume_n;
ring_buffer__consume_n;
} LIBBPF_1.4.0;

View File

@@ -15,9 +15,24 @@
#include <linux/err.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <libelf.h>
#include "relo_core.h"
/* Android's libc doesn't support AT_EACCESS in faccessat() implementation
* ([0]), and just returns -EINVAL even if file exists and is accessible.
* See [1] for issues caused by this.
*
* So just redefine it to 0 on Android.
*
* [0] https://android.googlesource.com/platform/bionic/+/refs/heads/android13-release/libc/bionic/faccessat.cpp#50
* [1] https://github.com/libbpf/libbpf-bootstrap/issues/250#issuecomment-1911324250
*/
#ifdef __ANDROID__
#undef AT_EACCESS
#define AT_EACCESS 0
#endif
/* make sure libbpf doesn't use kernel-only integer typedefs */
#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
@@ -219,6 +234,9 @@ struct btf_type;
struct btf_type *btf_type_by_id(const struct btf *btf, __u32 type_id);
const char *btf_kind_str(const struct btf_type *t);
const struct btf_type *skip_mods_and_typedefs(const struct btf *btf, __u32 id, __u32 *res_id);
const struct btf_header *btf_header(const struct btf *btf);
void btf_set_base_btf(struct btf *btf, const struct btf *base_btf);
int btf_relocate(struct btf *btf, const struct btf *base_btf, __u32 **id_map);
static inline enum btf_func_linkage btf_func_linkage(const struct btf_type *t)
{
@@ -357,18 +375,39 @@ enum kern_feature_id {
FEAT_SYSCALL_WRAPPER,
/* BPF multi-uprobe link support */
FEAT_UPROBE_MULTI_LINK,
/* Kernel supports arg:ctx tag (__arg_ctx) for global subprogs natively */
FEAT_ARG_CTX_TAG,
/* Kernel supports '?' at the front of datasec names */
FEAT_BTF_QMARK_DATASEC,
__FEAT_CNT,
};
int probe_memcg_account(void);
enum kern_feature_result {
FEAT_UNKNOWN = 0,
FEAT_SUPPORTED = 1,
FEAT_MISSING = 2,
};
struct kern_feature_cache {
enum kern_feature_result res[__FEAT_CNT];
int token_fd;
};
bool feat_supported(struct kern_feature_cache *cache, enum kern_feature_id feat_id);
bool kernel_supports(const struct bpf_object *obj, enum kern_feature_id feat_id);
int probe_kern_syscall_wrapper(int token_fd);
int probe_memcg_account(int token_fd);
int bump_rlimit_memlock(void);
int parse_cpu_mask_str(const char *s, bool **mask, int *mask_sz);
int parse_cpu_mask_file(const char *fcpu, bool **mask, int *mask_sz);
int libbpf__load_raw_btf(const char *raw_types, size_t types_len,
const char *str_sec, size_t str_len);
int btf_load_into_kernel(struct btf *btf, char *log_buf, size_t log_sz, __u32 log_level);
const char *str_sec, size_t str_len,
int token_fd);
int btf_load_into_kernel(struct btf *btf,
char *log_buf, size_t log_sz, __u32 log_level,
int token_fd);
struct btf *btf_get_from_fd(int btf_fd, struct btf *base_btf);
void btf_get_kernel_prefix_kind(enum bpf_attach_type attach_type,
@@ -472,21 +511,38 @@ struct bpf_line_info_min {
__u32 line_col;
};
enum btf_field_iter_kind {
BTF_FIELD_ITER_IDS,
BTF_FIELD_ITER_STRS,
};
struct btf_field_desc {
/* once-per-type offsets */
int t_off_cnt, t_offs[2];
/* member struct size, or zero, if no members */
int m_sz;
/* repeated per-member offsets */
int m_off_cnt, m_offs[1];
};
struct btf_field_iter {
struct btf_field_desc desc;
void *p;
int m_idx;
int off_idx;
int vlen;
};
int btf_field_iter_init(struct btf_field_iter *it, struct btf_type *t, enum btf_field_iter_kind iter_kind);
__u32 *btf_field_iter_next(struct btf_field_iter *it);
typedef int (*type_id_visit_fn)(__u32 *type_id, void *ctx);
typedef int (*str_off_visit_fn)(__u32 *str_off, void *ctx);
int btf_type_visit_type_ids(struct btf_type *t, type_id_visit_fn visit, void *ctx);
int btf_type_visit_str_offs(struct btf_type *t, str_off_visit_fn visit, void *ctx);
int btf_ext_visit_type_ids(struct btf_ext *btf_ext, type_id_visit_fn visit, void *ctx);
int btf_ext_visit_str_offs(struct btf_ext *btf_ext, str_off_visit_fn visit, void *ctx);
__s32 btf__find_by_name_kind_own(const struct btf *btf, const char *type_name,
__u32 kind);
typedef int (*kallsyms_cb_t)(unsigned long long sym_addr, char sym_type,
const char *sym_name, void *ctx);
int libbpf_kallsyms_parse(kallsyms_cb_t cb, void *arg);
/* handle direct returned errors */
static inline int libbpf_err(int ret)
{
@@ -532,6 +588,17 @@ static inline bool is_ldimm64_insn(struct bpf_insn *insn)
return insn->code == (BPF_LD | BPF_IMM | BPF_DW);
}
/* Unconditionally dup FD, ensuring it doesn't use [0, 2] range.
* Original FD is not closed or altered in any other way.
* Preserves original FD value, if it's invalid (negative).
*/
static inline int dup_good_fd(int fd)
{
if (fd < 0)
return fd;
return fcntl(fd, F_DUPFD_CLOEXEC, 3);
}
/* if fd is stdin, stdout, or stderr, dup to a fd greater than 2
* Takes ownership of the fd passed in, and closes it if calling
* fcntl(fd, F_DUPFD_CLOEXEC, 3).
@@ -543,7 +610,7 @@ static inline int ensure_good_fd(int fd)
if (fd < 0)
return fd;
if (fd < 3) {
fd = fcntl(fd, F_DUPFD_CLOEXEC, 3);
fd = dup_good_fd(fd);
saved_errno = errno;
close(old_fd);
errno = saved_errno;
@@ -555,6 +622,11 @@ static inline int ensure_good_fd(int fd)
return fd;
}
static inline int sys_dup3(int oldfd, int newfd, int flags)
{
return syscall(__NR_dup3, oldfd, newfd, flags);
}
/* Point *fixed_fd* to the same file that *tmp_fd* points to.
* Regardless of success, *tmp_fd* is closed.
* Whatever *fixed_fd* pointed to is closed silently.
@@ -563,7 +635,7 @@ static inline int reuse_fd(int fixed_fd, int tmp_fd)
{
int err;
err = dup2(tmp_fd, fixed_fd);
err = sys_dup3(tmp_fd, fixed_fd, O_CLOEXEC);
err = err < 0 ? -errno : 0;
close(tmp_fd); /* clean up temporary FD */
return err;
@@ -613,4 +685,6 @@ int elf_resolve_syms_offsets(const char *binary_path, int cnt,
int elf_resolve_pattern_offsets(const char *binary_path, const char *pattern,
unsigned long **poffsets, size_t *pcnt);
int probe_fd(int fd);
#endif /* __LIBBPF_LIBBPF_INTERNAL_H */

View File

@@ -74,9 +74,6 @@ static __u32 get_debian_kernel_version(struct utsname *info)
if (sscanf(p, "Debian %u.%u.%u", &major, &minor, &patch) != 3)
return 0;
if (major == 4 && minor == 19 && patch > 255)
return KERNEL_VERSION(major, minor, 255);
return KERNEL_VERSION(major, minor, patch);
}
@@ -222,7 +219,8 @@ int libbpf_probe_bpf_prog_type(enum bpf_prog_type prog_type, const void *opts)
}
int libbpf__load_raw_btf(const char *raw_types, size_t types_len,
const char *str_sec, size_t str_len)
const char *str_sec, size_t str_len,
int token_fd)
{
struct btf_header hdr = {
.magic = BTF_MAGIC,
@@ -232,6 +230,10 @@ int libbpf__load_raw_btf(const char *raw_types, size_t types_len,
.str_off = types_len,
.str_len = str_len,
};
LIBBPF_OPTS(bpf_btf_load_opts, opts,
.token_fd = token_fd,
.btf_flags = token_fd ? BPF_F_TOKEN_FD : 0,
);
int btf_fd, btf_len;
__u8 *raw_btf;
@@ -244,7 +246,7 @@ int libbpf__load_raw_btf(const char *raw_types, size_t types_len,
memcpy(raw_btf + hdr.hdr_len, raw_types, hdr.type_len);
memcpy(raw_btf + hdr.hdr_len + hdr.type_len, str_sec, hdr.str_len);
btf_fd = bpf_btf_load(raw_btf, btf_len, NULL);
btf_fd = bpf_btf_load(raw_btf, btf_len, &opts);
free(raw_btf);
return btf_fd;
@@ -274,7 +276,7 @@ static int load_local_storage_btf(void)
};
return libbpf__load_raw_btf((char *)types, sizeof(types),
strs, sizeof(strs));
strs, sizeof(strs), 0);
}
static int probe_map_create(enum bpf_map_type map_type)
@@ -329,12 +331,20 @@ static int probe_map_create(enum bpf_map_type map_type)
case BPF_MAP_TYPE_STRUCT_OPS:
/* we'll get -ENOTSUPP for invalid BTF type ID for struct_ops */
opts.btf_vmlinux_value_type_id = 1;
opts.value_type_btf_obj_fd = -1;
exp_err = -524; /* -ENOTSUPP */
break;
case BPF_MAP_TYPE_BLOOM_FILTER:
key_size = 0;
max_entries = 1;
break;
case BPF_MAP_TYPE_ARENA:
key_size = 0;
value_size = 0;
max_entries = 1; /* one page */
opts.map_extra = 0; /* can mmap() at any address */
opts.map_flags = BPF_F_MMAPABLE;
break;
case BPF_MAP_TYPE_HASH:
case BPF_MAP_TYPE_ARRAY:
case BPF_MAP_TYPE_PROG_ARRAY:
@@ -438,7 +448,8 @@ int libbpf_probe_bpf_helper(enum bpf_prog_type prog_type, enum bpf_func_id helpe
/* If BPF verifier doesn't recognize BPF helper ID (enum bpf_func_id)
* at all, it will emit something like "invalid func unknown#181".
* If BPF verifier recognizes BPF helper but it's not supported for
* given BPF program type, it will emit "unknown func bpf_sys_bpf#166".
* given BPF program type, it will emit "unknown func bpf_sys_bpf#166"
* or "program of this type cannot use helper bpf_sys_bpf#166".
* In both cases, provided combination of BPF program type and BPF
* helper is not supported by the kernel.
* In all other cases, probe_prog_load() above will either succeed (e.g.,
@@ -447,7 +458,8 @@ int libbpf_probe_bpf_helper(enum bpf_prog_type prog_type, enum bpf_func_id helpe
* that), or we'll get some more specific BPF verifier error about
* some unsatisfied conditions.
*/
if (ret == 0 && (strstr(buf, "invalid func ") || strstr(buf, "unknown func ")))
if (ret == 0 && (strstr(buf, "invalid func ") || strstr(buf, "unknown func ") ||
strstr(buf, "program of this type cannot use helper ")))
return 0;
return 1; /* assume supported */
}

View File

@@ -4,6 +4,6 @@
#define __LIBBPF_VERSION_H
#define LIBBPF_MAJOR_VERSION 1
#define LIBBPF_MINOR_VERSION 4
#define LIBBPF_MINOR_VERSION 5
#endif /* __LIBBPF_VERSION_H */

View File

@@ -957,19 +957,33 @@ static int check_btf_str_off(__u32 *str_off, void *ctx)
static int linker_sanity_check_btf(struct src_obj *obj)
{
struct btf_type *t;
int i, n, err = 0;
int i, n, err;
if (!obj->btf)
return 0;
n = btf__type_cnt(obj->btf);
for (i = 1; i < n; i++) {
struct btf_field_iter it;
__u32 *type_id, *str_off;
t = btf_type_by_id(obj->btf, i);
err = err ?: btf_type_visit_type_ids(t, check_btf_type_id, obj->btf);
err = err ?: btf_type_visit_str_offs(t, check_btf_str_off, obj->btf);
err = btf_field_iter_init(&it, t, BTF_FIELD_ITER_IDS);
if (err)
return err;
while ((type_id = btf_field_iter_next(&it))) {
if (*type_id >= n)
return -EINVAL;
}
err = btf_field_iter_init(&it, t, BTF_FIELD_ITER_STRS);
if (err)
return err;
while ((str_off = btf_field_iter_next(&it))) {
if (!btf__str_by_offset(obj->btf, *str_off))
return -EINVAL;
}
}
return 0;
@@ -2213,10 +2227,17 @@ static int linker_fixup_btf(struct src_obj *obj)
vi = btf_var_secinfos(t);
for (j = 0, m = btf_vlen(t); j < m; j++, vi++) {
const struct btf_type *vt = btf__type_by_id(obj->btf, vi->type);
const char *var_name = btf__str_by_offset(obj->btf, vt->name_off);
int var_linkage = btf_var(vt)->linkage;
const char *var_name;
int var_linkage;
Elf64_Sym *sym;
/* could be a variable or function */
if (!btf_is_var(vt))
continue;
var_name = btf__str_by_offset(obj->btf, vt->name_off);
var_linkage = btf_var(vt)->linkage;
/* no need to patch up static or extern vars */
if (var_linkage != BTF_VAR_GLOBAL_ALLOCATED)
continue;
@@ -2234,26 +2255,10 @@ static int linker_fixup_btf(struct src_obj *obj)
return 0;
}
static int remap_type_id(__u32 *type_id, void *ctx)
{
int *id_map = ctx;
int new_id = id_map[*type_id];
/* Error out if the type wasn't remapped. Ignore VOID which stays VOID. */
if (new_id == 0 && *type_id != 0) {
pr_warn("failed to find new ID mapping for original BTF type ID %u\n", *type_id);
return -EINVAL;
}
*type_id = id_map[*type_id];
return 0;
}
static int linker_append_btf(struct bpf_linker *linker, struct src_obj *obj)
{
const struct btf_type *t;
int i, j, n, start_id, id;
int i, j, n, start_id, id, err;
const char *name;
if (!obj->btf)
@@ -2324,9 +2329,25 @@ static int linker_append_btf(struct bpf_linker *linker, struct src_obj *obj)
n = btf__type_cnt(linker->btf);
for (i = start_id; i < n; i++) {
struct btf_type *dst_t = btf_type_by_id(linker->btf, i);
struct btf_field_iter it;
__u32 *type_id;
if (btf_type_visit_type_ids(dst_t, remap_type_id, obj->btf_type_map))
return -EINVAL;
err = btf_field_iter_init(&it, dst_t, BTF_FIELD_ITER_IDS);
if (err)
return err;
while ((type_id = btf_field_iter_next(&it))) {
int new_id = obj->btf_type_map[*type_id];
/* Error out if the type wasn't remapped. Ignore VOID which stays VOID. */
if (new_id == 0 && *type_id != 0) {
pr_warn("failed to find new ID mapping for original BTF type ID %u\n",
*type_id);
return -EINVAL;
}
*type_id = obj->btf_type_map[*type_id];
}
}
/* Rewrite VAR/FUNC underlying types (i.e., FUNC's FUNC_PROTO and VAR's
@@ -2732,7 +2753,7 @@ static int finalize_btf(struct bpf_linker *linker)
/* Emit .BTF.ext section */
if (linker->btf_ext) {
raw_data = btf_ext__get_raw_data(linker->btf_ext, &raw_sz);
raw_data = btf_ext__raw_data(linker->btf_ext, &raw_sz);
if (!raw_data)
return -ENOMEM;

View File

@@ -496,8 +496,8 @@ int bpf_xdp_query(int ifindex, int xdp_flags, struct bpf_xdp_query_opts *opts)
if (err)
return libbpf_err(err);
opts->feature_flags = md.flags;
opts->xdp_zc_max_segs = md.xdp_zc_max_segs;
OPTS_SET(opts, feature_flags, md.flags);
OPTS_SET(opts, xdp_zc_max_segs, md.xdp_zc_max_segs);
skip_feature_flags:
return 0;

View File

@@ -231,7 +231,7 @@ static inline int roundup_len(__u32 len)
return (len + 7) / 8 * 8;
}
static int64_t ringbuf_process_ring(struct ring *r)
static int64_t ringbuf_process_ring(struct ring *r, size_t n)
{
int *len_ptr, len, err;
/* 64-bit to avoid overflow in case of extreme application behavior */
@@ -268,12 +268,42 @@ static int64_t ringbuf_process_ring(struct ring *r)
}
smp_store_release(r->consumer_pos, cons_pos);
if (cnt >= n)
goto done;
}
} while (got_new_data);
done:
return cnt;
}
/* Consume available ring buffer(s) data without event polling, up to n
* records.
*
* Returns number of records consumed across all registered ring buffers (or
* n, whichever is less), or negative number if any of the callbacks return
* error.
*/
int ring_buffer__consume_n(struct ring_buffer *rb, size_t n)
{
int64_t err, res = 0;
int i;
for (i = 0; i < rb->ring_cnt; i++) {
struct ring *ring = rb->rings[i];
err = ringbuf_process_ring(ring, n);
if (err < 0)
return libbpf_err(err);
res += err;
n -= err;
if (n == 0)
break;
}
return res > INT_MAX ? INT_MAX : res;
}
/* Consume available ring buffer(s) data without event polling.
* Returns number of records consumed across all registered ring buffers (or
* INT_MAX, whichever is less), or negative number if any of the callbacks
@@ -287,13 +317,15 @@ int ring_buffer__consume(struct ring_buffer *rb)
for (i = 0; i < rb->ring_cnt; i++) {
struct ring *ring = rb->rings[i];
err = ringbuf_process_ring(ring);
err = ringbuf_process_ring(ring, INT_MAX);
if (err < 0)
return libbpf_err(err);
res += err;
if (res > INT_MAX) {
res = INT_MAX;
break;
}
}
if (res > INT_MAX)
return INT_MAX;
return res;
}
@@ -314,13 +346,13 @@ int ring_buffer__poll(struct ring_buffer *rb, int timeout_ms)
__u32 ring_id = rb->events[i].data.fd;
struct ring *ring = rb->rings[ring_id];
err = ringbuf_process_ring(ring);
err = ringbuf_process_ring(ring, INT_MAX);
if (err < 0)
return libbpf_err(err);
res += err;
}
if (res > INT_MAX)
return INT_MAX;
res = INT_MAX;
return res;
}
@@ -371,17 +403,22 @@ int ring__map_fd(const struct ring *r)
return r->map_fd;
}
int ring__consume(struct ring *r)
int ring__consume_n(struct ring *r, size_t n)
{
int64_t res;
res = ringbuf_process_ring(r);
res = ringbuf_process_ring(r, n);
if (res < 0)
return libbpf_err(res);
return res > INT_MAX ? INT_MAX : res;
}
int ring__consume(struct ring *r)
{
return ring__consume_n(r, INT_MAX);
}
static void user_ringbuf_unmap_ring(struct user_ring_buffer *rb)
{
if (rb->consumer_pos) {

View File

@@ -2,6 +2,7 @@
#undef _GNU_SOURCE
#include <string.h>
#include <stdio.h>
#include <errno.h>
#include "str_error.h"
/* make sure libbpf doesn't use kernel-only integer typedefs */
@@ -15,7 +16,18 @@
char *libbpf_strerror_r(int err, char *dst, int len)
{
int ret = strerror_r(err < 0 ? -err : err, dst, len);
if (ret)
snprintf(dst, len, "ERROR: strerror_r(%d)=%d", err, ret);
/* on glibc <2.13, ret == -1 and errno is set, if strerror_r() can't
* handle the error, on glibc >=2.13 *positive* (errno-like) error
* code is returned directly
*/
if (ret == -1)
ret = errno;
if (ret) {
if (ret == EINVAL)
/* strerror_r() doesn't recognize this specific error */
snprintf(dst, len, "unknown error (%d)", err < 0 ? err : -err);
else
snprintf(dst, len, "ERROR: strerror_r(%d)=%d", err, ret);
}
return dst;
}

View File

@@ -2,5 +2,8 @@
#ifndef __LIBBPF_STR_ERROR_H
#define __LIBBPF_STR_ERROR_H
#define STRERR_BUFSIZE 128
char *libbpf_strerror_r(int err, char *dst, int len);
#endif /* __LIBBPF_STR_ERROR_H */

View File

@@ -214,18 +214,18 @@ long bpf_usdt_cookie(struct pt_regs *ctx)
/* we rely on ___bpf_apply() and ___bpf_narg() macros already defined in bpf_tracing.h */
#define ___bpf_usdt_args0() ctx
#define ___bpf_usdt_args1(x) ___bpf_usdt_args0(), ({ long _x; bpf_usdt_arg(ctx, 0, &_x); (void *)_x; })
#define ___bpf_usdt_args2(x, args...) ___bpf_usdt_args1(args), ({ long _x; bpf_usdt_arg(ctx, 1, &_x); (void *)_x; })
#define ___bpf_usdt_args3(x, args...) ___bpf_usdt_args2(args), ({ long _x; bpf_usdt_arg(ctx, 2, &_x); (void *)_x; })
#define ___bpf_usdt_args4(x, args...) ___bpf_usdt_args3(args), ({ long _x; bpf_usdt_arg(ctx, 3, &_x); (void *)_x; })
#define ___bpf_usdt_args5(x, args...) ___bpf_usdt_args4(args), ({ long _x; bpf_usdt_arg(ctx, 4, &_x); (void *)_x; })
#define ___bpf_usdt_args6(x, args...) ___bpf_usdt_args5(args), ({ long _x; bpf_usdt_arg(ctx, 5, &_x); (void *)_x; })
#define ___bpf_usdt_args7(x, args...) ___bpf_usdt_args6(args), ({ long _x; bpf_usdt_arg(ctx, 6, &_x); (void *)_x; })
#define ___bpf_usdt_args8(x, args...) ___bpf_usdt_args7(args), ({ long _x; bpf_usdt_arg(ctx, 7, &_x); (void *)_x; })
#define ___bpf_usdt_args9(x, args...) ___bpf_usdt_args8(args), ({ long _x; bpf_usdt_arg(ctx, 8, &_x); (void *)_x; })
#define ___bpf_usdt_args10(x, args...) ___bpf_usdt_args9(args), ({ long _x; bpf_usdt_arg(ctx, 9, &_x); (void *)_x; })
#define ___bpf_usdt_args11(x, args...) ___bpf_usdt_args10(args), ({ long _x; bpf_usdt_arg(ctx, 10, &_x); (void *)_x; })
#define ___bpf_usdt_args12(x, args...) ___bpf_usdt_args11(args), ({ long _x; bpf_usdt_arg(ctx, 11, &_x); (void *)_x; })
#define ___bpf_usdt_args1(x) ___bpf_usdt_args0(), ({ long _x; bpf_usdt_arg(ctx, 0, &_x); _x; })
#define ___bpf_usdt_args2(x, args...) ___bpf_usdt_args1(args), ({ long _x; bpf_usdt_arg(ctx, 1, &_x); _x; })
#define ___bpf_usdt_args3(x, args...) ___bpf_usdt_args2(args), ({ long _x; bpf_usdt_arg(ctx, 2, &_x); _x; })
#define ___bpf_usdt_args4(x, args...) ___bpf_usdt_args3(args), ({ long _x; bpf_usdt_arg(ctx, 3, &_x); _x; })
#define ___bpf_usdt_args5(x, args...) ___bpf_usdt_args4(args), ({ long _x; bpf_usdt_arg(ctx, 4, &_x); _x; })
#define ___bpf_usdt_args6(x, args...) ___bpf_usdt_args5(args), ({ long _x; bpf_usdt_arg(ctx, 5, &_x); _x; })
#define ___bpf_usdt_args7(x, args...) ___bpf_usdt_args6(args), ({ long _x; bpf_usdt_arg(ctx, 6, &_x); _x; })
#define ___bpf_usdt_args8(x, args...) ___bpf_usdt_args7(args), ({ long _x; bpf_usdt_arg(ctx, 7, &_x); _x; })
#define ___bpf_usdt_args9(x, args...) ___bpf_usdt_args8(args), ({ long _x; bpf_usdt_arg(ctx, 8, &_x); _x; })
#define ___bpf_usdt_args10(x, args...) ___bpf_usdt_args9(args), ({ long _x; bpf_usdt_arg(ctx, 9, &_x); _x; })
#define ___bpf_usdt_args11(x, args...) ___bpf_usdt_args10(args), ({ long _x; bpf_usdt_arg(ctx, 10, &_x); _x; })
#define ___bpf_usdt_args12(x, args...) ___bpf_usdt_args11(args), ({ long _x; bpf_usdt_arg(ctx, 11, &_x); _x; })
#define ___bpf_usdt_args(args...) ___bpf_apply(___bpf_usdt_args, ___bpf_narg(args))(args)
/*