Compare commits

...

150 Commits

Author SHA1 Message Date
Mykyta Yatsenko
02bdeb7a2c sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   8e64c387c942229c551d0f23de4d9993d3a2acb6
Checkpoint bpf-next commit: 9325d53fe9adff354b6a93fda5f38c165947da0f
Baseline bpf commit:        b4432656b36e5cc1d50a1f2dc15357543add530e
Checkpoint bpf commit:      b4432656b36e5cc1d50a1f2dc15357543add530e

Andrii Nakryiko (1):
  libbpf: Improve BTF dedup handling of "identical" BTF types

Anton Protopopov (3):
  libbpf: Use proper errno value in linker
  bpf: Fix uninitialized values in BPF_{CORE,PROBE}_READ
  libbpf: Use proper errno value in nlattr

Jiri Olsa (1):
  bpf: Add support to retrieve ref_ctr_offset for uprobe perf link

Mykyta Yatsenko (1):
  libbpf: Check bpf_map_skeleton link for NULL

 include/uapi/linux/bpf.h |   1 +
 src/bpf_core_read.h      |   6 ++
 src/btf.c                | 137 +++++++++++++++++++++++++--------------
 src/libbpf.c             |   6 ++
 src/linker.c             |   4 +-
 src/nlattr.c             |  15 ++---
 6 files changed, 111 insertions(+), 58 deletions(-)

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
2025-05-19 10:07:42 -07:00
Mykyta Yatsenko
453601a65a libbpf: Check bpf_map_skeleton link for NULL
Avoid dereferencing bpf_map_skeleton's link field if it's NULL.
If BPF map skeleton is created with the size, that indicates containing
link field, but the field was not actually initialized with valid
bpf_link pointer, libbpf crashes. This may happen when using libbpf-rs
skeleton.
Skeleton loading may still progress, but user needs to attach struct_ops
map separately.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250514113220.219095-1-mykyta.yatsenko5@gmail.com
2025-05-19 10:07:42 -07:00
Anton Protopopov
8e34ca4e8f libbpf: Use proper errno value in nlattr
Return value of the validate_nla() function can be propagated all the
way up to users of libbpf API. In case of error this libbpf version
of validate_nla returns -1 which will be seen as -EPERM from user's
point of view. Instead, return a more reasonable -EINVAL.

Fixes: bbf48c18ee0c ("libbpf: add error reporting in XDP")
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250510182011.2246631-1-a.s.protopopov@gmail.com
2025-05-19 10:07:42 -07:00
Jiri Olsa
eda0e4ca46 bpf: Add support to retrieve ref_ctr_offset for uprobe perf link
Adding support to retrieve ref_ctr_offset for uprobe perf link,
which got somehow omitted from the initial uprobe link info changes.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/bpf/20250509153539.779599-2-jolsa@kernel.org
2025-05-19 10:07:42 -07:00
Andrii Nakryiko
5ee9fbf7d7 libbpf: Improve BTF dedup handling of "identical" BTF types
BTF dedup has a strong assumption that compiler with deduplicate identical
types within any given compilation unit (i.e., .c file). This property
is used when establishing equilvalence of two subgraphs of types.

Unfortunately, this property doesn't always holds in practice. We've
seen cases of having truly identical structs, unions, array definitions,
and, most recently, even pointers to the same type being duplicated
within CU.

Previously, we mitigated this on a case-by-case basis, adding a few
simple heuristics for validating that two BTF types (having two
different type IDs) are structurally the same. But this approach scales
poorly, and we can have more weird cases come up in the future.

So let's take a half-step back, and implement a bit more generic
structural equivalence check, recursively. We still limit it to
reasonable depth to avoid long reference loops. Depth-wise limiting of
potentially cyclical graph isn't great, but as I mentioned below doesn't
seem to be detrimental performance-wise. We can always improve this in
the future with per-type visited markers, if necessary.

Performance-wise this doesn't seem too affect vmlinux BTF dedup, which
makes sense because this logic kicks in not so frequently and only if we
already established a canonical candidate type match, but suddenly find
a different (but probably identical) type.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/r/20250501235231.1339822-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-05-19 10:07:42 -07:00
Anton Protopopov
d88ca95133 bpf: Fix uninitialized values in BPF_{CORE,PROBE}_READ
With the latest LLVM bpf selftests build will fail with
the following error message:

    progs/profiler.inc.h:710:31: error: default initialization of an object of type 'typeof ((parent_task)->real_cred->uid.val)' (aka 'const unsigned int') leaves the object uninitialized and is incompatible with C++ [-Werror,-Wdefault-const-init-unsafe]
      710 |         proc_exec_data->parent_uid = BPF_CORE_READ(parent_task, real_cred, uid.val);
          |                                      ^
    tools/testing/selftests/bpf/tools/include/bpf/bpf_core_read.h:520:35: note: expanded from macro 'BPF_CORE_READ'
      520 |         ___type((src), a, ##__VA_ARGS__) __r;                               \
          |                                          ^

This happens because BPF_CORE_READ (and other macro) declare the
variable __r using the ___type macro which can inherit const modifier
from intermediate types.

Fix this by using __typeof_unqual__, when supported. (And when it
is not supported, the problem shouldn't appear, as older compilers
haven't complained.)

Fixes: 792001f4f7aa ("libbpf: Add user-space variants of BPF_CORE_READ() family of macros")
Fixes: a4b09a9ef945 ("libbpf: Add non-CO-RE variants of BPF_CORE_READ() macro family")
Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250502193031.3522715-1-a.s.protopopov@gmail.com
2025-05-19 10:07:42 -07:00
Anton Protopopov
28deee2663 libbpf: Use proper errno value in linker
Return values of the linker_append_sec_data() and the
linker_append_elf_relos() functions are propagated all the
way up to users of libbpf API. In some error cases these
functions return -1 which will be seen as -EPERM from user's
point of view. Instead, return a more reasonable -EINVAL.

Fixes: faf6ed321cf6 ("libbpf: Add BPF static linker APIs")
Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250430120820.2262053-1-a.s.protopopov@gmail.com
2025-05-19 10:07:42 -07:00
Andrii Nakryiko
374f7807e1 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   25601e85441dd91cf7973b002f27af4c5b8691ea
Checkpoint bpf-next commit: 8e64c387c942229c551d0f23de4d9993d3a2acb6
Baseline bpf commit:        3f8ad18f8184
Checkpoint bpf commit:      b4432656b36e5cc1d50a1f2dc15357543add530e

Alan Maguire (1):
  libbpf: Add identical pointer detection to btf_dedup_is_equiv()

Anton Protopopov (2):
  bpf: Fix a comment describing bpf_attr
  libbpf: Add likely/unlikely macros and use them in selftests

Carlos Llamas (1):
  libbpf: Fix implicit memfd_create() for bionic

Feng Yang (1):
  libbpf: Fix event name too long error

Ihor Solodrai (1):
  libbpf: Verify section type in btf_find_elf_sections

Jonathan Wiepert (1):
  Use thread-safe function pointer in libbpf_print

Mykyta Yatsenko (1):
  libbpf: Add getters for BTF.ext func and line info

Paul Chaignon (2):
  bpf: Clarify role of BPF_F_RECOMPUTE_CSUM
  bpf: Clarify the meaning of BPF_F_PSEUDO_HDR

Tao Chen (1):
  libbpf: Remove sample_period init in perf_buffer

Viktor Malik (1):
  libbpf: Fix buffer overflow in bpf_object__init_prog

 include/uapi/linux/bpf.h | 18 +++++----
 src/bpf_helpers.h        |  8 ++++
 src/btf.c                | 22 +++++++++++
 src/libbpf.c             | 81 +++++++++++++++++++++-------------------
 src/libbpf.h             |  6 +++
 src/libbpf.map           |  4 ++
 src/libbpf_internal.h    |  9 +++++
 src/linker.c             |  2 +-
 8 files changed, 103 insertions(+), 47 deletions(-)

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2025-04-29 11:33:37 -07:00
Andrii Nakryiko
27dc274f68 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2025-04-29 11:33:37 -07:00
Alan Maguire
c8a28812fb libbpf: Add identical pointer detection to btf_dedup_is_equiv()
Recently as a side-effect of

commit ac053946f5c4 ("compiler.h: introduce TYPEOF_UNQUAL() macro")

issues were observed in deduplication between modules and kernel BTF
such that a large number of kernel types were not deduplicated so
were found in module BTF (task_struct, bpf_prog etc).  The root cause
appeared to be a failure to dedup struct types, specifically those
with members that were pointers with __percpu annotations.

The issue in dedup is at the point that we are deduplicating structures,
we have not yet deduplicated reference types like pointers.  If multiple
copies of a pointer point at the same (deduplicated) integer as in this
case, we do not see them as identical.  Special handling already exists
to deal with structures and arrays, so add pointer handling here too.

Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250429161042.2069678-1-alan.maguire@oracle.com
2025-04-29 11:33:37 -07:00
Jonathan Wiepert
88ae865423 Use thread-safe function pointer in libbpf_print
This patch fixes a thread safety bug where libbpf_print uses the
global variable storing the print function pointer rather than the local
variable that had the print function set via __atomic_load_n.

Fixes: f1cb927cdb62 ("libbpf: Ensure print callback usage is thread-safe")
Signed-off-by: Jonathan Wiepert <jonathan.wiepert@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Mykyta Yatsenko <mykyta.yatsenko5@gmail.com>
Link: https://lore.kernel.org/bpf/20250424221457.793068-1-jonathan.wiepert@gmail.com
2025-04-29 11:33:37 -07:00
Tao Chen
a2dc135196 libbpf: Remove sample_period init in perf_buffer
It seems that sample_period is not used in perf buffer. Actually, only
wakeup_events are meaningful to enable events aggregation for wakeup notification.
Remove sample_period setting code to avoid confusion.

Fixes: fb84b8224655 ("libbpf: add perf buffer API")
Signed-off-by: Tao Chen <chen.dylane@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/bpf/20250423163901.2983689-1-chen.dylane@linux.dev
2025-04-29 11:33:37 -07:00
Feng Yang
715808d3e2 libbpf: Fix event name too long error
When the binary path is excessively long, the generated probe_name in libbpf
exceeds the kernel's MAX_EVENT_NAME_LEN limit (64 bytes).
This causes legacy uprobe event attachment to fail with error code -22.

The fix reorders the fields to place the unique ID before the name.
This ensures that even if truncation occurs via snprintf, the unique ID
remains intact, preserving event name uniqueness. Additionally, explicit
checks with MAX_EVENT_NAME_LEN are added to enforce length constraints.

Before Fix:
	./test_progs -t attach_probe/kprobe-long_name
	......
	libbpf: failed to add legacy kprobe event for 'bpf_testmod_looooooooooooooooooooooooooooooong_name+0x0': -EINVAL
	libbpf: prog 'handle_kprobe': failed to create kprobe 'bpf_testmod_looooooooooooooooooooooooooooooong_name+0x0' perf event: -EINVAL
	test_attach_kprobe_long_event_name:FAIL:attach_kprobe_long_event_name unexpected error: -22
	test_attach_probe:PASS:uprobe_ref_ctr_cleanup 0 nsec
	#13/11   attach_probe/kprobe-long_name:FAIL
	#13      attach_probe:FAIL

	./test_progs -t attach_probe/uprobe-long_name
	......
	libbpf: failed to add legacy uprobe event for /root/linux-bpf/bpf-next/tools/testing/selftests/bpf/test_progs:0x13efd9: -EINVAL
	libbpf: prog 'handle_uprobe': failed to create uprobe '/root/linux-bpf/bpf-next/tools/testing/selftests/bpf/test_progs:0x13efd9' perf event: -EINVAL
	test_attach_uprobe_long_event_name:FAIL:attach_uprobe_long_event_name unexpected error: -22
	#13/10   attach_probe/uprobe-long_name:FAIL
	#13      attach_probe:FAIL
After Fix:
	./test_progs -t attach_probe/uprobe-long_name
	#13/10   attach_probe/uprobe-long_name:OK
	#13      attach_probe:OK
	Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED

	./test_progs -t attach_probe/kprobe-long_name
	#13/11   attach_probe/kprobe-long_name:OK
	#13      attach_probe:OK
	Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED

Fixes: 46ed5fc33db9 ("libbpf: Refactor and simplify legacy kprobe code")
Fixes: cc10623c6810 ("libbpf: Add legacy uprobe attaching support")
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Feng Yang <yangfeng@kylinos.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250417014848.59321-2-yangfeng59949@163.com
2025-04-29 11:33:37 -07:00
Ihor Solodrai
02bc656f90 libbpf: Verify section type in btf_find_elf_sections
A valid ELF file may contain a SHT_NOBITS .BTF section. This case is
not handled correctly in btf_parse_elf, which leads to a segfault.

Before attempting to load BTF section data, check that the section
type is SHT_PROGBITS, which is the expected type for BTF data.  Fail
with an error if the type is different.

Bug report: https://github.com/libbpf/libbpf/issues/894
v1: https://lore.kernel.org/bpf/20250408184104.3962949-1-ihor.solodrai@linux.dev/

Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250410182823.1591681-1-ihor.solodrai@linux.dev
2025-04-29 11:33:37 -07:00
Viktor Malik
806b4e0a9f libbpf: Fix buffer overflow in bpf_object__init_prog
As shown in [1], it is possible to corrupt a BPF ELF file such that
arbitrary BPF instructions are loaded by libbpf. This can be done by
setting a symbol (BPF program) section offset to a large (unsigned)
number such that <section start + symbol offset> overflows and points
before the section data in the memory.

Consider the situation below where:
- prog_start = sec_start + symbol_offset    <-- size_t overflow here
- prog_end   = prog_start + prog_size

    prog_start        sec_start        prog_end        sec_end
        |                |                 |              |
        v                v                 v              v
    .....................|################################|............

The report in [1] also provides a corrupted BPF ELF which can be used as
a reproducer:

    $ readelf -S crash
    Section Headers:
      [Nr] Name              Type             Address           Offset
           Size              EntSize          Flags  Link  Info  Align
    ...
      [ 2] uretprobe.mu[...] PROGBITS         0000000000000000  00000040
           0000000000000068  0000000000000000  AX       0     0     8

    $ readelf -s crash
    Symbol table '.symtab' contains 8 entries:
       Num:    Value          Size Type    Bind   Vis      Ndx Name
    ...
         6: ffffffffffffffb8   104 FUNC    GLOBAL DEFAULT    2 handle_tp

Here, the handle_tp prog has section offset ffffffffffffffb8, i.e. will
point before the actual memory where section 2 is allocated.

This is also reported by AddressSanitizer:

    =================================================================
    ==1232==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7c7302fe0000 at pc 0x7fc3046e4b77 bp 0x7ffe64677cd0 sp 0x7ffe64677490
    READ of size 104 at 0x7c7302fe0000 thread T0
        #0 0x7fc3046e4b76 in memcpy (/lib64/libasan.so.8+0xe4b76)
        #1 0x00000040df3e in bpf_object__init_prog /src/libbpf/src/libbpf.c:856
        #2 0x00000040df3e in bpf_object__add_programs /src/libbpf/src/libbpf.c:928
        #3 0x00000040df3e in bpf_object__elf_collect /src/libbpf/src/libbpf.c:3930
        #4 0x00000040df3e in bpf_object_open /src/libbpf/src/libbpf.c:8067
        #5 0x00000040f176 in bpf_object__open_file /src/libbpf/src/libbpf.c:8090
        #6 0x000000400c16 in main /poc/poc.c:8
        #7 0x7fc3043d25b4 in __libc_start_call_main (/lib64/libc.so.6+0x35b4)
        #8 0x7fc3043d2667 in __libc_start_main@@GLIBC_2.34 (/lib64/libc.so.6+0x3667)
        #9 0x000000400b34 in _start (/poc/poc+0x400b34)

    0x7c7302fe0000 is located 64 bytes before 104-byte region [0x7c7302fe0040,0x7c7302fe00a8)
    allocated by thread T0 here:
        #0 0x7fc3046e716b in malloc (/lib64/libasan.so.8+0xe716b)
        #1 0x7fc3045ee600 in __libelf_set_rawdata_wrlock (/lib64/libelf.so.1+0xb600)
        #2 0x7fc3045ef018 in __elf_getdata_rdlock (/lib64/libelf.so.1+0xc018)
        #3 0x00000040642f in elf_sec_data /src/libbpf/src/libbpf.c:3740

The problem here is that currently, libbpf only checks that the program
end is within the section bounds. There used to be a check
`while (sec_off < sec_sz)` in bpf_object__add_programs, however, it was
removed by commit 6245947c1b3c ("libbpf: Allow gaps in BPF program
sections to support overriden weak functions").

Add a check for detecting the overflow of `sec_off + prog_sz` to
bpf_object__init_prog to fix this issue.

[1] https://github.com/lmarch2/poc/blob/main/libbpf/libbpf.md

Fixes: 6245947c1b3c ("libbpf: Allow gaps in BPF program sections to support overriden weak functions")
Reported-by: lmarch2 <2524158037@qq.com>
Signed-off-by: Viktor Malik <vmalik@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Link: https://github.com/lmarch2/poc/blob/main/libbpf/libbpf.md
Link: https://lore.kernel.org/bpf/20250415155014.397603-1-vmalik@redhat.com
2025-04-29 11:33:37 -07:00
Paul Chaignon
0c85f5a154 bpf: Clarify the meaning of BPF_F_PSEUDO_HDR
In the bpf_l4_csum_replace helper, the BPF_F_PSEUDO_HDR flag should only
be set if the modified header field is part of the pseudo-header.

If you modify for example the UDP ports and pass BPF_F_PSEUDO_HDR,
inet_proto_csum_replace4 will update skb->csum even though it shouldn't
(the port and the UDP checksum updates null each other).

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/5126ef84ba75425b689482cbc98bffe75e5d8ab0.1744102490.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-04-29 11:33:37 -07:00
Paul Chaignon
7de6a44a0f bpf: Clarify role of BPF_F_RECOMPUTE_CSUM
BPF_F_RECOMPUTE_CSUM doesn't update the actual L3 and L4 checksums in
the packet, but simply updates skb->csum (according to skb->ip_summed).
This patch clarifies that to avoid confusions.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/ff6895d42936f03dbb82334d8bcfd50e00c79086.1744102490.git.paul.chaignon@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-04-29 11:33:37 -07:00
Mykyta Yatsenko
abdb15bedd libbpf: Add getters for BTF.ext func and line info
Introducing new libbpf API getters for BTF.ext func and line info,
namely:
  bpf_program__func_info
  bpf_program__func_info_cnt
  bpf_program__line_info
  bpf_program__line_info_cnt

This change enables scenarios, when user needs to load bpf_program
directly using `bpf_prog_load`, instead of higher-level
`bpf_object__load`. Line and func info are required for checking BTF
info in verifier; verification may fail without these fields if, for
example, program calls `bpf_obj_new`.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250408234417.452565-2-mykyta.yatsenko5@gmail.com
2025-04-29 11:33:37 -07:00
Anton Protopopov
7a1388d55f libbpf: Add likely/unlikely macros and use them in selftests
A few selftests and, more importantly, consequent changes to the
bpf_helpers.h file, use likely/unlikely macros, so define them here
and remove duplicate definitions from existing selftests.

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250331203618.1973691-3-a.s.protopopov@gmail.com
2025-04-29 11:33:37 -07:00
Anton Protopopov
4687560af9 bpf: Fix a comment describing bpf_attr
The map_fd field of the bpf_attr union is used in the BPF_MAP_FREEZE
syscall.  Explicitly mention this in the comments.

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250331203618.1973691-2-a.s.protopopov@gmail.com
2025-04-29 11:33:37 -07:00
Carlos Llamas
732d6c011f libbpf: Fix implicit memfd_create() for bionic
Since memfd_create() is not consistently available across different
bionic libc implementations, using memfd_create() directly can break
some Android builds:

  tools/lib/bpf/linker.c:576:7: error: implicit declaration of function 'memfd_create' [-Werror,-Wimplicit-function-declaration]
    576 |         fd = memfd_create(filename, 0);
        |              ^

To fix this, relocate and inline the sys_memfd_create() helper so that
it can be used in "linker.c". Similar issues were previously fixed by
commit 9fa5e1a180aa ("libbpf: Call memfd_create() syscall directly").

Fixes: 6d5e5e5d7ce1 ("libbpf: Extend linker API to support in-memory ELF files")
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250330211325.530677-1-cmllamas@google.com
2025-04-29 11:33:37 -07:00
Mykyta Yatsenko
4659eaafa4 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   488a8544f839048064ea0647596af2aaca7ecc25
Checkpoint bpf-next commit: 25601e85441dd91cf7973b002f27af4c5b8691ea
Baseline bpf commit:        46e88299d19695c2b21e245c52a86ed26ed5cfee
Checkpoint bpf commit:      0c2623cef4f49e1ef6a908a389eea86130d11057

David Wei (1):
  netdev: add io_uring memory provider info

Ian Rogers (1):
  libbpf: Add namespace for errstr making it libbpf_errstr

Jason Xing (6):
  bpf: Add networking timestamping support to bpf_get/setsockopt()
  bpf: Add BPF_SOCK_OPS_TSTAMP_SCHED_CB callback
  bpf: Add BPF_SOCK_OPS_TSTAMP_SND_SW_CB callback
  bpf: Add BPF_SOCK_OPS_TSTAMP_SND_HW_CB callback
  bpf: Add BPF_SOCK_OPS_TSTAMP_ACK_CB callback
  bpf: Add BPF_SOCK_OPS_TSTAMP_SENDMSG_CB callback

Joe Damato (1):
  netdev-genl: Add an XSK attribute to queues

Kan Liang (1):
  perf: Extend per event callchain limit to branch stack

Mykyta Yatsenko (2):
  bpf: BPF token support for BPF_BTF_GET_FD_BY_ID
  libbpf: Pass BPF token from find_prog_btf_id to BPF_BTF_GET_FD_BY_ID

Song Yoong Siang (1):
  xsk: Add launch time hardware offload support to XDP Tx metadata

 include/uapi/linux/bpf.h        | 31 +++++++++++++++++++++++++++++++
 include/uapi/linux/if_xdp.h     | 10 ++++++++++
 include/uapi/linux/netdev.h     | 16 ++++++++++++++++
 include/uapi/linux/perf_event.h |  2 ++
 src/bpf.c                       |  3 ++-
 src/bpf.h                       |  3 ++-
 src/btf.c                       | 15 +++++++++++++--
 src/libbpf.c                    | 10 +++++-----
 src/libbpf_internal.h           |  1 +
 src/str_error.c                 |  2 +-
 src/str_error.h                 |  7 +++++--
 11 files changed, 88 insertions(+), 12 deletions(-)

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
2025-04-02 14:24:25 -07:00
Ian Rogers
2a228c7885 libbpf: Add namespace for errstr making it libbpf_errstr
When statically linking symbols can be replaced with those from other
statically linked libraries depending on the link order and the hoped
for "multiple definition" error may not appear. To avoid conflicts it
is good practice to namespace symbols, this change renames errstr to
libbpf_errstr. To avoid churn a #define is used to turn use of
errstr(err) to libbpf_errstr(err).

Fixes: 1633a83bf993 ("libbpf: Introduce errstr() for stringifying errno")
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250320222439.1350187-1-irogers@google.com
2025-04-02 14:24:25 -07:00
Mykyta Yatsenko
90844e28dc libbpf: Pass BPF token from find_prog_btf_id to BPF_BTF_GET_FD_BY_ID
Pass BPF token from bpf_program__set_attach_target to
BPF_BTF_GET_FD_BY_ID bpf command.
When freplace program attaches to target program, it needs to look up
for BTF of the target, this may require BPF token, if, for example,
running from user namespace.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20250317174039.161275-4-mykyta.yatsenko5@gmail.com
2025-04-02 14:24:25 -07:00
Mykyta Yatsenko
009a8cb452 bpf: BPF token support for BPF_BTF_GET_FD_BY_ID
Currently BPF_BTF_GET_FD_BY_ID requires CAP_SYS_ADMIN, which does not
allow running it from user namespace. This creates a problem when
freplace program running from user namespace needs to query target
program BTF.
This patch relaxes capable check from CAP_SYS_ADMIN to CAP_BPF and adds
support for BPF token that can be passed in attributes to syscall.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250317174039.161275-2-mykyta.yatsenko5@gmail.com
2025-04-02 14:24:25 -07:00
Song Yoong Siang
89cad6a160 xsk: Add launch time hardware offload support to XDP Tx metadata
Extend the XDP Tx metadata framework so that user can requests launch time
hardware offload, where the Ethernet device will schedule the packet for
transmission at a pre-determined time called launch time. The value of
launch time is communicated from user space to Ethernet driver via
launch_time field of struct xsk_tx_metadata.

Suggested-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250216093430.957880-2-yoong.siang.song@intel.com
2025-04-02 14:24:25 -07:00
Jason Xing
5cbd13ee02 bpf: Add BPF_SOCK_OPS_TSTAMP_SENDMSG_CB callback
This patch introduces a new callback in tcp_tx_timestamp() to correlate
tcp_sendmsg timestamp with timestamps from other tx timestamping
callbacks (e.g., SND/SW/ACK).

Without this patch, BPF program wouldn't know which timestamps belong
to which flow because of no socket lock protection. This new callback
is inserted in tcp_tx_timestamp() to address this issue because
tcp_tx_timestamp() still owns the same socket lock with
tcp_sendmsg_locked() in the meanwhile tcp_tx_timestamp() initializes
the timestamping related fields for the skb, especially tskey. The
tskey is the bridge to do the correlation.

For TCP, BPF program hooks the beginning of tcp_sendmsg_locked() and
then stores the sendmsg timestamp at the bpf_sk_storage, correlating
this timestamp with its tskey that are later used in other sending
timestamping callbacks.

Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250220072940.99994-11-kerneljasonxing@gmail.com
2025-04-02 14:24:25 -07:00
Jason Xing
79e19bb62b bpf: Add BPF_SOCK_OPS_TSTAMP_ACK_CB callback
Support the ACK case for bpf timestamping.

Add a new sock_ops callback, BPF_SOCK_OPS_TSTAMP_ACK_CB. This
callback will occur at the same timestamping point as the user
space's SCM_TSTAMP_ACK. The BPF program can use it to get the
same SCM_TSTAMP_ACK timestamp without modifying the user-space
application.

This patch extends txstamp_ack to two bits: 1 stands for
SO_TIMESTAMPING mode, 2 bpf extension.

Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250220072940.99994-10-kerneljasonxing@gmail.com
2025-04-02 14:24:25 -07:00
Jason Xing
253b5ce758 bpf: Add BPF_SOCK_OPS_TSTAMP_SND_HW_CB callback
Support hw SCM_TSTAMP_SND case for bpf timestamping.

Add a new sock_ops callback, BPF_SOCK_OPS_TSTAMP_SND_HW_CB. This
callback will occur at the same timestamping point as the user
space's hardware SCM_TSTAMP_SND. The BPF program can use it to
get the same SCM_TSTAMP_SND timestamp without modifying the
user-space application.

To avoid increasing the code complexity, replace SKBTX_HW_TSTAMP
with SKBTX_HW_TSTAMP_NOBPF instead of changing numerous callers
from driver side using SKBTX_HW_TSTAMP. The new definition of
SKBTX_HW_TSTAMP means the combination tests of socket timestamping
and bpf timestamping. After this patch, drivers can work under the
bpf timestamping.

Considering some drivers don't assign the skb with hardware
timestamp, this patch does the assignment and then BPF program
can acquire the hwstamp from skb directly.

Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250220072940.99994-9-kerneljasonxing@gmail.com
2025-04-02 14:24:25 -07:00
Jason Xing
d855493df1 bpf: Add BPF_SOCK_OPS_TSTAMP_SND_SW_CB callback
Support sw SCM_TSTAMP_SND case for bpf timestamping.

Add a new sock_ops callback, BPF_SOCK_OPS_TSTAMP_SND_SW_CB. This
callback will occur at the same timestamping point as the user
space's software SCM_TSTAMP_SND. The BPF program can use it to
get the same SCM_TSTAMP_SND timestamp without modifying the
user-space application.

Based on this patch, BPF program will get the software
timestamp when the driver is ready to send the skb. In the
sebsequent patch, the hardware timestamp will be supported.

Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250220072940.99994-8-kerneljasonxing@gmail.com
2025-04-02 14:24:25 -07:00
Jason Xing
7ea10cfba8 bpf: Add BPF_SOCK_OPS_TSTAMP_SCHED_CB callback
Support SCM_TSTAMP_SCHED case for bpf timestamping.

Add a new sock_ops callback, BPF_SOCK_OPS_TSTAMP_SCHED_CB. This
callback will occur at the same timestamping point as the user
space's SCM_TSTAMP_SCHED. The BPF program can use it to get the
same SCM_TSTAMP_SCHED timestamp without modifying the user-space
application.

A new SKBTX_BPF flag is added to mark skb_shinfo(skb)->tx_flags,
ensuring that the new BPF timestamping and the current user
space's SO_TIMESTAMPING do not interfere with each other.

Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250220072940.99994-7-kerneljasonxing@gmail.com
2025-04-02 14:24:25 -07:00
Jason Xing
43b6c2cd70 bpf: Add networking timestamping support to bpf_get/setsockopt()
The new SK_BPF_CB_FLAGS and new SK_BPF_CB_TX_TIMESTAMPING are
added to bpf_get/setsockopt. The later patches will implement the
BPF networking timestamping. The BPF program will use
bpf_setsockopt(SK_BPF_CB_FLAGS, SK_BPF_CB_TX_TIMESTAMPING) to
enable the BPF networking timestamping on a socket.

Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250220072940.99994-2-kerneljasonxing@gmail.com
2025-04-02 14:24:25 -07:00
Joe Damato
b1cb441916 netdev-genl: Add an XSK attribute to queues
Expose a new per-queue nest attribute, xsk, which will be present for
queues that are being used for AF_XDP. If the queue is not being used for
AF_XDP, the nest will not be present.

In the future, this attribute can be extended to include more data about
XSK as it is needed.

Signed-off-by: Joe Damato <jdamato@fastly.com>
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250214211255.14194-3-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-02 14:24:25 -07:00
David Wei
01500813ad netdev: add io_uring memory provider info
Add a nested attribute for io_uring memory provider info. For now it is
empty and its presence indicates that a particular page pool or queue
has an io_uring memory provider attached.

$ ./cli.py --spec netlink/specs/netdev.yaml --dump page-pool-get
[{'id': 80,
  'ifindex': 2,
  'inflight': 64,
  'inflight-mem': 262144,
  'napi-id': 525},
 {'id': 79,
  'ifindex': 2,
  'inflight': 320,
  'inflight-mem': 1310720,
  'io_uring': {},
  'napi-id': 525},
...

$ ./cli.py --spec netlink/specs/netdev.yaml --dump queue-get
[{'id': 0, 'ifindex': 1, 'type': 'rx'},
 {'id': 0, 'ifindex': 1, 'type': 'tx'},
 {'id': 0, 'ifindex': 2, 'napi-id': 513, 'type': 'rx'},
 {'id': 1, 'ifindex': 2, 'napi-id': 514, 'type': 'rx'},
...
 {'id': 12, 'ifindex': 2, 'io_uring': {}, 'napi-id': 525, 'type': 'rx'},
...

Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: David Wei <dw@davidwei.uk>
Link: https://patch.msgid.link/20250204215622.695511-6-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-02 14:24:25 -07:00
Kan Liang
59171f49e9 perf: Extend per event callchain limit to branch stack
The commit 97c79a38cd45 ("perf core: Per event callchain limit")
introduced a per-event term to allow finer tuning of the depth of
callchains to save space.

It should be applied to the branch stack as well. For example, autoFDO
collections require maximum LBR entries. In the meantime, other
system-wide LBR users may only be interested in the latest a few number
of LBRs. A per-event LBR depth would save the perf output buffer.

The patch simply drops the uninterested branches, but HW still collects
the maximum branches. There may be a model-specific optimization that
can reduce the HW depth for some cases to reduce the overhead further.
But it isn't included in the patch set. Because it's not useful for all
cases. For example, ARCH LBR can utilize the PEBS and XSAVE to collect
LBRs. The depth should have less impact on the collecting overhead.
The model-specific optimization may be implemented later separately.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20250310181536.3645382-1-kan.liang@linux.intel.com
2025-04-02 14:24:25 -07:00
Ihor Solodrai
1b8768339f ci: add temporary patches for selftests
* https://lore.kernel.org/all/20250327185528.1740787-1-song@kernel.org/
* https://lore.kernel.org/bpf/20250328193124.808784-1-song@kernel.org/
* https://lore.kernel.org/bpf/20250331033828.365077-1-yonghong.song@linux.dev/

Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
2025-04-02 10:23:23 -07:00
Ihor Solodrai
374036c9f1 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   239860828f8660e2be487e2fbdae2640cce3fd67
Checkpoint bpf-next commit: 79d93c8ff35855d3283ee7d82dfe0c54f90b9986
Baseline bpf commit:        319fc77f8f45a1b3dba15b0cc1a869778fd222f7
Checkpoint bpf commit:      6ccf6adb05d0fe3dbb1a77ab90bf054da8a2198d

Ihor Solodrai (1):
  libbpf: Implement bpf_usdt_arg_size BPF function

Mykyta Yatsenko (3):
  libbpf: Use map_is_created helper in map setters
  libbpf: Introduce more granular state for bpf_object
  libbpf: Split bpf object load into prepare/load

Nandakumar Edamana (1):
  libbpf: Fix out-of-bound read

Peilin Ye (1):
  bpf: Introduce load-acquire and store-release instructions

Yonghong Song (1):
  bpf: Allow pre-ordering for bpf cgroup progs

 include/uapi/linux/bpf.h |   4 +
 src/libbpf.c             | 201 ++++++++++++++++++++++++++-------------
 src/libbpf.h             |  13 +++
 src/libbpf.map           |   1 +
 src/usdt.bpf.h           |  32 +++++++
 5 files changed, 183 insertions(+), 68 deletions(-)

Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
2025-03-10 15:35:17 -07:00
Peilin Ye
bf62e0dcfd bpf: Introduce load-acquire and store-release instructions
Introduce BPF instructions with load-acquire and store-release
semantics, as discussed in [1].  Define 2 new flags:

  #define BPF_LOAD_ACQ    0x100
  #define BPF_STORE_REL   0x110

A "load-acquire" is a BPF_STX | BPF_ATOMIC instruction with the 'imm'
field set to BPF_LOAD_ACQ (0x100).

Similarly, a "store-release" is a BPF_STX | BPF_ATOMIC instruction with
the 'imm' field set to BPF_STORE_REL (0x110).

Unlike existing atomic read-modify-write operations that only support
BPF_W (32-bit) and BPF_DW (64-bit) size modifiers, load-acquires and
store-releases also support BPF_B (8-bit) and BPF_H (16-bit).  As an
exception, however, 64-bit load-acquires/store-releases are not
supported on 32-bit architectures (to fix a build error reported by the
kernel test robot).

An 8- or 16-bit load-acquire zero-extends the value before writing it to
a 32-bit register, just like ARM64 instruction LDARH and friends.

Similar to existing atomic read-modify-write operations, misaligned
load-acquires/store-releases are not allowed (even if
BPF_F_ANY_ALIGNMENT is set).

As an example, consider the following 64-bit load-acquire BPF
instruction (assuming little-endian):

  db 10 00 00 00 01 00 00  r0 = load_acquire((u64 *)(r1 + 0x0))

  opcode (0xdb): BPF_ATOMIC | BPF_DW | BPF_STX
  imm (0x00000100): BPF_LOAD_ACQ

Similarly, a 16-bit BPF store-release:

  cb 21 00 00 10 01 00 00  store_release((u16 *)(r1 + 0x0), w2)

  opcode (0xcb): BPF_ATOMIC | BPF_H | BPF_STX
  imm (0x00000110): BPF_STORE_REL

In arch/{arm64,s390,x86}/net/bpf_jit_comp.c, have
bpf_jit_supports_insn(..., /*in_arena=*/true) return false for the new
instructions, until the corresponding JIT compiler supports them in
arena.

[1] https://lore.kernel.org/all/20240729183246.4110549-1-yepeilin@google.com/

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: kernel test robot <lkp@intel.com>
Signed-off-by: Peilin Ye <yepeilin@google.com>
Link: https://lore.kernel.org/r/a217f46f0e445fbd573a1a024be5c6bf1d5fe716.1741049567.git.yepeilin@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
2025-03-10 15:35:17 -07:00
Mykyta Yatsenko
855a5d7904 libbpf: Split bpf object load into prepare/load
Introduce bpf_object__prepare API: additional intermediate preparation
step that performs ELF processing, relocations, prepares final state of
BPF program instructions (accessible with bpf_program__insns()), creates
and (potentially) pins maps, and stops short of loading BPF programs.

We anticipate few use cases for this API, such as:
* Use prepare to initialize bpf_token, without loading freplace
programs, unlocking possibility to lookup BTF of other programs.
* Execute prepare to obtain finalized BPF program instructions without
loading programs, enabling tools like veristat to process one program at
a time, without incurring cost of ELF parsing and processing.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250303135752.158343-4-mykyta.yatsenko5@gmail.com
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
2025-03-10 15:35:17 -07:00
Mykyta Yatsenko
16c58c33c8 libbpf: Introduce more granular state for bpf_object
We are going to split bpf_object loading into 2 stages: preparation and
loading. This will increase flexibility when working with bpf_object
and unlock some optimizations and use cases.
This patch substitutes a boolean flag (loaded) by more finely-grained
state for bpf_object.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250303135752.158343-3-mykyta.yatsenko5@gmail.com
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
2025-03-10 15:35:17 -07:00
Mykyta Yatsenko
e14bb3629f libbpf: Use map_is_created helper in map setters
Refactoring: use map_is_created helper in map setters that need to check
the state of the map. This helps to reduce the number of the places that
depend explicitly on the loaded flag, simplifying refactoring in the
next patch of this set.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250303135752.158343-2-mykyta.yatsenko5@gmail.com
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
2025-03-10 15:35:17 -07:00
Yonghong Song
fbda5d7d2f bpf: Allow pre-ordering for bpf cgroup progs
Currently for bpf progs in a cgroup hierarchy, the effective prog array
is computed from bottom cgroup to upper cgroups (post-ordering). For
example, the following cgroup hierarchy
    root cgroup: p1, p2
        subcgroup: p3, p4
have BPF_F_ALLOW_MULTI for both cgroup levels.
The effective cgroup array ordering looks like
    p3 p4 p1 p2
and at run time, progs will execute based on that order.

But in some cases, it is desirable to have root prog executes earlier than
children progs (pre-ordering). For example,
  - prog p1 intends to collect original pkt dest addresses.
  - prog p3 will modify original pkt dest addresses to a proxy address for
    security reason.
The end result is that prog p1 gets proxy address which is not what it
wants. Putting p1 to every child cgroup is not desirable either as it
will duplicate itself in many child cgroups. And this is exactly a use case
we are encountering in Meta.

To fix this issue, let us introduce a flag BPF_F_PREORDER. If the flag
is specified at attachment time, the prog has higher priority and the
ordering with that flag will be from top to bottom (pre-ordering).
For example, in the above example,
    root cgroup: p1, p2
        subcgroup: p3, p4
Let us say p2 and p4 are marked with BPF_F_PREORDER. The final
effective array ordering will be
    p2 p4 p3 p1

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20250224230116.283071-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
2025-03-10 15:35:17 -07:00
Ihor Solodrai
be18fdb16a libbpf: Implement bpf_usdt_arg_size BPF function
Information about USDT argument size is implicitly stored in
__bpf_usdt_arg_spec, but currently it's not accessbile to BPF programs
that use USDT.

Implement bpf_sdt_arg_size() that returns the size of an USDT argument
in bytes.

v1->v2:
  * do not add __bpf_usdt_arg_spec() helper

v1: https://lore.kernel.org/bpf/20250220215904.3362709-1-ihor.solodrai@linux.dev/

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20250224235756.2612606-1-ihor.solodrai@linux.dev
2025-03-10 15:35:17 -07:00
Nandakumar Edamana
82f60c9b5e libbpf: Fix out-of-bound read
In `set_kcfg_value_str`, an untrusted string is accessed with the assumption
that it will be at least two characters long due to the presence of checks for
opening and closing quotes. But the check for the closing quote
(value[len - 1] != '"') misses the fact that it could be checking the opening
quote itself in case of an invalid input that consists of just the opening
quote.

This commit adds an explicit check to make sure the string is at least two
characters long.

Signed-off-by: Nandakumar Edamana <nandakumar@nandakumar.co.in>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250221210110.3182084-1-nandakumar@nandakumar.co.in
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
2025-03-10 15:35:17 -07:00
Andrii Nakryiko
4c893341f5 Makefile: detect pkg-config availability
Detect whether build system has pkg-config tool, and if not, fallback to
manually specifying -lelf -lz as dependency.

Closes: https://github.com/libbpf/libbpf/issues/885
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2025-03-03 18:59:47 -08:00
Ihor Solodrai
42a6ef6316 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   01f3ce5328c405179b2c69ea047c423dad2bfa6d
Checkpoint bpf-next commit: 239860828f8660e2be487e2fbdae2640cce3fd67
Baseline bpf commit:        c45323b7560ec87c37c729b703c86ee65f136d75
Checkpoint bpf commit:      319fc77f8f45a1b3dba15b0cc1a869778fd222f7

Andrii Nakryiko (2):
  libbpf: fix LDX/STX/ST CO-RE relocation size adjustment logic
  libbpf: Fix hypothetical STT_SECTION extern NULL deref case

Daniel Borkmann (1):
  netkit: Allow for configuring needed_{head,tail}room

Ihor Solodrai (3):
  libbpf: Introduce kflag for type_tags and decl_tags in BTF
  docs/bpf: Document the semantics of BTF tags with kind_flag
  libbpf: Check the kflag of type tags in btf_dump

Tao Chen (1):
  libbpf: Wrap libbpf API direct err with libbpf_err

Tony Ambardar (1):
  libbpf: Fix accessing BTF.ext core_relo header

Yonghong Song (1):
  bpf: Sync uapi bpf.h header for the tooling infra

 include/uapi/linux/bpf.h     |  5 +-
 include/uapi/linux/btf.h     |  3 +-
 include/uapi/linux/if_link.h |  2 +
 src/btf.c                    | 90 ++++++++++++++++++++++++++----------
 src/btf.h                    |  3 ++
 src/btf_dump.c               |  5 +-
 src/libbpf.c                 | 26 +++++------
 src/libbpf.map               |  2 +
 src/linker.c                 |  2 +-
 src/relo_core.c              | 24 ++++++++--
 10 files changed, 116 insertions(+), 46 deletions(-)

Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
2025-02-24 15:10:59 -08:00
Andrii Nakryiko
041d5948f3 libbpf: Fix hypothetical STT_SECTION extern NULL deref case
Fix theoretical NULL dereference in linker when resolving *extern*
STT_SECTION symbol against not-yet-existing ELF section. Not sure if
it's possible in practice for valid ELF object files (this would require
embedded assembly manipulations, at which point BTF will be missing),
but fix the s/dst_sym/dst_sec/ typo guarding this condition anyways.

Fixes: faf6ed321cf6 ("libbpf: Add BPF static linker APIs")
Fixes: a46349227cd8 ("libbpf: Add linker extern resolution support for functions and global variables")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20250220002821.834400-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
2025-02-24 15:10:59 -08:00
Tao Chen
39a589c74e libbpf: Wrap libbpf API direct err with libbpf_err
Just wrap the direct err with libbpf_err, keep consistency
with other APIs.

Signed-off-by: Tao Chen <chen.dylane@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20250219153711.29651-1-chen.dylane@linux.dev
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
2025-02-24 15:10:59 -08:00
Andrii Nakryiko
d7a4ab1548 libbpf: fix LDX/STX/ST CO-RE relocation size adjustment logic
Libbpf has a somewhat obscure feature of automatically adjusting the
"size" of LDX/STX/ST instruction (memory store and load instructions),
based on originally recorded access size (u8, u16, u32, or u64) and the
actual size of the field on target kernel. This is meant to facilitate
using BPF CO-RE on 32-bit architectures (pointers are always 64-bit in
BPF, but host kernel's BTF will have it as 32-bit type), as well as
generally supporting safe type changes (unsigned integer type changes
can be transparently "relocated").

One issue that surfaced only now, 5 years after this logic was
implemented, is how this all works when dealing with fields that are
arrays. This isn't all that easy and straightforward to hit (see
selftests that reproduce this condition), but one of sched_ext BPF
programs did hit it with innocent looking loop.

Long story short, libbpf used to calculate entire array size, instead of
making sure to only calculate array's element size. But it's the element
that is loaded by LDX/STX/ST instructions (1, 2, 4, or 8 bytes), so
that's what libbpf should check. This patch adjusts the logic for
arrays and fixed the issue.

Reported-by: Emil Tsalapatis <emil@etsalapatis.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20250207014809.1573841-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
2025-02-24 15:10:59 -08:00
Yonghong Song
4eed43c229 bpf: Sync uapi bpf.h header for the tooling infra
Commit 0abff462d802 ("bpf: Add comment about helper freeze") missed the
tooling header sync. Fix it.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20250213050427.2788837-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
2025-02-24 15:10:59 -08:00
Ihor Solodrai
cc278ff7c0 libbpf: Check the kflag of type tags in btf_dump
If the kflag is set for a BTF type tag, then the tag represents an
arbitrary __attribute__. Change btf_dump accordingly.

Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250130201239.1429648-4-ihor.solodrai@linux.dev
2025-02-24 15:10:59 -08:00
Ihor Solodrai
2b8b896bca docs/bpf: Document the semantics of BTF tags with kind_flag
Explain the meaning of kind_flag in BTF type_tags and decl_tags.
Update uapi btf.h kind_flag comment to reflect the changes.

Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250130201239.1429648-3-ihor.solodrai@linux.dev
2025-02-24 15:10:59 -08:00
Ihor Solodrai
32bda80136 libbpf: Introduce kflag for type_tags and decl_tags in BTF
Add the following functions to libbpf API:
  * btf__add_type_attr()
  * btf__add_decl_attr()

These functions allow to add to BTF the type tags and decl tags with
info->kflag set to 1. The kflag indicates that the tag directly
encodes an __attribute__ and not a normal tag.

See Documentation/bpf/btf.rst changes in the subsequent patch for
details on the semantics.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250130201239.1429648-2-ihor.solodrai@linux.dev
2025-02-24 15:10:59 -08:00
Tony Ambardar
71208c3362 libbpf: Fix accessing BTF.ext core_relo header
Update btf_ext_parse_info() to ensure the core_relo header is present
before reading its fields. This avoids a potential buffer read overflow
reported by the OSS Fuzz project.

Fixes: cf579164e9ea ("libbpf: Support BTF.ext loading and output in either endianness")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://issues.oss-fuzz.com/issues/388905046
Link: https://lore.kernel.org/bpf/20250125065236.2603346-1-itugrok@yahoo.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
2025-02-24 15:10:59 -08:00
Daniel Borkmann
9544a909f1 netkit: Allow for configuring needed_{head,tail}room
Allow the user to configure needed_{head,tail}room for both netkit
devices. The idea is similar to 163e529200af ("veth: implement
ndo_set_rx_headroom") with the difference that the two parameters
can be specified upon device creation. By default the current behavior
stays as is which is needed_{head,tail}room is 0.

In case of Cilium, for example, the netkit devices are not enslaved
into a bridge or openvswitch device (rather, BPF-based redirection
is used out of tcx), and as such these parameters are not propagated
into the Pod's netns via peer device.

Given Cilium can run in vxlan/geneve tunneling mode (needed_headroom)
and/or be used in combination with WireGuard (needed_{head,tail}room),
allow the Cilium CNI plugin to specify these two upon netkit device
creation.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/bpf/20241220234658.490686-1-daniel@iogearbox.net
Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
2025-02-24 15:10:59 -08:00
Ihor Solodrai
d4a841a32b ci: remove dependency on run-on-arch-action
run-on-arch-action is simply a wrapper around docker. There is no
value in using it in libbpf, as it is not complicated to run
non-native arch docker images directly on github-hosted runners.

Docker relies on qemu-user-static installed on the system to emulate
different architectures.

Recently there were various reports about multi-arch docker builds
failing with seemingly random issues, and it appears to boil down to
qemu [1]. I stumbled on this problem while updating s390x runners [2]
for BPF CI, and setting up more recent version of qemu helped.

This change addresses recent build failures on s390x and ppc64le.

[1] https://github.com/docker/setup-qemu-action/issues/188
[2] https://github.com/kernel-patches/runner/pull/69
[3] https://docs.docker.com/build/buildkit/#getting-started

Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>
2025-01-31 22:30:52 -08:00
Ihor Solodrai
324f3c3846 ci: run coverty scan on push to master
Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
2025-01-17 13:53:19 -08:00
Ihor Solodrai
63528b7a4d ci: remove sourcing helpers.sh from coverity workflow
Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
2025-01-17 13:53:19 -08:00
Andrii Nakryiko
7abfe520df sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   f44275e7155dc310d36516fc25be503da099781c
Checkpoint bpf-next commit: 01f3ce5328c405179b2c69ea047c423dad2bfa6d
Baseline bpf commit:        9d89551994a430b50c4fffcb1e617a057fa76e20
Checkpoint bpf commit:      c45323b7560ec87c37c729b703c86ee65f136d75

Andrii Nakryiko (1):
  libbpf: Work around kernel inconsistently stripping '.llvm.' suffix

Pu Lehui (2):
  libbpf: Fix return zero when elf_begin failed
  libbpf: Fix incorrect traversal end type ID when marking
    BTF_IS_EMBEDDED

Vishal Chourasia (1):
  tools: Sync if_xdp.h uapi tooling header

Yonghong Song (1):
  libbpf: Add unique_match option for multi kprobe

 include/uapi/linux/if_xdp.h |  4 ++--
 src/btf.c                   |  1 +
 src/btf_relocate.c          |  2 +-
 src/libbpf.c                | 39 +++++++++++++++++++++++++++++++++++--
 src/libbpf.h                |  4 +++-
 5 files changed, 44 insertions(+), 6 deletions(-)

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2025-01-17 12:31:44 -08:00
Vishal Chourasia
d76c770473 tools: Sync if_xdp.h uapi tooling header
Sync if_xdp.h uapi header to remove following warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/if_xdp.h'
  differs from latest version at 'include/uapi/linux/if_xdp.h'

Fixes: 48eb03dd2630 ("xsk: Add TX timestamp and TX checksum offload support")
Signed-off-by: Vishal Chourasia <vishalc@linux.ibm.com>
Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20250115032248.125742-1-yoong.siang.song@intel.com
2025-01-17 12:31:44 -08:00
Andrii Nakryiko
444f3c0e7a libbpf: Work around kernel inconsistently stripping '.llvm.' suffix
Some versions of kernel were stripping out '.llvm.<hash>' suffix from
kerne symbols (produced by Clang LTO compilation) from function names
reported in available_filter_functions, while kallsyms reported full
original name. This confuses libbpf's multi-kprobe logic of finding all
matching kernel functions for specified user glob pattern by joining
available_filter_functions and kallsyms contents, because joining by
full symbol name won't work for symbols containing '.llvm.<hash>' suffix.

This was eventually fixed by [0] in the kernel, but we'd like to not
regress multi-kprobe experience and add a work around for this bug on
libbpf side, stripping kallsym's name if it matches user pattern and
contains '.llvm.' suffix.

  [0] fb6a421fb615 ("kallsyms: Match symbols exactly with CONFIG_LTO_CLANG")

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20250117003957.179331-1-andrii@kernel.org
2025-01-17 12:31:44 -08:00
Pu Lehui
719aeb7a6e libbpf: Fix incorrect traversal end type ID when marking BTF_IS_EMBEDDED
When redirecting the split BTF to the vmlinux base BTF, we need to mark
the distilled base struct/union members of split BTF structs/unions in
id_map with BTF_IS_EMBEDDED. This indicates that these types must match
both name and size later. Therefore, we need to traverse the entire
split BTF, which involves traversing type IDs from nr_dist_base_types to
nr_types. However, the current implementation uses an incorrect
traversal end type ID, so let's correct it.

Fixes: 19e00c897d50 ("libbpf: Split BTF relocation")
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250115100241.4171581-3-pulehui@huaweicloud.com
2025-01-17 12:31:44 -08:00
Pu Lehui
a7edf4aec8 libbpf: Fix return zero when elf_begin failed
The error number of elf_begin is omitted when encapsulating the
btf_find_elf_sections function.

Fixes: c86f180ffc99 ("libbpf: Make btf_parse_elf process .BTF.base transparently")
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250115100241.4171581-2-pulehui@huaweicloud.com
2025-01-17 12:31:44 -08:00
Yonghong Song
32792ec66c libbpf: Add unique_match option for multi kprobe
Jordan reported an issue in Meta production environment where func
try_to_wake_up() is renamed to try_to_wake_up.llvm.<hash>() by clang
compiler at lto mode. The original 'kprobe/try_to_wake_up' does not
work any more since try_to_wake_up() does not match the actual func
name in /proc/kallsyms.

There are a couple of ways to resolve this issue. For example, in
attach_kprobe(), we could do lookup in /proc/kallsyms so try_to_wake_up()
can be replaced by try_to_wake_up.llvm.<hach>(). Or we can force users
to use bpf_program__attach_kprobe() where they need to lookup
/proc/kallsyms to find out try_to_wake_up.llvm.<hach>(). But these two
approaches requires extra work by either libbpf or user.

Luckily, suggested by Andrii, multi kprobe already supports wildcard ('*')
for symbol matching. In the above example, 'try_to_wake_up*' can match
to try_to_wake_up() or try_to_wake_up.llvm.<hash>() and this allows
bpf prog works for different kernels as some kernels may have
try_to_wake_up() and some others may have try_to_wake_up.llvm.<hash>().

The original intention is to kprobe try_to_wake_up() only, so an optional
field unique_match is added to struct bpf_kprobe_multi_opts. If the
field is set to true, the number of matched functions must be one.
Otherwise, the attachment will fail. In the above case, multi kprobe
with 'try_to_wake_up*' and unique_match preserves user functionality.

Reported-by: Jordan Rome <linux@jordanrome.com>
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250109174023.3368432-1-yonghong.song@linux.dev
2025-01-17 12:31:44 -08:00
Ihor Solodrai
c924f8d3dd ci: sync with libbpf/ci@v3
* vmtest.yml
  * use v3 of libbpf/ci actions
  * remove unnecessary selftests preparation steps
* ci/vmtest
  * remove unnecessary scripts and configs
  * add libbpf-specific run-vmtest.env [1]

[1] https://github.com/libbpf/ci/pull/166

Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
2025-01-15 19:30:48 -08:00
Daniel Müller
0ff2f8e0ee sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   a1087da9d11e5bcacc706002bc0f84b790881f69
Checkpoint bpf-next commit: f44275e7155dc310d36516fc25be503da099781c
Baseline bpf commit:        fb86c42a2a5d44e849ddfbc98b8d2f4f40d36ee3
Checkpoint bpf commit:      9d89551994a430b50c4fffcb1e617a057fa76e20

Adrian Hunter (1):
  perf/core: Add aux_pause, aux_resume, aux_start_paused

Alastair Robertson (2):
  libbpf: Pull file-opening logic up to top-level functions
  libbpf: Extend linker API to support in-memory ELF files

Andrii Nakryiko (1):
  libbpf: don't adjust USDT semaphore address if .stapsdt.base addr is
    missing

Anton Protopopov (2):
  bpf: Add fd_array_cnt attribute for prog_load
  libbpf: prog load: Allow to use fd_array_cnt

Ben Olson (1):
  libbpf: Improve debug message when the base BTF cannot be found

Daniel Borkmann (1):
  tools: Sync if_link.h uapi tooling header

Daniel Xu (1):
  libbpf: Set MFD_NOEXEC_SEAL when creating memfd

Eric Dumazet (1):
  net: add IFLA_MAX_PACING_OFFLOAD_HORIZON device attribute

Jiri Olsa (1):
  libbpf: Fix memory leak in bpf_program__attach_uprobe_multi

Joe Damato (3):
  netdev-genl: Dump napi_defer_hard_irqs
  netdev-genl: Dump gro_flush_timeout
  netdev-genl: Support setting per-NAPI config values

Martin Karsten (1):
  net: Add napi_struct parameter irq_suspend_timeout

Quentin Monnet (1):
  libbpf: Fix segfault due to libelf functions not setting errno

Sidong Yang (1):
  libbpf: Change hash_combine parameters from long to unsigned long

 include/uapi/linux/bpf.h        |  10 +
 include/uapi/linux/if_link.h    | 554 +++++++++++++++++++++++++++++++-
 include/uapi/linux/netdev.h     |   4 +
 include/uapi/linux/perf_event.h |  11 +-
 src/bpf.c                       |   3 +-
 src/bpf.h                       |   5 +-
 src/btf.c                       |   4 +-
 src/libbpf.c                    |  25 +-
 src/libbpf.h                    |   5 +
 src/libbpf.map                  |   4 +
 src/linker.c                    | 248 ++++++++++----
 src/usdt.c                      |   2 +-
 12 files changed, 794 insertions(+), 81 deletions(-)

Signed-off-by: Daniel Müller <deso@posteo.net>
2025-01-08 14:58:04 -08:00
Daniel Xu
f468e83c85 libbpf: Set MFD_NOEXEC_SEAL when creating memfd
Starting from 105ff5339f49 ("mm/memfd: add MFD_NOEXEC_SEAL and
MFD_EXEC") and until 1717449b4417 ("memfd: drop warning for missing
exec-related flags"), the kernel would print a warning if neither
MFD_NOEXEC_SEAL nor MFD_EXEC is set in memfd_create().

If libbpf runs on on a kernel between these two commits (eg. on an
improperly backported system), it'll trigger this warning.

To avoid this warning (and also be more secure), explicitly set
MFD_NOEXEC_SEAL. But since libbpf can be run on potentially very old
kernels, leave a fallback for kernels without MFD_NOEXEC_SEAL support.

Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Link: https://lore.kernel.org/r/6e62c2421ad7eb1da49cbf16da95aaaa7f94d394.1735594195.git.dxu@dxuuu.xyz
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-01-08 14:58:04 -08:00
Anton Protopopov
48c771c4ce libbpf: prog load: Allow to use fd_array_cnt
Add new fd_array_cnt field to bpf_prog_load_opts
and pass it in bpf_attr, if set.

Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20241213130934.1087929-6-aspsk@isovalent.com
2025-01-08 14:58:04 -08:00
Anton Protopopov
266da73237 bpf: Add fd_array_cnt attribute for prog_load
The fd_array attribute of the BPF_PROG_LOAD syscall may contain a set
of file descriptors: maps or btfs. This field was introduced as a
sparse array. Introduce a new attribute, fd_array_cnt, which, if
present, indicates that the fd_array is a continuous array of the
corresponding length.

If fd_array_cnt is non-zero, then every map in the fd_array will be
bound to the program, as if it was used by the program. This
functionality is similar to the BPF_PROG_BIND_MAP syscall, but such
maps can be used by the verifier during the program load.

Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20241213130934.1087929-5-aspsk@isovalent.com
2025-01-08 14:58:04 -08:00
Alastair Robertson
d2f1f4490b libbpf: Extend linker API to support in-memory ELF files
The new_fd and add_fd functions correspond to the original new and
add_file functions, but accept an FD instead of a file name. This
gives API consumers the option of using anonymous files/memfds to
avoid writing ELFs to disk.

This new API will be useful for performing linking as part of
bpftrace's JIT compilation.

The add_buf function is a convenience wrapper that does the work of
creating a memfd for the caller.

Signed-off-by: Alastair Robertson <ajor@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20241211164030.573042-3-ajor@meta.com
2025-01-08 14:58:04 -08:00
Alastair Robertson
f00fad0951 libbpf: Pull file-opening logic up to top-level functions
Move the filename arguments and file-descriptor handling from
init_output_elf() and linker_load_obj_file() and instead handle them
at the top-level in bpf_linker__new() and bpf_linker__add_file().

This will allow the inner functions to be shared with a new,
non-filename-based, API in the next commit.

Signed-off-by: Alastair Robertson <ajor@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20241211164030.573042-2-ajor@meta.com
2025-01-08 14:58:04 -08:00
Quentin Monnet
984dcc97ae libbpf: Fix segfault due to libelf functions not setting errno
Libelf functions do not set errno on failure. Instead, it relies on its
internal _elf_errno value, that can be retrieved via elf_errno (or the
corresponding message via elf_errmsg()). From "man libelf":

    If a libelf function encounters an error it will set an internal
    error code that can be retrieved with elf_errno. Each thread
    maintains its own separate error code. The meaning of each error
    code can be determined with elf_errmsg, which returns a string
    describing the error.

As a consequence, libbpf should not return -errno when a function from
libelf fails, because an empty value will not be interpreted as an error
and won't prevent the program to stop. This is visible in
bpf_linker__add_file(), for example, where we call a succession of
functions that rely on libelf:

    err = err ?: linker_load_obj_file(linker, filename, opts, &obj);
    err = err ?: linker_append_sec_data(linker, &obj);
    err = err ?: linker_append_elf_syms(linker, &obj);
    err = err ?: linker_append_elf_relos(linker, &obj);
    err = err ?: linker_append_btf(linker, &obj);
    err = err ?: linker_append_btf_ext(linker, &obj);

If the object file that we try to process is not, in fact, a correct
object file, linker_load_obj_file() may fail with errno not being set,
and return 0. In this case we attempt to run linker_append_elf_sysms()
and may segfault.

This can happen (and was discovered) with bpftool:

    $ bpftool gen object output.o sample_ret0.bpf.c
    libbpf: failed to get ELF header for sample_ret0.bpf.c: invalid `Elf' handle
    zsh: segmentation fault (core dumped)  bpftool gen object output.o sample_ret0.bpf.c

Fix the issue by returning a non-null error code (-EINVAL) when libelf
functions fail.

Fixes: faf6ed321cf6 ("libbpf: Add BPF static linker APIs")
Signed-off-by: Quentin Monnet <qmo@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20241205135942.65262-1-qmo@kernel.org
2025-01-08 14:58:04 -08:00
Ben Olson
3ed57f68e5 libbpf: Improve debug message when the base BTF cannot be found
When running `bpftool` on a kernel module installed in `/lib/modules...`,
this error is encountered if the user does not specify `--base-btf` to
point to a valid base BTF (e.g. usually in `/sys/kernel/btf/vmlinux`).
However, looking at the debug output to determine the cause of the error
simply says `Invalid BTF string section`, which does not point to the
actual source of the error. This just improves that debug message to tell
users what happened.

Signed-off-by: Ben Olson <matthew.olson@intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/Z0YqzQ5lNz7obQG7@bolson-desk
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-01-08 14:58:04 -08:00
Andrii Nakryiko
69d85c5fb3 libbpf: don't adjust USDT semaphore address if .stapsdt.base addr is missing
USDT ELF note optionally can record an offset of .stapsdt.base, which is
used to make adjustments to USDT target attach address. Currently,
libbpf will do this address adjustment unconditionally if it finds
.stapsdt.base ELF section in target binary. But there is a corner case
where .stapsdt.base ELF section is present, but specific USDT note
doesn't reference it. In such case, libbpf will basically just add base
address and end up with absolutely incorrect USDT target address.

This adjustment has to be done only if both .stapsdt.sema section is
present and USDT note is recording a reference to it.

Fixes: 74cc6311cec9 ("libbpf: Add USDT notes parsing and resolution logic")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20241121224558.796110-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-01-08 14:58:04 -08:00
Martin Karsten
0d822312fa net: Add napi_struct parameter irq_suspend_timeout
Add a per-NAPI IRQ suspension parameter, which can be get/set with
netdev-genl.

This patch doesn't change any behavior but prepares the code for other
changes in the following commits which use irq_suspend_timeout as a
timeout for IRQ suspension.

Signed-off-by: Martin Karsten <mkarsten@uwaterloo.ca>
Co-developed-by: Joe Damato <jdamato@fastly.com>
Signed-off-by: Joe Damato <jdamato@fastly.com>
Tested-by: Joe Damato <jdamato@fastly.com>
Tested-by: Martin Karsten <mkarsten@uwaterloo.ca>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Link: https://patch.msgid.link/20241109050245.191288-2-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-08 14:58:04 -08:00
Daniel Borkmann
ba2ba19f6d tools: Sync if_link.h uapi tooling header
Sync if_link uapi header to the latest version as we need the refresher
in tooling for netkit device. Given it's been a while since the last sync
and the diff is fairly big, it has been done as its own commit.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20241004101335.117711-4-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2025-01-08 14:58:04 -08:00
Joe Damato
c9a728c329 netdev-genl: Support setting per-NAPI config values
Add support to set per-NAPI defer_hard_irqs and gro_flush_timeout.

Signed-off-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241011184527.16393-7-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-08 14:58:04 -08:00
Joe Damato
adf7973417 netdev-genl: Dump gro_flush_timeout
Support dumping gro_flush_timeout for a NAPI ID.

Signed-off-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241011184527.16393-5-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-08 14:58:04 -08:00
Joe Damato
4f5f2597ce netdev-genl: Dump napi_defer_hard_irqs
Support dumping defer_hard_irqs for a NAPI ID.

Signed-off-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20241011184527.16393-3-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-08 14:58:04 -08:00
Eric Dumazet
114881acba net: add IFLA_MAX_PACING_OFFLOAD_HORIZON device attribute
Some network devices have the ability to offload EDT (Earliest
Departure Time) which is the model used for TCP pacing and FQ
packet scheduler.

Some of them implement the timing wheel mechanism described in
https://saeed.github.io/files/carousel-sigcomm17.pdf
with an associated 'timing wheel horizon'.

This patch adds dev->max_pacing_offload_horizon expressing
this timing wheel horizon in nsec units.

This is a read-only attribute.

Unless a driver sets it, dev->max_pacing_offload_horizon
is zero.

v2: addressed Jakub feedback ( https://lore.kernel.org/netdev/20240930152304.472767-2-edumazet@google.com/T/#mf6294d714c41cc459962154cc2580ce3c9693663 )
v3: added yaml doc (also per Jakub feedback)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20241003121219.2396589-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-08 14:58:04 -08:00
Sidong Yang
b1f223b5b8 libbpf: Change hash_combine parameters from long to unsigned long
The hash_combine() could be trapped when compiled with sanitizer like "zig cc"
or clang with signed-integer-overflow option. This patch parameters and return
type to unsigned long to remove the potential overflow.

Signed-off-by: Sidong Yang <sidong.yang@furiosa.ai>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20241116081054.65195-1-sidong.yang@furiosa.ai
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-01-08 14:58:04 -08:00
Jiri Olsa
05333afec1 libbpf: Fix memory leak in bpf_program__attach_uprobe_multi
Andrii reported memory leak detected by Coverity on error path
in bpf_program__attach_uprobe_multi. Fixing that by moving
the check earlier before the offsets allocations.

Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20241115115843.694337-1-jolsa@kernel.org
2025-01-08 14:58:04 -08:00
Adrian Hunter
d22e0d8721 perf/core: Add aux_pause, aux_resume, aux_start_paused
Hardware traces, such as instruction traces, can produce a vast amount of
trace data, so being able to reduce tracing to more specific circumstances
can be useful.

The ability to pause or resume tracing when another event happens, can do
that.

Add ability for an event to "pause" or "resume" AUX area tracing.

Add aux_pause bit to perf_event_attr to indicate that, if the event
happens, the associated AUX area tracing should be paused. Ditto
aux_resume. Do not allow aux_pause and aux_resume to be set together.

Add aux_start_paused bit to perf_event_attr to indicate to an AUX area
event that it should start in a "paused" state.

Add aux_paused to struct hw_perf_event for AUX area events to keep track of
the "paused" state. aux_paused is initialized to aux_start_paused.

Add PERF_EF_PAUSE and PERF_EF_RESUME modes for ->stop() and ->start()
callbacks. Call as needed, during __perf_event_output(). Add
aux_in_pause_resume to struct perf_buffer to prevent races with the NMI
handler. Pause/resume in NMI context will miss out if it coincides with
another pause/resume.

To use aux_pause or aux_resume, an event must be in a group with the AUX
area event as the group leader.

Example (requires Intel PT and tools patches also):

 $ perf record --kcore -e intel_pt/aux-action=start-paused/k,syscalls:sys_enter_newuname/aux-action=resume/,syscalls:sys_exit_newuname/aux-action=pause/ uname
 Linux
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.043 MB perf.data ]
 $ perf script --call-trace
 uname   30805 [000] 24001.058782799: name: 0x7ffc9c1865b0
 uname   30805 [000] 24001.058784424:  psb offs: 0
 uname   30805 [000] 24001.058784424:  cbr: 39 freq: 3904 MHz (139%)
 uname   30805 [000] 24001.058784629: ([kernel.kallsyms])        debug_smp_processor_id
 uname   30805 [000] 24001.058784629: ([kernel.kallsyms])        __x64_sys_newuname
 uname   30805 [000] 24001.058784629: ([kernel.kallsyms])            down_read
 uname   30805 [000] 24001.058784629: ([kernel.kallsyms])                __cond_resched
 uname   30805 [000] 24001.058784629: ([kernel.kallsyms])                preempt_count_add
 uname   30805 [000] 24001.058784629: ([kernel.kallsyms])                    in_lock_functions
 uname   30805 [000] 24001.058784629: ([kernel.kallsyms])                preempt_count_sub
 uname   30805 [000] 24001.058784629: ([kernel.kallsyms])            up_read
 uname   30805 [000] 24001.058784629: ([kernel.kallsyms])                preempt_count_add
 uname   30805 [000] 24001.058784838: ([kernel.kallsyms])                    in_lock_functions
 uname   30805 [000] 24001.058784838: ([kernel.kallsyms])                preempt_count_sub
 uname   30805 [000] 24001.058784838: ([kernel.kallsyms])            _copy_to_user
 uname   30805 [000] 24001.058784838: ([kernel.kallsyms])        syscall_exit_to_user_mode
 uname   30805 [000] 24001.058784838: ([kernel.kallsyms])            syscall_exit_work
 uname   30805 [000] 24001.058784838: ([kernel.kallsyms])                perf_syscall_exit
 uname   30805 [000] 24001.058784838: ([kernel.kallsyms])                    debug_smp_processor_id
 uname   30805 [000] 24001.058785046: ([kernel.kallsyms])                    perf_trace_buf_alloc
 uname   30805 [000] 24001.058785046: ([kernel.kallsyms])                        perf_swevent_get_recursion_context
 uname   30805 [000] 24001.058785046: ([kernel.kallsyms])                            debug_smp_processor_id
 uname   30805 [000] 24001.058785046: ([kernel.kallsyms])                        debug_smp_processor_id
 uname   30805 [000] 24001.058785046: ([kernel.kallsyms])                    perf_tp_event
 uname   30805 [000] 24001.058785046: ([kernel.kallsyms])                        perf_trace_buf_update
 uname   30805 [000] 24001.058785046: ([kernel.kallsyms])                            tracing_gen_ctx_irq_test
 uname   30805 [000] 24001.058785046: ([kernel.kallsyms])                        perf_swevent_event
 uname   30805 [000] 24001.058785046: ([kernel.kallsyms])                            __perf_event_account_interrupt
 uname   30805 [000] 24001.058785046: ([kernel.kallsyms])                                __this_cpu_preempt_check
 uname   30805 [000] 24001.058785046: ([kernel.kallsyms])                            perf_event_output_forward
 uname   30805 [000] 24001.058785046: ([kernel.kallsyms])                                perf_event_aux_pause
 uname   30805 [000] 24001.058785046: ([kernel.kallsyms])                                    ring_buffer_get
 uname   30805 [000] 24001.058785046: ([kernel.kallsyms])                                        __rcu_read_lock
 uname   30805 [000] 24001.058785046: ([kernel.kallsyms])                                        __rcu_read_unlock
 uname   30805 [000] 24001.058785254: ([kernel.kallsyms])                                    pt_event_stop
 uname   30805 [000] 24001.058785254: ([kernel.kallsyms])                                        debug_smp_processor_id
 uname   30805 [000] 24001.058785254: ([kernel.kallsyms])                                        debug_smp_processor_id
 uname   30805 [000] 24001.058785254: ([kernel.kallsyms])                                        native_write_msr
 uname   30805 [000] 24001.058785463: ([kernel.kallsyms])                                        native_write_msr
 uname   30805 [000] 24001.058785639: 0x0

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: James Clark <james.clark@arm.com>
Link: https://lkml.kernel.org/r/20241022155920.17511-3-adrian.hunter@intel.com
2025-01-08 14:58:04 -08:00
Ihor Solodrai
c5f22aca0f ci: remove llvm-17 variant of the workflow
Also try prettifying the job names.

Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
2024-11-20 10:23:35 -08:00
Ihor Solodrai
bfc9770b24 ci: switch to libbpf/ci actions @v2
Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
2024-11-18 22:00:09 -08:00
Ihor Solodrai
cd73a17321 ci: configure CI test jobs
* Don't run pahole@tmp.master + llvm-17 combination.
* Use descriptive name of for vmtest jobs
* Don't run test_progs_cpuv4 when LLVM_VERSION < 18 (same as on BPF CI)
* Add some logging to prepare-selftests-run.sh

Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
2024-11-18 22:00:09 -08:00
Ihor Solodrai
39e4e86263 ci: cleanup now unused local actions and workflows
Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
2024-11-18 22:00:09 -08:00
Ihor Solodrai
c1f8925561 ci: bump llvm version to 18
Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
2024-11-18 22:00:09 -08:00
Ihor Solodrai
c7bf7b8977 ci: update temporary kernel patches
Remove old patches applied to kernel source for CI. They haven't been
applied in a while.

Add a fix for token/obj_priv_implicit_token_envvar

Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
2024-11-18 22:00:09 -08:00
Ihor Solodrai
e0687f9f54 ci: add vmtest as a reusable workflow
Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
2024-11-18 22:00:09 -08:00
Ihor Solodrai
dcf6ad6c70 ci: switch to libbpf/ci/build-selftests action
Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
2024-11-18 22:00:09 -08:00
Ihor Solodrai
a453ffb7ea ci: add a vmtest step for setting up selftests run
Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
2024-11-18 22:00:09 -08:00
Ihor Solodrai
779cb2b65b ci: use libbpf/ci/run-vmtest action to run selftests
Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
2024-11-18 22:00:09 -08:00
Andrii Nakryiko
244485ce72 ci: don't fail test kmodule builds on older kernels
We are now getting:

  WARNING: Module.symvers is missing.
           Modules may not have dependencies or modversions.
           You may get many unresolved symbol errors.
           You can set KBUILD_MODPOST_WARN=1 to turn errors into warning
           if you want to proceed at your own risk.

So let's set KBUILD_MODPOST_WARN=1.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-11-13 19:25:37 -08:00
Andrii Nakryiko
713f5b0779 libbpf: bump version to v1.6.0 for new dev cycle
We are now in v1.6.0 dev cycles, reflect that in Makefile.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-11-13 19:25:37 -08:00
Andrii Nakryiko
6c25f7dcb5 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   c6fb8030b4baa01c850f99fc6da051b1017edc46
Checkpoint bpf-next commit: a1087da9d11e5bcacc706002bc0f84b790881f69
Baseline bpf commit:        d5fb316e2af1d947f0f6c3666e373a54d9f27c6f
Checkpoint bpf commit:      fb86c42a2a5d44e849ddfbc98b8d2f4f40d36ee3

Andrii Nakryiko (1):
  libbpf: start v1.6 development cycle

Jiri Olsa (2):
  bpf: Add support for uprobe multi session attach
  libbpf: Add support for uprobe multi session attach

Mykyta Yatsenko (4):
  libbpf: Introduce errstr() for stringifying errno
  libbpf: Stringify errno in log messages in libbpf.c
  libbpf: Stringify errno in log messages in btf*.c
  libbpf: Stringify errno in log messages in the remaining code

 include/uapi/linux/bpf.h |   1 +
 src/bpf.c                |   1 +
 src/btf.c                |  26 +--
 src/btf_dump.c           |   3 +-
 src/elf.c                |   4 +-
 src/features.c           |  15 +-
 src/gen_loader.c         |   3 +-
 src/libbpf.c             | 375 ++++++++++++++++++---------------------
 src/libbpf.h             |   4 +-
 src/libbpf.map           |   3 +
 src/libbpf_version.h     |   2 +-
 src/linker.c             |  21 ++-
 src/ringbuf.c            |  34 ++--
 src/str_error.c          |  71 ++++++++
 src/str_error.h          |   7 +
 src/usdt.c               |  32 ++--
 16 files changed, 332 insertions(+), 270 deletions(-)

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-11-13 19:25:37 -08:00
Andrii Nakryiko
8576598d64 sync: update .mailmap
Update .mailmap based on libbpf's list of contributors and on the latest
.mailmap version in the upstream repository.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-11-13 19:25:37 -08:00
Mykyta Yatsenko
db73e46709 libbpf: Stringify errno in log messages in the remaining code
Convert numeric error codes into the string representations in log
messages in the rest of libbpf source files.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20241111212919.368971-5-mykyta.yatsenko5@gmail.com
2024-11-13 19:25:37 -08:00
Mykyta Yatsenko
da7d63a690 libbpf: Stringify errno in log messages in btf*.c
Convert numeric error codes into the string representations in log
messages in btf.c and btf_dump.c.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20241111212919.368971-4-mykyta.yatsenko5@gmail.com
2024-11-13 19:25:37 -08:00
Mykyta Yatsenko
6d8af6175e libbpf: Stringify errno in log messages in libbpf.c
Convert numeric error codes into the string representations in log
messages in libbpf.c.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20241111212919.368971-3-mykyta.yatsenko5@gmail.com
2024-11-13 19:25:37 -08:00
Mykyta Yatsenko
8232da46d6 libbpf: Introduce errstr() for stringifying errno
Add function errstr(int err) that allows converting numeric error codes
into string representations.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20241111212919.368971-2-mykyta.yatsenko5@gmail.com
2024-11-13 19:25:37 -08:00
Jiri Olsa
c975e02612 libbpf: Add support for uprobe multi session attach
Adding support to attach program in uprobe session mode
with bpf_program__attach_uprobe_multi function.

Adding session bool to bpf_uprobe_multi_opts struct that allows
to load and attach the bpf program via uprobe session.
the attachment to create uprobe multi session.

Also adding new program loader section that allows:
  SEC("uprobe.session/bpf_fentry_test*")

and loads/attaches uprobe program as uprobe session.

Adding sleepable hook (uprobe.session.s) as well.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20241108134544.480660-6-jolsa@kernel.org
2024-11-13 19:25:37 -08:00
Jiri Olsa
183d84803a bpf: Add support for uprobe multi session attach
Adding support to attach BPF program for entry and return probe
of the same function. This is common use case which at the moment
requires to create two uprobe multi links.

Adding new BPF_TRACE_UPROBE_SESSION attach type that instructs
kernel to attach single link program to both entry and exit probe.

It's possible to control execution of the BPF program on return
probe simply by returning zero or non zero from the entry BPF
program execution to execute or not the BPF program on return
probe respectively.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20241108134544.480660-4-jolsa@kernel.org
2024-11-13 19:25:37 -08:00
Andrii Nakryiko
6210515c78 libbpf: start v1.6 development cycle
With libbpf v1.5.0 release out, start v1.6 dev cycle.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20241029184045.581537-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-13 19:25:37 -08:00
Ihor Solodrai
94610d4c27 ci: remove CI jobs for 4.9.0 and 5.5.0 kernels
Signed-off-by: Ihor Solodrai <isolodrai@meta.com>
2024-11-13 19:22:53 -08:00
Andrii Nakryiko
09b9e83102 ci: bump uraimo/run-on-arch-action version
Bump to latest uraimo/run-on-arch-action@v2.8.1 version, hoping that
fixes the CI.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-10-24 14:34:52 -07:00
Andrii Nakryiko
891438c086 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   989a29cfed9b5092c3e18be14e9032c51bb1c9f6
Checkpoint bpf-next commit: c6fb8030b4baa01c850f99fc6da051b1017edc46
Baseline bpf commit:        b836cbdf3b81a4a22b3452186efa2e5105a77e10
Checkpoint bpf commit:      d5fb316e2af1d947f0f6c3666e373a54d9f27c6f

Andrii Nakryiko (1):
  libbpf: move global data mmap()'ing into bpf_object__load()

Eder Zulian (1):
  libbpf: Prevent compiler warnings/errors

Hou Tao (1):
  bpf: Add the missing BPF_LINK_TYPE invocation for sockmap

Kui-Feng Lee (1):
  libbpf: define __uptr.

 include/uapi/linux/bpf.h |  3 ++
 src/bpf_helpers.h        |  1 +
 src/btf_dump.c           |  4 +-
 src/libbpf.c             | 83 +++++++++++++++++++---------------------
 4 files changed, 46 insertions(+), 45 deletions(-)

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-10-24 14:34:52 -07:00
Hou Tao
2d7a79a984 bpf: Add the missing BPF_LINK_TYPE invocation for sockmap
There is an out-of-bounds read in bpf_link_show_fdinfo() for the sockmap
link fd. Fix it by adding the missing BPF_LINK_TYPE invocation for
sockmap link

Also add comments for bpf_link_type to prevent missing updates in the
future.

Fixes: 699c23f02c65 ("bpf: Add bpf_link support for sk_msg and sk_skb progs")
Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20241024013558.1135167-2-houtao@huaweicloud.com
2024-10-24 14:34:52 -07:00
Kui-Feng Lee
ee92f521ab libbpf: define __uptr.
Make __uptr available to BPF programs to enable them to define uptrs.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20241023234759.860539-8-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-24 14:34:52 -07:00
Andrii Nakryiko
2dea4b86ee libbpf: move global data mmap()'ing into bpf_object__load()
Since BPF skeleton inception libbpf has been doing mmap()'ing of global
data ARRAY maps in bpf_object__load_skeleton() API, which is used by
code generated .skel.h files (i.e., by BPF skeletons only).

This is wrong because if BPF object is loaded through generic
bpf_object__load() API, global data maps won't be re-mmap()'ed after
load step, and memory pointers returned from bpf_map__initial_value()
would be wrong and won't reflect the actual memory shared between BPF
program and user space.

bpf_map__initial_value() return result is rarely used after load, so
this went unnoticed for a really long time, until bpftrace project
attempted to load BPF object through generic bpf_object__load() API and
then used BPF subskeleton instantiated from such bpf_object. It turned
out that .data/.rodata/.bss data updates through such subskeleton was
"blackholed", all because libbpf wouldn't re-mmap() those maps during
bpf_object__load() phase.

Long story short, this step should be done by libbpf regardless of BPF
skeleton usage, right after BPF map is created in the kernel. This patch
moves this functionality into bpf_object__populate_internal_map() to
achieve this. And bpf_object__load_skeleton() is now simple and almost
trivial, only propagating these mmap()'ed pointers into user-supplied
skeleton structs.

We also do trivial adjustments to error reporting inside
bpf_object__populate_internal_map() for consistency with the rest of
libbpf's map-handling code.

Reported-by: Alastair Robertson <ajor@meta.com>
Reported-by: Jonathan Wiepert <jwiepert@meta.com>
Fixes: d66562fba1ce ("libbpf: Add BPF object skeleton support")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20241023043908.3834423-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-24 14:34:52 -07:00
Eder Zulian
fdbdbb6b8a libbpf: Prevent compiler warnings/errors
Initialize 'new_off' and 'pad_bits' to 0 and 'pad_type' to  NULL in
btf_dump_emit_bit_padding to prevent compiler warnings/errors which are
observed when compiling with 'EXTRA_CFLAGS=-g -Og' options, but do not
happen when compiling with current default options.

For example, when compiling libbpf with

  $ make "EXTRA_CFLAGS=-g -Og" -C tools/lib/bpf/ clean all

Clang version 17.0.6 and GCC 13.3.1 fail to compile btf_dump.c due to
following errors:

  btf_dump.c: In function ‘btf_dump_emit_bit_padding’:
  btf_dump.c:903:42: error: ‘new_off’ may be used uninitialized [-Werror=maybe-uninitialized]
    903 |         if (new_off > cur_off && new_off <= next_off) {
        |                                  ~~~~~~~~^~~~~~~~~~~
  btf_dump.c:870:13: note: ‘new_off’ was declared here
    870 |         int new_off, pad_bits, bits, i;
        |             ^~~~~~~
  btf_dump.c:917:25: error: ‘pad_type’ may be used uninitialized [-Werror=maybe-uninitialized]
    917 |                         btf_dump_printf(d, "\n%s%s: %d;", pfx(lvl), pad_type,
        |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    918 |                                         in_bitfield ? new_off - cur_off : 0);
        |                                         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  btf_dump.c:871:21: note: ‘pad_type’ was declared here
    871 |         const char *pad_type;
        |                     ^~~~~~~~
  btf_dump.c:930:20: error: ‘pad_bits’ may be used uninitialized [-Werror=maybe-uninitialized]
    930 |                 if (bits == pad_bits) {
        |                    ^
  btf_dump.c:870:22: note: ‘pad_bits’ was declared here
    870 |         int new_off, pad_bits, bits, i;
        |                      ^~~~~~~~
  cc1: all warnings being treated as errors

Signed-off-by: Eder Zulian <ezulian@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20241022172329.3871958-3-ezulian@redhat.com
2024-10-24 14:34:52 -07:00
Andrii Nakryiko
fc064eb41e sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   b24d7f0da6ef5a23456a301eaf51b170f961d4ae
Checkpoint bpf-next commit: 989a29cfed9b5092c3e18be14e9032c51bb1c9f6
Baseline bpf commit:        b24d7f0da6ef5a23456a301eaf51b170f961d4ae
Checkpoint bpf commit:      b836cbdf3b81a4a22b3452186efa2e5105a77e10

Andrii Nakryiko (2):
  libbpf: fix sym_is_subprog() logic for weak global subprogs
  libbpf: never interpret subprogs in .text as entry programs

Chen Ni (1):
  libbpf: Remove unneeded semicolon

Eduard Zingerman (1):
  bpf: __bpf_fastcall for bpf_get_smp_processor_id in uapi

Eric Long (1):
  libbpf: Do not resolve size on duplicate FUNCs

Ihor Solodrai (1):
  libbpf: Change log level of BTF loading error message

Martin Kelly (1):
  bpf: Update bpf_override_return() comment

Matteo Croce (1):
  bpf: fix argument type in bpf_loop documentation

Namhyung Kim (1):
  libbpf: Fix possible compiler warnings in hashmap

Tao Chen (1):
  libbpf: Fix expected_attach_type set handling in program load callback

Tony Ambardar (7):
  libbpf: Improve log message formatting
  libbpf: Fix header comment typos for BTF.ext
  libbpf: Fix output .symtab byte-order during linking
  libbpf: Support BTF.ext loading and output in either endianness
  libbpf: Support opening bpf objects of either endianness
  libbpf: Support linking bpf objects of either endianness
  libbpf: Support creating light skeleton of either endianness

 include/uapi/linux/bpf.h |   8 +-
 src/bpf_gen_internal.h   |   1 +
 src/btf.c                | 280 ++++++++++++++++++++++++++++++---------
 src/btf.h                |   3 +
 src/btf_dump.c           |   2 +-
 src/btf_relocate.c       |   2 +-
 src/gen_loader.c         | 187 ++++++++++++++++++--------
 src/hashmap.h            |  20 +--
 src/libbpf.c             |  81 ++++++++---
 src/libbpf.map           |   2 +
 src/libbpf_internal.h    |  43 +++++-
 src/linker.c             |  84 +++++++++---
 src/relo_core.c          |   2 +-
 src/skel_internal.h      |   3 +-
 src/zip.c                |   2 +-
 15 files changed, 550 insertions(+), 170 deletions(-)

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-10-11 14:12:43 -07:00
Andrii Nakryiko
db8a210964 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-10-11 14:12:43 -07:00
Namhyung Kim
f69995d909 libbpf: Fix possible compiler warnings in hashmap
The hashmap__for_each_entry[_safe] is accessing 'map' as a pointer.
But it does without parentheses so passing a static hash map with an
ampersand (like '&slab_hash') will cause compiler warnings due
to unmatched types as '->' operator has a higher precedence.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20241011170021.1490836-1-namhyung@kernel.org
2024-10-11 14:12:43 -07:00
Andrii Nakryiko
ac9ced9eb3 libbpf: never interpret subprogs in .text as entry programs
Libbpf pre-1.0 had a legacy logic of allowing singular non-annotated
(i.e., not having explicit SEC() annotation) function to be treated as
sole entry BPF program (unless there were other explicit entry
programs).

This behavior was dropped during libbpf 1.0 transition period (unless
LIBBPF_STRICT_SEC_NAME flag was unset in libbpf_mode). When 1.0 was
released and all the legacy behavior was removed, the bug slipped
through leaving this legacy behavior around.

Fix this for good, as it actually causes very confusing behavior if BPF
object file only has subprograms, but no entry programs.

Fixes: bd054102a8c7 ("libbpf: enforce strict libbpf 1.0 behaviors")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20241010211731.4121837-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-11 14:12:43 -07:00
Martin Kelly
ba8bd24bbb bpf: Update bpf_override_return() comment
The documentation says CONFIG_FUNCTION_ERROR_INJECTION is supported only
on x86. This was presumably true at the time of writing, but it's now
supported on many other architectures too. Drop this statement, since
it's not correct anymore and it fits better in other documentation
anyway.

Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com>
Link: https://lore.kernel.org/r/20241010193301.995909-1-martin.kelly@crowdstrike.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-11 14:12:43 -07:00
Matteo Croce
8ea6e12372 bpf: fix argument type in bpf_loop documentation
The `index` argument to bpf_loop() is threaded as an u64.
This lead in a subtle verifier denial where clang cloned the argument
in another register[1].

[1] https://github.com/systemd/systemd/pull/34650#issuecomment-2401092895

Signed-off-by: Matteo Croce <teknoraver@meta.com>
Link: https://lore.kernel.org/r/20241010035652.17830-1-technoboy85@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-11 14:12:43 -07:00
Andrii Nakryiko
0e3971339f libbpf: fix sym_is_subprog() logic for weak global subprogs
sym_is_subprog() is incorrectly rejecting relocations against *weak*
global subprogs. Fix that by realizing that STB_WEAK is also a global
function.

While it seems like verifier doesn't support taking an address of
non-static subprog right now, it's still best to fix support for it on
libbpf side, otherwise users will get a very confusing error during BPF
skeleton generation or static linking due to misinterpreted relocation:

  libbpf: prog 'handle_tp': bad map relo against 'foo' in section '.text'
  Error: failed to open BPF object file: Relocation failed

It's clearly not a map relocation, but is treated and reported as such
without this fix.

Fixes: 53eddb5e04ac ("libbpf: Support subprog address relocation")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20241009011554.880168-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-11 14:12:43 -07:00
Eric Long
ecf998ed8f libbpf: Do not resolve size on duplicate FUNCs
FUNCs do not have sizes, thus currently btf__resolve_size will fail
with -EINVAL. Add conditions so that we only update size when the BTF
object is not function or function prototype.

Signed-off-by: Eric Long <i@hack3r.moe>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20241002-libbpf-dup-extern-funcs-v4-1-560eb460ff90@hack3r.moe
2024-10-11 14:12:43 -07:00
Eduard Zingerman
89df6536bf bpf: __bpf_fastcall for bpf_get_smp_processor_id in uapi
Since [1] kernel supports __bpf_fastcall attribute for helper function
bpf_get_smp_processor_id(). Update uapi definition for this helper in
order to have this attribute in the generated bpf_helper_defs.h

[1] commit 91b7fbf3936f ("bpf, x86, riscv, arm: no_caller_saved_registers for bpf_get_smp_processor_id()")

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240916091712.2929279-3-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-11 14:12:43 -07:00
Tony Ambardar
8244006267 libbpf: Support creating light skeleton of either endianness
Track target endianness in 'struct bpf_gen' and process in-memory data in
native byte-order, but on finalization convert the embedded loader BPF
insns to target endianness.

The light skeleton also includes a target-accessed data blob which is
heterogeneous and thus difficult to convert to target byte-order on
finalization. Add support functions to convert data to target endianness
as it is added to the blob.

Also add additional debug logging for data blob structure details and
skeleton loading.

Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/569562e1d5bf1cce80a1f1a3882461ee2da1ffd5.1726475448.git.tony.ambardar@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-11 14:12:43 -07:00
Tony Ambardar
6ac8762ecd libbpf: Support linking bpf objects of either endianness
Allow static linking object files of either endianness, checking that input
files have consistent byte-order, and setting output endianness from input.

Linking requires in-memory processing of programs, relocations, sections,
etc. in native endianness, and output conversion to target byte-order. This
is enabled by built-in ELF translation and recent BTF/BTF.ext endianness
functions. Further add local functions for swapping byte-order of sections
containing BPF insns.

Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/b47ca686d02664843fc99b96262fe3259650bc43.1726475448.git.tony.ambardar@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-11 14:12:43 -07:00
Tony Ambardar
628b21dbcd libbpf: Support opening bpf objects of either endianness
Allow bpf_object__open() to access files of either endianness, and convert
included BPF programs to native byte-order in-memory for introspection.
Loading BPF objects of non-native byte-order is still disallowed however.

Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/26353c1a1887a54400e1acd6c138fa90c99cdd40.1726475448.git.tony.ambardar@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-11 14:12:43 -07:00
Tony Ambardar
5ae8432d15 libbpf: Support BTF.ext loading and output in either endianness
Support for handling BTF data of either endianness was added in [1], but
did not include BTF.ext data for lack of use cases. Later, support for
static linking [2] provided a use case, but this feature and later ones
were restricted to native-endian usage.

Add support for BTF.ext handling in either endianness. Convert BTF.ext data
to native endianness when read into memory for further processing, and
support raw data access that restores the original byte-order for output.
Add internal header functions for byte-swapping func, line, and core info
records.

Add new API functions btf_ext__endianness() and btf_ext__set_endianness()
for query and setting byte-order, as already exist for BTF data.

[1] 3289959b97ca ("libbpf: Support BTF loading and raw data output in both endianness")
[2] 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support")

Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/133407ab20e0dd5c07cab2a6fa7879dee1ffa4bc.1726475448.git.tony.ambardar@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-11 14:12:43 -07:00
Tony Ambardar
f2668a0a71 libbpf: Fix output .symtab byte-order during linking
Object linking output data uses the default ELF_T_BYTE type for '.symtab'
section data, which disables any libelf-based translation. Explicitly set
the ELF_T_SYM type for output to restore libelf's byte-order conversion,
noting that input '.symtab' data is already correctly translated.

Fixes: faf6ed321cf6 ("libbpf: Add BPF static linker APIs")
Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/87868bfeccf3f51aec61260073f8778e9077050a.1726475448.git.tony.ambardar@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-11 14:12:43 -07:00
Tony Ambardar
5060f172cc libbpf: Fix header comment typos for BTF.ext
Mention struct btf_ext_info_sec rather than non-existent btf_sec_func_info
in BTF.ext struct documentation.

Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/cde65e01a5f2945c578485fab265ef711e2daeb6.1726475448.git.tony.ambardar@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-11 14:12:43 -07:00
Tony Ambardar
ceeb7211c9 libbpf: Improve log message formatting
Fix missing newlines and extraneous terminal spaces in messages.

Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/086884b7cbf87e524d584f9bf87f7a580e378b2b.1726475448.git.tony.ambardar@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-11 14:12:43 -07:00
Chen Ni
3fb92e63e0 libbpf: Remove unneeded semicolon
Remove unneeded semicolon in zip_archive_open().

Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240926023823.3632993-1-nichen@iscas.ac.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-11 14:12:43 -07:00
Tao Chen
ad633fb142 libbpf: Fix expected_attach_type set handling in program load callback
Referenced commit broke the logic of resetting expected_attach_type to
zero for allowed program types if kernel doesn't yet support such field.
We do need to overwrite and preserve expected_attach_type for
multi-uprobe though, but that can be done explicitly in
libbpf_prepare_prog_load().

Fixes: 5902da6d8a52 ("libbpf: Add uprobe multi link support to bpf_program__attach_usdt")
Suggested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Tao Chen <chen.dylane@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240925153012.212866-1-chen.dylane@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-11 14:12:43 -07:00
Ihor Solodrai
3ea36843b3 libbpf: Change log level of BTF loading error message
Reduce log level of BTF loading error to INFO if BTF is not required.

Andrii says:

  Nowadays the expectation is that the BPF program will have a valid
  .BTF section, so even though .BTF is "optional", I think it's fine
  to emit a warning for that case (any reasonably recent Clang will
  produce valid BTF).

  Ihor's patch is fixing the situation with an outdated host kernel
  that doesn't understand BTF. libbpf will try to "upload" the
  program's BTF, but if that fails and the BPF object doesn't use
  any features that require having BTF uploaded, then it's just an
  information message to the user, but otherwise can be ignored.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-11 14:12:43 -07:00
Jordan Rome
80b16457cb ci: add temporary patch for failing upstream BPF uprobe selftest
Signed-off-by: Jordan Rome <linux@jordanrome.com>
2024-10-09 14:13:35 -07:00
Jordan Rome
7827ca87d1 ci: regenerate vmlinux.h
Regenerate latest vmlinux.h for old kernel CI tests.

Signed-off-by: Jordan Rome <linux@jordanrome.com>
2024-10-09 14:13:35 -07:00
Jordan Rome
91ccd57ca9 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   2ad6d23f465a4f851e3bcf6d74c315ce7b2c205b
Checkpoint bpf-next commit: b24d7f0da6ef5a23456a301eaf51b170f961d4ae
Baseline bpf commit:        b408473ea01b2e499d23503e2bf898416da9d7ac
Checkpoint bpf commit:      b24d7f0da6ef5a23456a301eaf51b170f961d4ae

Alan Maguire (1):
  bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags

Daniel Borkmann (1):
  bpf: Sync uapi bpf.h header to tools directory

Donald Hunter (1):
  docs/bpf: Add missing BPF program types to docs

Ihor Solodrai (1):
  libbpf: Add bpf_object__token_fd accessor

Jiri Olsa (1):
  libbpf: Fix uretprobe.multi.s programs auto attachment

Lin Yikai (1):
  libbpf: fix some typos in libbpf

Mina Almasry (2):
  net: netdev netlink api to bind dma-buf to a net device
  netdev: add dmabuf introspection

Pu Lehui (3):
  libbpf: Access first syscall argument with CO-RE direct read on s390
  libbpf: Access first syscall argument with CO-RE direct read on arm64
  libbpf: Fix accessing first syscall argument on RV64

Sam James (1):
  libbpf: Workaround (another) -Wmaybe-uninitialized false positive

Shuyi Cheng (1):
  libbpf: Fixed getting wrong return address on arm64 architecture

Yusheng Zheng (1):
  libbpf: Fix some typos in comments

 docs/program_types.rst      | 30 ++++++++++++++++++++++++++----
 include/uapi/linux/bpf.h    | 25 ++++++++++++-------------
 include/uapi/linux/netdev.h | 13 +++++++++++++
 src/bpf.h                   |  4 ++--
 src/bpf_helpers.h           |  2 +-
 src/bpf_tracing.h           | 25 ++++++++++++++++---------
 src/btf.c                   |  4 ++--
 src/btf.h                   |  2 +-
 src/btf_dump.c              |  2 +-
 src/libbpf.c                | 13 +++++++++----
 src/libbpf.h                | 18 +++++++++++++-----
 src/libbpf.map              |  1 +
 src/libbpf_legacy.h         |  4 ++--
 src/linker.c                |  4 ++--
 src/skel_internal.h         |  2 +-
 src/usdt.bpf.h              |  2 +-
 16 files changed, 103 insertions(+), 48 deletions(-)

Signed-off-by: Jordan Rome <linux@jordanrome.com>
2024-10-09 14:13:35 -07:00
Jordan Rome
f0a307f61c sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.

Signed-off-by: Jordan Rome <linux@jordanrome.com>
2024-10-09 14:13:35 -07:00
Daniel Borkmann
80b97bd0b8 bpf: Sync uapi bpf.h header to tools directory
There is a delta between kernel UAPI bpf.h and tools UAPI bpf.h,
thus sync them again.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2024-10-09 14:13:35 -07:00
Ihor Solodrai
7c2f492a88 libbpf: Add bpf_object__token_fd accessor
Add a LIBBPF_API function to retrieve the token_fd from a bpf_object.

Without this accessor, if user needs a token FD they have to get it
manually via bpf_token_create, even though a token might have been
already created by bpf_object__load.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240913001858.3345583-1-ihor.solodrai@pm.me
2024-10-09 14:13:35 -07:00
Donald Hunter
114f6ce2fd docs/bpf: Add missing BPF program types to docs
Update the table of program types in the libbpf documentation with the
recently added program types.

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240912095944.6386-1-donald.hunter@gmail.com
2024-10-09 14:13:35 -07:00
Jiri Olsa
69671302df libbpf: Fix uretprobe.multi.s programs auto attachment
As reported by Andrii we don't currently recognize uretprobe.multi.s
programs as return probes due to using (wrong) strcmp function.

Using str_has_pfx() instead to match uretprobe.multi prefix.

Tests are passing, because the return program was executed
as entry program and all counts were incremented properly.

Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240910125336.3056271-1-jolsa@kernel.org
2024-10-09 14:13:35 -07:00
Yusheng Zheng
e1833cff9c libbpf: Fix some typos in comments
Fix some spelling errors in the code comments of libbpf:

betwen -> between
paremeters -> parameters
knowning -> knowing
definiton -> definition
compatiblity -> compatibility
overriden -> overridden
occured -> occurred
proccess -> process
managment -> management
nessary -> necessary

Signed-off-by: Yusheng Zheng <yunwei356@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240909225952.30324-1-yunwei356@gmail.com
2024-10-09 14:13:35 -07:00
Shuyi Cheng
81ac790dc8 libbpf: Fixed getting wrong return address on arm64 architecture
ARM64 has a separate lr register to store the return address, so here
you only need to read the lr register to get the return address, no need
to dereference it again.

Signed-off-by: Shuyi Cheng <chengshuyi@linux.alibaba.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1725787433-77262-1-git-send-email-chengshuyi@linux.alibaba.com
2024-10-09 14:13:35 -07:00
Sam James
3b301cf75d libbpf: Workaround (another) -Wmaybe-uninitialized false positive
We get this with GCC 15 -O3 (at least):
```
libbpf.c: In function ‘bpf_map__init_kern_struct_ops’:
libbpf.c:1109:18: error: ‘mod_btf’ may be used uninitialized [-Werror=maybe-uninitialized]
 1109 |         kern_btf = mod_btf ? mod_btf->btf : obj->btf_vmlinux;
      |         ~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
libbpf.c:1094:28: note: ‘mod_btf’ was declared here
 1094 |         struct module_btf *mod_btf;
      |                            ^~~~~~~
In function ‘find_struct_ops_kern_types’,
    inlined from ‘bpf_map__init_kern_struct_ops’ at libbpf.c:1102:8:
libbpf.c:982:21: error: ‘btf’ may be used uninitialized [-Werror=maybe-uninitialized]
  982 |         kern_type = btf__type_by_id(btf, kern_type_id);
      |                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
libbpf.c: In function ‘bpf_map__init_kern_struct_ops’:
libbpf.c:967:21: note: ‘btf’ was declared here
  967 |         struct btf *btf;
      |                     ^~~
```

This is similar to the other libbpf fix from a few weeks ago for
the same modelling-errno issue (fab45b962749184e1a1a57c7c583782b78fad539).

Signed-off-by: Sam James <sam@gentoo.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://bugs.gentoo.org/939106
Link: https://lore.kernel.org/bpf/f6962729197ae7cdf4f6d1512625bd92f2322d31.1725630494.git.sam@gentoo.org
2024-10-09 14:13:35 -07:00
Lin Yikai
6c8dde3554 libbpf: fix some typos in libbpf
Hi, fix some spelling errors in libbpf, the details are as follows:

-in the code comments:
	termintaing->terminating
	architecutre->architecture
	requring->requiring
	recored->recoded
	sanitise->sanities
	allowd->allowed
	abover->above
	see bpf_udst_arg()->see bpf_usdt_arg()

Signed-off-by: Lin Yikai <yikai.lin@vivo.com>
Link: https://lore.kernel.org/r/20240905110354.3274546-3-yikai.lin@vivo.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-09 14:13:35 -07:00
Pu Lehui
9045c3ab53 libbpf: Fix accessing first syscall argument on RV64
On RV64, as Ilya mentioned before [0], the first syscall parameter should be
accessed through orig_a0 (see arch/riscv64/include/asm/syscall.h),
otherwise it will cause selftests like bpf_syscall_macro, vmlinux,
test_lsm, etc. to fail on RV64. Let's fix it by using the struct pt_regs
style CO-RE direct access.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220209021745.2215452-1-iii@linux.ibm.com [0]
Link: https://lore.kernel.org/bpf/20240831041934.1629216-5-pulehui@huaweicloud.com
2024-10-09 14:13:35 -07:00
Pu Lehui
53a645402f libbpf: Access first syscall argument with CO-RE direct read on arm64
Currently PT_REGS_PARM1 SYSCALL(x) is consistent with PT_REGS_PARM1_CORE
SYSCALL(x), which will introduce the overhead of BPF_CORE_READ(), taking
into account the read pt_regs comes directly from the context, let's use
CO-RE direct read to access the first system call argument.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Xu Kuohai <xukuohai@huawei.com>
Link: https://lore.kernel.org/bpf/20240831041934.1629216-3-pulehui@huaweicloud.com
2024-10-09 14:13:35 -07:00
Pu Lehui
6d01681b02 libbpf: Access first syscall argument with CO-RE direct read on s390
Currently PT_REGS_PARM1 SYSCALL(x) is consistent with PT_REGS_PARM1_CORE
SYSCALL(x), which will introduce the overhead of BPF_CORE_READ(), taking
into account the read pt_regs comes directly from the context, let's use
CO-RE direct read to access the first system call argument.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240831041934.1629216-2-pulehui@huaweicloud.com
2024-10-09 14:13:35 -07:00
Mina Almasry
9a37057800 netdev: add dmabuf introspection
Add dmabuf information to page_pool stats:

$ ./cli.py --spec ../netlink/specs/netdev.yaml --dump page-pool-get
...
 {'dmabuf': 10,
  'id': 456,
  'ifindex': 3,
  'inflight': 1023,
  'inflight-mem': 4190208},
 {'dmabuf': 10,
  'id': 455,
  'ifindex': 3,
  'inflight': 1023,
  'inflight-mem': 4190208},
 {'dmabuf': 10,
  'id': 454,
  'ifindex': 3,
  'inflight': 1023,
  'inflight-mem': 4190208},
 {'dmabuf': 10,
  'id': 453,
  'ifindex': 3,
  'inflight': 1023,
  'inflight-mem': 4190208},
 {'dmabuf': 10,
  'id': 452,
  'ifindex': 3,
  'inflight': 1023,
  'inflight-mem': 4190208},
 {'dmabuf': 10,
  'id': 451,
  'ifindex': 3,
  'inflight': 1023,
  'inflight-mem': 4190208},
 {'dmabuf': 10,
  'id': 450,
  'ifindex': 3,
  'inflight': 1023,
  'inflight-mem': 4190208},
 {'dmabuf': 10,
  'id': 449,
  'ifindex': 3,
  'inflight': 1023,
  'inflight-mem': 4190208},

And queue stats:

$ ./cli.py --spec ../netlink/specs/netdev.yaml --dump queue-get
...
{'dmabuf': 10, 'id': 8, 'ifindex': 3, 'type': 'rx'},
{'dmabuf': 10, 'id': 9, 'ifindex': 3, 'type': 'rx'},
{'dmabuf': 10, 'id': 10, 'ifindex': 3, 'type': 'rx'},
{'dmabuf': 10, 'id': 11, 'ifindex': 3, 'type': 'rx'},
{'dmabuf': 10, 'id': 12, 'ifindex': 3, 'type': 'rx'},
{'dmabuf': 10, 'id': 13, 'ifindex': 3, 'type': 'rx'},
{'dmabuf': 10, 'id': 14, 'ifindex': 3, 'type': 'rx'},
{'dmabuf': 10, 'id': 15, 'ifindex': 3, 'type': 'rx'},

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20240910171458.219195-14-almasrymina@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09 14:13:35 -07:00
Mina Almasry
3578ab89fb net: netdev netlink api to bind dma-buf to a net device
API takes the dma-buf fd as input, and binds it to the netdevice. The
user can specify the rx queues to bind the dma-buf to.

Suggested-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20240910171458.219195-3-almasrymina@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09 14:13:35 -07:00
Alan Maguire
178df3d885 bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags
Currently the only opportunity to set sock ops flags dictating
which callbacks fire for a socket is from within a TCP-BPF sockops
program.  This is problematic if the connection is already set up
as there is no further chance to specify callbacks for that socket.
Add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_setsockopt() and bpf_getsockopt()
to allow users to specify callbacks later, either via an iterator
over sockets or via a socket-specific program triggered by a
setsockopt() on the socket.

Previous discussion on this here [1].

[1] https://lore.kernel.org/bpf/f42f157b-6e52-dd4d-3d97-9b86c84c0b00@oracle.com/

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/r/20240808150558.1035626-2-alan.maguire@oracle.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-10-09 14:13:35 -07:00
Ihor Solodrai
1f98105e54 ci: bump actions/upload-artifact to v4
Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
2024-10-07 15:38:01 -07:00
Andrii Nakryiko
a4161e00f9 ci: get rid of s390x kernel tests
Kernel/libbpf code is very well tested on s390x in BPF CI, so get rid of
it here as it often is a source of trouble and noise, without really
benefiting us much.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-10-07 15:38:01 -07:00
70 changed files with 12936 additions and 6312 deletions

View File

@@ -1,36 +0,0 @@
name: 'build-selftests'
description: 'Build BPF selftests'
inputs:
repo-path:
description: 'where is the source code'
required: true
kernel:
description: 'kernel version or LATEST'
required: true
default: 'LATEST'
vmlinux:
description: 'where is vmlinux file'
required: true
default: '${{ github.workspace }}/vmlinux'
llvm-version:
description: 'llvm version'
required: true
runs:
using: "composite"
steps:
- shell: bash
run: |
source $GITHUB_ACTION_PATH/../../../ci/vmtest/helpers.sh
foldable start "Setup Env"
sudo apt-get install -y qemu-kvm zstd binutils-dev elfutils libcap-dev libelf-dev libdw-dev python3-docutils
foldable end
- shell: bash
run: |
export KERNEL=${{ inputs.kernel }}
export REPO_ROOT="${{ github.workspace }}"
export REPO_PATH="${{ inputs.repo-path }}"
export VMLINUX_BTF="${{ inputs.vmlinux }}"
export LLVM_VERSION="${{ inputs.llvm-version }}"
${{ github.action_path }}/build_selftests.sh

View File

@@ -1,59 +0,0 @@
#!/bin/bash
set -euo pipefail
THISDIR="$(cd $(dirname $0) && pwd)"
source ${THISDIR}/helpers.sh
foldable start prepare_selftests "Building selftests"
LIBBPF_PATH="${REPO_ROOT}"
llvm_latest_version() {
echo "19"
}
if [[ "${LLVM_VERSION}" == $(llvm_latest_version) ]]; then
REPO_DISTRO_SUFFIX=""
else
REPO_DISTRO_SUFFIX="-${LLVM_VERSION}"
fi
DISTRIB_CODENAME="noble"
test -f /etc/lsb-release && . /etc/lsb-release
echo "${DISTRIB_CODENAME}"
echo "deb https://apt.llvm.org/${DISTRIB_CODENAME}/ llvm-toolchain-${DISTRIB_CODENAME}${REPO_DISTRO_SUFFIX} main" \
| sudo tee /etc/apt/sources.list.d/llvm.list
PREPARE_SELFTESTS_SCRIPT=${THISDIR}/prepare_selftests-${KERNEL}.sh
if [ -f "${PREPARE_SELFTESTS_SCRIPT}" ]; then
(cd "${REPO_ROOT}/${REPO_PATH}/tools/testing/selftests/bpf" && ${PREPARE_SELFTESTS_SCRIPT})
fi
if [[ "${KERNEL}" = 'LATEST' ]]; then
VMLINUX_H=
else
VMLINUX_H=${THISDIR}/vmlinux.h
fi
cd ${REPO_ROOT}/${REPO_PATH}
make headers
make \
CLANG=clang-${LLVM_VERSION} \
LLC=llc-${LLVM_VERSION} \
LLVM_STRIP=llvm-strip-${LLVM_VERSION} \
VMLINUX_BTF="${VMLINUX_BTF}" \
VMLINUX_H=${VMLINUX_H} \
-C "${REPO_ROOT}/${REPO_PATH}/tools/testing/selftests/bpf" \
-j $((4*$(nproc))) > /dev/null
cd -
mkdir ${LIBBPF_PATH}/selftests
cp -R "${REPO_ROOT}/${REPO_PATH}/tools/testing/selftests/bpf" \
${LIBBPF_PATH}/selftests
cd ${LIBBPF_PATH}
rm selftests/bpf/.gitignore
git add selftests
foldable end prepare_selftests

View File

@@ -1,38 +0,0 @@
# shellcheck shell=bash
# $1 - start or end
# $2 - fold identifier, no spaces
# $3 - fold section description
foldable() {
local YELLOW='\033[1;33m'
local NOCOLOR='\033[0m'
if [ $1 = "start" ]; then
line="::group::$2"
if [ ! -z "${3:-}" ]; then
line="$line - ${YELLOW}$3${NOCOLOR}"
fi
else
line="::endgroup::"
fi
echo -e "$line"
}
__print() {
local TITLE=""
if [[ -n $2 ]]; then
TITLE=" title=$2"
fi
echo "::$1${TITLE}::$3"
}
# $1 - title
# $2 - message
print_error() {
__print error $1 $2
}
# $1 - title
# $2 - message
print_notice() {
__print notice $1 $2
}

View File

@@ -1,5 +0,0 @@
#!/bin/bash
printf "all:\n\ttouch bpf_testmod.ko\n\nclean:\n" > bpf_testmod/Makefile
printf "all:\n\ttouch bpf_test_no_cfi.ko\n\nclean:\n" > bpf_test_no_cfi/Makefile

View File

@@ -1,5 +0,0 @@
#!/bin/bash
printf "all:\n\ttouch bpf_testmod.ko\n\nclean:\n" > bpf_testmod/Makefile
printf "all:\n\ttouch bpf_test_no_cfi.ko\n\nclean:\n" > bpf_test_no_cfi/Makefile

File diff suppressed because it is too large Load Diff

View File

@@ -1,124 +0,0 @@
name: 'vmtest'
description: 'Build + run vmtest'
inputs:
kernel:
description: 'kernel version or LATEST'
required: true
default: 'LATEST'
arch:
description: 'what arch to test'
required: true
default: 'x86_64'
pahole:
description: 'pahole rev or master'
required: true
default: 'master'
llvm-version:
description: 'llvm version'
required: false
default: '17'
runs:
using: "composite"
steps:
# Allow CI user to access /dev/kvm (via qemu) w/o group change/relogin
# by changing permissions set by udev.
- name: Set /dev/kvm permissions
shell: bash
run: |
if [ -e /dev/kvm ]; then
echo "/dev/kvm exists"
if [ $(id -u) != 0 ]; then
echo 'KERNEL=="kvm", GROUP="kvm", MODE="0666", OPTIONS+="static_node=kvm"' \
| sudo tee /etc/udev/rules.d/99-kvm4all.rules > /dev/null
sudo udevadm control --reload-rules
sudo udevadm trigger --name-match=kvm
fi
else
echo "/dev/kvm does not exist"
fi
# setup environment
- name: Setup environment
uses: libbpf/ci/setup-build-env@main
with:
pahole: ${{ inputs.pahole }}
arch: ${{ inputs.arch }}
llvm-version: ${{ inputs.llvm-version }}
# 1. download CHECKPOINT kernel source
- name: Get checkpoint commit
shell: bash
run: |
cat CHECKPOINT-COMMIT
echo "CHECKPOINT=$(cat CHECKPOINT-COMMIT)" >> $GITHUB_ENV
- name: Get kernel source at checkpoint
uses: libbpf/ci/get-linux-source@main
with:
repo: 'https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git'
rev: ${{ env.CHECKPOINT }}
dest: '${{ github.workspace }}/.kernel'
- name: Patch kernel source
uses: libbpf/ci/patch-kernel@main
with:
patches-root: '${{ github.workspace }}/ci/diffs'
repo-root: '.kernel'
- name: Prepare to build BPF selftests
shell: bash
run: |
source $GITHUB_ACTION_PATH/../../../ci/vmtest/helpers.sh
foldable start "Prepare building selftest"
cd .kernel
cat tools/testing/selftests/bpf/config \
tools/testing/selftests/bpf/config.${{ inputs.arch }} > .config
# this file might or mihgt not exist depending on kernel version
cat tools/testing/selftests/bpf/config.vm >> .config || :
make olddefconfig && make prepare
cd -
foldable end
# 2. if kernel == LATEST, build kernel image from tree
- name: Build kernel image
if: ${{ inputs.kernel == 'LATEST' }}
shell: bash
run: |
source $GITHUB_ACTION_PATH/../../../ci/vmtest/helpers.sh
foldable start "Build Kernel Image"
cd .kernel
make -j $((4*$(nproc))) all > /dev/null
cp vmlinux ${{ github.workspace }}
cd -
foldable end
# else, just download prebuilt kernel image
- name: Download prebuilt kernel
if: ${{ inputs.kernel != 'LATEST' }}
uses: libbpf/ci/download-vmlinux@main
with:
kernel: ${{ inputs.kernel }}
arch: ${{ inputs.arch }}
# 3. build selftests
- name: Build BPF selftests
uses: ./.github/actions/build-selftests
with:
repo-path: '.kernel'
kernel: ${{ inputs.kernel }}
llvm-version: ${{ inputs.llvm-version }}
# 4. prepare rootfs
- name: prepare rootfs
uses: libbpf/ci/prepare-rootfs@main
env:
KBUILD_OUTPUT: '.kernel'
with:
project-name: 'libbpf'
arch: ${{ inputs.arch }}
kernel: ${{ inputs.kernel }}
kernel-root: '.kernel'
kbuild-output: ${{ env.KBUILD_OUTPUT }}
image-output: '/tmp/root.img'
# 5. run selftest in QEMU
- name: Run selftests
env:
KERNEL: ${{ inputs.kernel }}
REPO_ROOT: ${{ github.workspace }}
uses: libbpf/ci/run-qemu@main
with:
arch: ${{ inputs.arch }}
img: '/tmp/root.img'
vmlinuz: 'vmlinuz'
kernel-root: '.kernel'

View File

@@ -61,31 +61,32 @@ jobs:
- arch: aarch64
- arch: ppc64le
- arch: s390x
- arch: x86
- arch: amd64
steps:
- uses: actions/checkout@v4
name: Checkout
- name: Setup QEMU
uses: docker/setup-qemu-action@v3
with:
image: tonistiigi/binfmt:qemu-v8.1.5
- uses: ./.github/actions/setup
name: Pre-Setup
- run: source /tmp/ci_setup && sudo -E $CI_ROOT/managers/ubuntu.sh
if: matrix.arch == 'x86'
if: matrix.arch == 'amd64'
name: Setup
- uses: uraimo/run-on-arch-action@v2.7.1
name: Build in docker
if: matrix.arch != 'x86'
with:
distro:
ubuntu22.04
arch:
${{ matrix.arch }}
setup:
cp /tmp/ci_setup $GITHUB_WORKSPACE
dockerRunArgs: |
--volume "${GITHUB_WORKSPACE}:${GITHUB_WORKSPACE}"
shell: /bin/bash
install: |
export DEBIAN_FRONTEND=noninteractive
export TZ="America/Los_Angeles"
apt-get update -y
apt-get install -y tzdata build-essential sudo
run: source ${GITHUB_WORKSPACE}/ci_setup && $CI_ROOT/managers/ubuntu.sh
- name: Build in docker
if: matrix.arch != 'amd64'
run: |
cp /tmp/ci_setup ${GITHUB_WORKSPACE}
docker run --rm \
--platform linux/${{ matrix.arch }} \
-v ${GITHUB_WORKSPACE}:${GITHUB_WORKSPACE} \
-e GITHUB_WORKSPACE=${GITHUB_WORKSPACE} \
-w /ci/workspace \
ubuntu:noble \
${GITHUB_WORKSPACE}/ci/build-in-docker.sh

View File

@@ -33,7 +33,7 @@ jobs:
dry-run: false
sanitizer: ${{ matrix.sanitizer }}
- name: Upload Crash
uses: actions/upload-artifact@v1
uses: actions/upload-artifact@v4
if: failure() && steps.build.outcome == 'success'
with:
name: ${{ matrix.sanitizer }}-artifacts

View File

@@ -1,30 +1,30 @@
name: libbpf-ci-coverity
on:
push:
branches:
- master
schedule:
- cron: '0 18 * * *'
jobs:
coverity:
runs-on: ubuntu-latest
if: github.repository == 'libbpf/libbpf'
name: Coverity
env:
COVERITY_SCAN_TOKEN: ${{ secrets.COVERITY_SCAN_TOKEN }}
steps:
- uses: actions/checkout@v4
- uses: ./.github/actions/setup
- name: Run coverity
if: ${{ env.COVERITY_SCAN_TOKEN }}
run: |
source "${GITHUB_WORKSPACE}"/ci/vmtest/helpers.sh
foldable start "Setup CI env"
source /tmp/ci_setup
export COVERITY_SCAN_NOTIFICATION_EMAIL="${AUTHOR_EMAIL}"
export COVERITY_SCAN_BRANCH_PATTERN=${GITHUB_REF##refs/*/}
export TRAVIS_BRANCH=${COVERITY_SCAN_BRANCH_PATTERN}
foldable end
scripts/coverity.sh
env:
COVERITY_SCAN_TOKEN: ${{ secrets.COVERITY_SCAN_TOKEN }}
COVERITY_SCAN_PROJECT_NAME: libbpf
COVERITY_SCAN_BUILD_COMMAND_PREPEND: 'cd src/'
COVERITY_SCAN_BUILD_COMMAND: 'make'

View File

@@ -3,34 +3,29 @@ name: ondemand
on:
workflow_dispatch:
inputs:
kernel-origin:
description: 'git repo for linux kernel'
default: 'https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git'
arch:
default: 'x86_64'
required: true
kernel-rev:
description: 'rev/tag/branch for linux kernel'
llvm-version:
default: '18'
required: true
kernel:
default: 'LATEST'
required: true
pahole:
default: "master"
required: true
pahole-origin:
description: 'git repo for pahole'
default: 'https://git.kernel.org/pub/scm/devel/pahole/pahole.git'
required: true
pahole-rev:
description: 'ref/tag/branch for pahole'
default: "master"
runs-on:
default: 'ubuntu-24.04'
required: true
jobs:
vmtest:
runs-on: ubuntu-latest
name: vmtest with customized pahole/Kernel
steps:
- uses: actions/checkout@v4
- uses: ./.github/actions/setup
- uses: ./.github/actions/vmtest
with:
kernel: 'LATEST'
kernel-rev: ${{ github.event.inputs.kernel-rev }}
kernel-origin: ${{ github.event.inputs.kernel-origin }}
pahole: ${{ github.event.inputs.pahole-rev }}
pahole-origin: ${{ github.event.inputs.pahole-origin }}
name: ${{ inputs.kernel }} kernel llvm-${{ inputs.llvm-version }} pahole@${{ inputs.pahole }}
uses: ./.github/workflows/vmtest.yml
with:
runs_on: ${{ inputs.runs-on }}
kernel: ${{ inputs.kernel }}
arch: ${{ inputs.arch }}
llvm-version: ${{ inputs.llvm-version }}
pahole: ${{ inputs.pahole }}

View File

@@ -1,20 +0,0 @@
name: pahole-staging
on:
workflow_dispatch:
schedule:
- cron: '0 18 * * *'
jobs:
vmtest:
runs-on: ubuntu-20.04
name: Kernel LATEST + staging pahole
env:
STAGING: tmp.master
steps:
- uses: actions/checkout@v4
- uses: ./.github/actions/setup
- uses: ./.github/actions/vmtest
with:
kernel: LATEST
pahole: $STAGING

View File

@@ -1,42 +1,36 @@
name: libbpf-ci
on:
on:
pull_request:
push:
schedule:
- cron: '0 18 * * *'
concurrency:
concurrency:
group: ci-test-${{ github.head_ref }}
cancel-in-progress: true
jobs:
vmtest:
runs-on: ${{ matrix.runs_on }}
name: Kernel ${{ matrix.kernel }} on ${{ matrix.arch }} + selftests
strategy:
fail-fast: false
matrix:
include:
- kernel: 'LATEST'
runs_on: ubuntu-24.04
arch: 'x86_64'
- kernel: '5.5.0'
runs_on: ubuntu-24.04
arch: 'x86_64'
- kernel: '4.9.0'
runs_on: ubuntu-24.04
runs_on: 'ubuntu-24.04'
arch: 'x86_64'
llvm-version: '18'
pahole: 'master'
- kernel: 'LATEST'
runs_on: ["s390x", "docker-noble-main"]
arch: 's390x'
steps:
- uses: actions/checkout@v4
name: Checkout
- uses: ./.github/actions/setup
name: Setup
- uses: ./.github/actions/vmtest
name: vmtest
with:
kernel: ${{ matrix.kernel }}
arch: ${{ matrix.arch }}
runs_on: 'ubuntu-24.04'
arch: 'x86_64'
llvm-version: '18'
pahole: 'tmp.master'
name: Linux ${{ matrix.kernel }} llvm-${{ matrix.llvm-version }}
uses: ./.github/workflows/vmtest.yml
with:
runs_on: ${{ matrix.runs_on }}
kernel: ${{ matrix.kernel }}
arch: ${{ matrix.arch }}
llvm-version: ${{ matrix.llvm-version }}
pahole: ${{ matrix.pahole }}

117
.github/workflows/vmtest.yml vendored Normal file
View File

@@ -0,0 +1,117 @@
name: 'Build kernel and selftests/bpf, run selftests via vmtest'
on:
workflow_call:
inputs:
runs_on:
required: true
default: 'ubuntu-24.04'
type: string
arch:
description: 'what arch to test'
required: true
default: 'x86_64'
type: string
kernel:
description: 'kernel version or LATEST'
required: true
default: 'LATEST'
type: string
pahole:
description: 'pahole rev or branch'
required: false
default: 'master'
type: string
llvm-version:
description: 'llvm version'
required: false
default: '18'
type: string
jobs:
vmtest:
name: pahole@${{ inputs.pahole }}
runs-on: ${{ inputs.runs_on }}
steps:
- uses: actions/checkout@v4
- name: Setup environment
uses: libbpf/ci/setup-build-env@v3
with:
pahole: ${{ inputs.pahole }}
arch: ${{ inputs.arch }}
llvm-version: ${{ inputs.llvm-version }}
- name: Get checkpoint commit
shell: bash
run: |
cat CHECKPOINT-COMMIT
echo "CHECKPOINT=$(cat CHECKPOINT-COMMIT)" >> $GITHUB_ENV
- name: Get kernel source at checkpoint
uses: libbpf/ci/get-linux-source@v3
with:
repo: 'https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git'
rev: ${{ env.CHECKPOINT }}
dest: '${{ github.workspace }}/.kernel'
- name: Patch kernel source
uses: libbpf/ci/patch-kernel@v3
with:
patches-root: '${{ github.workspace }}/ci/diffs'
repo-root: '.kernel'
- name: Configure kernel build
shell: bash
run: |
cd .kernel
cat tools/testing/selftests/bpf/config \
tools/testing/selftests/bpf/config.${{ inputs.arch }} > .config
# this file might or might not exist depending on kernel version
cat tools/testing/selftests/bpf/config.vm >> .config || :
make olddefconfig && make prepare
cd -
- name: Build kernel image
if: ${{ inputs.kernel == 'LATEST' }}
shell: bash
run: |
cd .kernel
make -j $((4*$(nproc))) all
cp vmlinux ${{ github.workspace }}
cd -
- name: Download prebuilt kernel
if: ${{ inputs.kernel != 'LATEST' }}
uses: libbpf/ci/download-vmlinux@v3
with:
kernel: ${{ inputs.kernel }}
arch: ${{ inputs.arch }}
- name: Build selftests/bpf
uses: libbpf/ci/build-selftests@v3
env:
MAX_MAKE_JOBS: 32
VMLINUX_BTF: ${{ github.workspace }}/vmlinux
VMLINUX_H: ${{ inputs.kernel != 'LATEST' && format('{0}/.github/actions/build-selftests/vmlinux.h', github.workspace) || '' }}
with:
arch: ${{ inputs.arch }}
kernel-root: ${{ github.workspace }}/.kernel
llvm-version: ${{ inputs.llvm-version }}
- name: Run selftests
env:
ALLOWLIST_FILE: /tmp/allowlist
DENYLIST_FILE: /tmp/denylist
KERNEL: ${{ inputs.kernel }}
VMLINUX: ${{ github.workspace }}/vmlinux
LLVM_VERSION: ${{ inputs.llvm-version }}
SELFTESTS_BPF: ${{ github.workspace }}/.kernel/tools/testing/selftests/bpf
VMTEST_CONFIGS: ${{ github.workspace }}/ci/vmtest/configs
uses: libbpf/ci/run-vmtest@v3
with:
arch: ${{ inputs.arch }}
kbuild-output: ${{ github.workspace }}/.kernel
kernel-root: ${{ github.workspace }}/.kernel
vmlinuz: ${{ inputs.arch }}/vmlinuz-${{ inputs.kernel }}

View File

@@ -8,6 +8,7 @@ Dan Carpenter <error27@gmail.com> <dan.carpenter@oracle.com>
Geliang Tang <geliang@kernel.org> <geliang.tang@suse.com>
Herbert Xu <herbert@gondor.apana.org.au>
Jakub Kicinski <kuba@kernel.org> <jakub.kicinski@netronome.com>
Jesper Dangaard Brouer <hawk@kernel.org> <brouer@redhat.com>
Kees Cook <kees@kernel.org> <keescook@chromium.org>
Leo Yan <leo.yan@linux.dev> <leo.yan@linaro.org>
Mark Starovoytov <mstarovo@pm.me> <mstarovoitov@marvell.com>

View File

@@ -1 +1 @@
b408473ea01b2e499d23503e2bf898416da9d7ac
b4432656b36e5cc1d50a1f2dc15357543add530e

View File

@@ -1 +1 @@
2ad6d23f465a4f851e3bcf6d74c315ce7b2c205b
9325d53fe9adff354b6a93fda5f38c165947da0f

14
ci/build-in-docker.sh Executable file
View File

@@ -0,0 +1,14 @@
#!/bin/bash
set -euo pipefail
export DEBIAN_FRONTEND=noninteractive
export TZ="America/Los_Angeles"
apt-get update -y
apt-get install -y tzdata build-essential sudo
source ${GITHUB_WORKSPACE}/ci_setup
$CI_ROOT/managers/ubuntu.sh
exit 0

View File

@@ -1,69 +0,0 @@
From c71766e8ff7a7f950522d25896fba758585500df Mon Sep 17 00:00:00 2001
From: Song Liu <song@kernel.org>
Date: Mon, 22 Apr 2024 21:14:40 -0700
Subject: [PATCH] arch/Kconfig: Move SPECULATION_MITIGATIONS to arch/Kconfig
SPECULATION_MITIGATIONS is currently defined only for x86. As a result,
IS_ENABLED(CONFIG_SPECULATION_MITIGATIONS) is always false for other
archs. f337a6a21e2f effectively set "mitigations=off" by default on
non-x86 archs, which is not desired behavior. Jakub observed this
change when running bpf selftests on s390 and arm64.
Fix this by moving SPECULATION_MITIGATIONS to arch/Kconfig so that it is
available in all archs and thus can be used safely in kernel/cpu.c
Fixes: f337a6a21e2f ("x86/cpu: Actually turn off mitigations by default for SPECULATION_MITIGATIONS=n")
Cc: stable@vger.kernel.org
Cc: Sean Christopherson <seanjc@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Song Liu <song@kernel.org>
---
arch/Kconfig | 10 ++++++++++
arch/x86/Kconfig | 10 ----------
2 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/arch/Kconfig b/arch/Kconfig
index 9f066785bb71..8f4af75005f8 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1609,4 +1609,14 @@ config CC_HAS_SANE_FUNCTION_ALIGNMENT
# strict alignment always, even with -falign-functions.
def_bool CC_HAS_MIN_FUNCTION_ALIGNMENT || CC_IS_CLANG
+menuconfig SPECULATION_MITIGATIONS
+ bool "Mitigations for speculative execution vulnerabilities"
+ default y
+ help
+ Say Y here to enable options which enable mitigations for
+ speculative execution hardware vulnerabilities.
+
+ If you say N, all mitigations will be disabled. You really
+ should know what you are doing to say so.
+
endmenu
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 39886bab943a..50c890fce5e0 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2486,16 +2486,6 @@ config PREFIX_SYMBOLS
def_bool y
depends on CALL_PADDING && !CFI_CLANG
-menuconfig SPECULATION_MITIGATIONS
- bool "Mitigations for speculative execution vulnerabilities"
- default y
- help
- Say Y here to enable options which enable mitigations for
- speculative execution hardware vulnerabilities.
-
- If you say N, all mitigations will be disabled. You really
- should know what you are doing to say so.
-
if SPECULATION_MITIGATIONS
config MITIGATION_PAGE_TABLE_ISOLATION
--
2.43.0

View File

@@ -1,32 +0,0 @@
From 0daad0a615e687e1247230f3d0c31ae60ba32314 Mon Sep 17 00:00:00 2001
From: Andrii Nakryiko <andrii@kernel.org>
Date: Tue, 28 May 2024 15:29:38 -0700
Subject: [PATCH bpf-next] selftests/bpf: fix inet_csk_accept prototype in
test_sk_storage_tracing.c
Recent kernel change ([0]) changed inet_csk_accept() prototype. Adapt
progs/test_sk_storage_tracing.c to take that into account.
[0] 92ef0fd55ac8 ("net: change proto and proto_ops accept type")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
tools/testing/selftests/bpf/progs/test_sk_storage_tracing.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/progs/test_sk_storage_tracing.c b/tools/testing/selftests/bpf/progs/test_sk_storage_tracing.c
index 02e718f06e0f..40531e56776e 100644
--- a/tools/testing/selftests/bpf/progs/test_sk_storage_tracing.c
+++ b/tools/testing/selftests/bpf/progs/test_sk_storage_tracing.c
@@ -84,7 +84,7 @@ int BPF_PROG(trace_tcp_connect, struct sock *sk)
}
SEC("fexit/inet_csk_accept")
-int BPF_PROG(inet_csk_accept, struct sock *sk, int flags, int *err, bool kern,
+int BPF_PROG(inet_csk_accept, struct sock *sk, struct proto_accept_arg *arg,
struct sock *accepted_sk)
{
set_task_info(accepted_sk);
--
2.43.0

View File

@@ -0,0 +1,85 @@
From e3a4f5092e847ec00e2b66c060f2cef52b8d0177 Mon Sep 17 00:00:00 2001
From: Ihor Solodrai <ihor.solodrai@pm.me>
Date: Thu, 14 Nov 2024 12:49:34 -0800
Subject: [PATCH bpf-next] selftests/bpf: set test path for
token/obj_priv_implicit_token_envvar
token/obj_priv_implicit_token_envvar test may fail in an environment
where the process executing tests can not write to the root path.
Example:
https://github.com/libbpf/libbpf/actions/runs/11844507007/job/33007897936
Change default path used by the test to /tmp/bpf-token-fs, and make it
runtime configurable via an environment variable.
Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
---
tools/testing/selftests/bpf/prog_tests/token.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/token.c b/tools/testing/selftests/bpf/prog_tests/token.c
index fe86e4fdb89c..39f5414b674b 100644
--- a/tools/testing/selftests/bpf/prog_tests/token.c
+++ b/tools/testing/selftests/bpf/prog_tests/token.c
@@ -828,8 +828,11 @@ static int userns_obj_priv_btf_success(int mnt_fd, struct token_lsm *lsm_skel)
return validate_struct_ops_load(mnt_fd, true /* should succeed */);
}
+static const char* token_bpffs_custom_dir() {
+ return getenv("BPF_SELFTESTS_BPF_TOKEN_DIR") ? : "/tmp/bpf-token-fs";
+}
+
#define TOKEN_ENVVAR "LIBBPF_BPF_TOKEN_PATH"
-#define TOKEN_BPFFS_CUSTOM "/bpf-token-fs"
static int userns_obj_priv_implicit_token(int mnt_fd, struct token_lsm *lsm_skel)
{
@@ -892,6 +895,7 @@ static int userns_obj_priv_implicit_token(int mnt_fd, struct token_lsm *lsm_skel
static int userns_obj_priv_implicit_token_envvar(int mnt_fd, struct token_lsm *lsm_skel)
{
+ const char *custom_dir = token_bpffs_custom_dir();
LIBBPF_OPTS(bpf_object_open_opts, opts);
struct dummy_st_ops_success *skel;
int err;
@@ -909,10 +913,10 @@ static int userns_obj_priv_implicit_token_envvar(int mnt_fd, struct token_lsm *l
* BPF token implicitly, unless pointed to it through
* LIBBPF_BPF_TOKEN_PATH envvar
*/
- rmdir(TOKEN_BPFFS_CUSTOM);
- if (!ASSERT_OK(mkdir(TOKEN_BPFFS_CUSTOM, 0777), "mkdir_bpffs_custom"))
+ rmdir(custom_dir);
+ if (!ASSERT_OK(mkdir(custom_dir, 0777), "mkdir_bpffs_custom"))
goto err_out;
- err = sys_move_mount(mnt_fd, "", AT_FDCWD, TOKEN_BPFFS_CUSTOM, MOVE_MOUNT_F_EMPTY_PATH);
+ err = sys_move_mount(mnt_fd, "", AT_FDCWD, custom_dir, MOVE_MOUNT_F_EMPTY_PATH);
if (!ASSERT_OK(err, "move_mount_bpffs"))
goto err_out;
@@ -925,7 +929,7 @@ static int userns_obj_priv_implicit_token_envvar(int mnt_fd, struct token_lsm *l
goto err_out;
}
- err = setenv(TOKEN_ENVVAR, TOKEN_BPFFS_CUSTOM, 1 /*overwrite*/);
+ err = setenv(TOKEN_ENVVAR, custom_dir, 1 /*overwrite*/);
if (!ASSERT_OK(err, "setenv_token_path"))
goto err_out;
@@ -951,11 +955,11 @@ static int userns_obj_priv_implicit_token_envvar(int mnt_fd, struct token_lsm *l
if (!ASSERT_ERR(err, "obj_empty_token_path_load"))
goto err_out;
- rmdir(TOKEN_BPFFS_CUSTOM);
+ rmdir(custom_dir);
unsetenv(TOKEN_ENVVAR);
return 0;
err_out:
- rmdir(TOKEN_BPFFS_CUSTOM);
+ rmdir(custom_dir);
unsetenv(TOKEN_ENVVAR);
return -EINVAL;
}
--
2.47.0

View File

@@ -1,56 +0,0 @@
From f267f262815033452195f46c43b572159262f533 Mon Sep 17 00:00:00 2001
From: Daniel Borkmann <daniel@iogearbox.net>
Date: Tue, 5 Mar 2024 10:08:28 +0100
Subject: [PATCH 2/2] xdp, bonding: Fix feature flags when there are no slave
devs anymore
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Commit 9b0ed890ac2a ("bonding: do not report NETDEV_XDP_ACT_XSK_ZEROCOPY")
changed the driver from reporting everything as supported before a device
was bonded into having the driver report that no XDP feature is supported
until a real device is bonded as it seems to be more truthful given
eventually real underlying devices decide what XDP features are supported.
The change however did not take into account when all slave devices get
removed from the bond device. In this case after 9b0ed890ac2a, the driver
keeps reporting a feature mask of 0x77, that is, NETDEV_XDP_ACT_MASK &
~NETDEV_XDP_ACT_XSK_ZEROCOPY whereas it should have reported a feature
mask of 0.
Fix it by resetting XDP feature flags in the same way as if no XDP program
is attached to the bond device. This was uncovered by the XDP bond selftest
which let BPF CI fail. After adjusting the starting masks on the latter
to 0 instead of NETDEV_XDP_ACT_MASK the test passes again together with
this fix.
Fixes: 9b0ed890ac2a ("bonding: do not report NETDEV_XDP_ACT_XSK_ZEROCOPY")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Magnus Karlsson <magnus.karlsson@intel.com>
Cc: Prashant Batra <prbatra.mail@gmail.com>
Cc: Toke Høiland-Jørgensen <toke@redhat.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Message-ID: <20240305090829.17131-1-daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
drivers/net/bonding/bond_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index a11748b8d69b..cd0683bcca03 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1811,7 +1811,7 @@ void bond_xdp_set_features(struct net_device *bond_dev)
ASSERT_RTNL();
- if (!bond_xdp_check(bond)) {
+ if (!bond_xdp_check(bond) || !bond_has_slaves(bond)) {
xdp_clear_features_flag(bond_dev);
return;
}
--
2.43.0

View File

@@ -0,0 +1,69 @@
From bd06a13f44e15e2e83561ea165061c445a15bd9e Mon Sep 17 00:00:00 2001
From: Song Liu <song@kernel.org>
Date: Thu, 27 Mar 2025 11:55:28 -0700
Subject: [PATCH 4000/4002] selftests/bpf: Fix tests after fields reorder in
struct file
The change in struct file [1] moved f_ref to the 3rd cache line.
It made *(u64 *)file dereference invalid from the verifier point of view,
because btf_struct_walk() walks into f_lock field, which is 4-byte long.
Fix the selftests to deference the file pointer as a 4-byte access.
[1] commit e249056c91a2 ("fs: place f_ref to 3rd cache line in struct file to resolve false sharing")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20250327185528.1740787-1-song@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
tools/testing/selftests/bpf/progs/test_module_attach.c | 2 +-
tools/testing/selftests/bpf/progs/test_subprogs_extable.c | 6 +++---
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/bpf/progs/test_module_attach.c b/tools/testing/selftests/bpf/progs/test_module_attach.c
index fb07f5773888..7f3c233943b3 100644
--- a/tools/testing/selftests/bpf/progs/test_module_attach.c
+++ b/tools/testing/selftests/bpf/progs/test_module_attach.c
@@ -117,7 +117,7 @@ int BPF_PROG(handle_fexit_ret, int arg, struct file *ret)
bpf_probe_read_kernel(&buf, 8, ret);
bpf_probe_read_kernel(&buf, 8, (char *)ret + 256);
- *(volatile long long *)ret;
+ *(volatile int *)ret;
*(volatile int *)&ret->f_mode;
return 0;
}
diff --git a/tools/testing/selftests/bpf/progs/test_subprogs_extable.c b/tools/testing/selftests/bpf/progs/test_subprogs_extable.c
index e2a21fbd4e44..dcac69f5928a 100644
--- a/tools/testing/selftests/bpf/progs/test_subprogs_extable.c
+++ b/tools/testing/selftests/bpf/progs/test_subprogs_extable.c
@@ -21,7 +21,7 @@ static __u64 test_cb(struct bpf_map *map, __u32 *key, __u64 *val, void *data)
SEC("fexit/bpf_testmod_return_ptr")
int BPF_PROG(handle_fexit_ret_subprogs, int arg, struct file *ret)
{
- *(volatile long *)ret;
+ *(volatile int *)ret;
*(volatile int *)&ret->f_mode;
bpf_for_each_map_elem(&test_array, test_cb, NULL, 0);
triggered++;
@@ -31,7 +31,7 @@ int BPF_PROG(handle_fexit_ret_subprogs, int arg, struct file *ret)
SEC("fexit/bpf_testmod_return_ptr")
int BPF_PROG(handle_fexit_ret_subprogs2, int arg, struct file *ret)
{
- *(volatile long *)ret;
+ *(volatile int *)ret;
*(volatile int *)&ret->f_mode;
bpf_for_each_map_elem(&test_array, test_cb, NULL, 0);
triggered++;
@@ -41,7 +41,7 @@ int BPF_PROG(handle_fexit_ret_subprogs2, int arg, struct file *ret)
SEC("fexit/bpf_testmod_return_ptr")
int BPF_PROG(handle_fexit_ret_subprogs3, int arg, struct file *ret)
{
- *(volatile long *)ret;
+ *(volatile int *)ret;
*(volatile int *)&ret->f_mode;
bpf_for_each_map_elem(&test_array, test_cb, NULL, 0);
triggered++;
--
2.49.0

View File

@@ -0,0 +1,71 @@
From 8be3a12f9f266aaf3f06f0cfe0e90cfe4d956f3d Mon Sep 17 00:00:00 2001
From: Song Liu <song@kernel.org>
Date: Fri, 28 Mar 2025 12:31:24 -0700
Subject: [PATCH 4001/4002] selftests/bpf: Fix verifier_bpf_fastcall test
Commit [1] moves percpu data on x86 from address 0x000... to address
0xfff...
Before [1]:
159020: 0000000000030700 0 OBJECT GLOBAL DEFAULT 23 pcpu_hot
After [1]:
152602: ffffffff83a3e034 4 OBJECT GLOBAL DEFAULT 35 pcpu_hot
As a result, verifier_bpf_fastcall tests should now expect a negative
value for pcpu_hot, IOW, the disassemble should show "r=" instead of
"w=".
Fix this in the test.
Note that, a later change created a new variable "cpu_number" for
bpf_get_smp_processor_id() [2]. The inlining logic is updated properly
as part of this change, so there is no need to fix anything on the
kernel side.
[1] commit 9d7de2aa8b41 ("x86/percpu/64: Use relative percpu offsets")
[2] commit 01c7bc5198e9 ("x86/smp: Move cpu number to percpu hot section")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20250328193124.808784-1-song@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
tools/testing/selftests/bpf/progs/verifier_bpf_fastcall.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/bpf/progs/verifier_bpf_fastcall.c b/tools/testing/selftests/bpf/progs/verifier_bpf_fastcall.c
index a9be6ae49454..c258b0722e04 100644
--- a/tools/testing/selftests/bpf/progs/verifier_bpf_fastcall.c
+++ b/tools/testing/selftests/bpf/progs/verifier_bpf_fastcall.c
@@ -12,7 +12,7 @@ SEC("raw_tp")
__arch_x86_64
__log_level(4) __msg("stack depth 8")
__xlated("4: r5 = 5")
-__xlated("5: w0 = ")
+__xlated("5: r0 = ")
__xlated("6: r0 = &(void __percpu *)(r0)")
__xlated("7: r0 = *(u32 *)(r0 +0)")
__xlated("8: exit")
@@ -704,7 +704,7 @@ SEC("raw_tp")
__arch_x86_64
__log_level(4) __msg("stack depth 32+0")
__xlated("2: r1 = 1")
-__xlated("3: w0 =")
+__xlated("3: r0 =")
__xlated("4: r0 = &(void __percpu *)(r0)")
__xlated("5: r0 = *(u32 *)(r0 +0)")
/* bpf_loop params setup */
@@ -753,7 +753,7 @@ __arch_x86_64
__log_level(4) __msg("stack depth 40+0")
/* call bpf_get_smp_processor_id */
__xlated("2: r1 = 42")
-__xlated("3: w0 =")
+__xlated("3: r0 =")
__xlated("4: r0 = &(void __percpu *)(r0)")
__xlated("5: r0 = *(u32 *)(r0 +0)")
/* call bpf_get_prandom_u32 */
--
2.49.0

View File

@@ -0,0 +1,71 @@
From 07be1f644ff9eeb842fd0490ddd824df0828cb0e Mon Sep 17 00:00:00 2001
From: Yonghong Song <yonghong.song@linux.dev>
Date: Sun, 30 Mar 2025 20:38:28 -0700
Subject: [PATCH 4002/4002] selftests/bpf: Fix verifier_private_stack test
failure
Several verifier_private_stack tests failed with latest bpf-next.
For example, for 'Private stack, single prog' subtest, the
jitted code:
func #0:
0: f3 0f 1e fa endbr64
4: 0f 1f 44 00 00 nopl (%rax,%rax)
9: 0f 1f 00 nopl (%rax)
c: 55 pushq %rbp
d: 48 89 e5 movq %rsp, %rbp
10: f3 0f 1e fa endbr64
14: 49 b9 58 74 8a 8f 7d 60 00 00 movabsq $0x607d8f8a7458, %r9
1e: 65 4c 03 0c 25 28 c0 48 87 addq %gs:-0x78b73fd8, %r9
27: bf 2a 00 00 00 movl $0x2a, %edi
2c: 49 89 b9 00 ff ff ff movq %rdi, -0x100(%r9)
33: 31 c0 xorl %eax, %eax
35: c9 leave
36: e9 20 5d 0f e1 jmp 0xffffffffe10f5d5b
The insn 'addq %gs:-0x78b73fd8, %r9' does not match the expected
regex 'addq %gs:0x{{.*}}, %r9' and this caused test failure.
Fix it by changing '%gs:0x{{.*}}' to '%gs:{{.*}}' to accommodate the
possible negative offset. A few other subtests are fixed in a similar way.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20250331033828.365077-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
tools/testing/selftests/bpf/progs/verifier_private_stack.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/bpf/progs/verifier_private_stack.c b/tools/testing/selftests/bpf/progs/verifier_private_stack.c
index b1fbdf119553..fc91b414364e 100644
--- a/tools/testing/selftests/bpf/progs/verifier_private_stack.c
+++ b/tools/testing/selftests/bpf/progs/verifier_private_stack.c
@@ -27,7 +27,7 @@ __description("Private stack, single prog")
__success
__arch_x86_64
__jited(" movabsq $0x{{.*}}, %r9")
-__jited(" addq %gs:0x{{.*}}, %r9")
+__jited(" addq %gs:{{.*}}, %r9")
__jited(" movl $0x2a, %edi")
__jited(" movq %rdi, -0x100(%r9)")
__naked void private_stack_single_prog(void)
@@ -74,7 +74,7 @@ __success
__arch_x86_64
/* private stack fp for the main prog */
__jited(" movabsq $0x{{.*}}, %r9")
-__jited(" addq %gs:0x{{.*}}, %r9")
+__jited(" addq %gs:{{.*}}, %r9")
__jited(" movl $0x2a, %edi")
__jited(" movq %rdi, -0x200(%r9)")
__jited(" pushq %r9")
@@ -122,7 +122,7 @@ __jited(" pushq %rbp")
__jited(" movq %rsp, %rbp")
__jited(" endbr64")
__jited(" movabsq $0x{{.*}}, %r9")
-__jited(" addq %gs:0x{{.*}}, %r9")
+__jited(" addq %gs:{{.*}}, %r9")
__jited(" pushq %r9")
__jited(" callq")
__jited(" popq %r9")
--
2.49.0

View File

@@ -1,8 +0,0 @@
# btf_dump -- need to disable data dump sub-tests
core_retro
cpu_mask
hashmap
legacy_printk
perf_buffer
section_names

View File

@@ -1,49 +0,0 @@
# attach_probe
autoload
bpf_verif_scale
cgroup_attach_autodetach
cgroup_attach_override
core_autosize
core_extern
core_read_macros
core_reloc
core_retro
cpu_mask
endian
get_branch_snapshot
get_stackid_cannot_attach
global_data
global_data_init
global_func_args
hashmap
legacy_printk
linked_funcs
linked_maps
map_lock
obj_name
perf_buffer
perf_event_stackmap
pinning
pkt_md_access
probe_user
queue_stack_map
raw_tp_writable_reject_nbd_invalid
raw_tp_writable_test_run
rdonly_maps
section_names
signal_pending
sockmap_ktls
spinlock
stacktrace_map
stacktrace_map_raw_tp
static_linked
task_fd_query_rawtp
task_fd_query_tp
tc_bpf
tcp_estats
test_global_funcs/arg_tag_ctx*
tp_attach_query
usdt/urand_pid_attach
xdp
xdp_noinline
xdp_perf

View File

@@ -1,5 +0,0 @@
# This complements ALLOWLIST-5.5.0 but excludes subtest that can't work on 5.5
btf # "size check test", "func (Non zero vlen)"
tailcalls # tailcall_bpf2bpf_1, tailcall_bpf2bpf_2, tailcall_bpf2bpf_3
tc_bpf/tc_bpf_non_root

View File

@@ -0,0 +1,37 @@
#!/bin/bash
# This file is sourced by libbpf/ci/run-vmtest Github Action scripts.
# $SELFTESTS_BPF and $VMTEST_CONFIGS are set in the workflow, before
# libbpf/ci/run-vmtest action is called
# See .github/workflows/kernel-test.yml
ALLOWLIST_FILES=(
"${SELFTESTS_BPF}/ALLOWLIST"
"${SELFTESTS_BPF}/ALLOWLIST.${ARCH}"
"${VMTEST_CONFIGS}/ALLOWLIST"
"${VMTEST_CONFIGS}/ALLOWLIST-${KERNEL}"
"${VMTEST_CONFIGS}/ALLOWLIST-${KERNEL}.${ARCH}"
)
DENYLIST_FILES=(
"${SELFTESTS_BPF}/DENYLIST"
"${SELFTESTS_BPF}/DENYLIST.${ARCH}"
"${VMTEST_CONFIGS}/DENYLIST"
"${VMTEST_CONFIGS}/DENYLIST-${KERNEL}"
"${VMTEST_CONFIGS}/DENYLIST-${KERNEL}.${ARCH}"
)
# Export pipe-separated strings, because bash doesn't support array export
export SELFTESTS_BPF_ALLOWLIST_FILES=$(IFS="|"; echo "${ALLOWLIST_FILES[*]}")
export SELFTESTS_BPF_DENYLIST_FILES=$(IFS="|"; echo "${DENYLIST_FILES[*]}")
if [[ "${LLVM_VERSION}" -lt 18 ]]; then
echo "KERNEL_TEST=test_progs test_progs_no_alu32 test_maps test_verifier" >> $GITHUB_ENV
else # all
echo "KERNEL_TEST=test_progs test_progs_cpuv4 test_progs_no_alu32 test_maps test_verifier" >> $GITHUB_ENV
fi
echo "cp -R ${SELFTESTS_BPF} ${GITHUB_WORKSPACE}/selftests"
mkdir -p "${GITHUB_WORKSPACE}/selftests"
cp -R "${SELFTESTS_BPF}" "${GITHUB_WORKSPACE}/selftests"

View File

@@ -1,38 +0,0 @@
# shellcheck shell=bash
# $1 - start or end
# $2 - fold identifier, no spaces
# $3 - fold section description
foldable() {
local YELLOW='\033[1;33m'
local NOCOLOR='\033[0m'
if [ $1 = "start" ]; then
line="::group::$2"
if [ ! -z "${3:-}" ]; then
line="$line - ${YELLOW}$3${NOCOLOR}"
fi
else
line="::endgroup::"
fi
echo -e "$line"
}
__print() {
local TITLE=""
if [[ -n $2 ]]; then
TITLE=" title=$2"
fi
echo "::$1${TITLE}::$3"
}
# $1 - title
# $2 - message
print_error() {
__print error $1 $2
}
# $1 - title
# $2 - message
print_notice() {
__print notice $1 $2
}

View File

@@ -1,96 +0,0 @@
#!/bin/bash
set -euo pipefail
source $(cd $(dirname $0) && pwd)/helpers.sh
ARCH=$(uname -m)
STATUS_FILE=/exitstatus
read_lists() {
(for path in "$@"; do
if [[ -s "$path" ]]; then
cat "$path"
fi;
done) | cut -d'#' -f1 | sed -e 's/^[[:space:]]*//' -e 's/[[:space:]]*$//' | tr -s '\n' ','
}
test_progs() {
if [[ "${KERNEL}" != '4.9.0' ]]; then
foldable start test_progs "Testing test_progs"
# "&& true" does not change the return code (it is not executed
# if the Python script fails), but it prevents exiting on a
# failure due to the "set -e".
./test_progs ${DENYLIST:+-d"$DENYLIST"} ${ALLOWLIST:+-a"$ALLOWLIST"} && true
echo "test_progs:$?" >> "${STATUS_FILE}"
foldable end test_progs
fi
}
test_progs_no_alu32() {
foldable start test_progs-no_alu32 "Testing test_progs-no_alu32"
./test_progs-no_alu32 ${DENYLIST:+-d"$DENYLIST"} ${ALLOWLIST:+-a"$ALLOWLIST"} && true
echo "test_progs-no_alu32:$?" >> "${STATUS_FILE}"
foldable end test_progs-no_alu32
}
test_maps() {
if [[ "${KERNEL}" == 'latest' ]]; then
foldable start test_maps "Testing test_maps"
./test_maps && true
echo "test_maps:$?" >> "${STATUS_FILE}"
foldable end test_maps
fi
}
test_verifier() {
if [[ "${KERNEL}" == 'latest' ]]; then
foldable start test_verifier "Testing test_verifier"
./test_verifier && true
echo "test_verifier:$?" >> "${STATUS_FILE}"
foldable end test_verifier
fi
}
foldable end vm_init
foldable start kernel_config "Kconfig"
zcat /proc/config.gz
foldable end kernel_config
configs_path=/${PROJECT_NAME}/selftests/bpf
local_configs_path=${PROJECT_NAME}/vmtest/configs
DENYLIST=$(read_lists \
"$configs_path/DENYLIST" \
"$configs_path/DENYLIST.${ARCH}" \
"$local_configs_path/DENYLIST" \
"$local_configs_path/DENYLIST-${KERNEL}" \
"$local_configs_path/DENYLIST-${KERNEL}.${ARCH}" \
)
ALLOWLIST=$(read_lists \
"$configs_path/ALLOWLIST" \
"$configs_path/ALLOWLIST.${ARCH}" \
"$local_configs_path/ALLOWLIST" \
"$local_configs_path/ALLOWLIST-${KERNEL}" \
"$local_configs_path/ALLOWLIST-${KERNEL}.${ARCH}" \
)
echo "DENYLIST: ${DENYLIST}"
echo "ALLOWLIST: ${ALLOWLIST}"
cd ${PROJECT_NAME}/selftests/bpf
if [ $# -eq 0 ]; then
test_progs
test_progs_no_alu32
# test_maps
test_verifier
else
for test_name in "$@"; do
"${test_name}"
done
fi

View File

@@ -121,6 +121,8 @@ described in more detail in the footnotes.
+-------------------------------------------+----------------------------------------+----------------------------------+-----------+
| ``BPF_PROG_TYPE_LWT_XMIT`` | | ``lwt_xmit`` | |
+-------------------------------------------+----------------------------------------+----------------------------------+-----------+
| ``BPF_PROG_TYPE_NETFILTER`` | | ``netfilter`` | |
+-------------------------------------------+----------------------------------------+----------------------------------+-----------+
| ``BPF_PROG_TYPE_PERF_EVENT`` | | ``perf_event`` | |
+-------------------------------------------+----------------------------------------+----------------------------------+-----------+
| ``BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE`` | | ``raw_tp.w+`` [#rawtp]_ | |
@@ -131,11 +133,23 @@ described in more detail in the footnotes.
+ + +----------------------------------+-----------+
| | | ``raw_tracepoint+`` | |
+-------------------------------------------+----------------------------------------+----------------------------------+-----------+
| ``BPF_PROG_TYPE_SCHED_ACT`` | | ``action`` | |
| ``BPF_PROG_TYPE_SCHED_ACT`` | | ``action`` [#tc_legacy]_ | |
+-------------------------------------------+----------------------------------------+----------------------------------+-----------+
| ``BPF_PROG_TYPE_SCHED_CLS`` | | ``classifier`` | |
| ``BPF_PROG_TYPE_SCHED_CLS`` | | ``classifier`` [#tc_legacy]_ | |
+ + +----------------------------------+-----------+
| | | ``tc`` | |
| | | ``tc`` [#tc_legacy]_ | |
+ +----------------------------------------+----------------------------------+-----------+
| | ``BPF_NETKIT_PRIMARY`` | ``netkit/primary`` | |
+ +----------------------------------------+----------------------------------+-----------+
| | ``BPF_NETKIT_PEER`` | ``netkit/peer`` | |
+ +----------------------------------------+----------------------------------+-----------+
| | ``BPF_TCX_INGRESS`` | ``tc/ingress`` | |
+ +----------------------------------------+----------------------------------+-----------+
| | ``BPF_TCX_EGRESS`` | ``tc/egress`` | |
+ +----------------------------------------+----------------------------------+-----------+
| | ``BPF_TCX_INGRESS`` | ``tcx/ingress`` | |
+ +----------------------------------------+----------------------------------+-----------+
| | ``BPF_TCX_EGRESS`` | ``tcx/egress`` | |
+-------------------------------------------+----------------------------------------+----------------------------------+-----------+
| ``BPF_PROG_TYPE_SK_LOOKUP`` | ``BPF_SK_LOOKUP`` | ``sk_lookup`` | |
+-------------------------------------------+----------------------------------------+----------------------------------+-----------+
@@ -155,7 +169,9 @@ described in more detail in the footnotes.
+-------------------------------------------+----------------------------------------+----------------------------------+-----------+
| ``BPF_PROG_TYPE_SOCK_OPS`` | ``BPF_CGROUP_SOCK_OPS`` | ``sockops`` | |
+-------------------------------------------+----------------------------------------+----------------------------------+-----------+
| ``BPF_PROG_TYPE_STRUCT_OPS`` | | ``struct_ops+`` | |
| ``BPF_PROG_TYPE_STRUCT_OPS`` | | ``struct_ops+`` [#struct_ops]_ | |
+ + +----------------------------------+-----------+
| | | ``struct_ops.s+`` [#struct_ops]_ | Yes |
+-------------------------------------------+----------------------------------------+----------------------------------+-----------+
| ``BPF_PROG_TYPE_SYSCALL`` | | ``syscall`` | Yes |
+-------------------------------------------+----------------------------------------+----------------------------------+-----------+
@@ -209,5 +225,11 @@ described in more detail in the footnotes.
``a-zA-Z0-9_.*?``.
.. [#lsm] The ``lsm`` attachment format is ``lsm[.s]/<hook>``.
.. [#rawtp] The ``raw_tp`` attach format is ``raw_tracepoint[.w]/<tracepoint>``.
.. [#tc_legacy] The ``tc``, ``classifier`` and ``action`` attach types are deprecated, use
``tcx/*`` instead.
.. [#struct_ops] The ``struct_ops`` attach format supports ``struct_ops[.s]/<name>`` convention,
but ``name`` is ignored and it is recommended to just use plain
``SEC("struct_ops[.s]")``. The attachments are defined in a struct initializer
that is tagged with ``SEC(".struct_ops[.link]")``.
.. [#tp] The ``tracepoint`` attach format is ``tracepoint/<category>/<name>``.
.. [#iter] The ``iter`` attach format is ``iter[.s]/<struct-name>``.

View File

@@ -51,6 +51,9 @@
#define BPF_XCHG (0xe0 | BPF_FETCH) /* atomic exchange */
#define BPF_CMPXCHG (0xf0 | BPF_FETCH) /* atomic compare-and-write */
#define BPF_LOAD_ACQ 0x100 /* load-acquire */
#define BPF_STORE_REL 0x110 /* store-release */
enum bpf_cond_pseudo_jmp {
BPF_MAY_GOTO = 0,
};
@@ -1116,11 +1119,15 @@ enum bpf_attach_type {
BPF_NETKIT_PRIMARY,
BPF_NETKIT_PEER,
BPF_TRACE_KPROBE_SESSION,
BPF_TRACE_UPROBE_SESSION,
__MAX_BPF_ATTACH_TYPE
};
#define MAX_BPF_ATTACH_TYPE __MAX_BPF_ATTACH_TYPE
/* Add BPF_LINK_TYPE(type, name) in bpf_types.h to keep bpf_link_type_strs[]
* in sync with the definitions below.
*/
enum bpf_link_type {
BPF_LINK_TYPE_UNSPEC = 0,
BPF_LINK_TYPE_RAW_TRACEPOINT = 1,
@@ -1203,6 +1210,7 @@ enum bpf_perf_event_type {
#define BPF_F_BEFORE (1U << 3)
#define BPF_F_AFTER (1U << 4)
#define BPF_F_ID (1U << 5)
#define BPF_F_PREORDER (1U << 6)
#define BPF_F_LINK BPF_F_LINK /* 1 << 13 */
/* If BPF_F_STRICT_ALIGNMENT is used in BPF_PROG_LOAD command, the
@@ -1498,7 +1506,7 @@ union bpf_attr {
__s32 map_token_fd;
};
struct { /* anonymous struct used by BPF_MAP_*_ELEM commands */
struct { /* anonymous struct used by BPF_MAP_*_ELEM and BPF_MAP_FREEZE commands */
__u32 map_fd;
__aligned_u64 key;
union {
@@ -1569,6 +1577,16 @@ union bpf_attr {
* If provided, prog_flags should have BPF_F_TOKEN_FD flag set.
*/
__s32 prog_token_fd;
/* The fd_array_cnt can be used to pass the length of the
* fd_array array. In this case all the [map] file descriptors
* passed in this array will be bound to the program, even if
* the maps are not referenced directly. The functionality is
* similar to the BPF_PROG_BIND_MAP syscall, but maps can be
* used by the verifier during the program load. If provided,
* then the fd_array[0,...,fd_array_cnt-1] is expected to be
* continuous.
*/
__u32 fd_array_cnt;
};
struct { /* anonymous struct used by BPF_OBJ_* commands */
@@ -1634,6 +1652,7 @@ union bpf_attr {
};
__u32 next_id;
__u32 open_flags;
__s32 fd_by_id_token_fd;
};
struct { /* anonymous struct used by BPF_OBJ_GET_INFO_BY_FD */
@@ -1970,15 +1989,21 @@ union bpf_attr {
* program.
* Return
* The SMP id of the processor running the program.
* Attributes
* __bpf_fastcall
*
* long bpf_skb_store_bytes(struct sk_buff *skb, u32 offset, const void *from, u32 len, u64 flags)
* Description
* Store *len* bytes from address *from* into the packet
* associated to *skb*, at *offset*. *flags* are a combination of
* **BPF_F_RECOMPUTE_CSUM** (automatically recompute the
* checksum for the packet after storing the bytes) and
* **BPF_F_INVALIDATE_HASH** (set *skb*\ **->hash**, *skb*\
* **->swhash** and *skb*\ **->l4hash** to 0).
* associated to *skb*, at *offset*. The *flags* are a combination
* of the following values:
*
* **BPF_F_RECOMPUTE_CSUM**
* Automatically update *skb*\ **->csum** after storing the
* bytes.
* **BPF_F_INVALIDATE_HASH**
* Set *skb*\ **->hash**, *skb*\ **->swhash** and *skb*\
* **->l4hash** to 0.
*
* A call to this helper is susceptible to change the underlying
* packet buffer. Therefore, at load time, all checks on pointers
@@ -2030,7 +2055,7 @@ union bpf_attr {
* untouched (unless **BPF_F_MARK_ENFORCE** is added as well), and
* for updates resulting in a null checksum the value is set to
* **CSUM_MANGLED_0** instead. Flag **BPF_F_PSEUDO_HDR** indicates
* the checksum is to be computed against a pseudo-header.
* that the modified header field is part of the pseudo-header.
*
* This helper works in combination with **bpf_csum_diff**\ (),
* which does not update the checksum in-place, but offers more
@@ -2851,7 +2876,7 @@ union bpf_attr {
* **TCP_SYNCNT**, **TCP_USER_TIMEOUT**, **TCP_NOTSENT_LOWAT**,
* **TCP_NODELAY**, **TCP_MAXSEG**, **TCP_WINDOW_CLAMP**,
* **TCP_THIN_LINEAR_TIMEOUTS**, **TCP_BPF_DELACK_MAX**,
* **TCP_BPF_RTO_MIN**.
* **TCP_BPF_RTO_MIN**, **TCP_BPF_SOCK_OPS_CB_FLAGS**.
* * **IPPROTO_IP**, which supports *optname* **IP_TOS**.
* * **IPPROTO_IPV6**, which supports the following *optname*\ s:
* **IPV6_TCLASS**, **IPV6_AUTOFLOWLABEL**.
@@ -3101,10 +3126,6 @@ union bpf_attr {
* with the **CONFIG_BPF_KPROBE_OVERRIDE** configuration
* option, and in this case it only works on functions tagged with
* **ALLOW_ERROR_INJECTION** in the kernel code.
*
* Also, the helper is only available for the architectures having
* the CONFIG_FUNCTION_ERROR_INJECTION option. As of this writing,
* x86 architecture is the only one to support this feature.
* Return
* 0
*
@@ -5369,7 +5390,7 @@ union bpf_attr {
* Currently, the **flags** must be 0. Currently, nr_loops is
* limited to 1 << 23 (~8 million) loops.
*
* long (\*callback_fn)(u32 index, void \*ctx);
* long (\*callback_fn)(u64 index, void \*ctx);
*
* where **index** is the current index in the loop. The index
* is zero-indexed.
@@ -5519,11 +5540,12 @@ union bpf_attr {
* **-EOPNOTSUPP** if the hash calculation failed or **-EINVAL** if
* invalid arguments are passed.
*
* void *bpf_kptr_xchg(void *map_value, void *ptr)
* void *bpf_kptr_xchg(void *dst, void *ptr)
* Description
* Exchange kptr at pointer *map_value* with *ptr*, and return the
* old value. *ptr* can be NULL, otherwise it must be a referenced
* pointer which will be released when this helper is called.
* Exchange kptr at pointer *dst* with *ptr*, and return the old value.
* *dst* can be map value or local kptr. *ptr* can be NULL, otherwise
* it must be a referenced pointer which will be released when this helper
* is called.
* Return
* The old value of kptr (which can be NULL). The returned pointer
* if not NULL, is a reference which must be released using its
@@ -6006,7 +6028,10 @@ union bpf_attr {
FN(user_ringbuf_drain, 209, ##ctx) \
FN(cgrp_storage_get, 210, ##ctx) \
FN(cgrp_storage_delete, 211, ##ctx) \
/* */
/* This helper list is effectively frozen. If you are trying to \
* add a new helper, you should add a kfunc instead which has \
* less stability guarantees. See Documentation/bpf/kfuncs.rst \
*/
/* backwards-compatibility macros for users of __BPF_FUNC_MAPPER that don't
* know or care about integer value that is now passed as second argument
@@ -6046,11 +6071,6 @@ enum {
BPF_F_MARK_ENFORCE = (1ULL << 6),
};
/* BPF_FUNC_clone_redirect and BPF_FUNC_redirect flags. */
enum {
BPF_F_INGRESS = (1ULL << 0),
};
/* BPF_FUNC_skb_set_tunnel_key and BPF_FUNC_skb_get_tunnel_key flags. */
enum {
BPF_F_TUNINFO_IPV6 = (1ULL << 0),
@@ -6197,10 +6217,12 @@ enum {
BPF_F_BPRM_SECUREEXEC = (1ULL << 0),
};
/* Flags for bpf_redirect_map helper */
/* Flags for bpf_redirect and bpf_redirect_map helpers */
enum {
BPF_F_BROADCAST = (1ULL << 3),
BPF_F_EXCLUDE_INGRESS = (1ULL << 4),
BPF_F_INGRESS = (1ULL << 0), /* used for skb path */
BPF_F_BROADCAST = (1ULL << 3), /* used for XDP path */
BPF_F_EXCLUDE_INGRESS = (1ULL << 4), /* used for XDP path */
#define BPF_F_REDIRECT_FLAGS (BPF_F_INGRESS | BPF_F_BROADCAST | BPF_F_EXCLUDE_INGRESS)
};
#define __bpf_md_ptr(type, name) \
@@ -6702,6 +6724,7 @@ struct bpf_link_info {
__u32 name_len;
__u32 offset; /* offset from file_name */
__u64 cookie;
__u64 ref_ctr_offset;
} uprobe; /* BPF_PERF_EVENT_UPROBE, BPF_PERF_EVENT_URETPROBE */
struct {
__aligned_u64 func_name; /* in/out */
@@ -6903,6 +6926,12 @@ enum {
BPF_SOCK_OPS_ALL_CB_FLAGS = 0x7F,
};
enum {
SK_BPF_CB_TX_TIMESTAMPING = 1<<0,
SK_BPF_CB_MASK = (SK_BPF_CB_TX_TIMESTAMPING - 1) |
SK_BPF_CB_TX_TIMESTAMPING
};
/* List of known BPF sock_ops operators.
* New entries can only be added at the end
*/
@@ -7015,6 +7044,29 @@ enum {
* by the kernel or the
* earlier bpf-progs.
*/
BPF_SOCK_OPS_TSTAMP_SCHED_CB, /* Called when skb is passing
* through dev layer when
* SK_BPF_CB_TX_TIMESTAMPING
* feature is on.
*/
BPF_SOCK_OPS_TSTAMP_SND_SW_CB, /* Called when skb is about to send
* to the nic when SK_BPF_CB_TX_TIMESTAMPING
* feature is on.
*/
BPF_SOCK_OPS_TSTAMP_SND_HW_CB, /* Called in hardware phase when
* SK_BPF_CB_TX_TIMESTAMPING feature
* is on.
*/
BPF_SOCK_OPS_TSTAMP_ACK_CB, /* Called when all the skbs in the
* same sendmsg call are acked
* when SK_BPF_CB_TX_TIMESTAMPING
* feature is on.
*/
BPF_SOCK_OPS_TSTAMP_SENDMSG_CB, /* Called when every sendmsg syscall
* is triggered. It's used to correlate
* sendmsg timestamp with corresponding
* tskey.
*/
};
/* List of TCP states. There is a build check in net/ipv4/tcp.c to detect
@@ -7080,6 +7132,8 @@ enum {
TCP_BPF_SYN = 1005, /* Copy the TCP header */
TCP_BPF_SYN_IP = 1006, /* Copy the IP[46] and TCP header */
TCP_BPF_SYN_MAC = 1007, /* Copy the MAC, IP[46], and TCP header */
TCP_BPF_SOCK_OPS_CB_FLAGS = 1008, /* Get or Set TCP sock ops flags */
SK_BPF_CB_FLAGS = 1009, /* Get or set sock ops flags in socket */
};
enum {

View File

@@ -36,7 +36,8 @@ struct btf_type {
* bits 24-28: kind (e.g. int, ptr, array...etc)
* bits 29-30: unused
* bit 31: kind_flag, currently used by
* struct, union, enum, fwd and enum64
* struct, union, enum, fwd, enum64,
* decl_tag and type_tag
*/
__u32 info;
/* "size" is used by INT, ENUM, STRUCT, UNION, DATASEC and ENUM64.

View File

@@ -377,6 +377,7 @@ enum {
IFLA_GSO_IPV4_MAX_SIZE,
IFLA_GRO_IPV4_MAX_SIZE,
IFLA_DPLL_PIN,
IFLA_MAX_PACING_OFFLOAD_HORIZON,
__IFLA_MAX
};
@@ -461,6 +462,286 @@ enum in6_addr_gen_mode {
/* Bridge section */
/**
* DOC: Bridge enum definition
*
* Please *note* that the timer values in the following section are expected
* in clock_t format, which is seconds multiplied by USER_HZ (generally
* defined as 100).
*
* @IFLA_BR_FORWARD_DELAY
* The bridge forwarding delay is the time spent in LISTENING state
* (before moving to LEARNING) and in LEARNING state (before moving
* to FORWARDING). Only relevant if STP is enabled.
*
* The valid values are between (2 * USER_HZ) and (30 * USER_HZ).
* The default value is (15 * USER_HZ).
*
* @IFLA_BR_HELLO_TIME
* The time between hello packets sent by the bridge, when it is a root
* bridge or a designated bridge. Only relevant if STP is enabled.
*
* The valid values are between (1 * USER_HZ) and (10 * USER_HZ).
* The default value is (2 * USER_HZ).
*
* @IFLA_BR_MAX_AGE
* The hello packet timeout is the time until another bridge in the
* spanning tree is assumed to be dead, after reception of its last hello
* message. Only relevant if STP is enabled.
*
* The valid values are between (6 * USER_HZ) and (40 * USER_HZ).
* The default value is (20 * USER_HZ).
*
* @IFLA_BR_AGEING_TIME
* Configure the bridge's FDB entries aging time. It is the time a MAC
* address will be kept in the FDB after a packet has been received from
* that address. After this time has passed, entries are cleaned up.
* Allow values outside the 802.1 standard specification for special cases:
*
* * 0 - entry never ages (all permanent)
* * 1 - entry disappears (no persistence)
*
* The default value is (300 * USER_HZ).
*
* @IFLA_BR_STP_STATE
* Turn spanning tree protocol on (*IFLA_BR_STP_STATE* > 0) or off
* (*IFLA_BR_STP_STATE* == 0) for this bridge.
*
* The default value is 0 (disabled).
*
* @IFLA_BR_PRIORITY
* Set this bridge's spanning tree priority, used during STP root bridge
* election.
*
* The valid values are between 0 and 65535.
*
* @IFLA_BR_VLAN_FILTERING
* Turn VLAN filtering on (*IFLA_BR_VLAN_FILTERING* > 0) or off
* (*IFLA_BR_VLAN_FILTERING* == 0). When disabled, the bridge will not
* consider the VLAN tag when handling packets.
*
* The default value is 0 (disabled).
*
* @IFLA_BR_VLAN_PROTOCOL
* Set the protocol used for VLAN filtering.
*
* The valid values are 0x8100(802.1Q) or 0x88A8(802.1AD). The default value
* is 0x8100(802.1Q).
*
* @IFLA_BR_GROUP_FWD_MASK
* The group forwarding mask. This is the bitmask that is applied to
* decide whether to forward incoming frames destined to link-local
* addresses (of the form 01:80:C2:00:00:0X).
*
* The default value is 0, which means the bridge does not forward any
* link-local frames coming on this port.
*
* @IFLA_BR_ROOT_ID
* The bridge root id, read only.
*
* @IFLA_BR_BRIDGE_ID
* The bridge id, read only.
*
* @IFLA_BR_ROOT_PORT
* The bridge root port, read only.
*
* @IFLA_BR_ROOT_PATH_COST
* The bridge root path cost, read only.
*
* @IFLA_BR_TOPOLOGY_CHANGE
* The bridge topology change, read only.
*
* @IFLA_BR_TOPOLOGY_CHANGE_DETECTED
* The bridge topology change detected, read only.
*
* @IFLA_BR_HELLO_TIMER
* The bridge hello timer, read only.
*
* @IFLA_BR_TCN_TIMER
* The bridge tcn timer, read only.
*
* @IFLA_BR_TOPOLOGY_CHANGE_TIMER
* The bridge topology change timer, read only.
*
* @IFLA_BR_GC_TIMER
* The bridge gc timer, read only.
*
* @IFLA_BR_GROUP_ADDR
* Set the MAC address of the multicast group this bridge uses for STP.
* The address must be a link-local address in standard Ethernet MAC address
* format. It is an address of the form 01:80:C2:00:00:0X, with X in [0, 4..f].
*
* The default value is 0.
*
* @IFLA_BR_FDB_FLUSH
* Flush bridge's fdb dynamic entries.
*
* @IFLA_BR_MCAST_ROUTER
* Set bridge's multicast router if IGMP snooping is enabled.
* The valid values are:
*
* * 0 - disabled.
* * 1 - automatic (queried).
* * 2 - permanently enabled.
*
* The default value is 1.
*
* @IFLA_BR_MCAST_SNOOPING
* Turn multicast snooping on (*IFLA_BR_MCAST_SNOOPING* > 0) or off
* (*IFLA_BR_MCAST_SNOOPING* == 0).
*
* The default value is 1.
*
* @IFLA_BR_MCAST_QUERY_USE_IFADDR
* If enabled use the bridge's own IP address as source address for IGMP
* queries (*IFLA_BR_MCAST_QUERY_USE_IFADDR* > 0) or the default of 0.0.0.0
* (*IFLA_BR_MCAST_QUERY_USE_IFADDR* == 0).
*
* The default value is 0 (disabled).
*
* @IFLA_BR_MCAST_QUERIER
* Enable (*IFLA_BR_MULTICAST_QUERIER* > 0) or disable
* (*IFLA_BR_MULTICAST_QUERIER* == 0) IGMP querier, ie sending of multicast
* queries by the bridge.
*
* The default value is 0 (disabled).
*
* @IFLA_BR_MCAST_HASH_ELASTICITY
* Set multicast database hash elasticity, It is the maximum chain length in
* the multicast hash table. This attribute is *deprecated* and the value
* is always 16.
*
* @IFLA_BR_MCAST_HASH_MAX
* Set maximum size of the multicast hash table
*
* The default value is 4096, the value must be a power of 2.
*
* @IFLA_BR_MCAST_LAST_MEMBER_CNT
* The Last Member Query Count is the number of Group-Specific Queries
* sent before the router assumes there are no local members. The Last
* Member Query Count is also the number of Group-and-Source-Specific
* Queries sent before the router assumes there are no listeners for a
* particular source.
*
* The default value is 2.
*
* @IFLA_BR_MCAST_STARTUP_QUERY_CNT
* The Startup Query Count is the number of Queries sent out on startup,
* separated by the Startup Query Interval.
*
* The default value is 2.
*
* @IFLA_BR_MCAST_LAST_MEMBER_INTVL
* The Last Member Query Interval is the Max Response Time inserted into
* Group-Specific Queries sent in response to Leave Group messages, and
* is also the amount of time between Group-Specific Query messages.
*
* The default value is (1 * USER_HZ).
*
* @IFLA_BR_MCAST_MEMBERSHIP_INTVL
* The interval after which the bridge will leave a group, if no membership
* reports for this group are received.
*
* The default value is (260 * USER_HZ).
*
* @IFLA_BR_MCAST_QUERIER_INTVL
* The interval between queries sent by other routers. if no queries are
* seen after this delay has passed, the bridge will start to send its own
* queries (as if *IFLA_BR_MCAST_QUERIER_INTVL* was enabled).
*
* The default value is (255 * USER_HZ).
*
* @IFLA_BR_MCAST_QUERY_INTVL
* The Query Interval is the interval between General Queries sent by
* the Querier.
*
* The default value is (125 * USER_HZ). The minimum value is (1 * USER_HZ).
*
* @IFLA_BR_MCAST_QUERY_RESPONSE_INTVL
* The Max Response Time used to calculate the Max Resp Code inserted
* into the periodic General Queries.
*
* The default value is (10 * USER_HZ).
*
* @IFLA_BR_MCAST_STARTUP_QUERY_INTVL
* The interval between queries in the startup phase.
*
* The default value is (125 * USER_HZ) / 4. The minimum value is (1 * USER_HZ).
*
* @IFLA_BR_NF_CALL_IPTABLES
* Enable (*NF_CALL_IPTABLES* > 0) or disable (*NF_CALL_IPTABLES* == 0)
* iptables hooks on the bridge.
*
* The default value is 0 (disabled).
*
* @IFLA_BR_NF_CALL_IP6TABLES
* Enable (*NF_CALL_IP6TABLES* > 0) or disable (*NF_CALL_IP6TABLES* == 0)
* ip6tables hooks on the bridge.
*
* The default value is 0 (disabled).
*
* @IFLA_BR_NF_CALL_ARPTABLES
* Enable (*NF_CALL_ARPTABLES* > 0) or disable (*NF_CALL_ARPTABLES* == 0)
* arptables hooks on the bridge.
*
* The default value is 0 (disabled).
*
* @IFLA_BR_VLAN_DEFAULT_PVID
* VLAN ID applied to untagged and priority-tagged incoming packets.
*
* The default value is 1. Setting to the special value 0 makes all ports of
* this bridge not have a PVID by default, which means that they will
* not accept VLAN-untagged traffic.
*
* @IFLA_BR_PAD
* Bridge attribute padding type for netlink message.
*
* @IFLA_BR_VLAN_STATS_ENABLED
* Enable (*IFLA_BR_VLAN_STATS_ENABLED* == 1) or disable
* (*IFLA_BR_VLAN_STATS_ENABLED* == 0) per-VLAN stats accounting.
*
* The default value is 0 (disabled).
*
* @IFLA_BR_MCAST_STATS_ENABLED
* Enable (*IFLA_BR_MCAST_STATS_ENABLED* > 0) or disable
* (*IFLA_BR_MCAST_STATS_ENABLED* == 0) multicast (IGMP/MLD) stats
* accounting.
*
* The default value is 0 (disabled).
*
* @IFLA_BR_MCAST_IGMP_VERSION
* Set the IGMP version.
*
* The valid values are 2 and 3. The default value is 2.
*
* @IFLA_BR_MCAST_MLD_VERSION
* Set the MLD version.
*
* The valid values are 1 and 2. The default value is 1.
*
* @IFLA_BR_VLAN_STATS_PER_PORT
* Enable (*IFLA_BR_VLAN_STATS_PER_PORT* == 1) or disable
* (*IFLA_BR_VLAN_STATS_PER_PORT* == 0) per-VLAN per-port stats accounting.
* Can be changed only when there are no port VLANs configured.
*
* The default value is 0 (disabled).
*
* @IFLA_BR_MULTI_BOOLOPT
* The multi_boolopt is used to control new boolean options to avoid adding
* new netlink attributes. You can look at ``enum br_boolopt_id`` for those
* options.
*
* @IFLA_BR_MCAST_QUERIER_STATE
* Bridge mcast querier states, read only.
*
* @IFLA_BR_FDB_N_LEARNED
* The number of dynamically learned FDB entries for the current bridge,
* read only.
*
* @IFLA_BR_FDB_MAX_LEARNED
* Set the number of max dynamically learned FDB entries for the current
* bridge.
*/
enum {
IFLA_BR_UNSPEC,
IFLA_BR_FORWARD_DELAY,
@@ -510,6 +791,8 @@ enum {
IFLA_BR_VLAN_STATS_PER_PORT,
IFLA_BR_MULTI_BOOLOPT,
IFLA_BR_MCAST_QUERIER_STATE,
IFLA_BR_FDB_N_LEARNED,
IFLA_BR_FDB_MAX_LEARNED,
__IFLA_BR_MAX,
};
@@ -520,11 +803,252 @@ struct ifla_bridge_id {
__u8 addr[6]; /* ETH_ALEN */
};
/**
* DOC: Bridge mode enum definition
*
* @BRIDGE_MODE_HAIRPIN
* Controls whether traffic may be sent back out of the port on which it
* was received. This option is also called reflective relay mode, and is
* used to support basic VEPA (Virtual Ethernet Port Aggregator)
* capabilities. By default, this flag is turned off and the bridge will
* not forward traffic back out of the receiving port.
*/
enum {
BRIDGE_MODE_UNSPEC,
BRIDGE_MODE_HAIRPIN,
};
/**
* DOC: Bridge port enum definition
*
* @IFLA_BRPORT_STATE
* The operation state of the port. Here are the valid values.
*
* * 0 - port is in STP *DISABLED* state. Make this port completely
* inactive for STP. This is also called BPDU filter and could be used
* to disable STP on an untrusted port, like a leaf virtual device.
* The traffic forwarding is also stopped on this port.
* * 1 - port is in STP *LISTENING* state. Only valid if STP is enabled
* on the bridge. In this state the port listens for STP BPDUs and
* drops all other traffic frames.
* * 2 - port is in STP *LEARNING* state. Only valid if STP is enabled on
* the bridge. In this state the port will accept traffic only for the
* purpose of updating MAC address tables.
* * 3 - port is in STP *FORWARDING* state. Port is fully active.
* * 4 - port is in STP *BLOCKING* state. Only valid if STP is enabled on
* the bridge. This state is used during the STP election process.
* In this state, port will only process STP BPDUs.
*
* @IFLA_BRPORT_PRIORITY
* The STP port priority. The valid values are between 0 and 255.
*
* @IFLA_BRPORT_COST
* The STP path cost of the port. The valid values are between 1 and 65535.
*
* @IFLA_BRPORT_MODE
* Set the bridge port mode. See *BRIDGE_MODE_HAIRPIN* for more details.
*
* @IFLA_BRPORT_GUARD
* Controls whether STP BPDUs will be processed by the bridge port. By
* default, the flag is turned off to allow BPDU processing. Turning this
* flag on will disable the bridge port if a STP BPDU packet is received.
*
* If the bridge has Spanning Tree enabled, hostile devices on the network
* may send BPDU on a port and cause network failure. Setting *guard on*
* will detect and stop this by disabling the port. The port will be
* restarted if the link is brought down, or removed and reattached.
*
* @IFLA_BRPORT_PROTECT
* Controls whether a given port is allowed to become a root port or not.
* Only used when STP is enabled on the bridge. By default the flag is off.
*
* This feature is also called root port guard. If BPDU is received from a
* leaf (edge) port, it should not be elected as root port. This could
* be used if using STP on a bridge and the downstream bridges are not fully
* trusted; this prevents a hostile guest from rerouting traffic.
*
* @IFLA_BRPORT_FAST_LEAVE
* This flag allows the bridge to immediately stop multicast traffic
* forwarding on a port that receives an IGMP Leave message. It is only used
* when IGMP snooping is enabled on the bridge. By default the flag is off.
*
* @IFLA_BRPORT_LEARNING
* Controls whether a given port will learn *source* MAC addresses from
* received traffic or not. Also controls whether dynamic FDB entries
* (which can also be added by software) will be refreshed by incoming
* traffic. By default this flag is on.
*
* @IFLA_BRPORT_UNICAST_FLOOD
* Controls whether unicast traffic for which there is no FDB entry will
* be flooded towards this port. By default this flag is on.
*
* @IFLA_BRPORT_PROXYARP
* Enable proxy ARP on this port.
*
* @IFLA_BRPORT_LEARNING_SYNC
* Controls whether a given port will sync MAC addresses learned on device
* port to bridge FDB.
*
* @IFLA_BRPORT_PROXYARP_WIFI
* Enable proxy ARP on this port which meets extended requirements by
* IEEE 802.11 and Hotspot 2.0 specifications.
*
* @IFLA_BRPORT_ROOT_ID
*
* @IFLA_BRPORT_BRIDGE_ID
*
* @IFLA_BRPORT_DESIGNATED_PORT
*
* @IFLA_BRPORT_DESIGNATED_COST
*
* @IFLA_BRPORT_ID
*
* @IFLA_BRPORT_NO
*
* @IFLA_BRPORT_TOPOLOGY_CHANGE_ACK
*
* @IFLA_BRPORT_CONFIG_PENDING
*
* @IFLA_BRPORT_MESSAGE_AGE_TIMER
*
* @IFLA_BRPORT_FORWARD_DELAY_TIMER
*
* @IFLA_BRPORT_HOLD_TIMER
*
* @IFLA_BRPORT_FLUSH
* Flush bridge ports' fdb dynamic entries.
*
* @IFLA_BRPORT_MULTICAST_ROUTER
* Configure the port's multicast router presence. A port with
* a multicast router will receive all multicast traffic.
* The valid values are:
*
* * 0 disable multicast routers on this port
* * 1 let the system detect the presence of routers (default)
* * 2 permanently enable multicast traffic forwarding on this port
* * 3 enable multicast routers temporarily on this port, not depending
* on incoming queries.
*
* @IFLA_BRPORT_PAD
*
* @IFLA_BRPORT_MCAST_FLOOD
* Controls whether a given port will flood multicast traffic for which
* there is no MDB entry. By default this flag is on.
*
* @IFLA_BRPORT_MCAST_TO_UCAST
* Controls whether a given port will replicate packets using unicast
* instead of multicast. By default this flag is off.
*
* This is done by copying the packet per host and changing the multicast
* destination MAC to a unicast one accordingly.
*
* *mcast_to_unicast* works on top of the multicast snooping feature of the
* bridge. Which means unicast copies are only delivered to hosts which
* are interested in unicast and signaled this via IGMP/MLD reports previously.
*
* This feature is intended for interface types which have a more reliable
* and/or efficient way to deliver unicast packets than broadcast ones
* (e.g. WiFi).
*
* However, it should only be enabled on interfaces where no IGMPv2/MLDv1
* report suppression takes place. IGMP/MLD report suppression issue is
* usually overcome by the network daemon (supplicant) enabling AP isolation
* and by that separating all STAs.
*
* Delivery of STA-to-STA IP multicast is made possible again by enabling
* and utilizing the bridge hairpin mode, which considers the incoming port
* as a potential outgoing port, too (see *BRIDGE_MODE_HAIRPIN* option).
* Hairpin mode is performed after multicast snooping, therefore leading
* to only deliver reports to STAs running a multicast router.
*
* @IFLA_BRPORT_VLAN_TUNNEL
* Controls whether vlan to tunnel mapping is enabled on the port.
* By default this flag is off.
*
* @IFLA_BRPORT_BCAST_FLOOD
* Controls flooding of broadcast traffic on the given port. By default
* this flag is on.
*
* @IFLA_BRPORT_GROUP_FWD_MASK
* Set the group forward mask. This is a bitmask that is applied to
* decide whether to forward incoming frames destined to link-local
* addresses. The addresses of the form are 01:80:C2:00:00:0X (defaults
* to 0, which means the bridge does not forward any link-local frames
* coming on this port).
*
* @IFLA_BRPORT_NEIGH_SUPPRESS
* Controls whether neighbor discovery (arp and nd) proxy and suppression
* is enabled on the port. By default this flag is off.
*
* @IFLA_BRPORT_ISOLATED
* Controls whether a given port will be isolated, which means it will be
* able to communicate with non-isolated ports only. By default this
* flag is off.
*
* @IFLA_BRPORT_BACKUP_PORT
* Set a backup port. If the port loses carrier all traffic will be
* redirected to the configured backup port. Set the value to 0 to disable
* it.
*
* @IFLA_BRPORT_MRP_RING_OPEN
*
* @IFLA_BRPORT_MRP_IN_OPEN
*
* @IFLA_BRPORT_MCAST_EHT_HOSTS_LIMIT
* The number of per-port EHT hosts limit. The default value is 512.
* Setting to 0 is not allowed.
*
* @IFLA_BRPORT_MCAST_EHT_HOSTS_CNT
* The current number of tracked hosts, read only.
*
* @IFLA_BRPORT_LOCKED
* Controls whether a port will be locked, meaning that hosts behind the
* port will not be able to communicate through the port unless an FDB
* entry with the unit's MAC address is in the FDB. The common use case is
* that hosts are allowed access through authentication with the IEEE 802.1X
* protocol or based on whitelists. By default this flag is off.
*
* Please note that secure 802.1X deployments should always use the
* *BR_BOOLOPT_NO_LL_LEARN* flag, to not permit the bridge to populate its
* FDB based on link-local (EAPOL) traffic received on the port.
*
* @IFLA_BRPORT_MAB
* Controls whether a port will use MAC Authentication Bypass (MAB), a
* technique through which select MAC addresses may be allowed on a locked
* port, without using 802.1X authentication. Packets with an unknown source
* MAC address generates a "locked" FDB entry on the incoming bridge port.
* The common use case is for user space to react to these bridge FDB
* notifications and optionally replace the locked FDB entry with a normal
* one, allowing traffic to pass for whitelisted MAC addresses.
*
* Setting this flag also requires *IFLA_BRPORT_LOCKED* and
* *IFLA_BRPORT_LEARNING*. *IFLA_BRPORT_LOCKED* ensures that unauthorized
* data packets are dropped, and *IFLA_BRPORT_LEARNING* allows the dynamic
* FDB entries installed by user space (as replacements for the locked FDB
* entries) to be refreshed and/or aged out.
*
* @IFLA_BRPORT_MCAST_N_GROUPS
*
* @IFLA_BRPORT_MCAST_MAX_GROUPS
* Sets the maximum number of MDB entries that can be registered for a
* given port. Attempts to register more MDB entries at the port than this
* limit allows will be rejected, whether they are done through netlink
* (e.g. the bridge tool), or IGMP or MLD membership reports. Setting a
* limit of 0 disables the limit. The default value is 0.
*
* @IFLA_BRPORT_NEIGH_VLAN_SUPPRESS
* Controls whether neighbor discovery (arp and nd) proxy and suppression is
* enabled for a given port. By default this flag is off.
*
* Note that this option only takes effect when *IFLA_BRPORT_NEIGH_SUPPRESS*
* is enabled for a given port.
*
* @IFLA_BRPORT_BACKUP_NHID
* The FDB nexthop object ID to attach to packets being redirected to a
* backup port that has VLAN tunnel mapping enabled (via the
* *IFLA_BRPORT_VLAN_TUNNEL* option). Setting a value of 0 (default) has
* the effect of not attaching any ID.
*/
enum {
IFLA_BRPORT_UNSPEC,
IFLA_BRPORT_STATE, /* Spanning tree state */
@@ -769,6 +1293,19 @@ enum netkit_mode {
NETKIT_L3,
};
/* NETKIT_SCRUB_NONE leaves clearing skb->{mark,priority} up to
* the BPF program if attached. This also means the latter can
* consume the two fields if they were populated earlier.
*
* NETKIT_SCRUB_DEFAULT zeroes skb->{mark,priority} fields before
* invoking the attached BPF program when the peer device resides
* in a different network namespace. This is the default behavior.
*/
enum netkit_scrub {
NETKIT_SCRUB_NONE,
NETKIT_SCRUB_DEFAULT,
};
enum {
IFLA_NETKIT_UNSPEC,
IFLA_NETKIT_PEER_INFO,
@@ -776,6 +1313,10 @@ enum {
IFLA_NETKIT_POLICY,
IFLA_NETKIT_PEER_POLICY,
IFLA_NETKIT_MODE,
IFLA_NETKIT_SCRUB,
IFLA_NETKIT_PEER_SCRUB,
IFLA_NETKIT_HEADROOM,
IFLA_NETKIT_TAILROOM,
__IFLA_NETKIT_MAX,
};
#define IFLA_NETKIT_MAX (__IFLA_NETKIT_MAX - 1)
@@ -854,6 +1395,7 @@ enum {
IFLA_VXLAN_DF,
IFLA_VXLAN_VNIFILTER, /* only applicable with COLLECT_METADATA mode */
IFLA_VXLAN_LOCALBYPASS,
IFLA_VXLAN_LABEL_POLICY, /* IPv6 flow label policy; ifla_vxlan_label_policy */
__IFLA_VXLAN_MAX
};
#define IFLA_VXLAN_MAX (__IFLA_VXLAN_MAX - 1)
@@ -871,6 +1413,13 @@ enum ifla_vxlan_df {
VXLAN_DF_MAX = __VXLAN_DF_END - 1,
};
enum ifla_vxlan_label_policy {
VXLAN_LABEL_FIXED = 0,
VXLAN_LABEL_INHERIT = 1,
__VXLAN_LABEL_END,
VXLAN_LABEL_MAX = __VXLAN_LABEL_END - 1,
};
/* GENEVE section */
enum {
IFLA_GENEVE_UNSPEC,
@@ -935,6 +1484,8 @@ enum {
IFLA_GTP_ROLE,
IFLA_GTP_CREATE_SOCKETS,
IFLA_GTP_RESTART_COUNT,
IFLA_GTP_LOCAL,
IFLA_GTP_LOCAL6,
__IFLA_GTP_MAX,
};
#define IFLA_GTP_MAX (__IFLA_GTP_MAX - 1)
@@ -1240,6 +1791,7 @@ enum {
IFLA_HSR_PROTOCOL, /* Indicate different protocol than
* HSR. For example PRP.
*/
IFLA_HSR_INTERLINK, /* HSR interlink network device */
__IFLA_HSR_MAX,
};
@@ -1417,7 +1969,9 @@ enum {
enum {
IFLA_DSA_UNSPEC,
IFLA_DSA_MASTER,
IFLA_DSA_CONDUIT,
/* Deprecated, use IFLA_DSA_CONDUIT instead */
IFLA_DSA_MASTER = IFLA_DSA_CONDUIT,
__IFLA_DSA_MAX,
};

View File

@@ -117,16 +117,22 @@ struct xdp_options {
((1ULL << XSK_UNALIGNED_BUF_OFFSET_SHIFT) - 1)
/* Request transmit timestamp. Upon completion, put it into tx_timestamp
* field of union xsk_tx_metadata.
* field of struct xsk_tx_metadata.
*/
#define XDP_TXMD_FLAGS_TIMESTAMP (1 << 0)
/* Request transmit checksum offload. Checksum start position and offset
* are communicated via csum_start and csum_offset fields of union
* are communicated via csum_start and csum_offset fields of struct
* xsk_tx_metadata.
*/
#define XDP_TXMD_FLAGS_CHECKSUM (1 << 1)
/* Request launch time hardware offload. The device will schedule the packet for
* transmission at a pre-determined time called launch time. The value of
* launch time is communicated via launch_time field of struct xsk_tx_metadata.
*/
#define XDP_TXMD_FLAGS_LAUNCH_TIME (1 << 2)
/* AF_XDP offloads request. 'request' union member is consumed by the driver
* when the packet is being transmitted. 'completion' union member is
* filled by the driver when the transmit completion arrives.
@@ -142,6 +148,10 @@ struct xsk_tx_metadata {
__u16 csum_start;
/* Offset from csum_start where checksum should be stored. */
__u16 csum_offset;
/* XDP_TXMD_FLAGS_LAUNCH_TIME */
/* Launch time in nanosecond against the PTP HW Clock */
__u64 launch_time;
} request;
struct {

View File

@@ -59,10 +59,13 @@ enum netdev_xdp_rx_metadata {
* by the driver.
* @NETDEV_XSK_FLAGS_TX_CHECKSUM: L3 checksum HW offload is supported by the
* driver.
* @NETDEV_XSK_FLAGS_TX_LAUNCH_TIME_FIFO: Launch time HW offload is supported
* by the driver.
*/
enum netdev_xsk_flags {
NETDEV_XSK_FLAGS_TX_TIMESTAMP = 1,
NETDEV_XSK_FLAGS_TX_CHECKSUM = 2,
NETDEV_XSK_FLAGS_TX_LAUNCH_TIME_FIFO = 4,
};
enum netdev_queue_type {
@@ -86,6 +89,11 @@ enum {
NETDEV_A_DEV_MAX = (__NETDEV_A_DEV_MAX - 1)
};
enum {
__NETDEV_A_IO_URING_PROVIDER_INFO_MAX,
NETDEV_A_IO_URING_PROVIDER_INFO_MAX = (__NETDEV_A_IO_URING_PROVIDER_INFO_MAX - 1)
};
enum {
NETDEV_A_PAGE_POOL_ID = 1,
NETDEV_A_PAGE_POOL_IFINDEX,
@@ -93,6 +101,8 @@ enum {
NETDEV_A_PAGE_POOL_INFLIGHT,
NETDEV_A_PAGE_POOL_INFLIGHT_MEM,
NETDEV_A_PAGE_POOL_DETACH_TIME,
NETDEV_A_PAGE_POOL_DMABUF,
NETDEV_A_PAGE_POOL_IO_URING,
__NETDEV_A_PAGE_POOL_MAX,
NETDEV_A_PAGE_POOL_MAX = (__NETDEV_A_PAGE_POOL_MAX - 1)
@@ -121,16 +131,27 @@ enum {
NETDEV_A_NAPI_ID,
NETDEV_A_NAPI_IRQ,
NETDEV_A_NAPI_PID,
NETDEV_A_NAPI_DEFER_HARD_IRQS,
NETDEV_A_NAPI_GRO_FLUSH_TIMEOUT,
NETDEV_A_NAPI_IRQ_SUSPEND_TIMEOUT,
__NETDEV_A_NAPI_MAX,
NETDEV_A_NAPI_MAX = (__NETDEV_A_NAPI_MAX - 1)
};
enum {
__NETDEV_A_XSK_INFO_MAX,
NETDEV_A_XSK_INFO_MAX = (__NETDEV_A_XSK_INFO_MAX - 1)
};
enum {
NETDEV_A_QUEUE_ID = 1,
NETDEV_A_QUEUE_IFINDEX,
NETDEV_A_QUEUE_TYPE,
NETDEV_A_QUEUE_NAPI_ID,
NETDEV_A_QUEUE_DMABUF,
NETDEV_A_QUEUE_IO_URING,
NETDEV_A_QUEUE_XSK,
__NETDEV_A_QUEUE_MAX,
NETDEV_A_QUEUE_MAX = (__NETDEV_A_QUEUE_MAX - 1)
@@ -173,6 +194,16 @@ enum {
NETDEV_A_QSTATS_MAX = (__NETDEV_A_QSTATS_MAX - 1)
};
enum {
NETDEV_A_DMABUF_IFINDEX = 1,
NETDEV_A_DMABUF_QUEUES,
NETDEV_A_DMABUF_FD,
NETDEV_A_DMABUF_ID,
__NETDEV_A_DMABUF_MAX,
NETDEV_A_DMABUF_MAX = (__NETDEV_A_DMABUF_MAX - 1)
};
enum {
NETDEV_CMD_DEV_GET = 1,
NETDEV_CMD_DEV_ADD_NTF,
@@ -186,6 +217,8 @@ enum {
NETDEV_CMD_QUEUE_GET,
NETDEV_CMD_NAPI_GET,
NETDEV_CMD_QSTATS_GET,
NETDEV_CMD_BIND_RX,
NETDEV_CMD_NAPI_SET,
__NETDEV_CMD_MAX,
NETDEV_CMD_MAX = (__NETDEV_CMD_MAX - 1)

View File

@@ -385,6 +385,8 @@ enum perf_event_read_format {
*
* @sample_max_stack: Max number of frame pointers in a callchain,
* should be < /proc/sys/kernel/perf_event_max_stack
* Max number of entries of branch stack
* should be < hardware limit
*/
struct perf_event_attr {
@@ -511,7 +513,16 @@ struct perf_event_attr {
__u16 sample_max_stack;
__u16 __reserved_2;
__u32 aux_sample_size;
__u32 __reserved_3;
union {
__u32 aux_action;
struct {
__u32 aux_start_paused : 1, /* start AUX area tracing paused */
aux_pause : 1, /* on overflow, pause AUX area tracing */
aux_resume : 1, /* on overflow, resume AUX area tracing */
__reserved_3 : 29;
};
};
/*
* User provided data if sigtrap=1, passed back to user via

View File

@@ -9,7 +9,7 @@ else
endif
LIBBPF_MAJOR_VERSION := 1
LIBBPF_MINOR_VERSION := 5
LIBBPF_MINOR_VERSION := 6
LIBBPF_PATCH_VERSION := 0
LIBBPF_VERSION := $(LIBBPF_MAJOR_VERSION).$(LIBBPF_MINOR_VERSION).$(LIBBPF_PATCH_VERSION)
LIBBPF_MAJMIN_VERSION := $(LIBBPF_MAJOR_VERSION).$(LIBBPF_MINOR_VERSION).0
@@ -26,6 +26,7 @@ endef
$(call allow-override,CC,$(CROSS_COMPILE)cc)
$(call allow-override,LD,$(CROSS_COMPILE)ld)
PKG_CONFIG ?= pkg-config
TOPDIR = ..
@@ -41,10 +42,12 @@ ALL_CFLAGS += $(CFLAGS) \
$(EXTRA_CFLAGS)
ALL_LDFLAGS += $(LDFLAGS) $(EXTRA_LDFLAGS)
ifeq ($(shell command -v $(PKG_CONFIG) 2> /dev/null),)
NO_PKG_CONFIG := 1
endif
ifdef NO_PKG_CONFIG
ALL_LDFLAGS += -lelf -lz
else
PKG_CONFIG ?= pkg-config
ALL_CFLAGS += $(shell $(PKG_CONFIG) --cflags libelf zlib)
ALL_LDFLAGS += $(shell $(PKG_CONFIG) --libs libelf zlib)
endif

View File

@@ -238,7 +238,7 @@ int bpf_prog_load(enum bpf_prog_type prog_type,
const struct bpf_insn *insns, size_t insn_cnt,
struct bpf_prog_load_opts *opts)
{
const size_t attr_sz = offsetofend(union bpf_attr, prog_token_fd);
const size_t attr_sz = offsetofend(union bpf_attr, fd_array_cnt);
void *finfo = NULL, *linfo = NULL;
const char *func_info, *line_info;
__u32 log_size, log_level, attach_prog_fd, attach_btf_obj_fd;
@@ -311,6 +311,7 @@ int bpf_prog_load(enum bpf_prog_type prog_type,
attr.line_info_cnt = OPTS_GET(opts, line_info_cnt, 0);
attr.fd_array = ptr_to_u64(OPTS_GET(opts, fd_array, NULL));
attr.fd_array_cnt = OPTS_GET(opts, fd_array_cnt, 0);
if (log_level) {
attr.log_buf = ptr_to_u64(log_buf);
@@ -776,6 +777,7 @@ int bpf_link_create(int prog_fd, int target_fd,
return libbpf_err(-EINVAL);
break;
case BPF_TRACE_UPROBE_MULTI:
case BPF_TRACE_UPROBE_SESSION:
attr.link_create.uprobe_multi.flags = OPTS_GET(opts, uprobe_multi.flags, 0);
attr.link_create.uprobe_multi.cnt = OPTS_GET(opts, uprobe_multi.cnt, 0);
attr.link_create.uprobe_multi.path = ptr_to_u64(OPTS_GET(opts, uprobe_multi.path, 0));
@@ -1095,7 +1097,7 @@ int bpf_map_get_fd_by_id(__u32 id)
int bpf_btf_get_fd_by_id_opts(__u32 id,
const struct bpf_get_fd_by_id_opts *opts)
{
const size_t attr_sz = offsetofend(union bpf_attr, open_flags);
const size_t attr_sz = offsetofend(union bpf_attr, fd_by_id_token_fd);
union bpf_attr attr;
int fd;
@@ -1105,6 +1107,7 @@ int bpf_btf_get_fd_by_id_opts(__u32 id,
memset(&attr, 0, attr_sz);
attr.btf_id = id;
attr.open_flags = OPTS_GET(opts, open_flags, 0);
attr.fd_by_id_token_fd = OPTS_GET(opts, token_fd, 0);
fd = sys_bpf_fd(BPF_BTF_GET_FD_BY_ID, &attr, attr_sz);
return libbpf_err_errno(fd);

View File

@@ -100,16 +100,19 @@ struct bpf_prog_load_opts {
__u32 log_level;
__u32 log_size;
char *log_buf;
/* output: actual total log contents size (including termintaing zero).
/* output: actual total log contents size (including terminating zero).
* It could be both larger than original log_size (if log was
* truncated), or smaller (if log buffer wasn't filled completely).
* If kernel doesn't support this feature, log_size is left unchanged.
*/
__u32 log_true_size;
__u32 token_fd;
/* if set, provides the length of fd_array */
__u32 fd_array_cnt;
size_t :0;
};
#define bpf_prog_load_opts__last_field token_fd
#define bpf_prog_load_opts__last_field fd_array_cnt
LIBBPF_API int bpf_prog_load(enum bpf_prog_type prog_type,
const char *prog_name, const char *license,
@@ -129,7 +132,7 @@ struct bpf_btf_load_opts {
char *log_buf;
__u32 log_level;
__u32 log_size;
/* output: actual total log contents size (including termintaing zero).
/* output: actual total log contents size (including terminating zero).
* It could be both larger than original log_size (if log was
* truncated), or smaller (if log buffer wasn't filled completely).
* If kernel doesn't support this feature, log_size is left unchanged.
@@ -484,9 +487,10 @@ LIBBPF_API int bpf_link_get_next_id(__u32 start_id, __u32 *next_id);
struct bpf_get_fd_by_id_opts {
size_t sz; /* size of this struct for forward/backward compatibility */
__u32 open_flags; /* permissions requested for the operation on fd */
__u32 token_fd;
size_t :0;
};
#define bpf_get_fd_by_id_opts__last_field open_flags
#define bpf_get_fd_by_id_opts__last_field token_fd
LIBBPF_API int bpf_prog_get_fd_by_id(__u32 id);
LIBBPF_API int bpf_prog_get_fd_by_id_opts(__u32 id,

View File

@@ -388,7 +388,13 @@ extern void *bpf_rdonly_cast(const void *obj, __u32 btf_id) __ksym __weak;
#define ___arrow10(a, b, c, d, e, f, g, h, i, j) a->b->c->d->e->f->g->h->i->j
#define ___arrow(...) ___apply(___arrow, ___narg(__VA_ARGS__))(__VA_ARGS__)
#if defined(__clang__) && (__clang_major__ >= 19)
#define ___type(...) __typeof_unqual__(___arrow(__VA_ARGS__))
#elif defined(__GNUC__) && (__GNUC__ >= 14)
#define ___type(...) __typeof_unqual__(___arrow(__VA_ARGS__))
#else
#define ___type(...) typeof(___arrow(__VA_ARGS__))
#endif
#define ___read(read_fn, dst, src_type, src, accessor) \
read_fn((void *)(dst), sizeof(*(dst)), &((src_type)(src))->accessor)

View File

@@ -34,6 +34,7 @@ struct bpf_gen {
void *data_cur;
void *insn_start;
void *insn_cur;
bool swapped_endian;
ssize_t cleanup_label;
__u32 nr_progs;
__u32 nr_maps;

View File

@@ -44,6 +44,14 @@ struct bpf_dynptr;
struct iphdr;
struct ipv6hdr;
#ifndef __bpf_fastcall
#if __has_attribute(bpf_fastcall)
#define __bpf_fastcall __attribute__((bpf_fastcall))
#else
#define __bpf_fastcall
#endif
#endif
/*
* bpf_map_lookup_elem
*
@@ -203,17 +211,21 @@ static __u32 (* const bpf_get_prandom_u32)(void) = (void *) 7;
* Returns
* The SMP id of the processor running the program.
*/
static __u32 (* const bpf_get_smp_processor_id)(void) = (void *) 8;
static __bpf_fastcall __u32 (* const bpf_get_smp_processor_id)(void) = (void *) 8;
/*
* bpf_skb_store_bytes
*
* Store *len* bytes from address *from* into the packet
* associated to *skb*, at *offset*. *flags* are a combination of
* **BPF_F_RECOMPUTE_CSUM** (automatically recompute the
* checksum for the packet after storing the bytes) and
* **BPF_F_INVALIDATE_HASH** (set *skb*\ **->hash**, *skb*\
* **->swhash** and *skb*\ **->l4hash** to 0).
* associated to *skb*, at *offset*. The *flags* are a combination
* of the following values:
*
* **BPF_F_RECOMPUTE_CSUM**
* Automatically update *skb*\ **->csum** after storing the
* bytes.
* **BPF_F_INVALIDATE_HASH**
* Set *skb*\ **->hash**, *skb*\ **->swhash** and *skb*\
* **->l4hash** to 0.
*
* A call to this helper is susceptible to change the underlying
* packet buffer. Therefore, at load time, all checks on pointers
@@ -273,7 +285,7 @@ static long (* const bpf_l3_csum_replace)(struct __sk_buff *skb, __u32 offset, _
* untouched (unless **BPF_F_MARK_ENFORCE** is added as well), and
* for updates resulting in a null checksum the value is set to
* **CSUM_MANGLED_0** instead. Flag **BPF_F_PSEUDO_HDR** indicates
* the checksum is to be computed against a pseudo-header.
* that the modified header field is part of the pseudo-header.
*
* This helper works in combination with **bpf_csum_diff**\ (),
* which does not update the checksum in-place, but offers more
@@ -1224,7 +1236,7 @@ static long (* const bpf_set_hash)(struct __sk_buff *skb, __u32 hash) = (void *)
* **TCP_SYNCNT**, **TCP_USER_TIMEOUT**, **TCP_NOTSENT_LOWAT**,
* **TCP_NODELAY**, **TCP_MAXSEG**, **TCP_WINDOW_CLAMP**,
* **TCP_THIN_LINEAR_TIMEOUTS**, **TCP_BPF_DELACK_MAX**,
* **TCP_BPF_RTO_MIN**.
* **TCP_BPF_RTO_MIN**, **TCP_BPF_SOCK_OPS_CB_FLAGS**.
* * **IPPROTO_IP**, which supports *optname* **IP_TOS**.
* * **IPPROTO_IPV6**, which supports the following *optname*\ s:
* **IPV6_TCLASS**, **IPV6_AUTOFLOWLABEL**.
@@ -1511,10 +1523,6 @@ static long (* const bpf_getsockopt)(void *bpf_socket, int level, int optname, v
* option, and in this case it only works on functions tagged with
* **ALLOW_ERROR_INJECTION** in the kernel code.
*
* Also, the helper is only available for the architectures having
* the CONFIG_FUNCTION_ERROR_INJECTION option. As of this writing,
* x86 architecture is the only one to support this feature.
*
* Returns
* 0
*/
@@ -4220,7 +4228,7 @@ static long (* const bpf_find_vma)(struct task_struct *task, __u64 addr, void *c
* Currently, the **flags** must be 0. Currently, nr_loops is
* limited to 1 << 23 (~8 million) loops.
*
* long (\*callback_fn)(u32 index, void \*ctx);
* long (\*callback_fn)(u64 index, void \*ctx);
*
* where **index** is the current index in the loop. The index
* is zero-indexed.
@@ -4424,9 +4432,10 @@ static long (* const bpf_ima_file_hash)(struct file *file, void *dst, __u32 size
/*
* bpf_kptr_xchg
*
* Exchange kptr at pointer *map_value* with *ptr*, and return the
* old value. *ptr* can be NULL, otherwise it must be a referenced
* pointer which will be released when this helper is called.
* Exchange kptr at pointer *dst* with *ptr*, and return the old value.
* *dst* can be map value or local kptr. *ptr* can be NULL, otherwise
* it must be a referenced pointer which will be released when this helper
* is called.
*
* Returns
* The old value of kptr (which can be NULL). The returned pointer
@@ -4434,7 +4443,7 @@ static long (* const bpf_ima_file_hash)(struct file *file, void *dst, __u32 size
* corresponding release function, or moved into a BPF map before
* program exit.
*/
static void *(* const bpf_kptr_xchg)(void *map_value, void *ptr) = (void *) 194;
static void *(* const bpf_kptr_xchg)(void *dst, void *ptr) = (void *) 194;
/*
* bpf_map_lookup_percpu_elem

View File

@@ -15,6 +15,14 @@
#define __array(name, val) typeof(val) *name[]
#define __ulong(name, val) enum { ___bpf_concat(__unique_value, __COUNTER__) = val } name
#ifndef likely
#define likely(x) (__builtin_expect(!!(x), 1))
#endif
#ifndef unlikely
#define unlikely(x) (__builtin_expect(!!(x), 0))
#endif
/*
* Helper macro to place programs, maps, license in
* different sections in elf_bpf file. Section names
@@ -185,6 +193,7 @@ enum libbpf_tristate {
#define __kptr_untrusted __attribute__((btf_type_tag("kptr_untrusted")))
#define __kptr __attribute__((btf_type_tag("kptr")))
#define __percpu_kptr __attribute__((btf_type_tag("percpu_kptr")))
#define __uptr __attribute__((btf_type_tag("uptr")))
#if defined (__clang__)
#define bpf_ksym_exists(sym) ({ \
@@ -341,7 +350,7 @@ extern void bpf_iter_num_destroy(struct bpf_iter_num *it) __weak __ksym;
* I.e., it looks almost like high-level for each loop in other languages,
* supports continue/break, and is verifiable by BPF verifier.
*
* For iterating integers, the difference betwen bpf_for_each(num, i, N, M)
* For iterating integers, the difference between bpf_for_each(num, i, N, M)
* and bpf_for(i, N, M) is in that bpf_for() provides additional proof to
* verifier that i is in [N, M) range, and in bpf_for_each() case i is `int
* *`, not just `int`. So for integers bpf_for() is more convenient.

View File

@@ -163,7 +163,7 @@
struct pt_regs___s390 {
unsigned long orig_gpr2;
};
} __attribute__((preserve_access_index));
/* s390 provides user_pt_regs instead of struct pt_regs to userspace */
#define __PT_REGS_CAST(x) ((const user_pt_regs *)(x))
@@ -179,7 +179,7 @@ struct pt_regs___s390 {
#define __PT_PARM4_SYSCALL_REG __PT_PARM4_REG
#define __PT_PARM5_SYSCALL_REG __PT_PARM5_REG
#define __PT_PARM6_SYSCALL_REG gprs[7]
#define PT_REGS_PARM1_SYSCALL(x) PT_REGS_PARM1_CORE_SYSCALL(x)
#define PT_REGS_PARM1_SYSCALL(x) (((const struct pt_regs___s390 *)(x))->__PT_PARM1_SYSCALL_REG)
#define PT_REGS_PARM1_CORE_SYSCALL(x) \
BPF_CORE_READ((const struct pt_regs___s390 *)(x), __PT_PARM1_SYSCALL_REG)
@@ -222,7 +222,7 @@ struct pt_regs___s390 {
struct pt_regs___arm64 {
unsigned long orig_x0;
};
} __attribute__((preserve_access_index));
/* arm64 provides struct user_pt_regs instead of struct pt_regs to userspace */
#define __PT_REGS_CAST(x) ((const struct user_pt_regs *)(x))
@@ -241,7 +241,7 @@ struct pt_regs___arm64 {
#define __PT_PARM4_SYSCALL_REG __PT_PARM4_REG
#define __PT_PARM5_SYSCALL_REG __PT_PARM5_REG
#define __PT_PARM6_SYSCALL_REG __PT_PARM6_REG
#define PT_REGS_PARM1_SYSCALL(x) PT_REGS_PARM1_CORE_SYSCALL(x)
#define PT_REGS_PARM1_SYSCALL(x) (((const struct pt_regs___arm64 *)(x))->__PT_PARM1_SYSCALL_REG)
#define PT_REGS_PARM1_CORE_SYSCALL(x) \
BPF_CORE_READ((const struct pt_regs___arm64 *)(x), __PT_PARM1_SYSCALL_REG)
@@ -351,6 +351,10 @@ struct pt_regs___arm64 {
* https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc#risc-v-calling-conventions
*/
struct pt_regs___riscv {
unsigned long orig_a0;
} __attribute__((preserve_access_index));
/* riscv provides struct user_regs_struct instead of struct pt_regs to userspace */
#define __PT_REGS_CAST(x) ((const struct user_regs_struct *)(x))
#define __PT_PARM1_REG a0
@@ -362,12 +366,15 @@ struct pt_regs___arm64 {
#define __PT_PARM7_REG a6
#define __PT_PARM8_REG a7
#define __PT_PARM1_SYSCALL_REG __PT_PARM1_REG
#define __PT_PARM1_SYSCALL_REG orig_a0
#define __PT_PARM2_SYSCALL_REG __PT_PARM2_REG
#define __PT_PARM3_SYSCALL_REG __PT_PARM3_REG
#define __PT_PARM4_SYSCALL_REG __PT_PARM4_REG
#define __PT_PARM5_SYSCALL_REG __PT_PARM5_REG
#define __PT_PARM6_SYSCALL_REG __PT_PARM6_REG
#define PT_REGS_PARM1_SYSCALL(x) (((const struct pt_regs___riscv *)(x))->__PT_PARM1_SYSCALL_REG)
#define PT_REGS_PARM1_CORE_SYSCALL(x) \
BPF_CORE_READ((const struct pt_regs___riscv *)(x), __PT_PARM1_SYSCALL_REG)
#define __PT_RET_REG ra
#define __PT_FP_REG s0
@@ -473,7 +480,7 @@ struct pt_regs;
#endif
/*
* Similarly, syscall-specific conventions might differ between function call
* conventions within each architecutre. All supported architectures pass
* conventions within each architecture. All supported architectures pass
* either 6 or 7 syscall arguments in registers.
*
* See syscall(2) manpage for succinct table with information on each arch.
@@ -515,7 +522,7 @@ struct pt_regs;
#define BPF_KPROBE_READ_RET_IP(ip, ctx) ({ (ip) = (ctx)->link; })
#define BPF_KRETPROBE_READ_RET_IP BPF_KPROBE_READ_RET_IP
#elif defined(bpf_target_sparc)
#elif defined(bpf_target_sparc) || defined(bpf_target_arm64)
#define BPF_KPROBE_READ_RET_IP(ip, ctx) ({ (ip) = PT_REGS_RET(ctx); })
#define BPF_KRETPROBE_READ_RET_IP BPF_KPROBE_READ_RET_IP
@@ -651,7 +658,7 @@ struct pt_regs;
* BPF_PROG is a convenience wrapper for generic tp_btf/fentry/fexit and
* similar kinds of BPF programs, that accept input arguments as a single
* pointer to untyped u64 array, where each u64 can actually be a typed
* pointer or integer of different size. Instead of requring user to write
* pointer or integer of different size. Instead of requiring user to write
* manual casts and work with array elements by index, BPF_PROG macro
* allows user to declare a list of named and typed input arguments in the
* same syntax as for normal C function. All the casting is hidden and
@@ -801,7 +808,7 @@ struct pt_regs;
* tp_btf/fentry/fexit BPF programs. It hides the underlying platform-specific
* low-level way of getting kprobe input arguments from struct pt_regs, and
* provides a familiar typed and named function arguments syntax and
* semantics of accessing kprobe input paremeters.
* semantics of accessing kprobe input parameters.
*
* Original struct pt_regs* context is preserved as 'ctx' argument. This might
* be necessary when using BPF helpers like bpf_perf_event_output().

561
src/btf.c
View File

@@ -22,6 +22,7 @@
#include "libbpf_internal.h"
#include "hashmap.h"
#include "strset.h"
#include "str_error.h"
#define BTF_MAX_NR_TYPES 0x7fffffffU
#define BTF_MAX_STR_OFFSET 0x7fffffffU
@@ -282,7 +283,7 @@ static int btf_parse_str_sec(struct btf *btf)
return -EINVAL;
}
if (!btf->base_btf && start[0]) {
pr_debug("Invalid BTF string section\n");
pr_debug("Malformed BTF string section, did you forget to provide base BTF?\n");
return -EINVAL;
}
return 0;
@@ -1147,6 +1148,12 @@ static int btf_find_elf_sections(Elf *elf, const char *path, struct btf_elf_secs
else
continue;
if (sh.sh_type != SHT_PROGBITS) {
pr_warn("unexpected section type (%d) of section(%d, %s) from %s\n",
sh.sh_type, idx, name, path);
goto err;
}
data = elf_getdata(scn, 0);
if (!data) {
pr_warn("failed to get section(%d, %s) data from %s\n",
@@ -1179,12 +1186,13 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
fd = open(path, O_RDONLY | O_CLOEXEC);
if (fd < 0) {
err = -errno;
pr_warn("failed to open %s: %s\n", path, strerror(errno));
pr_warn("failed to open %s: %s\n", path, errstr(err));
return ERR_PTR(err);
}
elf = elf_begin(fd, ELF_C_READ, NULL);
if (!elf) {
err = -LIBBPF_ERRNO__FORMAT;
pr_warn("failed to open %s as ELF file\n", path);
goto done;
}
@@ -1445,7 +1453,7 @@ retry_load:
goto retry_load;
err = -errno;
pr_warn("BTF loading error: %d\n", err);
pr_warn("BTF loading error: %s\n", errstr(err));
/* don't print out contents of custom log_buf */
if (!log_buf && buf[0])
pr_warn("-- BEGIN BTF LOAD LOG ---\n%s\n-- END BTF LOAD LOG --\n", buf);
@@ -1617,12 +1625,18 @@ exit_free:
return btf;
}
struct btf *btf__load_from_kernel_by_id_split(__u32 id, struct btf *base_btf)
struct btf *btf_load_from_kernel(__u32 id, struct btf *base_btf, int token_fd)
{
struct btf *btf;
int btf_fd;
LIBBPF_OPTS(bpf_get_fd_by_id_opts, opts);
btf_fd = bpf_btf_get_fd_by_id(id);
if (token_fd) {
opts.open_flags |= BPF_F_TOKEN_FD;
opts.token_fd = token_fd;
}
btf_fd = bpf_btf_get_fd_by_id_opts(id, &opts);
if (btf_fd < 0)
return libbpf_err_ptr(-errno);
@@ -1632,6 +1646,11 @@ struct btf *btf__load_from_kernel_by_id_split(__u32 id, struct btf *base_btf)
return libbpf_ptr(btf);
}
struct btf *btf__load_from_kernel_by_id_split(__u32 id, struct btf *base_btf)
{
return btf_load_from_kernel(id, base_btf, 0);
}
struct btf *btf__load_from_kernel_by_id(__u32 id)
{
return btf__load_from_kernel_by_id_split(id, NULL);
@@ -2088,7 +2107,7 @@ static int validate_type_id(int id)
}
/* generic append function for PTR, TYPEDEF, CONST/VOLATILE/RESTRICT */
static int btf_add_ref_kind(struct btf *btf, int kind, const char *name, int ref_type_id)
static int btf_add_ref_kind(struct btf *btf, int kind, const char *name, int ref_type_id, int kflag)
{
struct btf_type *t;
int sz, name_off = 0;
@@ -2111,7 +2130,7 @@ static int btf_add_ref_kind(struct btf *btf, int kind, const char *name, int ref
}
t->name_off = name_off;
t->info = btf_type_info(kind, 0, 0);
t->info = btf_type_info(kind, 0, kflag);
t->type = ref_type_id;
return btf_commit_type(btf, sz);
@@ -2126,7 +2145,7 @@ static int btf_add_ref_kind(struct btf *btf, int kind, const char *name, int ref
*/
int btf__add_ptr(struct btf *btf, int ref_type_id)
{
return btf_add_ref_kind(btf, BTF_KIND_PTR, NULL, ref_type_id);
return btf_add_ref_kind(btf, BTF_KIND_PTR, NULL, ref_type_id, 0);
}
/*
@@ -2504,7 +2523,7 @@ int btf__add_fwd(struct btf *btf, const char *name, enum btf_fwd_kind fwd_kind)
struct btf_type *t;
int id;
id = btf_add_ref_kind(btf, BTF_KIND_FWD, name, 0);
id = btf_add_ref_kind(btf, BTF_KIND_FWD, name, 0, 0);
if (id <= 0)
return id;
t = btf_type_by_id(btf, id);
@@ -2534,7 +2553,7 @@ int btf__add_typedef(struct btf *btf, const char *name, int ref_type_id)
if (!name || !name[0])
return libbpf_err(-EINVAL);
return btf_add_ref_kind(btf, BTF_KIND_TYPEDEF, name, ref_type_id);
return btf_add_ref_kind(btf, BTF_KIND_TYPEDEF, name, ref_type_id, 0);
}
/*
@@ -2546,7 +2565,7 @@ int btf__add_typedef(struct btf *btf, const char *name, int ref_type_id)
*/
int btf__add_volatile(struct btf *btf, int ref_type_id)
{
return btf_add_ref_kind(btf, BTF_KIND_VOLATILE, NULL, ref_type_id);
return btf_add_ref_kind(btf, BTF_KIND_VOLATILE, NULL, ref_type_id, 0);
}
/*
@@ -2558,7 +2577,7 @@ int btf__add_volatile(struct btf *btf, int ref_type_id)
*/
int btf__add_const(struct btf *btf, int ref_type_id)
{
return btf_add_ref_kind(btf, BTF_KIND_CONST, NULL, ref_type_id);
return btf_add_ref_kind(btf, BTF_KIND_CONST, NULL, ref_type_id, 0);
}
/*
@@ -2570,7 +2589,7 @@ int btf__add_const(struct btf *btf, int ref_type_id)
*/
int btf__add_restrict(struct btf *btf, int ref_type_id)
{
return btf_add_ref_kind(btf, BTF_KIND_RESTRICT, NULL, ref_type_id);
return btf_add_ref_kind(btf, BTF_KIND_RESTRICT, NULL, ref_type_id, 0);
}
/*
@@ -2586,7 +2605,24 @@ int btf__add_type_tag(struct btf *btf, const char *value, int ref_type_id)
if (!value || !value[0])
return libbpf_err(-EINVAL);
return btf_add_ref_kind(btf, BTF_KIND_TYPE_TAG, value, ref_type_id);
return btf_add_ref_kind(btf, BTF_KIND_TYPE_TAG, value, ref_type_id, 0);
}
/*
* Append new BTF_KIND_TYPE_TAG type with:
* - *value*, non-empty/non-NULL tag value;
* - *ref_type_id* - referenced type ID, it might not exist yet;
* Set info->kflag to 1, indicating this tag is an __attribute__
* Returns:
* - >0, type ID of newly added BTF type;
* - <0, on error.
*/
int btf__add_type_attr(struct btf *btf, const char *value, int ref_type_id)
{
if (!value || !value[0])
return libbpf_err(-EINVAL);
return btf_add_ref_kind(btf, BTF_KIND_TYPE_TAG, value, ref_type_id, 1);
}
/*
@@ -2608,7 +2644,7 @@ int btf__add_func(struct btf *btf, const char *name,
linkage != BTF_FUNC_EXTERN)
return libbpf_err(-EINVAL);
id = btf_add_ref_kind(btf, BTF_KIND_FUNC, name, proto_type_id);
id = btf_add_ref_kind(btf, BTF_KIND_FUNC, name, proto_type_id, 0);
if (id > 0) {
struct btf_type *t = btf_type_by_id(btf, id);
@@ -2843,18 +2879,8 @@ int btf__add_datasec_var_info(struct btf *btf, int var_type_id, __u32 offset, __
return 0;
}
/*
* Append new BTF_KIND_DECL_TAG type with:
* - *value* - non-empty/non-NULL string;
* - *ref_type_id* - referenced type ID, it might not exist yet;
* - *component_idx* - -1 for tagging reference type, otherwise struct/union
* member or function argument index;
* Returns:
* - >0, type ID of newly added BTF type;
* - <0, on error.
*/
int btf__add_decl_tag(struct btf *btf, const char *value, int ref_type_id,
int component_idx)
static int btf_add_decl_tag(struct btf *btf, const char *value, int ref_type_id,
int component_idx, int kflag)
{
struct btf_type *t;
int sz, value_off;
@@ -2878,14 +2904,47 @@ int btf__add_decl_tag(struct btf *btf, const char *value, int ref_type_id,
return value_off;
t->name_off = value_off;
t->info = btf_type_info(BTF_KIND_DECL_TAG, 0, false);
t->info = btf_type_info(BTF_KIND_DECL_TAG, 0, kflag);
t->type = ref_type_id;
btf_decl_tag(t)->component_idx = component_idx;
return btf_commit_type(btf, sz);
}
struct btf_ext_sec_setup_param {
/*
* Append new BTF_KIND_DECL_TAG type with:
* - *value* - non-empty/non-NULL string;
* - *ref_type_id* - referenced type ID, it might not exist yet;
* - *component_idx* - -1 for tagging reference type, otherwise struct/union
* member or function argument index;
* Returns:
* - >0, type ID of newly added BTF type;
* - <0, on error.
*/
int btf__add_decl_tag(struct btf *btf, const char *value, int ref_type_id,
int component_idx)
{
return btf_add_decl_tag(btf, value, ref_type_id, component_idx, 0);
}
/*
* Append new BTF_KIND_DECL_TAG type with:
* - *value* - non-empty/non-NULL string;
* - *ref_type_id* - referenced type ID, it might not exist yet;
* - *component_idx* - -1 for tagging reference type, otherwise struct/union
* member or function argument index;
* Set info->kflag to 1, indicating this tag is an __attribute__
* Returns:
* - >0, type ID of newly added BTF type;
* - <0, on error.
*/
int btf__add_decl_attr(struct btf *btf, const char *value, int ref_type_id,
int component_idx)
{
return btf_add_decl_tag(btf, value, ref_type_id, component_idx, 1);
}
struct btf_ext_sec_info_param {
__u32 off;
__u32 len;
__u32 min_rec_size;
@@ -2893,14 +2952,20 @@ struct btf_ext_sec_setup_param {
const char *desc;
};
static int btf_ext_setup_info(struct btf_ext *btf_ext,
struct btf_ext_sec_setup_param *ext_sec)
/*
* Parse a single info subsection of the BTF.ext info data:
* - validate subsection structure and elements
* - save info subsection start and sizing details in struct btf_ext
* - endian-independent operation, for calling before byte-swapping
*/
static int btf_ext_parse_sec_info(struct btf_ext *btf_ext,
struct btf_ext_sec_info_param *ext_sec,
bool is_native)
{
const struct btf_ext_info_sec *sinfo;
struct btf_ext_info *ext_info;
__u32 info_left, record_size;
size_t sec_cnt = 0;
/* The start of the info sec (including the __u32 record_size). */
void *info;
if (ext_sec->len == 0)
@@ -2912,6 +2977,7 @@ static int btf_ext_setup_info(struct btf_ext *btf_ext,
return -EINVAL;
}
/* The start of the info sec (including the __u32 record_size). */
info = btf_ext->data + btf_ext->hdr->hdr_len + ext_sec->off;
info_left = ext_sec->len;
@@ -2927,9 +2993,13 @@ static int btf_ext_setup_info(struct btf_ext *btf_ext,
return -EINVAL;
}
/* The record size needs to meet the minimum standard */
record_size = *(__u32 *)info;
/* The record size needs to meet either the minimum standard or, when
* handling non-native endianness data, the exact standard so as
* to allow safe byte-swapping.
*/
record_size = is_native ? *(__u32 *)info : bswap_32(*(__u32 *)info);
if (record_size < ext_sec->min_rec_size ||
(!is_native && record_size != ext_sec->min_rec_size) ||
record_size & 0x03) {
pr_debug("%s section in .BTF.ext has invalid record size %u\n",
ext_sec->desc, record_size);
@@ -2941,7 +3011,7 @@ static int btf_ext_setup_info(struct btf_ext *btf_ext,
/* If no records, return failure now so .BTF.ext won't be used. */
if (!info_left) {
pr_debug("%s section in .BTF.ext has no records", ext_sec->desc);
pr_debug("%s section in .BTF.ext has no records\n", ext_sec->desc);
return -EINVAL;
}
@@ -2956,7 +3026,7 @@ static int btf_ext_setup_info(struct btf_ext *btf_ext,
return -EINVAL;
}
num_records = sinfo->num_info;
num_records = is_native ? sinfo->num_info : bswap_32(sinfo->num_info);
if (num_records == 0) {
pr_debug("%s section has incorrect num_records in .BTF.ext\n",
ext_sec->desc);
@@ -2984,64 +3054,157 @@ static int btf_ext_setup_info(struct btf_ext *btf_ext,
return 0;
}
static int btf_ext_setup_func_info(struct btf_ext *btf_ext)
/* Parse all info secs in the BTF.ext info data */
static int btf_ext_parse_info(struct btf_ext *btf_ext, bool is_native)
{
struct btf_ext_sec_setup_param param = {
struct btf_ext_sec_info_param func_info = {
.off = btf_ext->hdr->func_info_off,
.len = btf_ext->hdr->func_info_len,
.min_rec_size = sizeof(struct bpf_func_info_min),
.ext_info = &btf_ext->func_info,
.desc = "func_info"
};
return btf_ext_setup_info(btf_ext, &param);
}
static int btf_ext_setup_line_info(struct btf_ext *btf_ext)
{
struct btf_ext_sec_setup_param param = {
struct btf_ext_sec_info_param line_info = {
.off = btf_ext->hdr->line_info_off,
.len = btf_ext->hdr->line_info_len,
.min_rec_size = sizeof(struct bpf_line_info_min),
.ext_info = &btf_ext->line_info,
.desc = "line_info",
};
return btf_ext_setup_info(btf_ext, &param);
}
static int btf_ext_setup_core_relos(struct btf_ext *btf_ext)
{
struct btf_ext_sec_setup_param param = {
.off = btf_ext->hdr->core_relo_off,
.len = btf_ext->hdr->core_relo_len,
struct btf_ext_sec_info_param core_relo = {
.min_rec_size = sizeof(struct bpf_core_relo),
.ext_info = &btf_ext->core_relo_info,
.desc = "core_relo",
};
int err;
return btf_ext_setup_info(btf_ext, &param);
err = btf_ext_parse_sec_info(btf_ext, &func_info, is_native);
if (err)
return err;
err = btf_ext_parse_sec_info(btf_ext, &line_info, is_native);
if (err)
return err;
if (btf_ext->hdr->hdr_len < offsetofend(struct btf_ext_header, core_relo_len))
return 0; /* skip core relos parsing */
core_relo.off = btf_ext->hdr->core_relo_off;
core_relo.len = btf_ext->hdr->core_relo_len;
err = btf_ext_parse_sec_info(btf_ext, &core_relo, is_native);
if (err)
return err;
return 0;
}
static int btf_ext_parse_hdr(__u8 *data, __u32 data_size)
/* Swap byte-order of BTF.ext header with any endianness */
static void btf_ext_bswap_hdr(struct btf_ext_header *h)
{
const struct btf_ext_header *hdr = (struct btf_ext_header *)data;
bool is_native = h->magic == BTF_MAGIC;
__u32 hdr_len;
if (data_size < offsetofend(struct btf_ext_header, hdr_len) ||
data_size < hdr->hdr_len) {
pr_debug("BTF.ext header not found");
hdr_len = is_native ? h->hdr_len : bswap_32(h->hdr_len);
h->magic = bswap_16(h->magic);
h->hdr_len = bswap_32(h->hdr_len);
h->func_info_off = bswap_32(h->func_info_off);
h->func_info_len = bswap_32(h->func_info_len);
h->line_info_off = bswap_32(h->line_info_off);
h->line_info_len = bswap_32(h->line_info_len);
if (hdr_len < offsetofend(struct btf_ext_header, core_relo_len))
return;
h->core_relo_off = bswap_32(h->core_relo_off);
h->core_relo_len = bswap_32(h->core_relo_len);
}
/* Swap byte-order of generic info subsection */
static void btf_ext_bswap_info_sec(void *info, __u32 len, bool is_native,
info_rec_bswap_fn bswap_fn)
{
struct btf_ext_info_sec *sec;
__u32 info_left, rec_size, *rs;
if (len == 0)
return;
rs = info; /* info record size */
rec_size = is_native ? *rs : bswap_32(*rs);
*rs = bswap_32(*rs);
sec = info + sizeof(__u32); /* info sec #1 */
info_left = len - sizeof(__u32);
while (info_left) {
unsigned int sec_hdrlen = sizeof(struct btf_ext_info_sec);
__u32 i, num_recs;
void *p;
num_recs = is_native ? sec->num_info : bswap_32(sec->num_info);
sec->sec_name_off = bswap_32(sec->sec_name_off);
sec->num_info = bswap_32(sec->num_info);
p = sec->data; /* info rec #1 */
for (i = 0; i < num_recs; i++, p += rec_size)
bswap_fn(p);
sec = p;
info_left -= sec_hdrlen + (__u64)rec_size * num_recs;
}
}
/*
* Swap byte-order of all info data in a BTF.ext section
* - requires BTF.ext hdr in native endianness
*/
static void btf_ext_bswap_info(struct btf_ext *btf_ext, void *data)
{
const bool is_native = btf_ext->swapped_endian;
const struct btf_ext_header *h = data;
void *info;
/* Swap func_info subsection byte-order */
info = data + h->hdr_len + h->func_info_off;
btf_ext_bswap_info_sec(info, h->func_info_len, is_native,
(info_rec_bswap_fn)bpf_func_info_bswap);
/* Swap line_info subsection byte-order */
info = data + h->hdr_len + h->line_info_off;
btf_ext_bswap_info_sec(info, h->line_info_len, is_native,
(info_rec_bswap_fn)bpf_line_info_bswap);
/* Swap core_relo subsection byte-order (if present) */
if (h->hdr_len < offsetofend(struct btf_ext_header, core_relo_len))
return;
info = data + h->hdr_len + h->core_relo_off;
btf_ext_bswap_info_sec(info, h->core_relo_len, is_native,
(info_rec_bswap_fn)bpf_core_relo_bswap);
}
/* Parse hdr data and info sections: check and convert to native endianness */
static int btf_ext_parse(struct btf_ext *btf_ext)
{
__u32 hdr_len, data_size = btf_ext->data_size;
struct btf_ext_header *hdr = btf_ext->hdr;
bool swapped_endian = false;
int err;
if (data_size < offsetofend(struct btf_ext_header, hdr_len)) {
pr_debug("BTF.ext header too short\n");
return -EINVAL;
}
hdr_len = hdr->hdr_len;
if (hdr->magic == bswap_16(BTF_MAGIC)) {
pr_warn("BTF.ext in non-native endianness is not supported\n");
return -ENOTSUP;
swapped_endian = true;
hdr_len = bswap_32(hdr_len);
} else if (hdr->magic != BTF_MAGIC) {
pr_debug("Invalid BTF.ext magic:%x\n", hdr->magic);
return -EINVAL;
}
if (hdr->version != BTF_VERSION) {
/* Ensure known version of structs, current BTF_VERSION == 1 */
if (hdr->version != 1) {
pr_debug("Unsupported BTF.ext version:%u\n", hdr->version);
return -ENOTSUP;
}
@@ -3051,11 +3214,39 @@ static int btf_ext_parse_hdr(__u8 *data, __u32 data_size)
return -ENOTSUP;
}
if (data_size == hdr->hdr_len) {
if (data_size < hdr_len) {
pr_debug("BTF.ext header not found\n");
return -EINVAL;
} else if (data_size == hdr_len) {
pr_debug("BTF.ext has no data\n");
return -EINVAL;
}
/* Verify mandatory hdr info details present */
if (hdr_len < offsetofend(struct btf_ext_header, line_info_len)) {
pr_warn("BTF.ext header missing func_info, line_info\n");
return -EINVAL;
}
/* Keep hdr native byte-order in memory for introspection */
if (swapped_endian)
btf_ext_bswap_hdr(btf_ext->hdr);
/* Validate info subsections and cache key metadata */
err = btf_ext_parse_info(btf_ext, !swapped_endian);
if (err)
return err;
/* Keep infos native byte-order in memory for introspection */
if (swapped_endian)
btf_ext_bswap_info(btf_ext, btf_ext->data);
/*
* Set btf_ext->swapped_endian only after all header and info data has
* been swapped, helping bswap functions determine if their data are
* in native byte-order when called.
*/
btf_ext->swapped_endian = swapped_endian;
return 0;
}
@@ -3067,6 +3258,7 @@ void btf_ext__free(struct btf_ext *btf_ext)
free(btf_ext->line_info.sec_idxs);
free(btf_ext->core_relo_info.sec_idxs);
free(btf_ext->data);
free(btf_ext->data_swapped);
free(btf_ext);
}
@@ -3087,29 +3279,7 @@ struct btf_ext *btf_ext__new(const __u8 *data, __u32 size)
}
memcpy(btf_ext->data, data, size);
err = btf_ext_parse_hdr(btf_ext->data, size);
if (err)
goto done;
if (btf_ext->hdr->hdr_len < offsetofend(struct btf_ext_header, line_info_len)) {
err = -EINVAL;
goto done;
}
err = btf_ext_setup_func_info(btf_ext);
if (err)
goto done;
err = btf_ext_setup_line_info(btf_ext);
if (err)
goto done;
if (btf_ext->hdr->hdr_len < offsetofend(struct btf_ext_header, core_relo_len))
goto done; /* skip core relos parsing */
err = btf_ext_setup_core_relos(btf_ext);
if (err)
goto done;
err = btf_ext_parse(btf_ext);
done:
if (err) {
@@ -3120,15 +3290,66 @@ done:
return btf_ext;
}
static void *btf_ext_raw_data(const struct btf_ext *btf_ext_ro, bool swap_endian)
{
struct btf_ext *btf_ext = (struct btf_ext *)btf_ext_ro;
const __u32 data_sz = btf_ext->data_size;
void *data;
/* Return native data (always present) or swapped data if present */
if (!swap_endian)
return btf_ext->data;
else if (btf_ext->data_swapped)
return btf_ext->data_swapped;
/* Recreate missing swapped data, then cache and return */
data = calloc(1, data_sz);
if (!data)
return NULL;
memcpy(data, btf_ext->data, data_sz);
btf_ext_bswap_info(btf_ext, data);
btf_ext_bswap_hdr(data);
btf_ext->data_swapped = data;
return data;
}
const void *btf_ext__raw_data(const struct btf_ext *btf_ext, __u32 *size)
{
void *data;
data = btf_ext_raw_data(btf_ext, btf_ext->swapped_endian);
if (!data)
return errno = ENOMEM, NULL;
*size = btf_ext->data_size;
return btf_ext->data;
return data;
}
__attribute__((alias("btf_ext__raw_data")))
const void *btf_ext__get_raw_data(const struct btf_ext *btf_ext, __u32 *size);
enum btf_endianness btf_ext__endianness(const struct btf_ext *btf_ext)
{
if (is_host_big_endian())
return btf_ext->swapped_endian ? BTF_LITTLE_ENDIAN : BTF_BIG_ENDIAN;
else
return btf_ext->swapped_endian ? BTF_BIG_ENDIAN : BTF_LITTLE_ENDIAN;
}
int btf_ext__set_endianness(struct btf_ext *btf_ext, enum btf_endianness endian)
{
if (endian != BTF_LITTLE_ENDIAN && endian != BTF_BIG_ENDIAN)
return libbpf_err(-EINVAL);
btf_ext->swapped_endian = is_host_big_endian() != (endian == BTF_BIG_ENDIAN);
if (!btf_ext->swapped_endian) {
free(btf_ext->data_swapped);
btf_ext->data_swapped = NULL;
}
return 0;
}
struct btf_dedup;
@@ -3291,7 +3512,7 @@ int btf__dedup(struct btf *btf, const struct btf_dedup_opts *opts)
d = btf_dedup_new(btf, opts);
if (IS_ERR(d)) {
pr_debug("btf_dedup_new failed: %ld", PTR_ERR(d));
pr_debug("btf_dedup_new failed: %ld\n", PTR_ERR(d));
return libbpf_err(-EINVAL);
}
@@ -3302,42 +3523,42 @@ int btf__dedup(struct btf *btf, const struct btf_dedup_opts *opts)
err = btf_dedup_prep(d);
if (err) {
pr_debug("btf_dedup_prep failed:%d\n", err);
pr_debug("btf_dedup_prep failed: %s\n", errstr(err));
goto done;
}
err = btf_dedup_strings(d);
if (err < 0) {
pr_debug("btf_dedup_strings failed:%d\n", err);
pr_debug("btf_dedup_strings failed: %s\n", errstr(err));
goto done;
}
err = btf_dedup_prim_types(d);
if (err < 0) {
pr_debug("btf_dedup_prim_types failed:%d\n", err);
pr_debug("btf_dedup_prim_types failed: %s\n", errstr(err));
goto done;
}
err = btf_dedup_struct_types(d);
if (err < 0) {
pr_debug("btf_dedup_struct_types failed:%d\n", err);
pr_debug("btf_dedup_struct_types failed: %s\n", errstr(err));
goto done;
}
err = btf_dedup_resolve_fwds(d);
if (err < 0) {
pr_debug("btf_dedup_resolve_fwds failed:%d\n", err);
pr_debug("btf_dedup_resolve_fwds failed: %s\n", errstr(err));
goto done;
}
err = btf_dedup_ref_types(d);
if (err < 0) {
pr_debug("btf_dedup_ref_types failed:%d\n", err);
pr_debug("btf_dedup_ref_types failed: %s\n", errstr(err));
goto done;
}
err = btf_dedup_compact_types(d);
if (err < 0) {
pr_debug("btf_dedup_compact_types failed:%d\n", err);
pr_debug("btf_dedup_compact_types failed: %s\n", errstr(err));
goto done;
}
err = btf_dedup_remap_types(d);
if (err < 0) {
pr_debug("btf_dedup_remap_types failed:%d\n", err);
pr_debug("btf_dedup_remap_types failed: %s\n", errstr(err));
goto done;
}
@@ -3385,7 +3606,7 @@ struct btf_dedup {
struct strset *strs_set;
};
static long hash_combine(long h, long value)
static unsigned long hash_combine(unsigned long h, unsigned long value)
{
return h * 31 + value;
}
@@ -4135,46 +4356,109 @@ static inline __u16 btf_fwd_kind(struct btf_type *t)
return btf_kflag(t) ? BTF_KIND_UNION : BTF_KIND_STRUCT;
}
/* Check if given two types are identical ARRAY definitions */
static bool btf_dedup_identical_arrays(struct btf_dedup *d, __u32 id1, __u32 id2)
static bool btf_dedup_identical_types(struct btf_dedup *d, __u32 id1, __u32 id2, int depth)
{
struct btf_type *t1, *t2;
t1 = btf_type_by_id(d->btf, id1);
t2 = btf_type_by_id(d->btf, id2);
if (!btf_is_array(t1) || !btf_is_array(t2))
int k1, k2;
recur:
if (depth <= 0)
return false;
return btf_equal_array(t1, t2);
}
/* Check if given two types are identical STRUCT/UNION definitions */
static bool btf_dedup_identical_structs(struct btf_dedup *d, __u32 id1, __u32 id2)
{
const struct btf_member *m1, *m2;
struct btf_type *t1, *t2;
int n, i;
t1 = btf_type_by_id(d->btf, id1);
t2 = btf_type_by_id(d->btf, id2);
if (!btf_is_composite(t1) || btf_kind(t1) != btf_kind(t2))
k1 = btf_kind(t1);
k2 = btf_kind(t2);
if (k1 != k2)
return false;
if (!btf_shallow_equal_struct(t1, t2))
return false;
m1 = btf_members(t1);
m2 = btf_members(t2);
for (i = 0, n = btf_vlen(t1); i < n; i++, m1++, m2++) {
if (m1->type != m2->type &&
!btf_dedup_identical_arrays(d, m1->type, m2->type) &&
!btf_dedup_identical_structs(d, m1->type, m2->type))
switch (k1) {
case BTF_KIND_UNKN: /* VOID */
return true;
case BTF_KIND_INT:
return btf_equal_int_tag(t1, t2);
case BTF_KIND_ENUM:
case BTF_KIND_ENUM64:
return btf_compat_enum(t1, t2);
case BTF_KIND_FWD:
case BTF_KIND_FLOAT:
return btf_equal_common(t1, t2);
case BTF_KIND_CONST:
case BTF_KIND_VOLATILE:
case BTF_KIND_RESTRICT:
case BTF_KIND_PTR:
case BTF_KIND_TYPEDEF:
case BTF_KIND_FUNC:
case BTF_KIND_TYPE_TAG:
if (t1->info != t2->info || t1->name_off != t2->name_off)
return false;
id1 = t1->type;
id2 = t2->type;
goto recur;
case BTF_KIND_ARRAY: {
struct btf_array *a1, *a2;
if (!btf_compat_array(t1, t2))
return false;
a1 = btf_array(t1);
a2 = btf_array(t1);
if (a1->index_type != a2->index_type &&
!btf_dedup_identical_types(d, a1->index_type, a2->index_type, depth - 1))
return false;
if (a1->type != a2->type &&
!btf_dedup_identical_types(d, a1->type, a2->type, depth - 1))
return false;
return true;
}
case BTF_KIND_STRUCT:
case BTF_KIND_UNION: {
const struct btf_member *m1, *m2;
int i, n;
if (!btf_shallow_equal_struct(t1, t2))
return false;
m1 = btf_members(t1);
m2 = btf_members(t2);
for (i = 0, n = btf_vlen(t1); i < n; i++, m1++, m2++) {
if (m1->type == m2->type)
continue;
if (!btf_dedup_identical_types(d, m1->type, m2->type, depth - 1))
return false;
}
return true;
}
case BTF_KIND_FUNC_PROTO: {
const struct btf_param *p1, *p2;
int i, n;
if (!btf_compat_fnproto(t1, t2))
return false;
if (t1->type != t2->type &&
!btf_dedup_identical_types(d, t1->type, t2->type, depth - 1))
return false;
p1 = btf_params(t1);
p2 = btf_params(t2);
for (i = 0, n = btf_vlen(t1); i < n; i++, p1++, p2++) {
if (p1->type == p2->type)
continue;
if (!btf_dedup_identical_types(d, p1->type, p2->type, depth - 1))
return false;
}
return true;
}
default:
return false;
}
return true;
}
/*
* Check equivalence of BTF type graph formed by candidate struct/union (we'll
* call it "candidate graph" in this description for brevity) to a type graph
@@ -4192,7 +4476,7 @@ static bool btf_dedup_identical_structs(struct btf_dedup *d, __u32 id1, __u32 id
* and canonical graphs are not compatible structurally, whole graphs are
* incompatible. If types are structurally equivalent (i.e., all information
* except referenced type IDs is exactly the same), a mapping from `canon_id` to
* a `cand_id` is recored in hypothetical mapping (`btf_dedup->hypot_map`).
* a `cand_id` is recoded in hypothetical mapping (`btf_dedup->hypot_map`).
* If a type references other types, then those referenced types are checked
* for equivalence recursively.
*
@@ -4230,7 +4514,7 @@ static bool btf_dedup_identical_structs(struct btf_dedup *d, __u32 id1, __u32 id
* consists of portions of the graph that come from multiple compilation units.
* This is due to the fact that types within single compilation unit are always
* deduplicated and FWDs are already resolved, if referenced struct/union
* definiton is available. So, if we had unresolved FWD and found corresponding
* definition is available. So, if we had unresolved FWD and found corresponding
* STRUCT/UNION, they will be from different compilation units. This
* consequently means that when we "link" FWD to corresponding STRUCT/UNION,
* type graph will likely have at least two different BTF types that describe
@@ -4293,19 +4577,13 @@ static int btf_dedup_is_equiv(struct btf_dedup *d, __u32 cand_id,
* different fields within the *same* struct. This breaks type
* equivalence check, which makes an assumption that candidate
* types sub-graph has a consistent and deduped-by-compiler
* types within a single CU. So work around that by explicitly
* allowing identical array types here.
* types within a single CU. And similar situation can happen
* with struct/union sometimes, and event with pointers.
* So accommodate cases like this doing a structural
* comparison recursively, but avoiding being stuck in endless
* loops by limiting the depth up to which we check.
*/
if (btf_dedup_identical_arrays(d, hypot_type_id, cand_id))
return 1;
/* It turns out that similar situation can happen with
* struct/union sometimes, sigh... Handle the case where
* structs/unions are exactly the same, down to the referenced
* type IDs. Anything more complicated (e.g., if referenced
* types are different, but equivalent) is *way more*
* complicated and requires a many-to-many equivalence mapping.
*/
if (btf_dedup_identical_structs(d, hypot_type_id, cand_id))
if (btf_dedup_identical_types(d, hypot_type_id, cand_id, 16))
return 1;
return 0;
}
@@ -5056,7 +5334,8 @@ struct btf *btf__load_vmlinux_btf(void)
btf = btf__parse(sysfs_btf_path, NULL);
if (!btf) {
err = -errno;
pr_warn("failed to read kernel BTF from '%s': %d\n", sysfs_btf_path, err);
pr_warn("failed to read kernel BTF from '%s': %s\n",
sysfs_btf_path, errstr(err));
return libbpf_err_ptr(err);
}
pr_debug("loaded kernel BTF from '%s'\n", sysfs_btf_path);
@@ -5073,7 +5352,7 @@ struct btf *btf__load_vmlinux_btf(void)
btf = btf__parse(path, NULL);
err = libbpf_get_error(btf);
pr_debug("loading kernel BTF '%s': %d\n", path, err);
pr_debug("loading kernel BTF '%s': %s\n", path, errstr(err));
if (err)
continue;

View File

@@ -167,6 +167,9 @@ LIBBPF_API const char *btf__str_by_offset(const struct btf *btf, __u32 offset);
LIBBPF_API struct btf_ext *btf_ext__new(const __u8 *data, __u32 size);
LIBBPF_API void btf_ext__free(struct btf_ext *btf_ext);
LIBBPF_API const void *btf_ext__raw_data(const struct btf_ext *btf_ext, __u32 *size);
LIBBPF_API enum btf_endianness btf_ext__endianness(const struct btf_ext *btf_ext);
LIBBPF_API int btf_ext__set_endianness(struct btf_ext *btf_ext,
enum btf_endianness endian);
LIBBPF_API int btf__find_str(struct btf *btf, const char *s);
LIBBPF_API int btf__add_str(struct btf *btf, const char *s);
@@ -224,6 +227,7 @@ LIBBPF_API int btf__add_volatile(struct btf *btf, int ref_type_id);
LIBBPF_API int btf__add_const(struct btf *btf, int ref_type_id);
LIBBPF_API int btf__add_restrict(struct btf *btf, int ref_type_id);
LIBBPF_API int btf__add_type_tag(struct btf *btf, const char *value, int ref_type_id);
LIBBPF_API int btf__add_type_attr(struct btf *btf, const char *value, int ref_type_id);
/* func and func_proto construction APIs */
LIBBPF_API int btf__add_func(struct btf *btf, const char *name,
@@ -240,6 +244,8 @@ LIBBPF_API int btf__add_datasec_var_info(struct btf *btf, int var_type_id,
/* tag construction API */
LIBBPF_API int btf__add_decl_tag(struct btf *btf, const char *value, int ref_type_id,
int component_idx);
LIBBPF_API int btf__add_decl_attr(struct btf *btf, const char *value, int ref_type_id,
int component_idx);
struct btf_dedup_opts {
size_t sz;
@@ -286,7 +292,7 @@ LIBBPF_API void btf_dump__free(struct btf_dump *d);
LIBBPF_API int btf_dump__dump_type(struct btf_dump *d, __u32 id);
struct btf_dump_emit_type_decl_opts {
/* size of this struct, for forward/backward compatiblity */
/* size of this struct, for forward/backward compatibility */
size_t sz;
/* optional field name for type declaration, e.g.:
* - struct my_struct <FNAME>

View File

@@ -21,6 +21,7 @@
#include "hashmap.h"
#include "libbpf.h"
#include "libbpf_internal.h"
#include "str_error.h"
static const char PREFIXES[] = "\t\t\t\t\t\t\t\t\t\t\t\t\t";
static const size_t PREFIX_CNT = sizeof(PREFIXES) - 1;
@@ -304,7 +305,7 @@ int btf_dump__dump_type(struct btf_dump *d, __u32 id)
* definition, in which case they have to be declared inline as part of field
* type declaration; or as a top-level anonymous enum, typically used for
* declaring global constants. It's impossible to distinguish between two
* without knowning whether given enum type was referenced from other type:
* without knowing whether given enum type was referenced from other type:
* top-level anonymous enum won't be referenced by anything, while embedded
* one will.
*/
@@ -867,8 +868,8 @@ static void btf_dump_emit_bit_padding(const struct btf_dump *d,
} pads[] = {
{"long", d->ptr_sz * 8}, {"int", 32}, {"short", 16}, {"char", 8}
};
int new_off, pad_bits, bits, i;
const char *pad_type;
int new_off = 0, pad_bits = 0, bits, i;
const char *pad_type = NULL;
if (cur_off >= next_off)
return; /* no gap */
@@ -1304,7 +1305,7 @@ static void btf_dump_emit_type_decl(struct btf_dump *d, __u32 id,
* chain, restore stack, emit warning, and try to
* proceed nevertheless
*/
pr_warn("not enough memory for decl stack:%d", err);
pr_warn("not enough memory for decl stack: %s\n", errstr(err));
d->decl_stack_cnt = stack_start;
return;
}
@@ -1493,7 +1494,10 @@ static void btf_dump_emit_type_chain(struct btf_dump *d,
case BTF_KIND_TYPE_TAG:
btf_dump_emit_mods(d, decls);
name = btf_name_of(d, t->name_off);
btf_dump_printf(d, " __attribute__((btf_type_tag(\"%s\")))", name);
if (btf_kflag(t))
btf_dump_printf(d, " __attribute__((%s))", name);
else
btf_dump_printf(d, " __attribute__((btf_type_tag(\"%s\")))", name);
break;
case BTF_KIND_ARRAY: {
const struct btf_array *a = btf_array(t);

View File

@@ -212,7 +212,7 @@ static int btf_relocate_map_distilled_base(struct btf_relocate *r)
* need to match both name and size, otherwise embedding the base
* struct/union in the split type is invalid.
*/
for (id = r->nr_dist_base_types; id < r->nr_split_types; id++) {
for (id = r->nr_dist_base_types; id < r->nr_dist_base_types + r->nr_split_types; id++) {
err = btf_mark_embedded_composite_type_ids(r, id);
if (err)
goto done;
@@ -428,7 +428,7 @@ static int btf_relocate_rewrite_strs(struct btf_relocate *r, __u32 i)
} else {
off = r->str_map[*str_off];
if (!off) {
pr_warn("string '%s' [offset %u] is not mapped to base BTF",
pr_warn("string '%s' [offset %u] is not mapped to base BTF\n",
btf__str_by_offset(r->btf, off), *str_off);
return -ENOENT;
}

View File

@@ -24,7 +24,6 @@
int elf_open(const char *binary_path, struct elf_fd *elf_fd)
{
char errmsg[STRERR_BUFSIZE];
int fd, ret;
Elf *elf;
@@ -38,8 +37,7 @@ int elf_open(const char *binary_path, struct elf_fd *elf_fd)
fd = open(binary_path, O_RDONLY | O_CLOEXEC);
if (fd < 0) {
ret = -errno;
pr_warn("elf: failed to open %s: %s\n", binary_path,
libbpf_strerror_r(ret, errmsg, sizeof(errmsg)));
pr_warn("elf: failed to open %s: %s\n", binary_path, errstr(ret));
return ret;
}
elf = elf_begin(fd, ELF_C_READ_MMAP, NULL);

View File

@@ -47,7 +47,6 @@ static int probe_kern_prog_name(int token_fd)
static int probe_kern_global_data(int token_fd)
{
char *cp, errmsg[STRERR_BUFSIZE];
struct bpf_insn insns[] = {
BPF_LD_MAP_VALUE(BPF_REG_1, 0, 16),
BPF_ST_MEM(BPF_DW, BPF_REG_1, 0, 42),
@@ -67,9 +66,8 @@ static int probe_kern_global_data(int token_fd)
map = bpf_map_create(BPF_MAP_TYPE_ARRAY, "libbpf_global", sizeof(int), 32, 1, &map_opts);
if (map < 0) {
ret = -errno;
cp = libbpf_strerror_r(ret, errmsg, sizeof(errmsg));
pr_warn("Error in %s():%s(%d). Couldn't create simple array map.\n",
__func__, cp, -ret);
pr_warn("Error in %s(): %s. Couldn't create simple array map.\n",
__func__, errstr(ret));
return ret;
}
@@ -267,7 +265,6 @@ static int probe_kern_probe_read_kernel(int token_fd)
static int probe_prog_bind_map(int token_fd)
{
char *cp, errmsg[STRERR_BUFSIZE];
struct bpf_insn insns[] = {
BPF_MOV64_IMM(BPF_REG_0, 0),
BPF_EXIT_INSN(),
@@ -285,9 +282,8 @@ static int probe_prog_bind_map(int token_fd)
map = bpf_map_create(BPF_MAP_TYPE_ARRAY, "libbpf_det_bind", sizeof(int), 32, 1, &map_opts);
if (map < 0) {
ret = -errno;
cp = libbpf_strerror_r(ret, errmsg, sizeof(errmsg));
pr_warn("Error in %s():%s(%d). Couldn't create simple array map.\n",
__func__, cp, -ret);
pr_warn("Error in %s(): %s. Couldn't create simple array map.\n",
__func__, errstr(ret));
return ret;
}
@@ -604,7 +600,8 @@ bool feat_supported(struct kern_feature_cache *cache, enum kern_feature_id feat_
} else if (ret == 0) {
WRITE_ONCE(cache->res[feat_id], FEAT_MISSING);
} else {
pr_warn("Detection of kernel %s support failed: %d\n", feat->desc, ret);
pr_warn("Detection of kernel %s support failed: %s\n",
feat->desc, errstr(ret));
WRITE_ONCE(cache->res[feat_id], FEAT_MISSING);
}
}

View File

@@ -14,6 +14,7 @@
#include "bpf_gen_internal.h"
#include "skel_internal.h"
#include <asm/byteorder.h>
#include "str_error.h"
#define MAX_USED_MAPS 64
#define MAX_USED_PROGS 32
@@ -393,7 +394,7 @@ int bpf_gen__finish(struct bpf_gen *gen, int nr_progs, int nr_maps)
blob_fd_array_off(gen, i));
emit(gen, BPF_MOV64_IMM(BPF_REG_0, 0));
emit(gen, BPF_EXIT_INSN());
pr_debug("gen: finish %d\n", gen->error);
pr_debug("gen: finish %s\n", errstr(gen->error));
if (!gen->error) {
struct gen_loader_opts *opts = gen->opts;
@@ -401,6 +402,15 @@ int bpf_gen__finish(struct bpf_gen *gen, int nr_progs, int nr_maps)
opts->insns_sz = gen->insn_cur - gen->insn_start;
opts->data = gen->data_start;
opts->data_sz = gen->data_cur - gen->data_start;
/* use target endianness for embedded loader */
if (gen->swapped_endian) {
struct bpf_insn *insn = (struct bpf_insn *)opts->insns;
int insn_cnt = opts->insns_sz / sizeof(struct bpf_insn);
for (i = 0; i < insn_cnt; i++)
bpf_insn_bswap(insn++);
}
}
return gen->error;
}
@@ -414,6 +424,28 @@ void bpf_gen__free(struct bpf_gen *gen)
free(gen);
}
/*
* Fields of bpf_attr are set to values in native byte-order before being
* written to the target-bound data blob, and may need endian conversion.
* This macro allows providing the correct value in situ more simply than
* writing a separate converter for *all fields* of *all records* included
* in union bpf_attr. Note that sizeof(rval) should match the assignment
* target to avoid runtime problems.
*/
#define tgt_endian(rval) ({ \
typeof(rval) _val = (rval); \
if (gen->swapped_endian) { \
switch (sizeof(_val)) { \
case 1: break; \
case 2: _val = bswap_16(_val); break; \
case 4: _val = bswap_32(_val); break; \
case 8: _val = bswap_64(_val); break; \
default: pr_warn("unsupported bswap size!\n"); \
} \
} \
_val; \
})
void bpf_gen__load_btf(struct bpf_gen *gen, const void *btf_raw_data,
__u32 btf_raw_size)
{
@@ -422,11 +454,12 @@ void bpf_gen__load_btf(struct bpf_gen *gen, const void *btf_raw_data,
union bpf_attr attr;
memset(&attr, 0, attr_size);
pr_debug("gen: load_btf: size %d\n", btf_raw_size);
btf_data = add_data(gen, btf_raw_data, btf_raw_size);
attr.btf_size = btf_raw_size;
attr.btf_size = tgt_endian(btf_raw_size);
btf_load_attr = add_data(gen, &attr, attr_size);
pr_debug("gen: load_btf: off %d size %d, attr: off %d size %d\n",
btf_data, btf_raw_size, btf_load_attr, attr_size);
/* populate union bpf_attr with user provided log details */
move_ctx2blob(gen, attr_field(btf_load_attr, btf_log_level), 4,
@@ -457,28 +490,29 @@ void bpf_gen__map_create(struct bpf_gen *gen,
union bpf_attr attr;
memset(&attr, 0, attr_size);
attr.map_type = map_type;
attr.key_size = key_size;
attr.value_size = value_size;
attr.map_flags = map_attr->map_flags;
attr.map_extra = map_attr->map_extra;
attr.map_type = tgt_endian(map_type);
attr.key_size = tgt_endian(key_size);
attr.value_size = tgt_endian(value_size);
attr.map_flags = tgt_endian(map_attr->map_flags);
attr.map_extra = tgt_endian(map_attr->map_extra);
if (map_name)
libbpf_strlcpy(attr.map_name, map_name, sizeof(attr.map_name));
attr.numa_node = map_attr->numa_node;
attr.map_ifindex = map_attr->map_ifindex;
attr.max_entries = max_entries;
attr.btf_key_type_id = map_attr->btf_key_type_id;
attr.btf_value_type_id = map_attr->btf_value_type_id;
pr_debug("gen: map_create: %s idx %d type %d value_type_id %d\n",
attr.map_name, map_idx, map_type, attr.btf_value_type_id);
attr.numa_node = tgt_endian(map_attr->numa_node);
attr.map_ifindex = tgt_endian(map_attr->map_ifindex);
attr.max_entries = tgt_endian(max_entries);
attr.btf_key_type_id = tgt_endian(map_attr->btf_key_type_id);
attr.btf_value_type_id = tgt_endian(map_attr->btf_value_type_id);
map_create_attr = add_data(gen, &attr, attr_size);
if (attr.btf_value_type_id)
pr_debug("gen: map_create: %s idx %d type %d value_type_id %d, attr: off %d size %d\n",
map_name, map_idx, map_type, map_attr->btf_value_type_id,
map_create_attr, attr_size);
if (map_attr->btf_value_type_id)
/* populate union bpf_attr with btf_fd saved in the stack earlier */
move_stack2blob(gen, attr_field(map_create_attr, btf_fd), 4,
stack_off(btf_fd));
switch (attr.map_type) {
switch (map_type) {
case BPF_MAP_TYPE_ARRAY_OF_MAPS:
case BPF_MAP_TYPE_HASH_OF_MAPS:
move_stack2blob(gen, attr_field(map_create_attr, inner_map_fd), 4,
@@ -498,8 +532,8 @@ void bpf_gen__map_create(struct bpf_gen *gen,
/* emit MAP_CREATE command */
emit_sys_bpf(gen, BPF_MAP_CREATE, map_create_attr, attr_size);
debug_ret(gen, "map_create %s idx %d type %d value_size %d value_btf_id %d",
attr.map_name, map_idx, map_type, value_size,
attr.btf_value_type_id);
map_name, map_idx, map_type, value_size,
map_attr->btf_value_type_id);
emit_check_err(gen);
/* remember map_fd in the stack, if successful */
if (map_idx < 0) {
@@ -784,12 +818,12 @@ log:
emit_ksym_relo_log(gen, relo, kdesc->ref);
}
static __u32 src_reg_mask(void)
static __u32 src_reg_mask(struct bpf_gen *gen)
{
#if defined(__LITTLE_ENDIAN_BITFIELD)
return 0x0f; /* src_reg,dst_reg,... */
#elif defined(__BIG_ENDIAN_BITFIELD)
return 0xf0; /* dst_reg,src_reg,... */
#if defined(__LITTLE_ENDIAN_BITFIELD) /* src_reg,dst_reg,... */
return gen->swapped_endian ? 0xf0 : 0x0f;
#elif defined(__BIG_ENDIAN_BITFIELD) /* dst_reg,src_reg,... */
return gen->swapped_endian ? 0x0f : 0xf0;
#else
#error "Unsupported bit endianness, cannot proceed"
#endif
@@ -840,7 +874,7 @@ static void emit_relo_ksym_btf(struct bpf_gen *gen, struct ksym_relo_desc *relo,
emit(gen, BPF_JMP_IMM(BPF_JA, 0, 0, 3));
clear_src_reg:
/* clear bpf_object__relocate_data's src_reg assignment, otherwise we get a verifier failure */
reg_mask = src_reg_mask();
reg_mask = src_reg_mask(gen);
emit(gen, BPF_LDX_MEM(BPF_B, BPF_REG_9, BPF_REG_8, offsetofend(struct bpf_insn, code)));
emit(gen, BPF_ALU32_IMM(BPF_AND, BPF_REG_9, reg_mask));
emit(gen, BPF_STX_MEM(BPF_B, BPF_REG_8, BPF_REG_9, offsetofend(struct bpf_insn, code)));
@@ -931,48 +965,94 @@ static void cleanup_relos(struct bpf_gen *gen, int insns)
cleanup_core_relo(gen);
}
/* Convert func, line, and core relo info blobs to target endianness */
static void info_blob_bswap(struct bpf_gen *gen, int func_info, int line_info,
int core_relos, struct bpf_prog_load_opts *load_attr)
{
struct bpf_func_info *fi = gen->data_start + func_info;
struct bpf_line_info *li = gen->data_start + line_info;
struct bpf_core_relo *cr = gen->data_start + core_relos;
int i;
for (i = 0; i < load_attr->func_info_cnt; i++)
bpf_func_info_bswap(fi++);
for (i = 0; i < load_attr->line_info_cnt; i++)
bpf_line_info_bswap(li++);
for (i = 0; i < gen->core_relo_cnt; i++)
bpf_core_relo_bswap(cr++);
}
void bpf_gen__prog_load(struct bpf_gen *gen,
enum bpf_prog_type prog_type, const char *prog_name,
const char *license, struct bpf_insn *insns, size_t insn_cnt,
struct bpf_prog_load_opts *load_attr, int prog_idx)
{
int func_info_tot_sz = load_attr->func_info_cnt *
load_attr->func_info_rec_size;
int line_info_tot_sz = load_attr->line_info_cnt *
load_attr->line_info_rec_size;
int core_relo_tot_sz = gen->core_relo_cnt *
sizeof(struct bpf_core_relo);
int prog_load_attr, license_off, insns_off, func_info, line_info, core_relos;
int attr_size = offsetofend(union bpf_attr, core_relo_rec_size);
union bpf_attr attr;
memset(&attr, 0, attr_size);
pr_debug("gen: prog_load: type %d insns_cnt %zd progi_idx %d\n",
prog_type, insn_cnt, prog_idx);
/* add license string to blob of bytes */
license_off = add_data(gen, license, strlen(license) + 1);
/* add insns to blob of bytes */
insns_off = add_data(gen, insns, insn_cnt * sizeof(struct bpf_insn));
pr_debug("gen: prog_load: prog_idx %d type %d insn off %d insns_cnt %zd license off %d\n",
prog_idx, prog_type, insns_off, insn_cnt, license_off);
attr.prog_type = prog_type;
attr.expected_attach_type = load_attr->expected_attach_type;
attr.attach_btf_id = load_attr->attach_btf_id;
attr.prog_ifindex = load_attr->prog_ifindex;
/* convert blob insns to target endianness */
if (gen->swapped_endian) {
struct bpf_insn *insn = gen->data_start + insns_off;
int i;
for (i = 0; i < insn_cnt; i++, insn++)
bpf_insn_bswap(insn);
}
attr.prog_type = tgt_endian(prog_type);
attr.expected_attach_type = tgt_endian(load_attr->expected_attach_type);
attr.attach_btf_id = tgt_endian(load_attr->attach_btf_id);
attr.prog_ifindex = tgt_endian(load_attr->prog_ifindex);
attr.kern_version = 0;
attr.insn_cnt = (__u32)insn_cnt;
attr.prog_flags = load_attr->prog_flags;
attr.insn_cnt = tgt_endian((__u32)insn_cnt);
attr.prog_flags = tgt_endian(load_attr->prog_flags);
attr.func_info_rec_size = load_attr->func_info_rec_size;
attr.func_info_cnt = load_attr->func_info_cnt;
func_info = add_data(gen, load_attr->func_info,
attr.func_info_cnt * attr.func_info_rec_size);
attr.func_info_rec_size = tgt_endian(load_attr->func_info_rec_size);
attr.func_info_cnt = tgt_endian(load_attr->func_info_cnt);
func_info = add_data(gen, load_attr->func_info, func_info_tot_sz);
pr_debug("gen: prog_load: func_info: off %d cnt %d rec size %d\n",
func_info, load_attr->func_info_cnt,
load_attr->func_info_rec_size);
attr.line_info_rec_size = load_attr->line_info_rec_size;
attr.line_info_cnt = load_attr->line_info_cnt;
line_info = add_data(gen, load_attr->line_info,
attr.line_info_cnt * attr.line_info_rec_size);
attr.line_info_rec_size = tgt_endian(load_attr->line_info_rec_size);
attr.line_info_cnt = tgt_endian(load_attr->line_info_cnt);
line_info = add_data(gen, load_attr->line_info, line_info_tot_sz);
pr_debug("gen: prog_load: line_info: off %d cnt %d rec size %d\n",
line_info, load_attr->line_info_cnt,
load_attr->line_info_rec_size);
attr.core_relo_rec_size = sizeof(struct bpf_core_relo);
attr.core_relo_cnt = gen->core_relo_cnt;
core_relos = add_data(gen, gen->core_relos,
attr.core_relo_cnt * attr.core_relo_rec_size);
attr.core_relo_rec_size = tgt_endian((__u32)sizeof(struct bpf_core_relo));
attr.core_relo_cnt = tgt_endian(gen->core_relo_cnt);
core_relos = add_data(gen, gen->core_relos, core_relo_tot_sz);
pr_debug("gen: prog_load: core_relos: off %d cnt %d rec size %zd\n",
core_relos, gen->core_relo_cnt,
sizeof(struct bpf_core_relo));
/* convert all info blobs to target endianness */
if (gen->swapped_endian)
info_blob_bswap(gen, func_info, line_info, core_relos, load_attr);
libbpf_strlcpy(attr.prog_name, prog_name, sizeof(attr.prog_name));
prog_load_attr = add_data(gen, &attr, attr_size);
pr_debug("gen: prog_load: attr: off %d size %d\n",
prog_load_attr, attr_size);
/* populate union bpf_attr with a pointer to license */
emit_rel_store(gen, attr_field(prog_load_attr, license), license_off);
@@ -1040,7 +1120,6 @@ void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *pvalue,
int zero = 0;
memset(&attr, 0, attr_size);
pr_debug("gen: map_update_elem: idx %d\n", map_idx);
value = add_data(gen, pvalue, value_size);
key = add_data(gen, &zero, sizeof(zero));
@@ -1068,6 +1147,8 @@ void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *pvalue,
emit(gen, BPF_EMIT_CALL(BPF_FUNC_probe_read_kernel));
map_update_attr = add_data(gen, &attr, attr_size);
pr_debug("gen: map_update_elem: idx %d, value: off %d size %d, attr: off %d size %d\n",
map_idx, value, value_size, map_update_attr, attr_size);
move_blob2blob(gen, attr_field(map_update_attr, map_fd), 4,
blob_fd_array_off(gen, map_idx));
emit_rel_store(gen, attr_field(map_update_attr, key), key);
@@ -1084,14 +1165,16 @@ void bpf_gen__populate_outer_map(struct bpf_gen *gen, int outer_map_idx, int slo
int attr_size = offsetofend(union bpf_attr, flags);
int map_update_attr, key;
union bpf_attr attr;
int tgt_slot;
memset(&attr, 0, attr_size);
pr_debug("gen: populate_outer_map: outer %d key %d inner %d\n",
outer_map_idx, slot, inner_map_idx);
key = add_data(gen, &slot, sizeof(slot));
tgt_slot = tgt_endian(slot);
key = add_data(gen, &tgt_slot, sizeof(tgt_slot));
map_update_attr = add_data(gen, &attr, attr_size);
pr_debug("gen: populate_outer_map: outer %d key %d inner %d, attr: off %d size %d\n",
outer_map_idx, slot, inner_map_idx, map_update_attr, attr_size);
move_blob2blob(gen, attr_field(map_update_attr, map_fd), 4,
blob_fd_array_off(gen, outer_map_idx));
emit_rel_store(gen, attr_field(map_update_attr, key), key);
@@ -1112,8 +1195,9 @@ void bpf_gen__map_freeze(struct bpf_gen *gen, int map_idx)
union bpf_attr attr;
memset(&attr, 0, attr_size);
pr_debug("gen: map_freeze: idx %d\n", map_idx);
map_freeze_attr = add_data(gen, &attr, attr_size);
pr_debug("gen: map_freeze: idx %d, attr: off %d size %d\n",
map_idx, map_freeze_attr, attr_size);
move_blob2blob(gen, attr_field(map_freeze_attr, map_fd), 4,
blob_fd_array_off(gen, map_idx));
/* emit MAP_FREEZE command */

View File

@@ -166,8 +166,8 @@ bool hashmap_find(const struct hashmap *map, long key, long *value);
* @bkt: integer used as a bucket loop cursor
*/
#define hashmap__for_each_entry(map, cur, bkt) \
for (bkt = 0; bkt < map->cap; bkt++) \
for (cur = map->buckets[bkt]; cur; cur = cur->next)
for (bkt = 0; bkt < (map)->cap; bkt++) \
for (cur = (map)->buckets[bkt]; cur; cur = cur->next)
/*
* hashmap__for_each_entry_safe - iterate over all entries in hashmap, safe
@@ -178,8 +178,8 @@ bool hashmap_find(const struct hashmap *map, long key, long *value);
* @bkt: integer used as a bucket loop cursor
*/
#define hashmap__for_each_entry_safe(map, cur, tmp, bkt) \
for (bkt = 0; bkt < map->cap; bkt++) \
for (cur = map->buckets[bkt]; \
for (bkt = 0; bkt < (map)->cap; bkt++) \
for (cur = (map)->buckets[bkt]; \
cur && ({tmp = cur->next; true; }); \
cur = tmp)
@@ -190,19 +190,19 @@ bool hashmap_find(const struct hashmap *map, long key, long *value);
* @key: key to iterate entries for
*/
#define hashmap__for_each_key_entry(map, cur, _key) \
for (cur = map->buckets \
? map->buckets[hash_bits(map->hash_fn((_key), map->ctx), map->cap_bits)] \
for (cur = (map)->buckets \
? (map)->buckets[hash_bits((map)->hash_fn((_key), (map)->ctx), (map)->cap_bits)] \
: NULL; \
cur; \
cur = cur->next) \
if (map->equal_fn(cur->key, (_key), map->ctx))
if ((map)->equal_fn(cur->key, (_key), (map)->ctx))
#define hashmap__for_each_key_entry_safe(map, cur, tmp, _key) \
for (cur = map->buckets \
? map->buckets[hash_bits(map->hash_fn((_key), map->ctx), map->cap_bits)] \
for (cur = (map)->buckets \
? (map)->buckets[hash_bits((map)->hash_fn((_key), (map)->ctx), (map)->cap_bits)] \
: NULL; \
cur && ({ tmp = cur->next; true; }); \
cur = tmp) \
if (map->equal_fn(cur->key, (_key), map->ctx))
if ((map)->equal_fn(cur->key, (_key), (map)->ctx))
#endif /* __LIBBPF_HASHMAP_H */

File diff suppressed because it is too large Load Diff

View File

@@ -152,7 +152,7 @@ struct bpf_object_open_opts {
* log_buf and log_level settings.
*
* If specified, this log buffer will be passed for:
* - each BPF progral load (BPF_PROG_LOAD) attempt, unless overriden
* - each BPF progral load (BPF_PROG_LOAD) attempt, unless overridden
* with bpf_program__set_log() on per-program level, to get
* BPF verifier log output.
* - during BPF object's BTF load into kernel (BPF_BTF_LOAD) to get
@@ -241,6 +241,19 @@ LIBBPF_API struct bpf_object *
bpf_object__open_mem(const void *obj_buf, size_t obj_buf_sz,
const struct bpf_object_open_opts *opts);
/**
* @brief **bpf_object__prepare()** prepares BPF object for loading:
* performs ELF processing, relocations, prepares final state of BPF program
* instructions (accessible with bpf_program__insns()), creates and
* (potentially) pins maps. Leaves BPF object in the state ready for program
* loading.
* @param obj Pointer to a valid BPF object instance returned by
* **bpf_object__open*()** API
* @return 0, on success; negative error code, otherwise, error code is
* stored in errno
*/
int bpf_object__prepare(struct bpf_object *obj);
/**
* @brief **bpf_object__load()** loads BPF object into kernel.
* @param obj Pointer to a valid BPF object instance returned by
@@ -294,6 +307,14 @@ LIBBPF_API const char *bpf_object__name(const struct bpf_object *obj);
LIBBPF_API unsigned int bpf_object__kversion(const struct bpf_object *obj);
LIBBPF_API int bpf_object__set_kversion(struct bpf_object *obj, __u32 kern_version);
/**
* @brief **bpf_object__token_fd** is an accessor for BPF token FD associated
* with BPF object.
* @param obj Pointer to a valid BPF object
* @return BPF token FD or -1, if it wasn't set
*/
LIBBPF_API int bpf_object__token_fd(const struct bpf_object *obj);
struct btf;
LIBBPF_API struct btf *bpf_object__btf(const struct bpf_object *obj);
LIBBPF_API int bpf_object__btf_fd(const struct bpf_object *obj);
@@ -455,7 +476,7 @@ LIBBPF_API int bpf_link__destroy(struct bpf_link *link);
/**
* @brief **bpf_program__attach()** is a generic function for attaching
* a BPF program based on auto-detection of program type, attach type,
* and extra paremeters, where applicable.
* and extra parameters, where applicable.
*
* @param prog BPF program to attach
* @return Reference to the newly created BPF link; or NULL is returned on error,
@@ -544,10 +565,12 @@ struct bpf_kprobe_multi_opts {
bool retprobe;
/* create session kprobes */
bool session;
/* enforce unique match */
bool unique_match;
size_t :0;
};
#define bpf_kprobe_multi_opts__last_field session
#define bpf_kprobe_multi_opts__last_field unique_match
LIBBPF_API struct bpf_link *
bpf_program__attach_kprobe_multi_opts(const struct bpf_program *prog,
@@ -569,10 +592,12 @@ struct bpf_uprobe_multi_opts {
size_t cnt;
/* create return uprobes */
bool retprobe;
/* create session kprobes */
bool session;
size_t :0;
};
#define bpf_uprobe_multi_opts__last_field retprobe
#define bpf_uprobe_multi_opts__last_field session
/**
* @brief **bpf_program__attach_uprobe_multi()** attaches a BPF program
@@ -679,7 +704,7 @@ struct bpf_uprobe_opts {
/**
* @brief **bpf_program__attach_uprobe()** attaches a BPF program
* to the userspace function which is found by binary path and
* offset. You can optionally specify a particular proccess to attach
* offset. You can optionally specify a particular process to attach
* to. You can also optionally attach the program to the function
* exit instead of entry.
*
@@ -915,6 +940,12 @@ LIBBPF_API int bpf_program__set_log_level(struct bpf_program *prog, __u32 log_le
LIBBPF_API const char *bpf_program__log_buf(const struct bpf_program *prog, size_t *log_size);
LIBBPF_API int bpf_program__set_log_buf(struct bpf_program *prog, char *log_buf, size_t log_size);
LIBBPF_API struct bpf_func_info *bpf_program__func_info(const struct bpf_program *prog);
LIBBPF_API __u32 bpf_program__func_info_cnt(const struct bpf_program *prog);
LIBBPF_API struct bpf_line_info *bpf_program__line_info(const struct bpf_program *prog);
LIBBPF_API __u32 bpf_program__line_info_cnt(const struct bpf_program *prog);
/**
* @brief **bpf_program__set_attach_target()** sets BTF-based attach target
* for supported BPF program types:
@@ -1593,11 +1624,11 @@ LIBBPF_API int perf_buffer__buffer_fd(const struct perf_buffer *pb, size_t buf_i
* memory region of the ring buffer.
* This ring buffer can be used to implement a custom events consumer.
* The ring buffer starts with the *struct perf_event_mmap_page*, which
* holds the ring buffer managment fields, when accessing the header
* holds the ring buffer management fields, when accessing the header
* structure it's important to be SMP aware.
* You can refer to *perf_event_read_simple* for a simple example.
* @param pb the perf buffer structure
* @param buf_idx the buffer index to retreive
* @param buf_idx the buffer index to retrieve
* @param buf (out) gets the base pointer of the mmap()'ed memory
* @param buf_size (out) gets the size of the mmap()'ed region
* @return 0 on success, negative error code for failure
@@ -1786,9 +1817,14 @@ struct bpf_linker_file_opts {
struct bpf_linker;
LIBBPF_API struct bpf_linker *bpf_linker__new(const char *filename, struct bpf_linker_opts *opts);
LIBBPF_API struct bpf_linker *bpf_linker__new_fd(int fd, struct bpf_linker_opts *opts);
LIBBPF_API int bpf_linker__add_file(struct bpf_linker *linker,
const char *filename,
const struct bpf_linker_file_opts *opts);
LIBBPF_API int bpf_linker__add_fd(struct bpf_linker *linker, int fd,
const struct bpf_linker_file_opts *opts);
LIBBPF_API int bpf_linker__add_buf(struct bpf_linker *linker, void *buf, size_t buf_sz,
const struct bpf_linker_file_opts *opts);
LIBBPF_API int bpf_linker__finalize(struct bpf_linker *linker);
LIBBPF_API void bpf_linker__free(struct bpf_linker *linker);

View File

@@ -421,9 +421,26 @@ LIBBPF_1.5.0 {
global:
btf__distill_base;
btf__relocate;
btf_ext__endianness;
btf_ext__set_endianness;
bpf_map__autoattach;
bpf_map__set_autoattach;
bpf_object__token_fd;
bpf_program__attach_sockmap;
ring__consume_n;
ring_buffer__consume_n;
} LIBBPF_1.4.0;
LIBBPF_1.6.0 {
global:
bpf_linker__add_buf;
bpf_linker__add_fd;
bpf_linker__new_fd;
bpf_object__prepare;
bpf_program__func_info;
bpf_program__func_info_cnt;
bpf_program__line_info;
bpf_program__line_info_cnt;
btf__add_decl_attr;
btf__add_type_attr;
} LIBBPF_1.5.0;

View File

@@ -10,6 +10,7 @@
#define __LIBBPF_LIBBPF_INTERNAL_H
#include <stdlib.h>
#include <byteswap.h>
#include <limits.h>
#include <errno.h>
#include <linux/err.h>
@@ -408,6 +409,7 @@ int libbpf__load_raw_btf(const char *raw_types, size_t types_len,
int btf_load_into_kernel(struct btf *btf,
char *log_buf, size_t log_sz, __u32 log_level,
int token_fd);
struct btf *btf_load_from_kernel(__u32 id, struct btf *base_btf, int token_fd);
struct btf *btf_get_from_fd(int btf_fd, struct btf *base_btf);
void btf_get_kernel_prefix_kind(enum bpf_attach_type attach_type,
@@ -448,11 +450,11 @@ struct btf_ext_info {
*
* The func_info subsection layout:
* record size for struct bpf_func_info in the func_info subsection
* struct btf_sec_func_info for section #1
* struct btf_ext_info_sec for section #1
* a list of bpf_func_info records for section #1
* where struct bpf_func_info mimics one in include/uapi/linux/bpf.h
* but may not be identical
* struct btf_sec_func_info for section #2
* struct btf_ext_info_sec for section #2
* a list of bpf_func_info records for section #2
* ......
*
@@ -484,6 +486,8 @@ struct btf_ext {
struct btf_ext_header *hdr;
void *data;
};
void *data_swapped;
bool swapped_endian;
struct btf_ext_info func_info;
struct btf_ext_info line_info;
struct btf_ext_info core_relo_info;
@@ -511,6 +515,32 @@ struct bpf_line_info_min {
__u32 line_col;
};
/* Functions to byte-swap info records */
typedef void (*info_rec_bswap_fn)(void *);
static inline void bpf_func_info_bswap(struct bpf_func_info *i)
{
i->insn_off = bswap_32(i->insn_off);
i->type_id = bswap_32(i->type_id);
}
static inline void bpf_line_info_bswap(struct bpf_line_info *i)
{
i->insn_off = bswap_32(i->insn_off);
i->file_name_off = bswap_32(i->file_name_off);
i->line_off = bswap_32(i->line_off);
i->line_col = bswap_32(i->line_col);
}
static inline void bpf_core_relo_bswap(struct bpf_core_relo *i)
{
i->insn_off = bswap_32(i->insn_off);
i->type_id = bswap_32(i->type_id);
i->access_str_off = bswap_32(i->access_str_off);
i->kind = bswap_32(i->kind);
}
enum btf_field_iter_kind {
BTF_FIELD_ITER_IDS,
BTF_FIELD_ITER_STRS,
@@ -588,6 +618,16 @@ static inline bool is_ldimm64_insn(struct bpf_insn *insn)
return insn->code == (BPF_LD | BPF_IMM | BPF_DW);
}
static inline void bpf_insn_bswap(struct bpf_insn *insn)
{
__u8 tmp_reg = insn->dst_reg;
insn->dst_reg = insn->src_reg;
insn->src_reg = tmp_reg;
insn->off = bswap_16(insn->off);
insn->imm = bswap_32(insn->imm);
}
/* Unconditionally dup FD, ensuring it doesn't use [0, 2] range.
* Original FD is not closed or altered in any other way.
* Preserves original FD value, if it's invalid (negative).
@@ -627,6 +667,15 @@ static inline int sys_dup3(int oldfd, int newfd, int flags)
return syscall(__NR_dup3, oldfd, newfd, flags);
}
/* Some versions of Android don't provide memfd_create() in their libc
* implementation, so avoid complications and just go straight to Linux
* syscall.
*/
static inline int sys_memfd_create(const char *name, unsigned flags)
{
return syscall(__NR_memfd_create, name, flags);
}
/* Point *fixed_fd* to the same file that *tmp_fd* points to.
* Regardless of success, *tmp_fd* is closed.
* Whatever *fixed_fd* pointed to is closed silently.

View File

@@ -76,7 +76,7 @@ enum libbpf_strict_mode {
* first BPF program or map creation operation. This is done only if
* kernel is too old to support memcg-based memory accounting for BPF
* subsystem. By default, RLIMIT_MEMLOCK limit is set to RLIM_INFINITY,
* but it can be overriden with libbpf_set_memlock_rlim() API.
* but it can be overridden with libbpf_set_memlock_rlim() API.
* Note that libbpf_set_memlock_rlim() needs to be called before
* the very first bpf_prog_load(), bpf_map_create() or bpf_object__load()
* operation.
@@ -97,7 +97,7 @@ LIBBPF_API int libbpf_set_strict_mode(enum libbpf_strict_mode mode);
* @brief **libbpf_get_error()** extracts the error code from the passed
* pointer
* @param ptr pointer returned from libbpf API function
* @return error code; or 0 if no error occured
* @return error code; or 0 if no error occurred
*
* Note, as of libbpf 1.0 this function is not necessary and not recommended
* to be used. Libbpf doesn't return error code embedded into the pointer

View File

@@ -4,6 +4,6 @@
#define __LIBBPF_VERSION_H
#define LIBBPF_MAJOR_VERSION 1
#define LIBBPF_MINOR_VERSION 5
#define LIBBPF_MINOR_VERSION 6
#endif /* __LIBBPF_VERSION_H */

View File

@@ -4,6 +4,10 @@
*
* Copyright (c) 2021 Facebook
*/
#ifndef _GNU_SOURCE
#define _GNU_SOURCE
#endif
#include <stdbool.h>
#include <stddef.h>
#include <stdio.h>
@@ -16,10 +20,12 @@
#include <elf.h>
#include <libelf.h>
#include <fcntl.h>
#include <sys/mman.h>
#include "libbpf.h"
#include "btf.h"
#include "libbpf_internal.h"
#include "strset.h"
#include "str_error.h"
#define BTF_EXTERN_SEC ".extern"
@@ -135,6 +141,7 @@ struct bpf_linker {
int fd;
Elf *elf;
Elf64_Ehdr *elf_hdr;
bool swapped_endian;
/* Output sections metadata */
struct dst_sec *secs;
@@ -150,15 +157,19 @@ struct bpf_linker {
/* global (including extern) ELF symbols */
int glob_sym_cnt;
struct glob_sym *glob_syms;
bool fd_is_owned;
};
#define pr_warn_elf(fmt, ...) \
libbpf_print(LIBBPF_WARN, "libbpf: " fmt ": %s\n", ##__VA_ARGS__, elf_errmsg(-1))
static int init_output_elf(struct bpf_linker *linker, const char *file);
static int init_output_elf(struct bpf_linker *linker);
static int linker_load_obj_file(struct bpf_linker *linker, const char *filename,
const struct bpf_linker_file_opts *opts,
static int bpf_linker_add_file(struct bpf_linker *linker, int fd,
const char *filename);
static int linker_load_obj_file(struct bpf_linker *linker,
struct src_obj *obj);
static int linker_sanity_check_elf(struct src_obj *obj);
static int linker_sanity_check_elf_symtab(struct src_obj *obj, struct src_sec *sec);
@@ -189,7 +200,7 @@ void bpf_linker__free(struct bpf_linker *linker)
if (linker->elf)
elf_end(linker->elf);
if (linker->fd >= 0)
if (linker->fd >= 0 && linker->fd_is_owned)
close(linker->fd);
strset__free(linker->strtab_strs);
@@ -231,9 +242,63 @@ struct bpf_linker *bpf_linker__new(const char *filename, struct bpf_linker_opts
if (!linker)
return errno = ENOMEM, NULL;
linker->fd = -1;
linker->filename = strdup(filename);
if (!linker->filename) {
err = -ENOMEM;
goto err_out;
}
err = init_output_elf(linker, filename);
linker->fd = open(filename, O_WRONLY | O_CREAT | O_TRUNC | O_CLOEXEC, 0644);
if (linker->fd < 0) {
err = -errno;
pr_warn("failed to create '%s': %d\n", filename, err);
goto err_out;
}
linker->fd_is_owned = true;
err = init_output_elf(linker);
if (err)
goto err_out;
return linker;
err_out:
bpf_linker__free(linker);
return errno = -err, NULL;
}
struct bpf_linker *bpf_linker__new_fd(int fd, struct bpf_linker_opts *opts)
{
struct bpf_linker *linker;
char filename[32];
int err;
if (fd < 0)
return errno = EINVAL, NULL;
if (!OPTS_VALID(opts, bpf_linker_opts))
return errno = EINVAL, NULL;
if (elf_version(EV_CURRENT) == EV_NONE) {
pr_warn_elf("libelf initialization failed");
return errno = EINVAL, NULL;
}
linker = calloc(1, sizeof(*linker));
if (!linker)
return errno = ENOMEM, NULL;
snprintf(filename, sizeof(filename), "fd:%d", fd);
linker->filename = strdup(filename);
if (!linker->filename) {
err = -ENOMEM;
goto err_out;
}
linker->fd = fd;
linker->fd_is_owned = false;
err = init_output_elf(linker);
if (err)
goto err_out;
@@ -292,23 +357,12 @@ static Elf64_Sym *add_new_sym(struct bpf_linker *linker, size_t *sym_idx)
return sym;
}
static int init_output_elf(struct bpf_linker *linker, const char *file)
static int init_output_elf(struct bpf_linker *linker)
{
int err, str_off;
Elf64_Sym *init_sym;
struct dst_sec *sec;
linker->filename = strdup(file);
if (!linker->filename)
return -ENOMEM;
linker->fd = open(file, O_WRONLY | O_CREAT | O_TRUNC | O_CLOEXEC, 0644);
if (linker->fd < 0) {
err = -errno;
pr_warn("failed to create '%s': %d\n", file, err);
return err;
}
linker->elf = elf_begin(linker->fd, ELF_C_WRITE, NULL);
if (!linker->elf) {
pr_warn_elf("failed to create ELF object");
@@ -324,13 +378,8 @@ static int init_output_elf(struct bpf_linker *linker, const char *file)
linker->elf_hdr->e_machine = EM_BPF;
linker->elf_hdr->e_type = ET_REL;
#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
linker->elf_hdr->e_ident[EI_DATA] = ELFDATA2LSB;
#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
linker->elf_hdr->e_ident[EI_DATA] = ELFDATA2MSB;
#else
#error "Unknown __BYTE_ORDER__"
#endif
/* Set unknown ELF endianness, assign later from input files */
linker->elf_hdr->e_ident[EI_DATA] = ELFDATANONE;
/* STRTAB */
/* initialize strset with an empty string to conform to ELF */
@@ -396,6 +445,8 @@ static int init_output_elf(struct bpf_linker *linker, const char *file)
pr_warn_elf("failed to create SYMTAB data");
return -EINVAL;
}
/* Ensure libelf translates byte-order of symbol records */
sec->data->d_type = ELF_T_SYM;
str_off = strset__add_str(linker->strtab_strs, sec->sec_name);
if (str_off < 0)
@@ -437,19 +488,16 @@ static int init_output_elf(struct bpf_linker *linker, const char *file)
return 0;
}
int bpf_linker__add_file(struct bpf_linker *linker, const char *filename,
const struct bpf_linker_file_opts *opts)
static int bpf_linker_add_file(struct bpf_linker *linker, int fd,
const char *filename)
{
struct src_obj obj = {};
int err = 0;
if (!OPTS_VALID(opts, bpf_linker_file_opts))
return libbpf_err(-EINVAL);
obj.filename = filename;
obj.fd = fd;
if (!linker->elf)
return libbpf_err(-EINVAL);
err = err ?: linker_load_obj_file(linker, filename, opts, &obj);
err = err ?: linker_load_obj_file(linker, &obj);
err = err ?: linker_append_sec_data(linker, &obj);
err = err ?: linker_append_elf_syms(linker, &obj);
err = err ?: linker_append_elf_relos(linker, &obj);
@@ -464,12 +512,91 @@ int bpf_linker__add_file(struct bpf_linker *linker, const char *filename,
free(obj.sym_map);
if (obj.elf)
elf_end(obj.elf);
if (obj.fd >= 0)
close(obj.fd);
return err;
}
int bpf_linker__add_file(struct bpf_linker *linker, const char *filename,
const struct bpf_linker_file_opts *opts)
{
int fd, err;
if (!OPTS_VALID(opts, bpf_linker_file_opts))
return libbpf_err(-EINVAL);
if (!linker->elf)
return libbpf_err(-EINVAL);
fd = open(filename, O_RDONLY | O_CLOEXEC);
if (fd < 0) {
err = -errno;
pr_warn("failed to open file '%s': %s\n", filename, errstr(err));
return libbpf_err(err);
}
err = bpf_linker_add_file(linker, fd, filename);
close(fd);
return libbpf_err(err);
}
int bpf_linker__add_fd(struct bpf_linker *linker, int fd,
const struct bpf_linker_file_opts *opts)
{
char filename[32];
int err;
if (!OPTS_VALID(opts, bpf_linker_file_opts))
return libbpf_err(-EINVAL);
if (!linker->elf)
return libbpf_err(-EINVAL);
if (fd < 0)
return libbpf_err(-EINVAL);
snprintf(filename, sizeof(filename), "fd:%d", fd);
err = bpf_linker_add_file(linker, fd, filename);
return libbpf_err(err);
}
int bpf_linker__add_buf(struct bpf_linker *linker, void *buf, size_t buf_sz,
const struct bpf_linker_file_opts *opts)
{
char filename[32];
int fd, written, ret;
if (!OPTS_VALID(opts, bpf_linker_file_opts))
return libbpf_err(-EINVAL);
if (!linker->elf)
return libbpf_err(-EINVAL);
snprintf(filename, sizeof(filename), "mem:%p+%zu", buf, buf_sz);
fd = sys_memfd_create(filename, 0);
if (fd < 0) {
ret = -errno;
pr_warn("failed to create memfd '%s': %s\n", filename, errstr(ret));
return libbpf_err(ret);
}
written = 0;
while (written < buf_sz) {
ret = write(fd, buf, buf_sz);
if (ret < 0) {
ret = -errno;
pr_warn("failed to write '%s': %s\n", filename, errstr(ret));
goto err_out;
}
written += ret;
}
ret = bpf_linker_add_file(linker, fd, filename);
err_out:
close(fd);
return libbpf_err(ret);
}
static bool is_dwarf_sec_name(const char *name)
{
/* approximation, but the actual list is too long */
@@ -535,65 +662,69 @@ static struct src_sec *add_src_sec(struct src_obj *obj, const char *sec_name)
return sec;
}
static int linker_load_obj_file(struct bpf_linker *linker, const char *filename,
const struct bpf_linker_file_opts *opts,
static int linker_load_obj_file(struct bpf_linker *linker,
struct src_obj *obj)
{
#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
const int host_endianness = ELFDATA2LSB;
#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
const int host_endianness = ELFDATA2MSB;
#else
#error "Unknown __BYTE_ORDER__"
#endif
int err = 0;
Elf_Scn *scn;
Elf_Data *data;
Elf64_Ehdr *ehdr;
Elf64_Shdr *shdr;
struct src_sec *sec;
unsigned char obj_byteorder;
unsigned char link_byteorder = linker->elf_hdr->e_ident[EI_DATA];
#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
const unsigned char host_byteorder = ELFDATA2LSB;
#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
const unsigned char host_byteorder = ELFDATA2MSB;
#else
#error "Unknown __BYTE_ORDER__"
#endif
pr_debug("linker: adding object file '%s'...\n", filename);
pr_debug("linker: adding object file '%s'...\n", obj->filename);
obj->filename = filename;
obj->fd = open(filename, O_RDONLY | O_CLOEXEC);
if (obj->fd < 0) {
err = -errno;
pr_warn("failed to open file '%s': %d\n", filename, err);
return err;
}
obj->elf = elf_begin(obj->fd, ELF_C_READ_MMAP, NULL);
if (!obj->elf) {
err = -errno;
pr_warn_elf("failed to parse ELF file '%s'", filename);
return err;
pr_warn_elf("failed to parse ELF file '%s'", obj->filename);
return -EINVAL;
}
/* Sanity check ELF file high-level properties */
ehdr = elf64_getehdr(obj->elf);
if (!ehdr) {
err = -errno;
pr_warn_elf("failed to get ELF header for %s", filename);
return err;
pr_warn_elf("failed to get ELF header for %s", obj->filename);
return -EINVAL;
}
if (ehdr->e_ident[EI_DATA] != host_endianness) {
/* Linker output endianness set by first input object */
obj_byteorder = ehdr->e_ident[EI_DATA];
if (obj_byteorder != ELFDATA2LSB && obj_byteorder != ELFDATA2MSB) {
err = -EOPNOTSUPP;
pr_warn_elf("unsupported byte order of ELF file %s", filename);
pr_warn("unknown byte order of ELF file %s\n", obj->filename);
return err;
}
if (link_byteorder == ELFDATANONE) {
linker->elf_hdr->e_ident[EI_DATA] = obj_byteorder;
linker->swapped_endian = obj_byteorder != host_byteorder;
pr_debug("linker: set %s-endian output byte order\n",
obj_byteorder == ELFDATA2MSB ? "big" : "little");
} else if (link_byteorder != obj_byteorder) {
err = -EOPNOTSUPP;
pr_warn("byte order mismatch with ELF file %s\n", obj->filename);
return err;
}
if (ehdr->e_type != ET_REL
|| ehdr->e_machine != EM_BPF
|| ehdr->e_ident[EI_CLASS] != ELFCLASS64) {
err = -EOPNOTSUPP;
pr_warn_elf("unsupported kind of ELF file %s", filename);
pr_warn_elf("unsupported kind of ELF file %s", obj->filename);
return err;
}
if (elf_getshdrstrndx(obj->elf, &obj->shstrs_sec_idx)) {
err = -errno;
pr_warn_elf("failed to get SHSTRTAB section index for %s", filename);
return err;
pr_warn_elf("failed to get SHSTRTAB section index for %s", obj->filename);
return -EINVAL;
}
scn = NULL;
@@ -603,26 +734,23 @@ static int linker_load_obj_file(struct bpf_linker *linker, const char *filename,
shdr = elf64_getshdr(scn);
if (!shdr) {
err = -errno;
pr_warn_elf("failed to get section #%zu header for %s",
sec_idx, filename);
return err;
sec_idx, obj->filename);
return -EINVAL;
}
sec_name = elf_strptr(obj->elf, obj->shstrs_sec_idx, shdr->sh_name);
if (!sec_name) {
err = -errno;
pr_warn_elf("failed to get section #%zu name for %s",
sec_idx, filename);
return err;
sec_idx, obj->filename);
return -EINVAL;
}
data = elf_getdata(scn, 0);
if (!data) {
err = -errno;
pr_warn_elf("failed to get section #%zu (%s) data from %s",
sec_idx, sec_name, filename);
return err;
sec_idx, sec_name, obj->filename);
return -EINVAL;
}
sec = add_src_sec(obj, sec_name);
@@ -656,7 +784,8 @@ static int linker_load_obj_file(struct bpf_linker *linker, const char *filename,
obj->btf = btf__new(data->d_buf, shdr->sh_size);
err = libbpf_get_error(obj->btf);
if (err) {
pr_warn("failed to parse .BTF from %s: %d\n", filename, err);
pr_warn("failed to parse .BTF from %s: %s\n",
obj->filename, errstr(err));
return err;
}
sec->skipped = true;
@@ -666,7 +795,8 @@ static int linker_load_obj_file(struct bpf_linker *linker, const char *filename,
obj->btf_ext = btf_ext__new(data->d_buf, shdr->sh_size);
err = libbpf_get_error(obj->btf_ext);
if (err) {
pr_warn("failed to parse .BTF.ext from '%s': %d\n", filename, err);
pr_warn("failed to parse .BTF.ext from '%s': %s\n",
obj->filename, errstr(err));
return err;
}
sec->skipped = true;
@@ -683,7 +813,7 @@ static int linker_load_obj_file(struct bpf_linker *linker, const char *filename,
break;
default:
pr_warn("unrecognized section #%zu (%s) in %s\n",
sec_idx, sec_name, filename);
sec_idx, sec_name, obj->filename);
err = -EINVAL;
return err;
}
@@ -1109,6 +1239,24 @@ static bool sec_content_is_same(struct dst_sec *dst_sec, struct src_sec *src_sec
return true;
}
static bool is_exec_sec(struct dst_sec *sec)
{
if (!sec || sec->ephemeral)
return false;
return (sec->shdr->sh_type == SHT_PROGBITS) &&
(sec->shdr->sh_flags & SHF_EXECINSTR);
}
static void exec_sec_bswap(void *raw_data, int size)
{
const int insn_cnt = size / sizeof(struct bpf_insn);
struct bpf_insn *insn = raw_data;
int i;
for (i = 0; i < insn_cnt; i++, insn++)
bpf_insn_bswap(insn);
}
static int extend_sec(struct bpf_linker *linker, struct dst_sec *dst, struct src_sec *src)
{
void *tmp;
@@ -1168,6 +1316,10 @@ static int extend_sec(struct bpf_linker *linker, struct dst_sec *dst, struct src
memset(dst->raw_data + dst->sec_sz, 0, dst_align_sz - dst->sec_sz);
/* now copy src data at a properly aligned offset */
memcpy(dst->raw_data + dst_align_sz, src->data->d_buf, src->shdr->sh_size);
/* convert added bpf insns to native byte-order */
if (linker->swapped_endian && is_exec_sec(dst))
exec_sec_bswap(dst->raw_data + dst_align_sz, src->shdr->sh_size);
}
dst->sec_sz = dst_final_sz;
@@ -1224,7 +1376,7 @@ static int linker_append_sec_data(struct bpf_linker *linker, struct src_obj *obj
} else {
if (!secs_match(dst_sec, src_sec)) {
pr_warn("ELF sections %s are incompatible\n", src_sec->sec_name);
return -1;
return -EINVAL;
}
/* "license" and "version" sections are deduped */
@@ -1413,7 +1565,7 @@ recur:
return true;
case BTF_KIND_PTR:
/* just validate overall shape of the referenced type, so no
* contents comparison for struct/union, and allowd fwd vs
* contents comparison for struct/union, and allowed fwd vs
* struct/union
*/
exact = false;
@@ -1962,7 +2114,7 @@ static int linker_append_elf_sym(struct bpf_linker *linker, struct src_obj *obj,
/* If existing symbol is a strong resolved symbol, bail out,
* because we lost resolution battle have nothing to
* contribute. We already checked abover that there is no
* contribute. We already checked above that there is no
* strong-strong conflict. We also already tightened binding
* and visibility, so nothing else to contribute at that point.
*/
@@ -2011,7 +2163,7 @@ add_sym:
obj->sym_map[src_sym_idx] = dst_sym_idx;
if (sym_type == STT_SECTION && dst_sym) {
if (sym_type == STT_SECTION && dst_sec) {
dst_sec->sec_sym_idx = dst_sym_idx;
dst_sym->st_value = 0;
}
@@ -2071,7 +2223,7 @@ static int linker_append_elf_relos(struct bpf_linker *linker, struct src_obj *ob
}
} else if (!secs_match(dst_sec, src_sec)) {
pr_warn("sections %s are not compatible\n", src_sec->sec_name);
return -1;
return -EINVAL;
}
/* shdr->sh_link points to SYMTAB */
@@ -2415,6 +2567,10 @@ static int linker_append_btf(struct bpf_linker *linker, struct src_obj *obj)
if (glob_sym && glob_sym->var_idx >= 0) {
__s64 sz;
/* FUNCs don't have size, nothing to update */
if (btf_is_func(t))
continue;
dst_var = &dst_sec->sec_vars[glob_sym->var_idx];
/* Because underlying BTF type might have
* changed, so might its size have changed, so
@@ -2628,27 +2784,32 @@ int bpf_linker__finalize(struct bpf_linker *linker)
if (!sec->scn)
continue;
/* restore sections with bpf insns to target byte-order */
if (linker->swapped_endian && is_exec_sec(sec))
exec_sec_bswap(sec->raw_data, sec->sec_sz);
sec->data->d_buf = sec->raw_data;
}
/* Finalize ELF layout */
if (elf_update(linker->elf, ELF_C_NULL) < 0) {
err = -errno;
err = -EINVAL;
pr_warn_elf("failed to finalize ELF layout");
return libbpf_err(err);
}
/* Write out final ELF contents */
if (elf_update(linker->elf, ELF_C_WRITE) < 0) {
err = -errno;
err = -EINVAL;
pr_warn_elf("failed to write ELF contents");
return libbpf_err(err);
}
elf_end(linker->elf);
close(linker->fd);
linker->elf = NULL;
if (linker->fd_is_owned)
close(linker->fd);
linker->fd = -1;
return 0;
@@ -2696,6 +2857,7 @@ static int emit_elf_data_sec(struct bpf_linker *linker, const char *sec_name,
static int finalize_btf(struct bpf_linker *linker)
{
enum btf_endianness link_endianness;
LIBBPF_OPTS(btf_dedup_opts, opts);
struct btf *btf = linker->btf;
const void *raw_data;
@@ -2729,17 +2891,24 @@ static int finalize_btf(struct bpf_linker *linker)
err = finalize_btf_ext(linker);
if (err) {
pr_warn(".BTF.ext generation failed: %d\n", err);
pr_warn(".BTF.ext generation failed: %s\n", errstr(err));
return err;
}
opts.btf_ext = linker->btf_ext;
err = btf__dedup(linker->btf, &opts);
if (err) {
pr_warn("BTF dedup failed: %d\n", err);
pr_warn("BTF dedup failed: %s\n", errstr(err));
return err;
}
/* Set .BTF and .BTF.ext output byte order */
link_endianness = linker->elf_hdr->e_ident[EI_DATA] == ELFDATA2MSB ?
BTF_BIG_ENDIAN : BTF_LITTLE_ENDIAN;
btf__set_endianness(linker->btf, link_endianness);
if (linker->btf_ext)
btf_ext__set_endianness(linker->btf_ext, link_endianness);
/* Emit .BTF section */
raw_data = btf__raw_data(linker->btf, &raw_sz);
if (!raw_data)
@@ -2747,7 +2916,7 @@ static int finalize_btf(struct bpf_linker *linker)
err = emit_elf_data_sec(linker, BTF_ELF_SEC, 8, raw_data, raw_sz);
if (err) {
pr_warn("failed to write out .BTF ELF section: %d\n", err);
pr_warn("failed to write out .BTF ELF section: %s\n", errstr(err));
return err;
}
@@ -2759,7 +2928,7 @@ static int finalize_btf(struct bpf_linker *linker)
err = emit_elf_data_sec(linker, BTF_EXT_ELF_SEC, 8, raw_data, raw_sz);
if (err) {
pr_warn("failed to write out .BTF.ext ELF section: %d\n", err);
pr_warn("failed to write out .BTF.ext ELF section: %s\n", errstr(err));
return err;
}
}
@@ -2935,7 +3104,7 @@ static int finalize_btf_ext(struct bpf_linker *linker)
err = libbpf_get_error(linker->btf_ext);
if (err) {
linker->btf_ext = NULL;
pr_warn("failed to parse final .BTF.ext data: %d\n", err);
pr_warn("failed to parse final .BTF.ext data: %s\n", errstr(err));
goto out;
}

View File

@@ -63,16 +63,16 @@ static int validate_nla(struct nlattr *nla, int maxtype,
minlen = nla_attr_minlen[pt->type];
if (libbpf_nla_len(nla) < minlen)
return -1;
return -EINVAL;
if (pt->maxlen && libbpf_nla_len(nla) > pt->maxlen)
return -1;
return -EINVAL;
if (pt->type == LIBBPF_NLA_STRING) {
char *data = libbpf_nla_data(nla);
if (data[libbpf_nla_len(nla) - 1] != '\0')
return -1;
return -EINVAL;
}
return 0;
@@ -118,19 +118,18 @@ int libbpf_nla_parse(struct nlattr *tb[], int maxtype, struct nlattr *head,
if (policy) {
err = validate_nla(nla, maxtype, policy);
if (err < 0)
goto errout;
return err;
}
if (tb[type])
if (tb[type]) {
pr_warn("Attribute of type %#x found multiple times in message, "
"previous attribute is being ignored.\n", type);
}
tb[type] = nla;
}
err = 0;
errout:
return err;
return 0;
}
/**

View File

@@ -683,7 +683,7 @@ static int bpf_core_calc_field_relo(const char *prog_name,
{
const struct bpf_core_accessor *acc;
const struct btf_type *t;
__u32 byte_off, byte_sz, bit_off, bit_sz, field_type_id;
__u32 byte_off, byte_sz, bit_off, bit_sz, field_type_id, elem_id;
const struct btf_member *m;
const struct btf_type *mt;
bool bitfield;
@@ -706,8 +706,14 @@ static int bpf_core_calc_field_relo(const char *prog_name,
if (!acc->name) {
if (relo->kind == BPF_CORE_FIELD_BYTE_OFFSET) {
*val = spec->bit_offset / 8;
/* remember field size for load/store mem size */
sz = btf__resolve_size(spec->btf, acc->type_id);
/* remember field size for load/store mem size;
* note, for arrays we care about individual element
* sizes, not the overall array size
*/
t = skip_mods_and_typedefs(spec->btf, acc->type_id, &elem_id);
while (btf_is_array(t))
t = skip_mods_and_typedefs(spec->btf, btf_array(t)->type, &elem_id);
sz = btf__resolve_size(spec->btf, elem_id);
if (sz < 0)
return -EINVAL;
*field_sz = sz;
@@ -767,7 +773,17 @@ static int bpf_core_calc_field_relo(const char *prog_name,
case BPF_CORE_FIELD_BYTE_OFFSET:
*val = byte_off;
if (!bitfield) {
*field_sz = byte_sz;
/* remember field size for load/store mem size;
* note, for arrays we care about individual element
* sizes, not the overall array size
*/
t = skip_mods_and_typedefs(spec->btf, field_type_id, &elem_id);
while (btf_is_array(t))
t = skip_mods_and_typedefs(spec->btf, btf_array(t)->type, &elem_id);
sz = btf__resolve_size(spec->btf, elem_id);
if (sz < 0)
return -EINVAL;
*field_sz = sz;
*type_id = field_type_id;
}
break;
@@ -1339,7 +1355,7 @@ int bpf_core_calc_relo_insn(const char *prog_name,
cands->cands[i].id, cand_spec);
if (err < 0) {
bpf_core_format_spec(spec_buf, sizeof(spec_buf), cand_spec);
pr_warn("prog '%s': relo #%d: error matching candidate #%d %s: %d\n ",
pr_warn("prog '%s': relo #%d: error matching candidate #%d %s: %d\n",
prog_name, relo_idx, i, spec_buf, err);
return err;
}

View File

@@ -21,6 +21,7 @@
#include "libbpf.h"
#include "libbpf_internal.h"
#include "bpf.h"
#include "str_error.h"
struct ring {
ring_buffer_sample_fn sample_cb;
@@ -88,8 +89,8 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd,
err = bpf_map_get_info_by_fd(map_fd, &info, &len);
if (err) {
err = -errno;
pr_warn("ringbuf: failed to get map info for fd=%d: %d\n",
map_fd, err);
pr_warn("ringbuf: failed to get map info for fd=%d: %s\n",
map_fd, errstr(err));
return libbpf_err(err);
}
@@ -123,8 +124,8 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd,
tmp = mmap(NULL, rb->page_size, PROT_READ | PROT_WRITE, MAP_SHARED, map_fd, 0);
if (tmp == MAP_FAILED) {
err = -errno;
pr_warn("ringbuf: failed to mmap consumer page for map fd=%d: %d\n",
map_fd, err);
pr_warn("ringbuf: failed to mmap consumer page for map fd=%d: %s\n",
map_fd, errstr(err));
goto err_out;
}
r->consumer_pos = tmp;
@@ -142,8 +143,8 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd,
tmp = mmap(NULL, (size_t)mmap_sz, PROT_READ, MAP_SHARED, map_fd, rb->page_size);
if (tmp == MAP_FAILED) {
err = -errno;
pr_warn("ringbuf: failed to mmap data pages for map fd=%d: %d\n",
map_fd, err);
pr_warn("ringbuf: failed to mmap data pages for map fd=%d: %s\n",
map_fd, errstr(err));
goto err_out;
}
r->producer_pos = tmp;
@@ -156,8 +157,8 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd,
e->data.fd = rb->ring_cnt;
if (epoll_ctl(rb->epoll_fd, EPOLL_CTL_ADD, map_fd, e) < 0) {
err = -errno;
pr_warn("ringbuf: failed to epoll add map fd=%d: %d\n",
map_fd, err);
pr_warn("ringbuf: failed to epoll add map fd=%d: %s\n",
map_fd, errstr(err));
goto err_out;
}
@@ -205,7 +206,7 @@ ring_buffer__new(int map_fd, ring_buffer_sample_fn sample_cb, void *ctx,
rb->epoll_fd = epoll_create1(EPOLL_CLOEXEC);
if (rb->epoll_fd < 0) {
err = -errno;
pr_warn("ringbuf: failed to create epoll instance: %d\n", err);
pr_warn("ringbuf: failed to create epoll instance: %s\n", errstr(err));
goto err_out;
}
@@ -458,7 +459,8 @@ static int user_ringbuf_map(struct user_ring_buffer *rb, int map_fd)
err = bpf_map_get_info_by_fd(map_fd, &info, &len);
if (err) {
err = -errno;
pr_warn("user ringbuf: failed to get map info for fd=%d: %d\n", map_fd, err);
pr_warn("user ringbuf: failed to get map info for fd=%d: %s\n",
map_fd, errstr(err));
return err;
}
@@ -474,8 +476,8 @@ static int user_ringbuf_map(struct user_ring_buffer *rb, int map_fd)
tmp = mmap(NULL, rb->page_size, PROT_READ, MAP_SHARED, map_fd, 0);
if (tmp == MAP_FAILED) {
err = -errno;
pr_warn("user ringbuf: failed to mmap consumer page for map fd=%d: %d\n",
map_fd, err);
pr_warn("user ringbuf: failed to mmap consumer page for map fd=%d: %s\n",
map_fd, errstr(err));
return err;
}
rb->consumer_pos = tmp;
@@ -494,8 +496,8 @@ static int user_ringbuf_map(struct user_ring_buffer *rb, int map_fd)
map_fd, rb->page_size);
if (tmp == MAP_FAILED) {
err = -errno;
pr_warn("user ringbuf: failed to mmap data pages for map fd=%d: %d\n",
map_fd, err);
pr_warn("user ringbuf: failed to mmap data pages for map fd=%d: %s\n",
map_fd, errstr(err));
return err;
}
@@ -506,7 +508,7 @@ static int user_ringbuf_map(struct user_ring_buffer *rb, int map_fd)
rb_epoll->events = EPOLLOUT;
if (epoll_ctl(rb->epoll_fd, EPOLL_CTL_ADD, map_fd, rb_epoll) < 0) {
err = -errno;
pr_warn("user ringbuf: failed to epoll add map fd=%d: %d\n", map_fd, err);
pr_warn("user ringbuf: failed to epoll add map fd=%d: %s\n", map_fd, errstr(err));
return err;
}
@@ -531,7 +533,7 @@ user_ring_buffer__new(int map_fd, const struct user_ring_buffer_opts *opts)
rb->epoll_fd = epoll_create1(EPOLL_CLOEXEC);
if (rb->epoll_fd < 0) {
err = -errno;
pr_warn("user ringbuf: failed to create epoll instance: %d\n", err);
pr_warn("user ringbuf: failed to create epoll instance: %s\n", errstr(err));
goto err_out;
}

View File

@@ -107,7 +107,7 @@ static inline void skel_free(const void *p)
* The loader program will perform probe_read_kernel() from maps.rodata.initial_value.
* skel_finalize_map_data() sets skel->rodata to point to actual value in a bpf map and
* does maps.rodata.initial_value = ~0ULL to signal skel_free_map_data() that kvfree
* is not nessary.
* is not necessary.
*
* For user space:
* skel_prep_map_data() mmaps anon memory into skel->rodata that can be accessed directly.
@@ -351,10 +351,11 @@ static inline int bpf_load_and_run(struct bpf_load_and_run_opts *opts)
attr.test.ctx_size_in = opts->ctx->sz;
err = skel_sys_bpf(BPF_PROG_RUN, &attr, test_run_attr_sz);
if (err < 0 || (int)attr.test.retval < 0) {
opts->errstr = "failed to execute loader prog";
if (err < 0) {
opts->errstr = "failed to execute loader prog";
set_err;
} else {
opts->errstr = "error returned by loader prog";
err = (int)attr.test.retval;
#ifndef __KERNEL__
errno = -err;

View File

@@ -5,6 +5,10 @@
#include <errno.h>
#include "str_error.h"
#ifndef ENOTSUPP
#define ENOTSUPP 524
#endif
/* make sure libbpf doesn't use kernel-only integer typedefs */
#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
@@ -31,3 +35,70 @@ char *libbpf_strerror_r(int err, char *dst, int len)
}
return dst;
}
const char *libbpf_errstr(int err)
{
static __thread char buf[12];
if (err > 0)
err = -err;
switch (err) {
case -E2BIG: return "-E2BIG";
case -EACCES: return "-EACCES";
case -EADDRINUSE: return "-EADDRINUSE";
case -EADDRNOTAVAIL: return "-EADDRNOTAVAIL";
case -EAGAIN: return "-EAGAIN";
case -EALREADY: return "-EALREADY";
case -EBADF: return "-EBADF";
case -EBADFD: return "-EBADFD";
case -EBUSY: return "-EBUSY";
case -ECANCELED: return "-ECANCELED";
case -ECHILD: return "-ECHILD";
case -EDEADLK: return "-EDEADLK";
case -EDOM: return "-EDOM";
case -EEXIST: return "-EEXIST";
case -EFAULT: return "-EFAULT";
case -EFBIG: return "-EFBIG";
case -EILSEQ: return "-EILSEQ";
case -EINPROGRESS: return "-EINPROGRESS";
case -EINTR: return "-EINTR";
case -EINVAL: return "-EINVAL";
case -EIO: return "-EIO";
case -EISDIR: return "-EISDIR";
case -ELOOP: return "-ELOOP";
case -EMFILE: return "-EMFILE";
case -EMLINK: return "-EMLINK";
case -EMSGSIZE: return "-EMSGSIZE";
case -ENAMETOOLONG: return "-ENAMETOOLONG";
case -ENFILE: return "-ENFILE";
case -ENODATA: return "-ENODATA";
case -ENODEV: return "-ENODEV";
case -ENOENT: return "-ENOENT";
case -ENOEXEC: return "-ENOEXEC";
case -ENOLINK: return "-ENOLINK";
case -ENOMEM: return "-ENOMEM";
case -ENOSPC: return "-ENOSPC";
case -ENOTBLK: return "-ENOTBLK";
case -ENOTDIR: return "-ENOTDIR";
case -ENOTSUPP: return "-ENOTSUPP";
case -ENOTTY: return "-ENOTTY";
case -ENXIO: return "-ENXIO";
case -EOPNOTSUPP: return "-EOPNOTSUPP";
case -EOVERFLOW: return "-EOVERFLOW";
case -EPERM: return "-EPERM";
case -EPIPE: return "-EPIPE";
case -EPROTO: return "-EPROTO";
case -EPROTONOSUPPORT: return "-EPROTONOSUPPORT";
case -ERANGE: return "-ERANGE";
case -EROFS: return "-EROFS";
case -ESPIPE: return "-ESPIPE";
case -ESRCH: return "-ESRCH";
case -ETXTBSY: return "-ETXTBSY";
case -EUCLEAN: return "-EUCLEAN";
case -EXDEV: return "-EXDEV";
default:
snprintf(buf, sizeof(buf), "%d", err);
return buf;
}
}

View File

@@ -6,4 +6,14 @@
char *libbpf_strerror_r(int err, char *dst, int len);
/**
* @brief **libbpf_errstr()** returns string corresponding to numeric errno
* @param err negative numeric errno
* @return pointer to string representation of the errno, that is invalidated
* upon the next call.
*/
const char *libbpf_errstr(int err);
#define errstr(err) libbpf_errstr(err)
#endif /* __LIBBPF_STR_ERROR_H */

View File

@@ -39,7 +39,7 @@ enum __bpf_usdt_arg_type {
struct __bpf_usdt_arg_spec {
/* u64 scalar interpreted depending on arg_type, see below */
__u64 val_off;
/* arg location case, see bpf_udst_arg() for details */
/* arg location case, see bpf_usdt_arg() for details */
enum __bpf_usdt_arg_type arg_type;
/* offset of referenced register within struct pt_regs */
short reg_off;
@@ -108,6 +108,38 @@ int bpf_usdt_arg_cnt(struct pt_regs *ctx)
return spec->arg_cnt;
}
/* Returns the size in bytes of the #*arg_num* (zero-indexed) USDT argument.
* Returns negative error if argument is not found or arg_num is invalid.
*/
static __always_inline
int bpf_usdt_arg_size(struct pt_regs *ctx, __u64 arg_num)
{
struct __bpf_usdt_arg_spec *arg_spec;
struct __bpf_usdt_spec *spec;
int spec_id;
spec_id = __bpf_usdt_spec_id(ctx);
if (spec_id < 0)
return -ESRCH;
spec = bpf_map_lookup_elem(&__bpf_usdt_specs, &spec_id);
if (!spec)
return -ESRCH;
if (arg_num >= BPF_USDT_MAX_ARG_CNT)
return -ENOENT;
barrier_var(arg_num);
if (arg_num >= spec->arg_cnt)
return -ENOENT;
arg_spec = &spec->args[arg_num];
/* arg_spec->arg_bitshift = 64 - arg_sz * 8
* so: arg_sz = (64 - arg_spec->arg_bitshift) / 8
*/
return (unsigned int)(64 - arg_spec->arg_bitshift) / 8;
}
/* Fetch USDT argument #*arg_num* (zero-indexed) and put its value into *res.
* Returns 0 on success; negative error, otherwise.
* On error *res is guaranteed to be set to zero.

View File

@@ -20,6 +20,7 @@
#include "libbpf_common.h"
#include "libbpf_internal.h"
#include "hashmap.h"
#include "str_error.h"
/* libbpf's USDT support consists of BPF-side state/code and user-space
* state/code working together in concert. BPF-side parts are defined in
@@ -465,8 +466,8 @@ static int parse_vma_segs(int pid, const char *lib_path, struct elf_seg **segs,
goto proceed;
if (!realpath(lib_path, path)) {
pr_warn("usdt: failed to get absolute path of '%s' (err %d), using path as is...\n",
lib_path, -errno);
pr_warn("usdt: failed to get absolute path of '%s' (err %s), using path as is...\n",
lib_path, errstr(-errno));
libbpf_strlcpy(path, lib_path, sizeof(path));
}
@@ -475,8 +476,8 @@ proceed:
f = fopen(line, "re");
if (!f) {
err = -errno;
pr_warn("usdt: failed to open '%s' to get base addr of '%s': %d\n",
line, lib_path, err);
pr_warn("usdt: failed to open '%s' to get base addr of '%s': %s\n",
line, lib_path, errstr(err));
return err;
}
@@ -606,7 +607,8 @@ static int collect_usdt_targets(struct usdt_manager *man, Elf *elf, const char *
err = parse_elf_segs(elf, path, &segs, &seg_cnt);
if (err) {
pr_warn("usdt: failed to process ELF program segments for '%s': %d\n", path, err);
pr_warn("usdt: failed to process ELF program segments for '%s': %s\n",
path, errstr(err));
goto err_out;
}
@@ -659,7 +661,7 @@ static int collect_usdt_targets(struct usdt_manager *man, Elf *elf, const char *
* [0] https://sourceware.org/systemtap/wiki/UserSpaceProbeImplementation
*/
usdt_abs_ip = note.loc_addr;
if (base_addr)
if (base_addr && note.base_addr)
usdt_abs_ip += base_addr - note.base_addr;
/* When attaching uprobes (which is what USDTs basically are)
@@ -708,8 +710,8 @@ static int collect_usdt_targets(struct usdt_manager *man, Elf *elf, const char *
if (vma_seg_cnt == 0) {
err = parse_vma_segs(pid, path, &vma_segs, &vma_seg_cnt);
if (err) {
pr_warn("usdt: failed to get memory segments in PID %d for shared library '%s': %d\n",
pid, path, err);
pr_warn("usdt: failed to get memory segments in PID %d for shared library '%s': %s\n",
pid, path, errstr(err));
goto err_out;
}
}
@@ -1047,8 +1049,8 @@ struct bpf_link *usdt_manager_attach_usdt(struct usdt_manager *man, const struct
if (is_new && bpf_map_update_elem(spec_map_fd, &spec_id, &target->spec, BPF_ANY)) {
err = -errno;
pr_warn("usdt: failed to set USDT spec #%d for '%s:%s' in '%s': %d\n",
spec_id, usdt_provider, usdt_name, path, err);
pr_warn("usdt: failed to set USDT spec #%d for '%s:%s' in '%s': %s\n",
spec_id, usdt_provider, usdt_name, path, errstr(err));
goto err_out;
}
if (!man->has_bpf_cookie &&
@@ -1058,9 +1060,9 @@ struct bpf_link *usdt_manager_attach_usdt(struct usdt_manager *man, const struct
pr_warn("usdt: IP collision detected for spec #%d for '%s:%s' in '%s'\n",
spec_id, usdt_provider, usdt_name, path);
} else {
pr_warn("usdt: failed to map IP 0x%lx to spec #%d for '%s:%s' in '%s': %d\n",
pr_warn("usdt: failed to map IP 0x%lx to spec #%d for '%s:%s' in '%s': %s\n",
target->abs_ip, spec_id, usdt_provider, usdt_name,
path, err);
path, errstr(err));
}
goto err_out;
}
@@ -1076,8 +1078,8 @@ struct bpf_link *usdt_manager_attach_usdt(struct usdt_manager *man, const struct
target->rel_ip, &opts);
err = libbpf_get_error(uprobe_link);
if (err) {
pr_warn("usdt: failed to attach uprobe #%d for '%s:%s' in '%s': %d\n",
i, usdt_provider, usdt_name, path, err);
pr_warn("usdt: failed to attach uprobe #%d for '%s:%s' in '%s': %s\n",
i, usdt_provider, usdt_name, path, errstr(err));
goto err_out;
}
@@ -1099,8 +1101,8 @@ struct bpf_link *usdt_manager_attach_usdt(struct usdt_manager *man, const struct
NULL, &opts_multi);
if (!link->multi_link) {
err = -errno;
pr_warn("usdt: failed to attach uprobe multi for '%s:%s' in '%s': %d\n",
usdt_provider, usdt_name, path, err);
pr_warn("usdt: failed to attach uprobe multi for '%s:%s' in '%s': %s\n",
usdt_provider, usdt_name, path, errstr(err));
goto err_out;
}

View File

@@ -223,7 +223,7 @@ struct zip_archive *zip_archive_open(const char *path)
if (!archive) {
munmap(data, size);
return ERR_PTR(-ENOMEM);
};
}
archive->data = data;
archive->size = size;