libbpf

mirror of https://github.com/netdata/libbpf.git synced 2026-03-13 21:09:07 +08:00

Author	SHA1	Message	Date
Mykyta Yatsenko	02bdeb7a2c	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 8e64c387c942229c551d0f23de4d9993d3a2acb6 Checkpoint bpf-next commit: 9325d53fe9adff354b6a93fda5f38c165947da0f Baseline bpf commit: b4432656b36e5cc1d50a1f2dc15357543add530e Checkpoint bpf commit: b4432656b36e5cc1d50a1f2dc15357543add530e Andrii Nakryiko (1): libbpf: Improve BTF dedup handling of "identical" BTF types Anton Protopopov (3): libbpf: Use proper errno value in linker bpf: Fix uninitialized values in BPF_{CORE,PROBE}_READ libbpf: Use proper errno value in nlattr Jiri Olsa (1): bpf: Add support to retrieve ref_ctr_offset for uprobe perf link Mykyta Yatsenko (1): libbpf: Check bpf_map_skeleton link for NULL include/uapi/linux/bpf.h \| 1 + src/bpf_core_read.h \| 6 ++ src/btf.c \| 137 +++++++++++++++++++++++++-------------- src/libbpf.c \| 6 ++ src/linker.c \| 4 +- src/nlattr.c \| 15 ++--- 6 files changed, 111 insertions(+), 58 deletions(-) Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>	2025-05-19 10:07:42 -07:00
Mykyta Yatsenko	453601a65a	libbpf: Check bpf_map_skeleton link for NULL Avoid dereferencing bpf_map_skeleton's link field if it's NULL. If BPF map skeleton is created with the size, that indicates containing link field, but the field was not actually initialized with valid bpf_link pointer, libbpf crashes. This may happen when using libbpf-rs skeleton. Skeleton loading may still progress, but user needs to attach struct_ops map separately. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250514113220.219095-1-mykyta.yatsenko5@gmail.com	2025-05-19 10:07:42 -07:00
Anton Protopopov	8e34ca4e8f	libbpf: Use proper errno value in nlattr Return value of the validate_nla() function can be propagated all the way up to users of libbpf API. In case of error this libbpf version of validate_nla returns -1 which will be seen as -EPERM from user's point of view. Instead, return a more reasonable -EINVAL. Fixes: bbf48c18ee0c ("libbpf: add error reporting in XDP") Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250510182011.2246631-1-a.s.protopopov@gmail.com	2025-05-19 10:07:42 -07:00
Jiri Olsa	eda0e4ca46	bpf: Add support to retrieve ref_ctr_offset for uprobe perf link Adding support to retrieve ref_ctr_offset for uprobe perf link, which got somehow omitted from the initial uprobe link info changes. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/bpf/20250509153539.779599-2-jolsa@kernel.org	2025-05-19 10:07:42 -07:00
Andrii Nakryiko	5ee9fbf7d7	libbpf: Improve BTF dedup handling of "identical" BTF types BTF dedup has a strong assumption that compiler with deduplicate identical types within any given compilation unit (i.e., .c file). This property is used when establishing equilvalence of two subgraphs of types. Unfortunately, this property doesn't always holds in practice. We've seen cases of having truly identical structs, unions, array definitions, and, most recently, even pointers to the same type being duplicated within CU. Previously, we mitigated this on a case-by-case basis, adding a few simple heuristics for validating that two BTF types (having two different type IDs) are structurally the same. But this approach scales poorly, and we can have more weird cases come up in the future. So let's take a half-step back, and implement a bit more generic structural equivalence check, recursively. We still limit it to reasonable depth to avoid long reference loops. Depth-wise limiting of potentially cyclical graph isn't great, but as I mentioned below doesn't seem to be detrimental performance-wise. We can always improve this in the future with per-type visited markers, if necessary. Performance-wise this doesn't seem too affect vmlinux BTF dedup, which makes sense because this logic kicks in not so frequently and only if we already established a canonical candidate type match, but suddenly find a different (but probably identical) type. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/r/20250501235231.1339822-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-19 10:07:42 -07:00
Anton Protopopov	d88ca95133	bpf: Fix uninitialized values in BPF_{CORE,PROBE}_READ With the latest LLVM bpf selftests build will fail with the following error message: progs/profiler.inc.h:710:31: error: default initialization of an object of type 'typeof ((parent_task)->real_cred->uid.val)' (aka 'const unsigned int') leaves the object uninitialized and is incompatible with C++ [-Werror,-Wdefault-const-init-unsafe] 710 \| proc_exec_data->parent_uid = BPF_CORE_READ(parent_task, real_cred, uid.val); \| ^ tools/testing/selftests/bpf/tools/include/bpf/bpf_core_read.h:520:35: note: expanded from macro 'BPF_CORE_READ' 520 \| ___type((src), a, ##__VA_ARGS__) __r; \ \| ^ This happens because BPF_CORE_READ (and other macro) declare the variable __r using the ___type macro which can inherit const modifier from intermediate types. Fix this by using __typeof_unqual__, when supported. (And when it is not supported, the problem shouldn't appear, as older compilers haven't complained.) Fixes: 792001f4f7aa ("libbpf: Add user-space variants of BPF_CORE_READ() family of macros") Fixes: a4b09a9ef945 ("libbpf: Add non-CO-RE variants of BPF_CORE_READ() macro family") Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250502193031.3522715-1-a.s.protopopov@gmail.com	2025-05-19 10:07:42 -07:00
Anton Protopopov	28deee2663	libbpf: Use proper errno value in linker Return values of the linker_append_sec_data() and the linker_append_elf_relos() functions are propagated all the way up to users of libbpf API. In some error cases these functions return -1 which will be seen as -EPERM from user's point of view. Instead, return a more reasonable -EINVAL. Fixes: faf6ed321cf6 ("libbpf: Add BPF static linker APIs") Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250430120820.2262053-1-a.s.protopopov@gmail.com	2025-05-19 10:07:42 -07:00
Andrii Nakryiko	374f7807e1	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 25601e85441dd91cf7973b002f27af4c5b8691ea Checkpoint bpf-next commit: 8e64c387c942229c551d0f23de4d9993d3a2acb6 Baseline bpf commit: 3f8ad18f8184 Checkpoint bpf commit: b4432656b36e5cc1d50a1f2dc15357543add530e Alan Maguire (1): libbpf: Add identical pointer detection to btf_dedup_is_equiv() Anton Protopopov (2): bpf: Fix a comment describing bpf_attr libbpf: Add likely/unlikely macros and use them in selftests Carlos Llamas (1): libbpf: Fix implicit memfd_create() for bionic Feng Yang (1): libbpf: Fix event name too long error Ihor Solodrai (1): libbpf: Verify section type in btf_find_elf_sections Jonathan Wiepert (1): Use thread-safe function pointer in libbpf_print Mykyta Yatsenko (1): libbpf: Add getters for BTF.ext func and line info Paul Chaignon (2): bpf: Clarify role of BPF_F_RECOMPUTE_CSUM bpf: Clarify the meaning of BPF_F_PSEUDO_HDR Tao Chen (1): libbpf: Remove sample_period init in perf_buffer Viktor Malik (1): libbpf: Fix buffer overflow in bpf_object__init_prog include/uapi/linux/bpf.h \| 18 +++++---- src/bpf_helpers.h \| 8 ++++ src/btf.c \| 22 +++++++++++ src/libbpf.c \| 81 +++++++++++++++++++++------------------- src/libbpf.h \| 6 +++ src/libbpf.map \| 4 ++ src/libbpf_internal.h \| 9 +++++ src/linker.c \| 2 +- 8 files changed, 103 insertions(+), 47 deletions(-) Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2025-04-29 11:33:37 -07:00
Andrii Nakryiko	27dc274f68	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2025-04-29 11:33:37 -07:00
Alan Maguire	c8a28812fb	libbpf: Add identical pointer detection to btf_dedup_is_equiv() Recently as a side-effect of commit ac053946f5c4 ("compiler.h: introduce TYPEOF_UNQUAL() macro") issues were observed in deduplication between modules and kernel BTF such that a large number of kernel types were not deduplicated so were found in module BTF (task_struct, bpf_prog etc). The root cause appeared to be a failure to dedup struct types, specifically those with members that were pointers with __percpu annotations. The issue in dedup is at the point that we are deduplicating structures, we have not yet deduplicated reference types like pointers. If multiple copies of a pointer point at the same (deduplicated) integer as in this case, we do not see them as identical. Special handling already exists to deal with structures and arrays, so add pointer handling here too. Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250429161042.2069678-1-alan.maguire@oracle.com	2025-04-29 11:33:37 -07:00
Jonathan Wiepert	88ae865423	Use thread-safe function pointer in libbpf_print This patch fixes a thread safety bug where libbpf_print uses the global variable storing the print function pointer rather than the local variable that had the print function set via __atomic_load_n. Fixes: f1cb927cdb62 ("libbpf: Ensure print callback usage is thread-safe") Signed-off-by: Jonathan Wiepert <jonathan.wiepert@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Mykyta Yatsenko <mykyta.yatsenko5@gmail.com> Link: https://lore.kernel.org/bpf/20250424221457.793068-1-jonathan.wiepert@gmail.com	2025-04-29 11:33:37 -07:00
Tao Chen	a2dc135196	libbpf: Remove sample_period init in perf_buffer It seems that sample_period is not used in perf buffer. Actually, only wakeup_events are meaningful to enable events aggregation for wakeup notification. Remove sample_period setting code to avoid confusion. Fixes: fb84b8224655 ("libbpf: add perf buffer API") Signed-off-by: Tao Chen <chen.dylane@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/bpf/20250423163901.2983689-1-chen.dylane@linux.dev	2025-04-29 11:33:37 -07:00
Feng Yang	715808d3e2	libbpf: Fix event name too long error When the binary path is excessively long, the generated probe_name in libbpf exceeds the kernel's MAX_EVENT_NAME_LEN limit (64 bytes). This causes legacy uprobe event attachment to fail with error code -22. The fix reorders the fields to place the unique ID before the name. This ensures that even if truncation occurs via snprintf, the unique ID remains intact, preserving event name uniqueness. Additionally, explicit checks with MAX_EVENT_NAME_LEN are added to enforce length constraints. Before Fix: ./test_progs -t attach_probe/kprobe-long_name ...... libbpf: failed to add legacy kprobe event for 'bpf_testmod_looooooooooooooooooooooooooooooong_name+0x0': -EINVAL libbpf: prog 'handle_kprobe': failed to create kprobe 'bpf_testmod_looooooooooooooooooooooooooooooong_name+0x0' perf event: -EINVAL test_attach_kprobe_long_event_name:FAIL:attach_kprobe_long_event_name unexpected error: -22 test_attach_probe:PASS:uprobe_ref_ctr_cleanup 0 nsec #13/11 attach_probe/kprobe-long_name:FAIL #13 attach_probe:FAIL ./test_progs -t attach_probe/uprobe-long_name ...... libbpf: failed to add legacy uprobe event for /root/linux-bpf/bpf-next/tools/testing/selftests/bpf/test_progs:0x13efd9: -EINVAL libbpf: prog 'handle_uprobe': failed to create uprobe '/root/linux-bpf/bpf-next/tools/testing/selftests/bpf/test_progs:0x13efd9' perf event: -EINVAL test_attach_uprobe_long_event_name:FAIL:attach_uprobe_long_event_name unexpected error: -22 #13/10 attach_probe/uprobe-long_name:FAIL #13 attach_probe:FAIL After Fix: ./test_progs -t attach_probe/uprobe-long_name #13/10 attach_probe/uprobe-long_name:OK #13 attach_probe:OK Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED ./test_progs -t attach_probe/kprobe-long_name #13/11 attach_probe/kprobe-long_name:OK #13 attach_probe:OK Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED Fixes: 46ed5fc33db9 ("libbpf: Refactor and simplify legacy kprobe code") Fixes: cc10623c6810 ("libbpf: Add legacy uprobe attaching support") Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Feng Yang <yangfeng@kylinos.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250417014848.59321-2-yangfeng59949@163.com	2025-04-29 11:33:37 -07:00
Ihor Solodrai	02bc656f90	libbpf: Verify section type in btf_find_elf_sections A valid ELF file may contain a SHT_NOBITS .BTF section. This case is not handled correctly in btf_parse_elf, which leads to a segfault. Before attempting to load BTF section data, check that the section type is SHT_PROGBITS, which is the expected type for BTF data. Fail with an error if the type is different. Bug report: https://github.com/libbpf/libbpf/issues/894 v1: https://lore.kernel.org/bpf/20250408184104.3962949-1-ihor.solodrai@linux.dev/ Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250410182823.1591681-1-ihor.solodrai@linux.dev	2025-04-29 11:33:37 -07:00
Viktor Malik	806b4e0a9f	libbpf: Fix buffer overflow in bpf_object__init_prog As shown in [1], it is possible to corrupt a BPF ELF file such that arbitrary BPF instructions are loaded by libbpf. This can be done by setting a symbol (BPF program) section offset to a large (unsigned) number such that <section start + symbol offset> overflows and points before the section data in the memory. Consider the situation below where: - prog_start = sec_start + symbol_offset <-- size_t overflow here - prog_end = prog_start + prog_size prog_start sec_start prog_end sec_end \| \| \| \| v v v v .....................\|################################\|............ The report in [1] also provides a corrupted BPF ELF which can be used as a reproducer: $ readelf -S crash Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align ... [ 2] uretprobe.mu[...] PROGBITS 0000000000000000 00000040 0000000000000068 0000000000000000 AX 0 0 8 $ readelf -s crash Symbol table '.symtab' contains 8 entries: Num: Value Size Type Bind Vis Ndx Name ... 6: ffffffffffffffb8 104 FUNC GLOBAL DEFAULT 2 handle_tp Here, the handle_tp prog has section offset ffffffffffffffb8, i.e. will point before the actual memory where section 2 is allocated. This is also reported by AddressSanitizer: ================================================================= ==1232==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7c7302fe0000 at pc 0x7fc3046e4b77 bp 0x7ffe64677cd0 sp 0x7ffe64677490 READ of size 104 at 0x7c7302fe0000 thread T0 #0 0x7fc3046e4b76 in memcpy (/lib64/libasan.so.8+0xe4b76) #1 0x00000040df3e in bpf_object__init_prog /src/libbpf/src/libbpf.c:856 #2 0x00000040df3e in bpf_object__add_programs /src/libbpf/src/libbpf.c:928 #3 0x00000040df3e in bpf_object__elf_collect /src/libbpf/src/libbpf.c:3930 #4 0x00000040df3e in bpf_object_open /src/libbpf/src/libbpf.c:8067 #5 0x00000040f176 in bpf_object__open_file /src/libbpf/src/libbpf.c:8090 #6 0x000000400c16 in main /poc/poc.c:8 #7 0x7fc3043d25b4 in __libc_start_call_main (/lib64/libc.so.6+0x35b4) #8 0x7fc3043d2667 in __libc_start_main@@GLIBC_2.34 (/lib64/libc.so.6+0x3667) #9 0x000000400b34 in _start (/poc/poc+0x400b34) 0x7c7302fe0000 is located 64 bytes before 104-byte region [0x7c7302fe0040,0x7c7302fe00a8) allocated by thread T0 here: #0 0x7fc3046e716b in malloc (/lib64/libasan.so.8+0xe716b) #1 0x7fc3045ee600 in __libelf_set_rawdata_wrlock (/lib64/libelf.so.1+0xb600) #2 0x7fc3045ef018 in __elf_getdata_rdlock (/lib64/libelf.so.1+0xc018) #3 0x00000040642f in elf_sec_data /src/libbpf/src/libbpf.c:3740 The problem here is that currently, libbpf only checks that the program end is within the section bounds. There used to be a check `while (sec_off < sec_sz)` in bpf_object__add_programs, however, it was removed by commit 6245947c1b3c ("libbpf: Allow gaps in BPF program sections to support overriden weak functions"). Add a check for detecting the overflow of `sec_off + prog_sz` to bpf_object__init_prog to fix this issue. [1] https://github.com/lmarch2/poc/blob/main/libbpf/libbpf.md Fixes: 6245947c1b3c ("libbpf: Allow gaps in BPF program sections to support overriden weak functions") Reported-by: lmarch2 <2524158037@qq.com> Signed-off-by: Viktor Malik <vmalik@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Link: https://github.com/lmarch2/poc/blob/main/libbpf/libbpf.md Link: https://lore.kernel.org/bpf/20250415155014.397603-1-vmalik@redhat.com	2025-04-29 11:33:37 -07:00
Paul Chaignon	0c85f5a154	bpf: Clarify the meaning of BPF_F_PSEUDO_HDR In the bpf_l4_csum_replace helper, the BPF_F_PSEUDO_HDR flag should only be set if the modified header field is part of the pseudo-header. If you modify for example the UDP ports and pass BPF_F_PSEUDO_HDR, inet_proto_csum_replace4 will update skb->csum even though it shouldn't (the port and the UDP checksum updates null each other). Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Link: https://lore.kernel.org/r/5126ef84ba75425b689482cbc98bffe75e5d8ab0.1744102490.git.paul.chaignon@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-04-29 11:33:37 -07:00
Paul Chaignon	7de6a44a0f	bpf: Clarify role of BPF_F_RECOMPUTE_CSUM BPF_F_RECOMPUTE_CSUM doesn't update the actual L3 and L4 checksums in the packet, but simply updates skb->csum (according to skb->ip_summed). This patch clarifies that to avoid confusions. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Link: https://lore.kernel.org/r/ff6895d42936f03dbb82334d8bcfd50e00c79086.1744102490.git.paul.chaignon@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-04-29 11:33:37 -07:00
Mykyta Yatsenko	abdb15bedd	libbpf: Add getters for BTF.ext func and line info Introducing new libbpf API getters for BTF.ext func and line info, namely: bpf_program__func_info bpf_program__func_info_cnt bpf_program__line_info bpf_program__line_info_cnt This change enables scenarios, when user needs to load bpf_program directly using `bpf_prog_load`, instead of higher-level `bpf_object__load`. Line and func info are required for checking BTF info in verifier; verification may fail without these fields if, for example, program calls `bpf_obj_new`. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250408234417.452565-2-mykyta.yatsenko5@gmail.com	2025-04-29 11:33:37 -07:00
Anton Protopopov	7a1388d55f	libbpf: Add likely/unlikely macros and use them in selftests A few selftests and, more importantly, consequent changes to the bpf_helpers.h file, use likely/unlikely macros, so define them here and remove duplicate definitions from existing selftests. Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250331203618.1973691-3-a.s.protopopov@gmail.com	2025-04-29 11:33:37 -07:00
Anton Protopopov	4687560af9	bpf: Fix a comment describing bpf_attr The map_fd field of the bpf_attr union is used in the BPF_MAP_FREEZE syscall. Explicitly mention this in the comments. Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250331203618.1973691-2-a.s.protopopov@gmail.com	2025-04-29 11:33:37 -07:00
Carlos Llamas	732d6c011f	libbpf: Fix implicit memfd_create() for bionic Since memfd_create() is not consistently available across different bionic libc implementations, using memfd_create() directly can break some Android builds: tools/lib/bpf/linker.c:576:7: error: implicit declaration of function 'memfd_create' [-Werror,-Wimplicit-function-declaration] 576 \| fd = memfd_create(filename, 0); \| ^ To fix this, relocate and inline the sys_memfd_create() helper so that it can be used in "linker.c". Similar issues were previously fixed by commit 9fa5e1a180aa ("libbpf: Call memfd_create() syscall directly"). Fixes: 6d5e5e5d7ce1 ("libbpf: Extend linker API to support in-memory ELF files") Signed-off-by: Carlos Llamas <cmllamas@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250330211325.530677-1-cmllamas@google.com	2025-04-29 11:33:37 -07:00
Mykyta Yatsenko	4659eaafa4	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 488a8544f839048064ea0647596af2aaca7ecc25 Checkpoint bpf-next commit: 25601e85441dd91cf7973b002f27af4c5b8691ea Baseline bpf commit: 46e88299d19695c2b21e245c52a86ed26ed5cfee Checkpoint bpf commit: 0c2623cef4f49e1ef6a908a389eea86130d11057 David Wei (1): netdev: add io_uring memory provider info Ian Rogers (1): libbpf: Add namespace for errstr making it libbpf_errstr Jason Xing (6): bpf: Add networking timestamping support to bpf_get/setsockopt() bpf: Add BPF_SOCK_OPS_TSTAMP_SCHED_CB callback bpf: Add BPF_SOCK_OPS_TSTAMP_SND_SW_CB callback bpf: Add BPF_SOCK_OPS_TSTAMP_SND_HW_CB callback bpf: Add BPF_SOCK_OPS_TSTAMP_ACK_CB callback bpf: Add BPF_SOCK_OPS_TSTAMP_SENDMSG_CB callback Joe Damato (1): netdev-genl: Add an XSK attribute to queues Kan Liang (1): perf: Extend per event callchain limit to branch stack Mykyta Yatsenko (2): bpf: BPF token support for BPF_BTF_GET_FD_BY_ID libbpf: Pass BPF token from find_prog_btf_id to BPF_BTF_GET_FD_BY_ID Song Yoong Siang (1): xsk: Add launch time hardware offload support to XDP Tx metadata include/uapi/linux/bpf.h \| 31 +++++++++++++++++++++++++++++++ include/uapi/linux/if_xdp.h \| 10 ++++++++++ include/uapi/linux/netdev.h \| 16 ++++++++++++++++ include/uapi/linux/perf_event.h \| 2 ++ src/bpf.c \| 3 ++- src/bpf.h \| 3 ++- src/btf.c \| 15 +++++++++++++-- src/libbpf.c \| 10 +++++----- src/libbpf_internal.h \| 1 + src/str_error.c \| 2 +- src/str_error.h \| 7 +++++-- 11 files changed, 88 insertions(+), 12 deletions(-) Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>	2025-04-02 14:24:25 -07:00
Ian Rogers	2a228c7885	libbpf: Add namespace for errstr making it libbpf_errstr When statically linking symbols can be replaced with those from other statically linked libraries depending on the link order and the hoped for "multiple definition" error may not appear. To avoid conflicts it is good practice to namespace symbols, this change renames errstr to libbpf_errstr. To avoid churn a #define is used to turn use of errstr(err) to libbpf_errstr(err). Fixes: 1633a83bf993 ("libbpf: Introduce errstr() for stringifying errno") Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250320222439.1350187-1-irogers@google.com	2025-04-02 14:24:25 -07:00
Mykyta Yatsenko	90844e28dc	libbpf: Pass BPF token from find_prog_btf_id to BPF_BTF_GET_FD_BY_ID Pass BPF token from bpf_program__set_attach_target to BPF_BTF_GET_FD_BY_ID bpf command. When freplace program attaches to target program, it needs to look up for BTF of the target, this may require BPF token, if, for example, running from user namespace. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/20250317174039.161275-4-mykyta.yatsenko5@gmail.com	2025-04-02 14:24:25 -07:00
Mykyta Yatsenko	009a8cb452	bpf: BPF token support for BPF_BTF_GET_FD_BY_ID Currently BPF_BTF_GET_FD_BY_ID requires CAP_SYS_ADMIN, which does not allow running it from user namespace. This creates a problem when freplace program running from user namespace needs to query target program BTF. This patch relaxes capable check from CAP_SYS_ADMIN to CAP_BPF and adds support for BPF token that can be passed in attributes to syscall. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250317174039.161275-2-mykyta.yatsenko5@gmail.com	2025-04-02 14:24:25 -07:00
Song Yoong Siang	89cad6a160	xsk: Add launch time hardware offload support to XDP Tx metadata Extend the XDP Tx metadata framework so that user can requests launch time hardware offload, where the Ethernet device will schedule the packet for transmission at a pre-determined time called launch time. The value of launch time is communicated from user space to Ethernet driver via launch_time field of struct xsk_tx_metadata. Suggested-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Acked-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250216093430.957880-2-yoong.siang.song@intel.com	2025-04-02 14:24:25 -07:00
Jason Xing	5cbd13ee02	bpf: Add BPF_SOCK_OPS_TSTAMP_SENDMSG_CB callback This patch introduces a new callback in tcp_tx_timestamp() to correlate tcp_sendmsg timestamp with timestamps from other tx timestamping callbacks (e.g., SND/SW/ACK). Without this patch, BPF program wouldn't know which timestamps belong to which flow because of no socket lock protection. This new callback is inserted in tcp_tx_timestamp() to address this issue because tcp_tx_timestamp() still owns the same socket lock with tcp_sendmsg_locked() in the meanwhile tcp_tx_timestamp() initializes the timestamping related fields for the skb, especially tskey. The tskey is the bridge to do the correlation. For TCP, BPF program hooks the beginning of tcp_sendmsg_locked() and then stores the sendmsg timestamp at the bpf_sk_storage, correlating this timestamp with its tskey that are later used in other sending timestamping callbacks. Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250220072940.99994-11-kerneljasonxing@gmail.com	2025-04-02 14:24:25 -07:00
Jason Xing	79e19bb62b	bpf: Add BPF_SOCK_OPS_TSTAMP_ACK_CB callback Support the ACK case for bpf timestamping. Add a new sock_ops callback, BPF_SOCK_OPS_TSTAMP_ACK_CB. This callback will occur at the same timestamping point as the user space's SCM_TSTAMP_ACK. The BPF program can use it to get the same SCM_TSTAMP_ACK timestamp without modifying the user-space application. This patch extends txstamp_ack to two bits: 1 stands for SO_TIMESTAMPING mode, 2 bpf extension. Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250220072940.99994-10-kerneljasonxing@gmail.com	2025-04-02 14:24:25 -07:00
Jason Xing	253b5ce758	bpf: Add BPF_SOCK_OPS_TSTAMP_SND_HW_CB callback Support hw SCM_TSTAMP_SND case for bpf timestamping. Add a new sock_ops callback, BPF_SOCK_OPS_TSTAMP_SND_HW_CB. This callback will occur at the same timestamping point as the user space's hardware SCM_TSTAMP_SND. The BPF program can use it to get the same SCM_TSTAMP_SND timestamp without modifying the user-space application. To avoid increasing the code complexity, replace SKBTX_HW_TSTAMP with SKBTX_HW_TSTAMP_NOBPF instead of changing numerous callers from driver side using SKBTX_HW_TSTAMP. The new definition of SKBTX_HW_TSTAMP means the combination tests of socket timestamping and bpf timestamping. After this patch, drivers can work under the bpf timestamping. Considering some drivers don't assign the skb with hardware timestamp, this patch does the assignment and then BPF program can acquire the hwstamp from skb directly. Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250220072940.99994-9-kerneljasonxing@gmail.com	2025-04-02 14:24:25 -07:00
Jason Xing	d855493df1	bpf: Add BPF_SOCK_OPS_TSTAMP_SND_SW_CB callback Support sw SCM_TSTAMP_SND case for bpf timestamping. Add a new sock_ops callback, BPF_SOCK_OPS_TSTAMP_SND_SW_CB. This callback will occur at the same timestamping point as the user space's software SCM_TSTAMP_SND. The BPF program can use it to get the same SCM_TSTAMP_SND timestamp without modifying the user-space application. Based on this patch, BPF program will get the software timestamp when the driver is ready to send the skb. In the sebsequent patch, the hardware timestamp will be supported. Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250220072940.99994-8-kerneljasonxing@gmail.com	2025-04-02 14:24:25 -07:00
Jason Xing	7ea10cfba8	bpf: Add BPF_SOCK_OPS_TSTAMP_SCHED_CB callback Support SCM_TSTAMP_SCHED case for bpf timestamping. Add a new sock_ops callback, BPF_SOCK_OPS_TSTAMP_SCHED_CB. This callback will occur at the same timestamping point as the user space's SCM_TSTAMP_SCHED. The BPF program can use it to get the same SCM_TSTAMP_SCHED timestamp without modifying the user-space application. A new SKBTX_BPF flag is added to mark skb_shinfo(skb)->tx_flags, ensuring that the new BPF timestamping and the current user space's SO_TIMESTAMPING do not interfere with each other. Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250220072940.99994-7-kerneljasonxing@gmail.com	2025-04-02 14:24:25 -07:00
Jason Xing	43b6c2cd70	bpf: Add networking timestamping support to bpf_get/setsockopt() The new SK_BPF_CB_FLAGS and new SK_BPF_CB_TX_TIMESTAMPING are added to bpf_get/setsockopt. The later patches will implement the BPF networking timestamping. The BPF program will use bpf_setsockopt(SK_BPF_CB_FLAGS, SK_BPF_CB_TX_TIMESTAMPING) to enable the BPF networking timestamping on a socket. Signed-off-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250220072940.99994-2-kerneljasonxing@gmail.com	2025-04-02 14:24:25 -07:00
Joe Damato	b1cb441916	netdev-genl: Add an XSK attribute to queues Expose a new per-queue nest attribute, xsk, which will be present for queues that are being used for AF_XDP. If the queue is not being used for AF_XDP, the nest will not be present. In the future, this attribute can be extended to include more data about XSK as it is needed. Signed-off-by: Joe Damato <jdamato@fastly.com> Suggested-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250214211255.14194-3-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-04-02 14:24:25 -07:00
David Wei	01500813ad	netdev: add io_uring memory provider info Add a nested attribute for io_uring memory provider info. For now it is empty and its presence indicates that a particular page pool or queue has an io_uring memory provider attached. $ ./cli.py --spec netlink/specs/netdev.yaml --dump page-pool-get [{'id': 80, 'ifindex': 2, 'inflight': 64, 'inflight-mem': 262144, 'napi-id': 525}, {'id': 79, 'ifindex': 2, 'inflight': 320, 'inflight-mem': 1310720, 'io_uring': {}, 'napi-id': 525}, ... $ ./cli.py --spec netlink/specs/netdev.yaml --dump queue-get [{'id': 0, 'ifindex': 1, 'type': 'rx'}, {'id': 0, 'ifindex': 1, 'type': 'tx'}, {'id': 0, 'ifindex': 2, 'napi-id': 513, 'type': 'rx'}, {'id': 1, 'ifindex': 2, 'napi-id': 514, 'type': 'rx'}, ... {'id': 12, 'ifindex': 2, 'io_uring': {}, 'napi-id': 525, 'type': 'rx'}, ... Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: David Wei <dw@davidwei.uk> Link: https://patch.msgid.link/20250204215622.695511-6-dw@davidwei.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-04-02 14:24:25 -07:00
Kan Liang	59171f49e9	perf: Extend per event callchain limit to branch stack The commit 97c79a38cd45 ("perf core: Per event callchain limit") introduced a per-event term to allow finer tuning of the depth of callchains to save space. It should be applied to the branch stack as well. For example, autoFDO collections require maximum LBR entries. In the meantime, other system-wide LBR users may only be interested in the latest a few number of LBRs. A per-event LBR depth would save the perf output buffer. The patch simply drops the uninterested branches, but HW still collects the maximum branches. There may be a model-specific optimization that can reduce the HW depth for some cases to reduce the overhead further. But it isn't included in the patch set. Because it's not useful for all cases. For example, ARCH LBR can utilize the PEBS and XSAVE to collect LBRs. The depth should have less impact on the collecting overhead. The model-specific optimization may be implemented later separately. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20250310181536.3645382-1-kan.liang@linux.intel.com	2025-04-02 14:24:25 -07:00
Ihor Solodrai	1b8768339f	ci: add temporary patches for selftests * https://lore.kernel.org/all/20250327185528.1740787-1-song@kernel.org/ * https://lore.kernel.org/bpf/20250328193124.808784-1-song@kernel.org/ * https://lore.kernel.org/bpf/20250331033828.365077-1-yonghong.song@linux.dev/ Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>	2025-04-02 10:23:23 -07:00
Ihor Solodrai	374036c9f1	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 239860828f8660e2be487e2fbdae2640cce3fd67 Checkpoint bpf-next commit: 79d93c8ff35855d3283ee7d82dfe0c54f90b9986 Baseline bpf commit: 319fc77f8f45a1b3dba15b0cc1a869778fd222f7 Checkpoint bpf commit: 6ccf6adb05d0fe3dbb1a77ab90bf054da8a2198d Ihor Solodrai (1): libbpf: Implement bpf_usdt_arg_size BPF function Mykyta Yatsenko (3): libbpf: Use map_is_created helper in map setters libbpf: Introduce more granular state for bpf_object libbpf: Split bpf object load into prepare/load Nandakumar Edamana (1): libbpf: Fix out-of-bound read Peilin Ye (1): bpf: Introduce load-acquire and store-release instructions Yonghong Song (1): bpf: Allow pre-ordering for bpf cgroup progs include/uapi/linux/bpf.h \| 4 + src/libbpf.c \| 201 ++++++++++++++++++++++++++------------- src/libbpf.h \| 13 +++ src/libbpf.map \| 1 + src/usdt.bpf.h \| 32 +++++++ 5 files changed, 183 insertions(+), 68 deletions(-) Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>	2025-03-10 15:35:17 -07:00
Peilin Ye	bf62e0dcfd	bpf: Introduce load-acquire and store-release instructions Introduce BPF instructions with load-acquire and store-release semantics, as discussed in [1]. Define 2 new flags: #define BPF_LOAD_ACQ 0x100 #define BPF_STORE_REL 0x110 A "load-acquire" is a BPF_STX \| BPF_ATOMIC instruction with the 'imm' field set to BPF_LOAD_ACQ (0x100). Similarly, a "store-release" is a BPF_STX \| BPF_ATOMIC instruction with the 'imm' field set to BPF_STORE_REL (0x110). Unlike existing atomic read-modify-write operations that only support BPF_W (32-bit) and BPF_DW (64-bit) size modifiers, load-acquires and store-releases also support BPF_B (8-bit) and BPF_H (16-bit). As an exception, however, 64-bit load-acquires/store-releases are not supported on 32-bit architectures (to fix a build error reported by the kernel test robot). An 8- or 16-bit load-acquire zero-extends the value before writing it to a 32-bit register, just like ARM64 instruction LDARH and friends. Similar to existing atomic read-modify-write operations, misaligned load-acquires/store-releases are not allowed (even if BPF_F_ANY_ALIGNMENT is set). As an example, consider the following 64-bit load-acquire BPF instruction (assuming little-endian): db 10 00 00 00 01 00 00 r0 = load_acquire((u64 )(r1 + 0x0)) opcode (0xdb): BPF_ATOMIC \| BPF_DW \| BPF_STX imm (0x00000100): BPF_LOAD_ACQ Similarly, a 16-bit BPF store-release: cb 21 00 00 10 01 00 00 store_release((u16 )(r1 + 0x0), w2) opcode (0xcb): BPF_ATOMIC \| BPF_H \| BPF_STX imm (0x00000110): BPF_STORE_REL In arch/{arm64,s390,x86}/net/bpf_jit_comp.c, have bpf_jit_supports_insn(..., /in_arena=/true) return false for the new instructions, until the corresponding JIT compiler supports them in arena. [1] https://lore.kernel.org/all/20240729183246.4110549-1-yepeilin@google.com/ Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Cc: kernel test robot <lkp@intel.com> Signed-off-by: Peilin Ye <yepeilin@google.com> Link: https://lore.kernel.org/r/a217f46f0e445fbd573a1a024be5c6bf1d5fe716.1741049567.git.yepeilin@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>	2025-03-10 15:35:17 -07:00
Mykyta Yatsenko	855a5d7904	libbpf: Split bpf object load into prepare/load Introduce bpf_object__prepare API: additional intermediate preparation step that performs ELF processing, relocations, prepares final state of BPF program instructions (accessible with bpf_program__insns()), creates and (potentially) pins maps, and stops short of loading BPF programs. We anticipate few use cases for this API, such as: * Use prepare to initialize bpf_token, without loading freplace programs, unlocking possibility to lookup BTF of other programs. * Execute prepare to obtain finalized BPF program instructions without loading programs, enabling tools like veristat to process one program at a time, without incurring cost of ELF parsing and processing. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250303135752.158343-4-mykyta.yatsenko5@gmail.com Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>	2025-03-10 15:35:17 -07:00
Mykyta Yatsenko	16c58c33c8	libbpf: Introduce more granular state for bpf_object We are going to split bpf_object loading into 2 stages: preparation and loading. This will increase flexibility when working with bpf_object and unlock some optimizations and use cases. This patch substitutes a boolean flag (loaded) by more finely-grained state for bpf_object. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250303135752.158343-3-mykyta.yatsenko5@gmail.com Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>	2025-03-10 15:35:17 -07:00
Mykyta Yatsenko	e14bb3629f	libbpf: Use map_is_created helper in map setters Refactoring: use map_is_created helper in map setters that need to check the state of the map. This helps to reduce the number of the places that depend explicitly on the loaded flag, simplifying refactoring in the next patch of this set. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250303135752.158343-2-mykyta.yatsenko5@gmail.com Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>	2025-03-10 15:35:17 -07:00
Yonghong Song	fbda5d7d2f	bpf: Allow pre-ordering for bpf cgroup progs Currently for bpf progs in a cgroup hierarchy, the effective prog array is computed from bottom cgroup to upper cgroups (post-ordering). For example, the following cgroup hierarchy root cgroup: p1, p2 subcgroup: p3, p4 have BPF_F_ALLOW_MULTI for both cgroup levels. The effective cgroup array ordering looks like p3 p4 p1 p2 and at run time, progs will execute based on that order. But in some cases, it is desirable to have root prog executes earlier than children progs (pre-ordering). For example, - prog p1 intends to collect original pkt dest addresses. - prog p3 will modify original pkt dest addresses to a proxy address for security reason. The end result is that prog p1 gets proxy address which is not what it wants. Putting p1 to every child cgroup is not desirable either as it will duplicate itself in many child cgroups. And this is exactly a use case we are encountering in Meta. To fix this issue, let us introduce a flag BPF_F_PREORDER. If the flag is specified at attachment time, the prog has higher priority and the ordering with that flag will be from top to bottom (pre-ordering). For example, in the above example, root cgroup: p1, p2 subcgroup: p3, p4 Let us say p2 and p4 are marked with BPF_F_PREORDER. The final effective array ordering will be p2 p4 p3 p1 Suggested-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20250224230116.283071-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>	2025-03-10 15:35:17 -07:00
Ihor Solodrai	be18fdb16a	libbpf: Implement bpf_usdt_arg_size BPF function Information about USDT argument size is implicitly stored in __bpf_usdt_arg_spec, but currently it's not accessbile to BPF programs that use USDT. Implement bpf_sdt_arg_size() that returns the size of an USDT argument in bytes. v1->v2: * do not add __bpf_usdt_arg_spec() helper v1: https://lore.kernel.org/bpf/20250220215904.3362709-1-ihor.solodrai@linux.dev/ Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20250224235756.2612606-1-ihor.solodrai@linux.dev	2025-03-10 15:35:17 -07:00
Nandakumar Edamana	82f60c9b5e	libbpf: Fix out-of-bound read In `set_kcfg_value_str`, an untrusted string is accessed with the assumption that it will be at least two characters long due to the presence of checks for opening and closing quotes. But the check for the closing quote (value[len - 1] != '"') misses the fact that it could be checking the opening quote itself in case of an invalid input that consists of just the opening quote. This commit adds an explicit check to make sure the string is at least two characters long. Signed-off-by: Nandakumar Edamana <nandakumar@nandakumar.co.in> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250221210110.3182084-1-nandakumar@nandakumar.co.in Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>	2025-03-10 15:35:17 -07:00
Andrii Nakryiko	4c893341f5	Makefile: detect pkg-config availability Detect whether build system has pkg-config tool, and if not, fallback to manually specifying -lelf -lz as dependency. Closes: https://github.com/libbpf/libbpf/issues/885 Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2025-03-03 18:59:47 -08:00
Ihor Solodrai	42a6ef6316	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 01f3ce5328c405179b2c69ea047c423dad2bfa6d Checkpoint bpf-next commit: 239860828f8660e2be487e2fbdae2640cce3fd67 Baseline bpf commit: c45323b7560ec87c37c729b703c86ee65f136d75 Checkpoint bpf commit: 319fc77f8f45a1b3dba15b0cc1a869778fd222f7 Andrii Nakryiko (2): libbpf: fix LDX/STX/ST CO-RE relocation size adjustment logic libbpf: Fix hypothetical STT_SECTION extern NULL deref case Daniel Borkmann (1): netkit: Allow for configuring needed_{head,tail}room Ihor Solodrai (3): libbpf: Introduce kflag for type_tags and decl_tags in BTF docs/bpf: Document the semantics of BTF tags with kind_flag libbpf: Check the kflag of type tags in btf_dump Tao Chen (1): libbpf: Wrap libbpf API direct err with libbpf_err Tony Ambardar (1): libbpf: Fix accessing BTF.ext core_relo header Yonghong Song (1): bpf: Sync uapi bpf.h header for the tooling infra include/uapi/linux/bpf.h \| 5 +- include/uapi/linux/btf.h \| 3 +- include/uapi/linux/if_link.h \| 2 + src/btf.c \| 90 ++++++++++++++++++++++++++---------- src/btf.h \| 3 ++ src/btf_dump.c \| 5 +- src/libbpf.c \| 26 +++++------ src/libbpf.map \| 2 + src/linker.c \| 2 +- src/relo_core.c \| 24 ++++++++-- 10 files changed, 116 insertions(+), 46 deletions(-) Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>	2025-02-24 15:10:59 -08:00
Andrii Nakryiko	041d5948f3	libbpf: Fix hypothetical STT_SECTION extern NULL deref case Fix theoretical NULL dereference in linker when resolving extern STT_SECTION symbol against not-yet-existing ELF section. Not sure if it's possible in practice for valid ELF object files (this would require embedded assembly manipulations, at which point BTF will be missing), but fix the s/dst_sym/dst_sec/ typo guarding this condition anyways. Fixes: faf6ed321cf6 ("libbpf: Add BPF static linker APIs") Fixes: a46349227cd8 ("libbpf: Add linker extern resolution support for functions and global variables") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20250220002821.834400-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>	2025-02-24 15:10:59 -08:00
Tao Chen	39a589c74e	libbpf: Wrap libbpf API direct err with libbpf_err Just wrap the direct err with libbpf_err, keep consistency with other APIs. Signed-off-by: Tao Chen <chen.dylane@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20250219153711.29651-1-chen.dylane@linux.dev Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>	2025-02-24 15:10:59 -08:00
Andrii Nakryiko	d7a4ab1548	libbpf: fix LDX/STX/ST CO-RE relocation size adjustment logic Libbpf has a somewhat obscure feature of automatically adjusting the "size" of LDX/STX/ST instruction (memory store and load instructions), based on originally recorded access size (u8, u16, u32, or u64) and the actual size of the field on target kernel. This is meant to facilitate using BPF CO-RE on 32-bit architectures (pointers are always 64-bit in BPF, but host kernel's BTF will have it as 32-bit type), as well as generally supporting safe type changes (unsigned integer type changes can be transparently "relocated"). One issue that surfaced only now, 5 years after this logic was implemented, is how this all works when dealing with fields that are arrays. This isn't all that easy and straightforward to hit (see selftests that reproduce this condition), but one of sched_ext BPF programs did hit it with innocent looking loop. Long story short, libbpf used to calculate entire array size, instead of making sure to only calculate array's element size. But it's the element that is loaded by LDX/STX/ST instructions (1, 2, 4, or 8 bytes), so that's what libbpf should check. This patch adjusts the logic for arrays and fixed the issue. Reported-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250207014809.1573841-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>	2025-02-24 15:10:59 -08:00
Yonghong Song	4eed43c229	bpf: Sync uapi bpf.h header for the tooling infra Commit 0abff462d802 ("bpf: Add comment about helper freeze") missed the tooling header sync. Fix it. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20250213050427.2788837-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>	2025-02-24 15:10:59 -08:00
Ihor Solodrai	cc278ff7c0	libbpf: Check the kflag of type tags in btf_dump If the kflag is set for a BTF type tag, then the tag represents an arbitrary __attribute__. Change btf_dump accordingly. Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250130201239.1429648-4-ihor.solodrai@linux.dev	2025-02-24 15:10:59 -08:00
Ihor Solodrai	2b8b896bca	docs/bpf: Document the semantics of BTF tags with kind_flag Explain the meaning of kind_flag in BTF type_tags and decl_tags. Update uapi btf.h kind_flag comment to reflect the changes. Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250130201239.1429648-3-ihor.solodrai@linux.dev	2025-02-24 15:10:59 -08:00
Ihor Solodrai	32bda80136	libbpf: Introduce kflag for type_tags and decl_tags in BTF Add the following functions to libbpf API: * btf__add_type_attr() * btf__add_decl_attr() These functions allow to add to BTF the type tags and decl tags with info->kflag set to 1. The kflag indicates that the tag directly encodes an __attribute__ and not a normal tag. See Documentation/bpf/btf.rst changes in the subsequent patch for details on the semantics. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250130201239.1429648-2-ihor.solodrai@linux.dev	2025-02-24 15:10:59 -08:00
Tony Ambardar	71208c3362	libbpf: Fix accessing BTF.ext core_relo header Update btf_ext_parse_info() to ensure the core_relo header is present before reading its fields. This avoids a potential buffer read overflow reported by the OSS Fuzz project. Fixes: cf579164e9ea ("libbpf: Support BTF.ext loading and output in either endianness") Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://issues.oss-fuzz.com/issues/388905046 Link: https://lore.kernel.org/bpf/20250125065236.2603346-1-itugrok@yahoo.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>	2025-02-24 15:10:59 -08:00
Daniel Borkmann	9544a909f1	netkit: Allow for configuring needed_{head,tail}room Allow the user to configure needed_{head,tail}room for both netkit devices. The idea is similar to 163e529200af ("veth: implement ndo_set_rx_headroom") with the difference that the two parameters can be specified upon device creation. By default the current behavior stays as is which is needed_{head,tail}room is 0. In case of Cilium, for example, the netkit devices are not enslaved into a bridge or openvswitch device (rather, BPF-based redirection is used out of tcx), and as such these parameters are not propagated into the Pod's netns via peer device. Given Cilium can run in vxlan/geneve tunneling mode (needed_headroom) and/or be used in combination with WireGuard (needed_{head,tail}room), allow the Cilium CNI plugin to specify these two upon netkit device creation. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/bpf/20241220234658.490686-1-daniel@iogearbox.net Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>	2025-02-24 15:10:59 -08:00
Ihor Solodrai	d4a841a32b	ci: remove dependency on run-on-arch-action run-on-arch-action is simply a wrapper around docker. There is no value in using it in libbpf, as it is not complicated to run non-native arch docker images directly on github-hosted runners. Docker relies on qemu-user-static installed on the system to emulate different architectures. Recently there were various reports about multi-arch docker builds failing with seemingly random issues, and it appears to boil down to qemu [1]. I stumbled on this problem while updating s390x runners [2] for BPF CI, and setting up more recent version of qemu helped. This change addresses recent build failures on s390x and ppc64le. [1] https://github.com/docker/setup-qemu-action/issues/188 [2] https://github.com/kernel-patches/runner/pull/69 [3] https://docs.docker.com/build/buildkit/#getting-started Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev>	2025-01-31 22:30:52 -08:00
Ihor Solodrai	324f3c3846	ci: run coverty scan on push to master Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>	2025-01-17 13:53:19 -08:00
Ihor Solodrai	63528b7a4d	ci: remove sourcing helpers.sh from coverity workflow Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>	2025-01-17 13:53:19 -08:00
Andrii Nakryiko	7abfe520df	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: f44275e7155dc310d36516fc25be503da099781c Checkpoint bpf-next commit: 01f3ce5328c405179b2c69ea047c423dad2bfa6d Baseline bpf commit: 9d89551994a430b50c4fffcb1e617a057fa76e20 Checkpoint bpf commit: c45323b7560ec87c37c729b703c86ee65f136d75 Andrii Nakryiko (1): libbpf: Work around kernel inconsistently stripping '.llvm.' suffix Pu Lehui (2): libbpf: Fix return zero when elf_begin failed libbpf: Fix incorrect traversal end type ID when marking BTF_IS_EMBEDDED Vishal Chourasia (1): tools: Sync if_xdp.h uapi tooling header Yonghong Song (1): libbpf: Add unique_match option for multi kprobe include/uapi/linux/if_xdp.h \| 4 ++-- src/btf.c \| 1 + src/btf_relocate.c \| 2 +- src/libbpf.c \| 39 +++++++++++++++++++++++++++++++++++-- src/libbpf.h \| 4 +++- 5 files changed, 44 insertions(+), 6 deletions(-) Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2025-01-17 12:31:44 -08:00
Vishal Chourasia	d76c770473	tools: Sync if_xdp.h uapi tooling header Sync if_xdp.h uapi header to remove following warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/if_xdp.h' differs from latest version at 'include/uapi/linux/if_xdp.h' Fixes: 48eb03dd2630 ("xsk: Add TX timestamp and TX checksum offload support") Signed-off-by: Vishal Chourasia <vishalc@linux.ibm.com> Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20250115032248.125742-1-yoong.siang.song@intel.com	2025-01-17 12:31:44 -08:00
Andrii Nakryiko	444f3c0e7a	libbpf: Work around kernel inconsistently stripping '.llvm.' suffix Some versions of kernel were stripping out '.llvm.<hash>' suffix from kerne symbols (produced by Clang LTO compilation) from function names reported in available_filter_functions, while kallsyms reported full original name. This confuses libbpf's multi-kprobe logic of finding all matching kernel functions for specified user glob pattern by joining available_filter_functions and kallsyms contents, because joining by full symbol name won't work for symbols containing '.llvm.<hash>' suffix. This was eventually fixed by [0] in the kernel, but we'd like to not regress multi-kprobe experience and add a work around for this bug on libbpf side, stripping kallsym's name if it matches user pattern and contains '.llvm.' suffix. [0] fb6a421fb615 ("kallsyms: Match symbols exactly with CONFIG_LTO_CLANG") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/20250117003957.179331-1-andrii@kernel.org	2025-01-17 12:31:44 -08:00
Pu Lehui	719aeb7a6e	libbpf: Fix incorrect traversal end type ID when marking BTF_IS_EMBEDDED When redirecting the split BTF to the vmlinux base BTF, we need to mark the distilled base struct/union members of split BTF structs/unions in id_map with BTF_IS_EMBEDDED. This indicates that these types must match both name and size later. Therefore, we need to traverse the entire split BTF, which involves traversing type IDs from nr_dist_base_types to nr_types. However, the current implementation uses an incorrect traversal end type ID, so let's correct it. Fixes: 19e00c897d50 ("libbpf: Split BTF relocation") Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250115100241.4171581-3-pulehui@huaweicloud.com	2025-01-17 12:31:44 -08:00
Pu Lehui	a7edf4aec8	libbpf: Fix return zero when elf_begin failed The error number of elf_begin is omitted when encapsulating the btf_find_elf_sections function. Fixes: c86f180ffc99 ("libbpf: Make btf_parse_elf process .BTF.base transparently") Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250115100241.4171581-2-pulehui@huaweicloud.com	2025-01-17 12:31:44 -08:00
Yonghong Song	32792ec66c	libbpf: Add unique_match option for multi kprobe Jordan reported an issue in Meta production environment where func try_to_wake_up() is renamed to try_to_wake_up.llvm.<hash>() by clang compiler at lto mode. The original 'kprobe/try_to_wake_up' does not work any more since try_to_wake_up() does not match the actual func name in /proc/kallsyms. There are a couple of ways to resolve this issue. For example, in attach_kprobe(), we could do lookup in /proc/kallsyms so try_to_wake_up() can be replaced by try_to_wake_up.llvm.<hach>(). Or we can force users to use bpf_program__attach_kprobe() where they need to lookup /proc/kallsyms to find out try_to_wake_up.llvm.<hach>(). But these two approaches requires extra work by either libbpf or user. Luckily, suggested by Andrii, multi kprobe already supports wildcard ('') for symbol matching. In the above example, 'try_to_wake_up' can match to try_to_wake_up() or try_to_wake_up.llvm.<hash>() and this allows bpf prog works for different kernels as some kernels may have try_to_wake_up() and some others may have try_to_wake_up.llvm.<hash>(). The original intention is to kprobe try_to_wake_up() only, so an optional field unique_match is added to struct bpf_kprobe_multi_opts. If the field is set to true, the number of matched functions must be one. Otherwise, the attachment will fail. In the above case, multi kprobe with 'try_to_wake_up*' and unique_match preserves user functionality. Reported-by: Jordan Rome <linux@jordanrome.com> Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250109174023.3368432-1-yonghong.song@linux.dev	2025-01-17 12:31:44 -08:00
Ihor Solodrai	c924f8d3dd	ci: sync with libbpf/ci@v3 * vmtest.yml * use v3 of libbpf/ci actions * remove unnecessary selftests preparation steps * ci/vmtest * remove unnecessary scripts and configs * add libbpf-specific run-vmtest.env [1] [1] https://github.com/libbpf/ci/pull/166 Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>	2025-01-15 19:30:48 -08:00
Daniel Müller	0ff2f8e0ee	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: a1087da9d11e5bcacc706002bc0f84b790881f69 Checkpoint bpf-next commit: f44275e7155dc310d36516fc25be503da099781c Baseline bpf commit: fb86c42a2a5d44e849ddfbc98b8d2f4f40d36ee3 Checkpoint bpf commit: 9d89551994a430b50c4fffcb1e617a057fa76e20 Adrian Hunter (1): perf/core: Add aux_pause, aux_resume, aux_start_paused Alastair Robertson (2): libbpf: Pull file-opening logic up to top-level functions libbpf: Extend linker API to support in-memory ELF files Andrii Nakryiko (1): libbpf: don't adjust USDT semaphore address if .stapsdt.base addr is missing Anton Protopopov (2): bpf: Add fd_array_cnt attribute for prog_load libbpf: prog load: Allow to use fd_array_cnt Ben Olson (1): libbpf: Improve debug message when the base BTF cannot be found Daniel Borkmann (1): tools: Sync if_link.h uapi tooling header Daniel Xu (1): libbpf: Set MFD_NOEXEC_SEAL when creating memfd Eric Dumazet (1): net: add IFLA_MAX_PACING_OFFLOAD_HORIZON device attribute Jiri Olsa (1): libbpf: Fix memory leak in bpf_program__attach_uprobe_multi Joe Damato (3): netdev-genl: Dump napi_defer_hard_irqs netdev-genl: Dump gro_flush_timeout netdev-genl: Support setting per-NAPI config values Martin Karsten (1): net: Add napi_struct parameter irq_suspend_timeout Quentin Monnet (1): libbpf: Fix segfault due to libelf functions not setting errno Sidong Yang (1): libbpf: Change hash_combine parameters from long to unsigned long include/uapi/linux/bpf.h \| 10 + include/uapi/linux/if_link.h \| 554 +++++++++++++++++++++++++++++++- include/uapi/linux/netdev.h \| 4 + include/uapi/linux/perf_event.h \| 11 +- src/bpf.c \| 3 +- src/bpf.h \| 5 +- src/btf.c \| 4 +- src/libbpf.c \| 25 +- src/libbpf.h \| 5 + src/libbpf.map \| 4 + src/linker.c \| 248 ++++++++++---- src/usdt.c \| 2 +- 12 files changed, 794 insertions(+), 81 deletions(-) Signed-off-by: Daniel Müller <deso@posteo.net>	2025-01-08 14:58:04 -08:00
Daniel Xu	f468e83c85	libbpf: Set MFD_NOEXEC_SEAL when creating memfd Starting from 105ff5339f49 ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC") and until 1717449b4417 ("memfd: drop warning for missing exec-related flags"), the kernel would print a warning if neither MFD_NOEXEC_SEAL nor MFD_EXEC is set in memfd_create(). If libbpf runs on on a kernel between these two commits (eg. on an improperly backported system), it'll trigger this warning. To avoid this warning (and also be more secure), explicitly set MFD_NOEXEC_SEAL. But since libbpf can be run on potentially very old kernels, leave a fallback for kernels without MFD_NOEXEC_SEAL support. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/6e62c2421ad7eb1da49cbf16da95aaaa7f94d394.1735594195.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-01-08 14:58:04 -08:00
Anton Protopopov	48c771c4ce	libbpf: prog load: Allow to use fd_array_cnt Add new fd_array_cnt field to bpf_prog_load_opts and pass it in bpf_attr, if set. Signed-off-by: Anton Protopopov <aspsk@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241213130934.1087929-6-aspsk@isovalent.com	2025-01-08 14:58:04 -08:00
Anton Protopopov	266da73237	bpf: Add fd_array_cnt attribute for prog_load The fd_array attribute of the BPF_PROG_LOAD syscall may contain a set of file descriptors: maps or btfs. This field was introduced as a sparse array. Introduce a new attribute, fd_array_cnt, which, if present, indicates that the fd_array is a continuous array of the corresponding length. If fd_array_cnt is non-zero, then every map in the fd_array will be bound to the program, as if it was used by the program. This functionality is similar to the BPF_PROG_BIND_MAP syscall, but such maps can be used by the verifier during the program load. Signed-off-by: Anton Protopopov <aspsk@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241213130934.1087929-5-aspsk@isovalent.com	2025-01-08 14:58:04 -08:00
Alastair Robertson	d2f1f4490b	libbpf: Extend linker API to support in-memory ELF files The new_fd and add_fd functions correspond to the original new and add_file functions, but accept an FD instead of a file name. This gives API consumers the option of using anonymous files/memfds to avoid writing ELFs to disk. This new API will be useful for performing linking as part of bpftrace's JIT compilation. The add_buf function is a convenience wrapper that does the work of creating a memfd for the caller. Signed-off-by: Alastair Robertson <ajor@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241211164030.573042-3-ajor@meta.com	2025-01-08 14:58:04 -08:00
Alastair Robertson	f00fad0951	libbpf: Pull file-opening logic up to top-level functions Move the filename arguments and file-descriptor handling from init_output_elf() and linker_load_obj_file() and instead handle them at the top-level in bpf_linker__new() and bpf_linker__add_file(). This will allow the inner functions to be shared with a new, non-filename-based, API in the next commit. Signed-off-by: Alastair Robertson <ajor@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241211164030.573042-2-ajor@meta.com	2025-01-08 14:58:04 -08:00
Quentin Monnet	984dcc97ae	libbpf: Fix segfault due to libelf functions not setting errno Libelf functions do not set errno on failure. Instead, it relies on its internal _elf_errno value, that can be retrieved via elf_errno (or the corresponding message via elf_errmsg()). From "man libelf": If a libelf function encounters an error it will set an internal error code that can be retrieved with elf_errno. Each thread maintains its own separate error code. The meaning of each error code can be determined with elf_errmsg, which returns a string describing the error. As a consequence, libbpf should not return -errno when a function from libelf fails, because an empty value will not be interpreted as an error and won't prevent the program to stop. This is visible in bpf_linker__add_file(), for example, where we call a succession of functions that rely on libelf: err = err ?: linker_load_obj_file(linker, filename, opts, &obj); err = err ?: linker_append_sec_data(linker, &obj); err = err ?: linker_append_elf_syms(linker, &obj); err = err ?: linker_append_elf_relos(linker, &obj); err = err ?: linker_append_btf(linker, &obj); err = err ?: linker_append_btf_ext(linker, &obj); If the object file that we try to process is not, in fact, a correct object file, linker_load_obj_file() may fail with errno not being set, and return 0. In this case we attempt to run linker_append_elf_sysms() and may segfault. This can happen (and was discovered) with bpftool: $ bpftool gen object output.o sample_ret0.bpf.c libbpf: failed to get ELF header for sample_ret0.bpf.c: invalid `Elf' handle zsh: segmentation fault (core dumped) bpftool gen object output.o sample_ret0.bpf.c Fix the issue by returning a non-null error code (-EINVAL) when libelf functions fail. Fixes: faf6ed321cf6 ("libbpf: Add BPF static linker APIs") Signed-off-by: Quentin Monnet <qmo@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241205135942.65262-1-qmo@kernel.org	2025-01-08 14:58:04 -08:00
Ben Olson	3ed57f68e5	libbpf: Improve debug message when the base BTF cannot be found When running `bpftool` on a kernel module installed in `/lib/modules...`, this error is encountered if the user does not specify `--base-btf` to point to a valid base BTF (e.g. usually in `/sys/kernel/btf/vmlinux`). However, looking at the debug output to determine the cause of the error simply says `Invalid BTF string section`, which does not point to the actual source of the error. This just improves that debug message to tell users what happened. Signed-off-by: Ben Olson <matthew.olson@intel.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/Z0YqzQ5lNz7obQG7@bolson-desk Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-01-08 14:58:04 -08:00
Andrii Nakryiko	69d85c5fb3	libbpf: don't adjust USDT semaphore address if .stapsdt.base addr is missing USDT ELF note optionally can record an offset of .stapsdt.base, which is used to make adjustments to USDT target attach address. Currently, libbpf will do this address adjustment unconditionally if it finds .stapsdt.base ELF section in target binary. But there is a corner case where .stapsdt.base ELF section is present, but specific USDT note doesn't reference it. In such case, libbpf will basically just add base address and end up with absolutely incorrect USDT target address. This adjustment has to be done only if both .stapsdt.sema section is present and USDT note is recording a reference to it. Fixes: 74cc6311cec9 ("libbpf: Add USDT notes parsing and resolution logic") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20241121224558.796110-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-01-08 14:58:04 -08:00
Martin Karsten	0d822312fa	net: Add napi_struct parameter irq_suspend_timeout Add a per-NAPI IRQ suspension parameter, which can be get/set with netdev-genl. This patch doesn't change any behavior but prepares the code for other changes in the following commits which use irq_suspend_timeout as a timeout for IRQ suspension. Signed-off-by: Martin Karsten <mkarsten@uwaterloo.ca> Co-developed-by: Joe Damato <jdamato@fastly.com> Signed-off-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Tested-by: Martin Karsten <mkarsten@uwaterloo.ca> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Link: https://patch.msgid.link/20241109050245.191288-2-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-08 14:58:04 -08:00
Daniel Borkmann	ba2ba19f6d	tools: Sync if_link.h uapi tooling header Sync if_link uapi header to the latest version as we need the refresher in tooling for netkit device. Given it's been a while since the last sync and the diff is fairly big, it has been done as its own commit. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20241004101335.117711-4-daniel@iogearbox.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2025-01-08 14:58:04 -08:00
Joe Damato	c9a728c329	netdev-genl: Support setting per-NAPI config values Add support to set per-NAPI defer_hard_irqs and gro_flush_timeout. Signed-off-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241011184527.16393-7-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-08 14:58:04 -08:00
Joe Damato	adf7973417	netdev-genl: Dump gro_flush_timeout Support dumping gro_flush_timeout for a NAPI ID. Signed-off-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241011184527.16393-5-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-08 14:58:04 -08:00
Joe Damato	4f5f2597ce	netdev-genl: Dump napi_defer_hard_irqs Support dumping defer_hard_irqs for a NAPI ID. Signed-off-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20241011184527.16393-3-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-08 14:58:04 -08:00
Eric Dumazet	114881acba	net: add IFLA_MAX_PACING_OFFLOAD_HORIZON device attribute Some network devices have the ability to offload EDT (Earliest Departure Time) which is the model used for TCP pacing and FQ packet scheduler. Some of them implement the timing wheel mechanism described in https://saeed.github.io/files/carousel-sigcomm17.pdf with an associated 'timing wheel horizon'. This patch adds dev->max_pacing_offload_horizon expressing this timing wheel horizon in nsec units. This is a read-only attribute. Unless a driver sets it, dev->max_pacing_offload_horizon is zero. v2: addressed Jakub feedback ( https://lore.kernel.org/netdev/20240930152304.472767-2-edumazet@google.com/T/#mf6294d714c41cc459962154cc2580ce3c9693663 ) v3: added yaml doc (also per Jakub feedback) Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20241003121219.2396589-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-01-08 14:58:04 -08:00
Sidong Yang	b1f223b5b8	libbpf: Change hash_combine parameters from long to unsigned long The hash_combine() could be trapped when compiled with sanitizer like "zig cc" or clang with signed-integer-overflow option. This patch parameters and return type to unsigned long to remove the potential overflow. Signed-off-by: Sidong Yang <sidong.yang@furiosa.ai> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20241116081054.65195-1-sidong.yang@furiosa.ai Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-01-08 14:58:04 -08:00
Jiri Olsa	05333afec1	libbpf: Fix memory leak in bpf_program__attach_uprobe_multi Andrii reported memory leak detected by Coverity on error path in bpf_program__attach_uprobe_multi. Fixing that by moving the check earlier before the offsets allocations. Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241115115843.694337-1-jolsa@kernel.org	2025-01-08 14:58:04 -08:00
Adrian Hunter	d22e0d8721	perf/core: Add aux_pause, aux_resume, aux_start_paused Hardware traces, such as instruction traces, can produce a vast amount of trace data, so being able to reduce tracing to more specific circumstances can be useful. The ability to pause or resume tracing when another event happens, can do that. Add ability for an event to "pause" or "resume" AUX area tracing. Add aux_pause bit to perf_event_attr to indicate that, if the event happens, the associated AUX area tracing should be paused. Ditto aux_resume. Do not allow aux_pause and aux_resume to be set together. Add aux_start_paused bit to perf_event_attr to indicate to an AUX area event that it should start in a "paused" state. Add aux_paused to struct hw_perf_event for AUX area events to keep track of the "paused" state. aux_paused is initialized to aux_start_paused. Add PERF_EF_PAUSE and PERF_EF_RESUME modes for ->stop() and ->start() callbacks. Call as needed, during __perf_event_output(). Add aux_in_pause_resume to struct perf_buffer to prevent races with the NMI handler. Pause/resume in NMI context will miss out if it coincides with another pause/resume. To use aux_pause or aux_resume, an event must be in a group with the AUX area event as the group leader. Example (requires Intel PT and tools patches also): $ perf record --kcore -e intel_pt/aux-action=start-paused/k,syscalls:sys_enter_newuname/aux-action=resume/,syscalls:sys_exit_newuname/aux-action=pause/ uname Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.043 MB perf.data ] $ perf script --call-trace uname 30805 [000] 24001.058782799: name: 0x7ffc9c1865b0 uname 30805 [000] 24001.058784424: psb offs: 0 uname 30805 [000] 24001.058784424: cbr: 39 freq: 3904 MHz (139%) uname 30805 [000] 24001.058784629: ([kernel.kallsyms]) debug_smp_processor_id uname 30805 [000] 24001.058784629: ([kernel.kallsyms]) __x64_sys_newuname uname 30805 [000] 24001.058784629: ([kernel.kallsyms]) down_read uname 30805 [000] 24001.058784629: ([kernel.kallsyms]) __cond_resched uname 30805 [000] 24001.058784629: ([kernel.kallsyms]) preempt_count_add uname 30805 [000] 24001.058784629: ([kernel.kallsyms]) in_lock_functions uname 30805 [000] 24001.058784629: ([kernel.kallsyms]) preempt_count_sub uname 30805 [000] 24001.058784629: ([kernel.kallsyms]) up_read uname 30805 [000] 24001.058784629: ([kernel.kallsyms]) preempt_count_add uname 30805 [000] 24001.058784838: ([kernel.kallsyms]) in_lock_functions uname 30805 [000] 24001.058784838: ([kernel.kallsyms]) preempt_count_sub uname 30805 [000] 24001.058784838: ([kernel.kallsyms]) _copy_to_user uname 30805 [000] 24001.058784838: ([kernel.kallsyms]) syscall_exit_to_user_mode uname 30805 [000] 24001.058784838: ([kernel.kallsyms]) syscall_exit_work uname 30805 [000] 24001.058784838: ([kernel.kallsyms]) perf_syscall_exit uname 30805 [000] 24001.058784838: ([kernel.kallsyms]) debug_smp_processor_id uname 30805 [000] 24001.058785046: ([kernel.kallsyms]) perf_trace_buf_alloc uname 30805 [000] 24001.058785046: ([kernel.kallsyms]) perf_swevent_get_recursion_context uname 30805 [000] 24001.058785046: ([kernel.kallsyms]) debug_smp_processor_id uname 30805 [000] 24001.058785046: ([kernel.kallsyms]) debug_smp_processor_id uname 30805 [000] 24001.058785046: ([kernel.kallsyms]) perf_tp_event uname 30805 [000] 24001.058785046: ([kernel.kallsyms]) perf_trace_buf_update uname 30805 [000] 24001.058785046: ([kernel.kallsyms]) tracing_gen_ctx_irq_test uname 30805 [000] 24001.058785046: ([kernel.kallsyms]) perf_swevent_event uname 30805 [000] 24001.058785046: ([kernel.kallsyms]) __perf_event_account_interrupt uname 30805 [000] 24001.058785046: ([kernel.kallsyms]) __this_cpu_preempt_check uname 30805 [000] 24001.058785046: ([kernel.kallsyms]) perf_event_output_forward uname 30805 [000] 24001.058785046: ([kernel.kallsyms]) perf_event_aux_pause uname 30805 [000] 24001.058785046: ([kernel.kallsyms]) ring_buffer_get uname 30805 [000] 24001.058785046: ([kernel.kallsyms]) __rcu_read_lock uname 30805 [000] 24001.058785046: ([kernel.kallsyms]) __rcu_read_unlock uname 30805 [000] 24001.058785254: ([kernel.kallsyms]) pt_event_stop uname 30805 [000] 24001.058785254: ([kernel.kallsyms]) debug_smp_processor_id uname 30805 [000] 24001.058785254: ([kernel.kallsyms]) debug_smp_processor_id uname 30805 [000] 24001.058785254: ([kernel.kallsyms]) native_write_msr uname 30805 [000] 24001.058785463: ([kernel.kallsyms]) native_write_msr uname 30805 [000] 24001.058785639: 0x0 Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: James Clark <james.clark@arm.com> Link: https://lkml.kernel.org/r/20241022155920.17511-3-adrian.hunter@intel.com	2025-01-08 14:58:04 -08:00
Ihor Solodrai	c5f22aca0f	ci: remove llvm-17 variant of the workflow Also try prettifying the job names. Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>	2024-11-20 10:23:35 -08:00
Ihor Solodrai	bfc9770b24	ci: switch to libbpf/ci actions @v2 Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>	2024-11-18 22:00:09 -08:00
Ihor Solodrai	cd73a17321	ci: configure CI test jobs * Don't run pahole@tmp.master + llvm-17 combination. * Use descriptive name of for vmtest jobs * Don't run test_progs_cpuv4 when LLVM_VERSION < 18 (same as on BPF CI) * Add some logging to prepare-selftests-run.sh Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>	2024-11-18 22:00:09 -08:00
Ihor Solodrai	39e4e86263	ci: cleanup now unused local actions and workflows Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>	2024-11-18 22:00:09 -08:00
Ihor Solodrai	c1f8925561	ci: bump llvm version to 18 Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>	2024-11-18 22:00:09 -08:00
Ihor Solodrai	c7bf7b8977	ci: update temporary kernel patches Remove old patches applied to kernel source for CI. They haven't been applied in a while. Add a fix for token/obj_priv_implicit_token_envvar Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>	2024-11-18 22:00:09 -08:00
Ihor Solodrai	e0687f9f54	ci: add vmtest as a reusable workflow Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>	2024-11-18 22:00:09 -08:00
Ihor Solodrai	dcf6ad6c70	ci: switch to libbpf/ci/build-selftests action Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>	2024-11-18 22:00:09 -08:00
Ihor Solodrai	a453ffb7ea	ci: add a vmtest step for setting up selftests run Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>	2024-11-18 22:00:09 -08:00
Ihor Solodrai	779cb2b65b	ci: use libbpf/ci/run-vmtest action to run selftests Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>	2024-11-18 22:00:09 -08:00
Andrii Nakryiko	244485ce72	ci: don't fail test kmodule builds on older kernels We are now getting: WARNING: Module.symvers is missing. Modules may not have dependencies or modversions. You may get many unresolved symbol errors. You can set KBUILD_MODPOST_WARN=1 to turn errors into warning if you want to proceed at your own risk. So let's set KBUILD_MODPOST_WARN=1. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-11-13 19:25:37 -08:00
Andrii Nakryiko	713f5b0779	libbpf: bump version to v1.6.0 for new dev cycle We are now in v1.6.0 dev cycles, reflect that in Makefile. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-11-13 19:25:37 -08:00
Andrii Nakryiko	6c25f7dcb5	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: c6fb8030b4baa01c850f99fc6da051b1017edc46 Checkpoint bpf-next commit: a1087da9d11e5bcacc706002bc0f84b790881f69 Baseline bpf commit: d5fb316e2af1d947f0f6c3666e373a54d9f27c6f Checkpoint bpf commit: fb86c42a2a5d44e849ddfbc98b8d2f4f40d36ee3 Andrii Nakryiko (1): libbpf: start v1.6 development cycle Jiri Olsa (2): bpf: Add support for uprobe multi session attach libbpf: Add support for uprobe multi session attach Mykyta Yatsenko (4): libbpf: Introduce errstr() for stringifying errno libbpf: Stringify errno in log messages in libbpf.c libbpf: Stringify errno in log messages in btf*.c libbpf: Stringify errno in log messages in the remaining code include/uapi/linux/bpf.h \| 1 + src/bpf.c \| 1 + src/btf.c \| 26 +-- src/btf_dump.c \| 3 +- src/elf.c \| 4 +- src/features.c \| 15 +- src/gen_loader.c \| 3 +- src/libbpf.c \| 375 ++++++++++++++++++--------------------- src/libbpf.h \| 4 +- src/libbpf.map \| 3 + src/libbpf_version.h \| 2 +- src/linker.c \| 21 ++- src/ringbuf.c \| 34 ++-- src/str_error.c \| 71 ++++++++ src/str_error.h \| 7 + src/usdt.c \| 32 ++-- 16 files changed, 332 insertions(+), 270 deletions(-) Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-11-13 19:25:37 -08:00
Andrii Nakryiko	8576598d64	sync: update .mailmap Update .mailmap based on libbpf's list of contributors and on the latest .mailmap version in the upstream repository. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-11-13 19:25:37 -08:00
Mykyta Yatsenko	db73e46709	libbpf: Stringify errno in log messages in the remaining code Convert numeric error codes into the string representations in log messages in the rest of libbpf source files. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241111212919.368971-5-mykyta.yatsenko5@gmail.com	2024-11-13 19:25:37 -08:00
Mykyta Yatsenko	da7d63a690	libbpf: Stringify errno in log messages in btf*.c Convert numeric error codes into the string representations in log messages in btf.c and btf_dump.c. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241111212919.368971-4-mykyta.yatsenko5@gmail.com	2024-11-13 19:25:37 -08:00
Mykyta Yatsenko	6d8af6175e	libbpf: Stringify errno in log messages in libbpf.c Convert numeric error codes into the string representations in log messages in libbpf.c. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241111212919.368971-3-mykyta.yatsenko5@gmail.com	2024-11-13 19:25:37 -08:00
Mykyta Yatsenko	8232da46d6	libbpf: Introduce errstr() for stringifying errno Add function errstr(int err) that allows converting numeric error codes into string representations. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241111212919.368971-2-mykyta.yatsenko5@gmail.com	2024-11-13 19:25:37 -08:00
Jiri Olsa	c975e02612	libbpf: Add support for uprobe multi session attach Adding support to attach program in uprobe session mode with bpf_program__attach_uprobe_multi function. Adding session bool to bpf_uprobe_multi_opts struct that allows to load and attach the bpf program via uprobe session. the attachment to create uprobe multi session. Also adding new program loader section that allows: SEC("uprobe.session/bpf_fentry_test*") and loads/attaches uprobe program as uprobe session. Adding sleepable hook (uprobe.session.s) as well. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241108134544.480660-6-jolsa@kernel.org	2024-11-13 19:25:37 -08:00
Jiri Olsa	183d84803a	bpf: Add support for uprobe multi session attach Adding support to attach BPF program for entry and return probe of the same function. This is common use case which at the moment requires to create two uprobe multi links. Adding new BPF_TRACE_UPROBE_SESSION attach type that instructs kernel to attach single link program to both entry and exit probe. It's possible to control execution of the BPF program on return probe simply by returning zero or non zero from the entry BPF program execution to execute or not the BPF program on return probe respectively. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241108134544.480660-4-jolsa@kernel.org	2024-11-13 19:25:37 -08:00
Andrii Nakryiko	6210515c78	libbpf: start v1.6 development cycle With libbpf v1.5.0 release out, start v1.6 dev cycle. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20241029184045.581537-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-11-13 19:25:37 -08:00
Ihor Solodrai	94610d4c27	ci: remove CI jobs for 4.9.0 and 5.5.0 kernels Signed-off-by: Ihor Solodrai <isolodrai@meta.com>	2024-11-13 19:22:53 -08:00
Andrii Nakryiko	09b9e83102	ci: bump uraimo/run-on-arch-action version Bump to latest uraimo/run-on-arch-action@v2.8.1 version, hoping that fixes the CI. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-10-24 14:34:52 -07:00
Andrii Nakryiko	891438c086	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 989a29cfed9b5092c3e18be14e9032c51bb1c9f6 Checkpoint bpf-next commit: c6fb8030b4baa01c850f99fc6da051b1017edc46 Baseline bpf commit: b836cbdf3b81a4a22b3452186efa2e5105a77e10 Checkpoint bpf commit: d5fb316e2af1d947f0f6c3666e373a54d9f27c6f Andrii Nakryiko (1): libbpf: move global data mmap()'ing into bpf_object__load() Eder Zulian (1): libbpf: Prevent compiler warnings/errors Hou Tao (1): bpf: Add the missing BPF_LINK_TYPE invocation for sockmap Kui-Feng Lee (1): libbpf: define __uptr. include/uapi/linux/bpf.h \| 3 ++ src/bpf_helpers.h \| 1 + src/btf_dump.c \| 4 +- src/libbpf.c \| 83 +++++++++++++++++++--------------------- 4 files changed, 46 insertions(+), 45 deletions(-) Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-10-24 14:34:52 -07:00
Hou Tao	2d7a79a984	bpf: Add the missing BPF_LINK_TYPE invocation for sockmap There is an out-of-bounds read in bpf_link_show_fdinfo() for the sockmap link fd. Fix it by adding the missing BPF_LINK_TYPE invocation for sockmap link Also add comments for bpf_link_type to prevent missing updates in the future. Fixes: 699c23f02c65 ("bpf: Add bpf_link support for sk_msg and sk_skb progs") Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241024013558.1135167-2-houtao@huaweicloud.com	2024-10-24 14:34:52 -07:00
Kui-Feng Lee	ee92f521ab	libbpf: define __uptr. Make __uptr available to BPF programs to enable them to define uptrs. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20241023234759.860539-8-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-24 14:34:52 -07:00
Andrii Nakryiko	2dea4b86ee	libbpf: move global data mmap()'ing into bpf_object__load() Since BPF skeleton inception libbpf has been doing mmap()'ing of global data ARRAY maps in bpf_object__load_skeleton() API, which is used by code generated .skel.h files (i.e., by BPF skeletons only). This is wrong because if BPF object is loaded through generic bpf_object__load() API, global data maps won't be re-mmap()'ed after load step, and memory pointers returned from bpf_map__initial_value() would be wrong and won't reflect the actual memory shared between BPF program and user space. bpf_map__initial_value() return result is rarely used after load, so this went unnoticed for a really long time, until bpftrace project attempted to load BPF object through generic bpf_object__load() API and then used BPF subskeleton instantiated from such bpf_object. It turned out that .data/.rodata/.bss data updates through such subskeleton was "blackholed", all because libbpf wouldn't re-mmap() those maps during bpf_object__load() phase. Long story short, this step should be done by libbpf regardless of BPF skeleton usage, right after BPF map is created in the kernel. This patch moves this functionality into bpf_object__populate_internal_map() to achieve this. And bpf_object__load_skeleton() is now simple and almost trivial, only propagating these mmap()'ed pointers into user-supplied skeleton structs. We also do trivial adjustments to error reporting inside bpf_object__populate_internal_map() for consistency with the rest of libbpf's map-handling code. Reported-by: Alastair Robertson <ajor@meta.com> Reported-by: Jonathan Wiepert <jwiepert@meta.com> Fixes: d66562fba1ce ("libbpf: Add BPF object skeleton support") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20241023043908.3834423-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-24 14:34:52 -07:00
Eder Zulian	fdbdbb6b8a	libbpf: Prevent compiler warnings/errors Initialize 'new_off' and 'pad_bits' to 0 and 'pad_type' to NULL in btf_dump_emit_bit_padding to prevent compiler warnings/errors which are observed when compiling with 'EXTRA_CFLAGS=-g -Og' options, but do not happen when compiling with current default options. For example, when compiling libbpf with $ make "EXTRA_CFLAGS=-g -Og" -C tools/lib/bpf/ clean all Clang version 17.0.6 and GCC 13.3.1 fail to compile btf_dump.c due to following errors: btf_dump.c: In function ‘btf_dump_emit_bit_padding’: btf_dump.c:903:42: error: ‘new_off’ may be used uninitialized [-Werror=maybe-uninitialized] 903 \| if (new_off > cur_off && new_off <= next_off) { \| ~~~~~~~~^~~~~~~~~~~ btf_dump.c:870:13: note: ‘new_off’ was declared here 870 \| int new_off, pad_bits, bits, i; \| ^~~~~~~ btf_dump.c:917:25: error: ‘pad_type’ may be used uninitialized [-Werror=maybe-uninitialized] 917 \| btf_dump_printf(d, "\n%s%s: %d;", pfx(lvl), pad_type, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 918 \| in_bitfield ? new_off - cur_off : 0); \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ btf_dump.c:871:21: note: ‘pad_type’ was declared here 871 \| const char *pad_type; \| ^~~~~~~~ btf_dump.c:930:20: error: ‘pad_bits’ may be used uninitialized [-Werror=maybe-uninitialized] 930 \| if (bits == pad_bits) { \| ^ btf_dump.c:870:22: note: ‘pad_bits’ was declared here 870 \| int new_off, pad_bits, bits, i; \| ^~~~~~~~ cc1: all warnings being treated as errors Signed-off-by: Eder Zulian <ezulian@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20241022172329.3871958-3-ezulian@redhat.com	2024-10-24 14:34:52 -07:00
Andrii Nakryiko	fc064eb41e	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: b24d7f0da6ef5a23456a301eaf51b170f961d4ae Checkpoint bpf-next commit: 989a29cfed9b5092c3e18be14e9032c51bb1c9f6 Baseline bpf commit: b24d7f0da6ef5a23456a301eaf51b170f961d4ae Checkpoint bpf commit: b836cbdf3b81a4a22b3452186efa2e5105a77e10 Andrii Nakryiko (2): libbpf: fix sym_is_subprog() logic for weak global subprogs libbpf: never interpret subprogs in .text as entry programs Chen Ni (1): libbpf: Remove unneeded semicolon Eduard Zingerman (1): bpf: __bpf_fastcall for bpf_get_smp_processor_id in uapi Eric Long (1): libbpf: Do not resolve size on duplicate FUNCs Ihor Solodrai (1): libbpf: Change log level of BTF loading error message Martin Kelly (1): bpf: Update bpf_override_return() comment Matteo Croce (1): bpf: fix argument type in bpf_loop documentation Namhyung Kim (1): libbpf: Fix possible compiler warnings in hashmap Tao Chen (1): libbpf: Fix expected_attach_type set handling in program load callback Tony Ambardar (7): libbpf: Improve log message formatting libbpf: Fix header comment typos for BTF.ext libbpf: Fix output .symtab byte-order during linking libbpf: Support BTF.ext loading and output in either endianness libbpf: Support opening bpf objects of either endianness libbpf: Support linking bpf objects of either endianness libbpf: Support creating light skeleton of either endianness include/uapi/linux/bpf.h \| 8 +- src/bpf_gen_internal.h \| 1 + src/btf.c \| 280 ++++++++++++++++++++++++++++++--------- src/btf.h \| 3 + src/btf_dump.c \| 2 +- src/btf_relocate.c \| 2 +- src/gen_loader.c \| 187 ++++++++++++++++++-------- src/hashmap.h \| 20 +-- src/libbpf.c \| 81 ++++++++--- src/libbpf.map \| 2 + src/libbpf_internal.h \| 43 +++++- src/linker.c \| 84 +++++++++--- src/relo_core.c \| 2 +- src/skel_internal.h \| 3 +- src/zip.c \| 2 +- 15 files changed, 550 insertions(+), 170 deletions(-) Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-10-11 14:12:43 -07:00
Andrii Nakryiko	db8a210964	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-10-11 14:12:43 -07:00
Namhyung Kim	f69995d909	libbpf: Fix possible compiler warnings in hashmap The hashmap__for_each_entry[_safe] is accessing 'map' as a pointer. But it does without parentheses so passing a static hash map with an ampersand (like '&slab_hash') will cause compiler warnings due to unmatched types as '->' operator has a higher precedence. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241011170021.1490836-1-namhyung@kernel.org	2024-10-11 14:12:43 -07:00
Andrii Nakryiko	ac9ced9eb3	libbpf: never interpret subprogs in .text as entry programs Libbpf pre-1.0 had a legacy logic of allowing singular non-annotated (i.e., not having explicit SEC() annotation) function to be treated as sole entry BPF program (unless there were other explicit entry programs). This behavior was dropped during libbpf 1.0 transition period (unless LIBBPF_STRICT_SEC_NAME flag was unset in libbpf_mode). When 1.0 was released and all the legacy behavior was removed, the bug slipped through leaving this legacy behavior around. Fix this for good, as it actually causes very confusing behavior if BPF object file only has subprograms, but no entry programs. Fixes: bd054102a8c7 ("libbpf: enforce strict libbpf 1.0 behaviors") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20241010211731.4121837-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-11 14:12:43 -07:00
Martin Kelly	ba8bd24bbb	bpf: Update bpf_override_return() comment The documentation says CONFIG_FUNCTION_ERROR_INJECTION is supported only on x86. This was presumably true at the time of writing, but it's now supported on many other architectures too. Drop this statement, since it's not correct anymore and it fits better in other documentation anyway. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Link: https://lore.kernel.org/r/20241010193301.995909-1-martin.kelly@crowdstrike.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-11 14:12:43 -07:00
Matteo Croce	8ea6e12372	bpf: fix argument type in bpf_loop documentation The `index` argument to bpf_loop() is threaded as an u64. This lead in a subtle verifier denial where clang cloned the argument in another register[1]. [1] https://github.com/systemd/systemd/pull/34650#issuecomment-2401092895 Signed-off-by: Matteo Croce <teknoraver@meta.com> Link: https://lore.kernel.org/r/20241010035652.17830-1-technoboy85@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-11 14:12:43 -07:00
Andrii Nakryiko	0e3971339f	libbpf: fix sym_is_subprog() logic for weak global subprogs sym_is_subprog() is incorrectly rejecting relocations against weak global subprogs. Fix that by realizing that STB_WEAK is also a global function. While it seems like verifier doesn't support taking an address of non-static subprog right now, it's still best to fix support for it on libbpf side, otherwise users will get a very confusing error during BPF skeleton generation or static linking due to misinterpreted relocation: libbpf: prog 'handle_tp': bad map relo against 'foo' in section '.text' Error: failed to open BPF object file: Relocation failed It's clearly not a map relocation, but is treated and reported as such without this fix. Fixes: 53eddb5e04ac ("libbpf: Support subprog address relocation") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20241009011554.880168-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-11 14:12:43 -07:00
Eric Long	ecf998ed8f	libbpf: Do not resolve size on duplicate FUNCs FUNCs do not have sizes, thus currently btf__resolve_size will fail with -EINVAL. Add conditions so that we only update size when the BTF object is not function or function prototype. Signed-off-by: Eric Long <i@hack3r.moe> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241002-libbpf-dup-extern-funcs-v4-1-560eb460ff90@hack3r.moe	2024-10-11 14:12:43 -07:00
Eduard Zingerman	89df6536bf	bpf: __bpf_fastcall for bpf_get_smp_processor_id in uapi Since [1] kernel supports __bpf_fastcall attribute for helper function bpf_get_smp_processor_id(). Update uapi definition for this helper in order to have this attribute in the generated bpf_helper_defs.h [1] commit 91b7fbf3936f ("bpf, x86, riscv, arm: no_caller_saved_registers for bpf_get_smp_processor_id()") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240916091712.2929279-3-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-11 14:12:43 -07:00
Tony Ambardar	8244006267	libbpf: Support creating light skeleton of either endianness Track target endianness in 'struct bpf_gen' and process in-memory data in native byte-order, but on finalization convert the embedded loader BPF insns to target endianness. The light skeleton also includes a target-accessed data blob which is heterogeneous and thus difficult to convert to target byte-order on finalization. Add support functions to convert data to target endianness as it is added to the blob. Also add additional debug logging for data blob structure details and skeleton loading. Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/569562e1d5bf1cce80a1f1a3882461ee2da1ffd5.1726475448.git.tony.ambardar@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-11 14:12:43 -07:00
Tony Ambardar	6ac8762ecd	libbpf: Support linking bpf objects of either endianness Allow static linking object files of either endianness, checking that input files have consistent byte-order, and setting output endianness from input. Linking requires in-memory processing of programs, relocations, sections, etc. in native endianness, and output conversion to target byte-order. This is enabled by built-in ELF translation and recent BTF/BTF.ext endianness functions. Further add local functions for swapping byte-order of sections containing BPF insns. Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/b47ca686d02664843fc99b96262fe3259650bc43.1726475448.git.tony.ambardar@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-11 14:12:43 -07:00
Tony Ambardar	628b21dbcd	libbpf: Support opening bpf objects of either endianness Allow bpf_object__open() to access files of either endianness, and convert included BPF programs to native byte-order in-memory for introspection. Loading BPF objects of non-native byte-order is still disallowed however. Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/26353c1a1887a54400e1acd6c138fa90c99cdd40.1726475448.git.tony.ambardar@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-11 14:12:43 -07:00
Tony Ambardar	5ae8432d15	libbpf: Support BTF.ext loading and output in either endianness Support for handling BTF data of either endianness was added in [1], but did not include BTF.ext data for lack of use cases. Later, support for static linking [2] provided a use case, but this feature and later ones were restricted to native-endian usage. Add support for BTF.ext handling in either endianness. Convert BTF.ext data to native endianness when read into memory for further processing, and support raw data access that restores the original byte-order for output. Add internal header functions for byte-swapping func, line, and core info records. Add new API functions btf_ext__endianness() and btf_ext__set_endianness() for query and setting byte-order, as already exist for BTF data. [1] 3289959b97ca ("libbpf: Support BTF loading and raw data output in both endianness") [2] 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support") Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/133407ab20e0dd5c07cab2a6fa7879dee1ffa4bc.1726475448.git.tony.ambardar@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-11 14:12:43 -07:00
Tony Ambardar	f2668a0a71	libbpf: Fix output .symtab byte-order during linking Object linking output data uses the default ELF_T_BYTE type for '.symtab' section data, which disables any libelf-based translation. Explicitly set the ELF_T_SYM type for output to restore libelf's byte-order conversion, noting that input '.symtab' data is already correctly translated. Fixes: faf6ed321cf6 ("libbpf: Add BPF static linker APIs") Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/87868bfeccf3f51aec61260073f8778e9077050a.1726475448.git.tony.ambardar@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-11 14:12:43 -07:00
Tony Ambardar	5060f172cc	libbpf: Fix header comment typos for BTF.ext Mention struct btf_ext_info_sec rather than non-existent btf_sec_func_info in BTF.ext struct documentation. Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/cde65e01a5f2945c578485fab265ef711e2daeb6.1726475448.git.tony.ambardar@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-11 14:12:43 -07:00
Tony Ambardar	ceeb7211c9	libbpf: Improve log message formatting Fix missing newlines and extraneous terminal spaces in messages. Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/086884b7cbf87e524d584f9bf87f7a580e378b2b.1726475448.git.tony.ambardar@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-11 14:12:43 -07:00
Chen Ni	3fb92e63e0	libbpf: Remove unneeded semicolon Remove unneeded semicolon in zip_archive_open(). Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240926023823.3632993-1-nichen@iscas.ac.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-11 14:12:43 -07:00
Tao Chen	ad633fb142	libbpf: Fix expected_attach_type set handling in program load callback Referenced commit broke the logic of resetting expected_attach_type to zero for allowed program types if kernel doesn't yet support such field. We do need to overwrite and preserve expected_attach_type for multi-uprobe though, but that can be done explicitly in libbpf_prepare_prog_load(). Fixes: 5902da6d8a52 ("libbpf: Add uprobe multi link support to bpf_program__attach_usdt") Suggested-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Tao Chen <chen.dylane@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240925153012.212866-1-chen.dylane@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-11 14:12:43 -07:00
Ihor Solodrai	3ea36843b3	libbpf: Change log level of BTF loading error message Reduce log level of BTF loading error to INFO if BTF is not required. Andrii says: Nowadays the expectation is that the BPF program will have a valid .BTF section, so even though .BTF is "optional", I think it's fine to emit a warning for that case (any reasonably recent Clang will produce valid BTF). Ihor's patch is fixing the situation with an outdated host kernel that doesn't understand BTF. libbpf will try to "upload" the program's BTF, but if that fails and the BPF object doesn't use any features that require having BTF uploaded, then it's just an information message to the user, but otherwise can be ignored. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-11 14:12:43 -07:00
Jordan Rome	80b16457cb	ci: add temporary patch for failing upstream BPF uprobe selftest Signed-off-by: Jordan Rome <linux@jordanrome.com>	2024-10-09 14:13:35 -07:00
Jordan Rome	7827ca87d1	ci: regenerate vmlinux.h Regenerate latest vmlinux.h for old kernel CI tests. Signed-off-by: Jordan Rome <linux@jordanrome.com>	2024-10-09 14:13:35 -07:00
Jordan Rome	91ccd57ca9	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 2ad6d23f465a4f851e3bcf6d74c315ce7b2c205b Checkpoint bpf-next commit: b24d7f0da6ef5a23456a301eaf51b170f961d4ae Baseline bpf commit: b408473ea01b2e499d23503e2bf898416da9d7ac Checkpoint bpf commit: b24d7f0da6ef5a23456a301eaf51b170f961d4ae Alan Maguire (1): bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags Daniel Borkmann (1): bpf: Sync uapi bpf.h header to tools directory Donald Hunter (1): docs/bpf: Add missing BPF program types to docs Ihor Solodrai (1): libbpf: Add bpf_object__token_fd accessor Jiri Olsa (1): libbpf: Fix uretprobe.multi.s programs auto attachment Lin Yikai (1): libbpf: fix some typos in libbpf Mina Almasry (2): net: netdev netlink api to bind dma-buf to a net device netdev: add dmabuf introspection Pu Lehui (3): libbpf: Access first syscall argument with CO-RE direct read on s390 libbpf: Access first syscall argument with CO-RE direct read on arm64 libbpf: Fix accessing first syscall argument on RV64 Sam James (1): libbpf: Workaround (another) -Wmaybe-uninitialized false positive Shuyi Cheng (1): libbpf: Fixed getting wrong return address on arm64 architecture Yusheng Zheng (1): libbpf: Fix some typos in comments docs/program_types.rst \| 30 ++++++++++++++++++++++++++---- include/uapi/linux/bpf.h \| 25 ++++++++++++------------- include/uapi/linux/netdev.h \| 13 +++++++++++++ src/bpf.h \| 4 ++-- src/bpf_helpers.h \| 2 +- src/bpf_tracing.h \| 25 ++++++++++++++++--------- src/btf.c \| 4 ++-- src/btf.h \| 2 +- src/btf_dump.c \| 2 +- src/libbpf.c \| 13 +++++++++---- src/libbpf.h \| 18 +++++++++++++----- src/libbpf.map \| 1 + src/libbpf_legacy.h \| 4 ++-- src/linker.c \| 4 ++-- src/skel_internal.h \| 2 +- src/usdt.bpf.h \| 2 +- 16 files changed, 103 insertions(+), 48 deletions(-) Signed-off-by: Jordan Rome <linux@jordanrome.com>	2024-10-09 14:13:35 -07:00
Jordan Rome	f0a307f61c	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions. Signed-off-by: Jordan Rome <linux@jordanrome.com>	2024-10-09 14:13:35 -07:00
Daniel Borkmann	80b97bd0b8	bpf: Sync uapi bpf.h header to tools directory There is a delta between kernel UAPI bpf.h and tools UAPI bpf.h, thus sync them again. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2024-10-09 14:13:35 -07:00
Ihor Solodrai	7c2f492a88	libbpf: Add bpf_object__token_fd accessor Add a LIBBPF_API function to retrieve the token_fd from a bpf_object. Without this accessor, if user needs a token FD they have to get it manually via bpf_token_create, even though a token might have been already created by bpf_object__load. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240913001858.3345583-1-ihor.solodrai@pm.me	2024-10-09 14:13:35 -07:00
Donald Hunter	114f6ce2fd	docs/bpf: Add missing BPF program types to docs Update the table of program types in the libbpf documentation with the recently added program types. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240912095944.6386-1-donald.hunter@gmail.com	2024-10-09 14:13:35 -07:00
Jiri Olsa	69671302df	libbpf: Fix uretprobe.multi.s programs auto attachment As reported by Andrii we don't currently recognize uretprobe.multi.s programs as return probes due to using (wrong) strcmp function. Using str_has_pfx() instead to match uretprobe.multi prefix. Tests are passing, because the return program was executed as entry program and all counts were incremented properly. Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240910125336.3056271-1-jolsa@kernel.org	2024-10-09 14:13:35 -07:00
Yusheng Zheng	e1833cff9c	libbpf: Fix some typos in comments Fix some spelling errors in the code comments of libbpf: betwen -> between paremeters -> parameters knowning -> knowing definiton -> definition compatiblity -> compatibility overriden -> overridden occured -> occurred proccess -> process managment -> management nessary -> necessary Signed-off-by: Yusheng Zheng <yunwei356@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240909225952.30324-1-yunwei356@gmail.com	2024-10-09 14:13:35 -07:00
Shuyi Cheng	81ac790dc8	libbpf: Fixed getting wrong return address on arm64 architecture ARM64 has a separate lr register to store the return address, so here you only need to read the lr register to get the return address, no need to dereference it again. Signed-off-by: Shuyi Cheng <chengshuyi@linux.alibaba.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1725787433-77262-1-git-send-email-chengshuyi@linux.alibaba.com	2024-10-09 14:13:35 -07:00
Sam James	3b301cf75d	libbpf: Workaround (another) -Wmaybe-uninitialized false positive We get this with GCC 15 -O3 (at least): ``` libbpf.c: In function ‘bpf_map__init_kern_struct_ops’: libbpf.c:1109:18: error: ‘mod_btf’ may be used uninitialized [-Werror=maybe-uninitialized] 1109 \| kern_btf = mod_btf ? mod_btf->btf : obj->btf_vmlinux; \| ~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ libbpf.c:1094:28: note: ‘mod_btf’ was declared here 1094 \| struct module_btf mod_btf; \| ^~~~~~~ In function ‘find_struct_ops_kern_types’, inlined from ‘bpf_map__init_kern_struct_ops’ at libbpf.c:1102:8: libbpf.c:982:21: error: ‘btf’ may be used uninitialized [-Werror=maybe-uninitialized] 982 \| kern_type = btf__type_by_id(btf, kern_type_id); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ libbpf.c: In function ‘bpf_map__init_kern_struct_ops’: libbpf.c:967:21: note: ‘btf’ was declared here 967 \| struct btf btf; \| ^~~ ``` This is similar to the other libbpf fix from a few weeks ago for the same modelling-errno issue (fab45b962749184e1a1a57c7c583782b78fad539). Signed-off-by: Sam James <sam@gentoo.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://bugs.gentoo.org/939106 Link: https://lore.kernel.org/bpf/f6962729197ae7cdf4f6d1512625bd92f2322d31.1725630494.git.sam@gentoo.org	2024-10-09 14:13:35 -07:00
Lin Yikai	6c8dde3554	libbpf: fix some typos in libbpf Hi, fix some spelling errors in libbpf, the details are as follows: -in the code comments: termintaing->terminating architecutre->architecture requring->requiring recored->recoded sanitise->sanities allowd->allowed abover->above see bpf_udst_arg()->see bpf_usdt_arg() Signed-off-by: Lin Yikai <yikai.lin@vivo.com> Link: https://lore.kernel.org/r/20240905110354.3274546-3-yikai.lin@vivo.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-10-09 14:13:35 -07:00
Pu Lehui	9045c3ab53	libbpf: Fix accessing first syscall argument on RV64 On RV64, as Ilya mentioned before [0], the first syscall parameter should be accessed through orig_a0 (see arch/riscv64/include/asm/syscall.h), otherwise it will cause selftests like bpf_syscall_macro, vmlinux, test_lsm, etc. to fail on RV64. Let's fix it by using the struct pt_regs style CO-RE direct access. Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220209021745.2215452-1-iii@linux.ibm.com [0] Link: https://lore.kernel.org/bpf/20240831041934.1629216-5-pulehui@huaweicloud.com	2024-10-09 14:13:35 -07:00
Pu Lehui	53a645402f	libbpf: Access first syscall argument with CO-RE direct read on arm64 Currently PT_REGS_PARM1 SYSCALL(x) is consistent with PT_REGS_PARM1_CORE SYSCALL(x), which will introduce the overhead of BPF_CORE_READ(), taking into account the read pt_regs comes directly from the context, let's use CO-RE direct read to access the first system call argument. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Xu Kuohai <xukuohai@huawei.com> Link: https://lore.kernel.org/bpf/20240831041934.1629216-3-pulehui@huaweicloud.com	2024-10-09 14:13:35 -07:00
Pu Lehui	6d01681b02	libbpf: Access first syscall argument with CO-RE direct read on s390 Currently PT_REGS_PARM1 SYSCALL(x) is consistent with PT_REGS_PARM1_CORE SYSCALL(x), which will introduce the overhead of BPF_CORE_READ(), taking into account the read pt_regs comes directly from the context, let's use CO-RE direct read to access the first system call argument. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240831041934.1629216-2-pulehui@huaweicloud.com	2024-10-09 14:13:35 -07:00
Mina Almasry	9a37057800	netdev: add dmabuf introspection Add dmabuf information to page_pool stats: $ ./cli.py --spec ../netlink/specs/netdev.yaml --dump page-pool-get ... {'dmabuf': 10, 'id': 456, 'ifindex': 3, 'inflight': 1023, 'inflight-mem': 4190208}, {'dmabuf': 10, 'id': 455, 'ifindex': 3, 'inflight': 1023, 'inflight-mem': 4190208}, {'dmabuf': 10, 'id': 454, 'ifindex': 3, 'inflight': 1023, 'inflight-mem': 4190208}, {'dmabuf': 10, 'id': 453, 'ifindex': 3, 'inflight': 1023, 'inflight-mem': 4190208}, {'dmabuf': 10, 'id': 452, 'ifindex': 3, 'inflight': 1023, 'inflight-mem': 4190208}, {'dmabuf': 10, 'id': 451, 'ifindex': 3, 'inflight': 1023, 'inflight-mem': 4190208}, {'dmabuf': 10, 'id': 450, 'ifindex': 3, 'inflight': 1023, 'inflight-mem': 4190208}, {'dmabuf': 10, 'id': 449, 'ifindex': 3, 'inflight': 1023, 'inflight-mem': 4190208}, And queue stats: $ ./cli.py --spec ../netlink/specs/netdev.yaml --dump queue-get ... {'dmabuf': 10, 'id': 8, 'ifindex': 3, 'type': 'rx'}, {'dmabuf': 10, 'id': 9, 'ifindex': 3, 'type': 'rx'}, {'dmabuf': 10, 'id': 10, 'ifindex': 3, 'type': 'rx'}, {'dmabuf': 10, 'id': 11, 'ifindex': 3, 'type': 'rx'}, {'dmabuf': 10, 'id': 12, 'ifindex': 3, 'type': 'rx'}, {'dmabuf': 10, 'id': 13, 'ifindex': 3, 'type': 'rx'}, {'dmabuf': 10, 'id': 14, 'ifindex': 3, 'type': 'rx'}, {'dmabuf': 10, 'id': 15, 'ifindex': 3, 'type': 'rx'}, Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20240910171458.219195-14-almasrymina@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-09 14:13:35 -07:00
Mina Almasry	3578ab89fb	net: netdev netlink api to bind dma-buf to a net device API takes the dma-buf fd as input, and binds it to the netdevice. The user can specify the rx queues to bind the dma-buf to. Suggested-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20240910171458.219195-3-almasrymina@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-10-09 14:13:35 -07:00
Alan Maguire	178df3d885	bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags Currently the only opportunity to set sock ops flags dictating which callbacks fire for a socket is from within a TCP-BPF sockops program. This is problematic if the connection is already set up as there is no further chance to specify callbacks for that socket. Add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_setsockopt() and bpf_getsockopt() to allow users to specify callbacks later, either via an iterator over sockets or via a socket-specific program triggered by a setsockopt() on the socket. Previous discussion on this here [1]. [1] https://lore.kernel.org/bpf/f42f157b-6e52-dd4d-3d97-9b86c84c0b00@oracle.com/ Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/r/20240808150558.1035626-2-alan.maguire@oracle.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-10-09 14:13:35 -07:00
Ihor Solodrai	1f98105e54	ci: bump actions/upload-artifact to v4 Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>	2024-10-07 15:38:01 -07:00
Andrii Nakryiko	a4161e00f9	ci: get rid of s390x kernel tests Kernel/libbpf code is very well tested on s390x in BPF CI, so get rid of it here as it often is a source of trouble and noise, without really benefiting us much. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-10-07 15:38:01 -07:00
Andrii Nakryiko	caa17bdcbf	ci: regenerate vmlinux.h Regenerated latest vmlinux.h for old kernels. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-08-30 16:29:01 -07:00
Andrii Nakryiko	76c9f50f3e	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: ec5b8c76ab1c6d163762d60cfbedcd27e7527144 Checkpoint bpf-next commit: 2ad6d23f465a4f851e3bcf6d74c315ce7b2c205b Baseline bpf commit: e1533b6319ab9c3a97dad314dd88b3783bc41b69 Checkpoint bpf commit: b408473ea01b2e499d23503e2bf898416da9d7ac Alan Maguire (1): libbpf: Fix license for btf_relocate.c Andrii Nakryiko (2): libbpf: Fix no-args func prototype BTF dumping syntax libbpf: Fix bpf_object__open_skeleton()'s mishandling of options David Vernet (1): libbpf: Don't take direct pointers into BTF data from st_ops Jordan Rome (1): bpf: Add bpf_copy_from_user_str kfunc Kan Liang (1): perf/x86/intel: Support new data source for Lunar Lake Sam James (1): libbpf: Workaround -Wmaybe-uninitialized false positive Stanislav Fomichev (1): selftests/bpf: Add XDP_UMEM_TX_METADATA_LEN to XSK TX metadata test Tony Ambardar (1): libbpf: Ensure new BTF objects inherit input endianness include/uapi/linux/bpf.h \| 9 ++++ include/uapi/linux/if_xdp.h \| 4 ++ include/uapi/linux/perf_event.h \| 6 ++- src/btf.c \| 4 ++ src/btf_dump.c \| 8 ++-- src/btf_relocate.c \| 2 +- src/elf.c \| 3 ++ src/libbpf.c \| 75 ++++++++++++++------------------- 8 files changed, 62 insertions(+), 49 deletions(-) Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-08-30 16:29:01 -07:00
Tony Ambardar	fe28fae57a	libbpf: Ensure new BTF objects inherit input endianness New split BTF needs to preserve base's endianness. Similarly, when creating a distilled BTF, we need to preserve original endianness. Fix by updating libbpf's btf__distill_base() and btf_new_empty() to retain the byte order of any source BTF objects when creating new ones. Fixes: ba451366bf44 ("libbpf: Implement basic split BTF support") Fixes: 58e185a0dc35 ("libbpf: Add btf__distill_base() creating split BTF with distilled base BTF") Reported-by: Song Liu <song@kernel.org> Reported-by: Eduard Zingerman <eddyz87@gmail.com> Suggested-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/6358db36c5f68b07873a0a5be2d062b1af5ea5f8.camel@gmail.com/ Link: https://lore.kernel.org/bpf/20240830095150.278881-1-tony.ambardar@gmail.com	2024-08-30 16:29:01 -07:00
Andrii Nakryiko	f6f24022d3	libbpf: Fix bpf_object__open_skeleton()'s mishandling of options We do an ugly copying of options in bpf_object__open_skeleton() just to be able to set object name from skeleton's recorded name (while still allowing user to override it through opts->object_name). This is not just ugly, but it also is broken due to memcpy() that doesn't take into account potential skel_opts' and user-provided opts' sizes differences due to backward and forward compatibility. This leads to copying over extra bytes and then failing to validate options properly. It could, technically, lead also to SIGSEGV, if we are unlucky. So just get rid of that memory copy completely and instead pass default object name into bpf_object_open() directly, simplifying all this significantly. The rule now is that obj_name should be non-NULL for bpf_object_open() when called with in-memory buffer, so validate that explicitly as well. We adopt bpf_object__open_mem() to this as well and generate default name (based on buffer memory address and size) outside of bpf_object_open(). Fixes: d66562fba1ce ("libbpf: Add BPF object skeleton support") Reported-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Daniel Müller <deso@posteo.net> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20240827203721.1145494-1-andrii@kernel.org	2024-08-30 16:29:01 -07:00
Jordan Rome	4bd31a1044	bpf: Add bpf_copy_from_user_str kfunc This adds a kfunc wrapper around strncpy_from_user, which can be called from sleepable BPF programs. This matches the non-sleepable 'bpf_probe_read_user_str' helper except it includes an additional 'flags' param, which allows consumers to clear the entire destination buffer on success or failure. Signed-off-by: Jordan Rome <linux@jordanrome.com> Link: https://lore.kernel.org/r/20240823195101.3621028-1-linux@jordanrome.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-08-30 16:29:01 -07:00
Sam James	33b22671c2	libbpf: Workaround -Wmaybe-uninitialized false positive In `elf_close`, we get this with GCC 15 -O3 (at least): ``` In function ‘elf_close’, inlined from ‘elf_close’ at elf.c:53:6, inlined from ‘elf_find_func_offset_from_file’ at elf.c:384:2: elf.c:57:9: warning: ‘elf_fd.elf’ may be used uninitialized [-Wmaybe-uninitialized] 57 \| elf_end(elf_fd->elf); \| ^~~~~~~~~~~~~~~~~~~~ elf.c: In function ‘elf_find_func_offset_from_file’: elf.c:377:23: note: ‘elf_fd.elf’ was declared here 377 \| struct elf_fd elf_fd; \| ^~~~~~ In function ‘elf_close’, inlined from ‘elf_close’ at elf.c:53:6, inlined from ‘elf_find_func_offset_from_file’ at elf.c:384:2: elf.c:58:9: warning: ‘elf_fd.fd’ may be used uninitialized [-Wmaybe-uninitialized] 58 \| close(elf_fd->fd); \| ^~~~~~~~~~~~~~~~~ elf.c: In function ‘elf_find_func_offset_from_file’: elf.c:377:23: note: ‘elf_fd.fd’ was declared here 377 \| struct elf_fd elf_fd; \| ^~~~~~ ``` In reality, our use is fine, it's just that GCC doesn't model errno here (see linked GCC bug). Suppress -Wmaybe-uninitialized accordingly by initializing elf_fd.fd to -1 and elf_fd.elf to NULL. I've done this in two other functions as well given it could easily occur there too (same access/use pattern). Signed-off-by: Sam James <sam@gentoo.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://gcc.gnu.org/PR114952 Link: https://lore.kernel.org/bpf/14ec488a1cac02794c2fa2b83ae0cef1bce2cb36.1723578546.git.sam@gentoo.org	2024-08-30 16:29:01 -07:00
Alan Maguire	8b29484790	libbpf: Fix license for btf_relocate.c License should be // SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) ...as with other libbpf files. Fixes: 19e00c897d50 ("libbpf: Split BTF relocation") Reported-by: Neill Kapron <nkapron@google.com> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/20240810093504.2111134-1-alan.maguire@oracle.com	2024-08-30 16:29:01 -07:00
David Vernet	7b5237996a	libbpf: Don't take direct pointers into BTF data from st_ops In struct bpf_struct_ops, we have take a pointer to a BTF type name, and a struct btf_type. This was presumably done for convenience, but can actually result in subtle and confusing bugs given that BTF data can be invalidated before a program is loaded. For example, in sched_ext, we may sometimes resize a data section after a skeleton has been opened, but before the struct_ops scheduler map has been loaded. This may cause the BTF data to be realloc'd, which can then cause a UAF when loading the program because the struct_ops map has pointers directly into the BTF data. We're already storing the BTF type_id in struct bpf_struct_ops. Because type_id is stable, we can therefore just update the places where we were looking at those pointers to instead do the lookups we need from the type_id. Fixes: 590a00888250 ("bpf: libbpf: Add STRUCT_OPS support") Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240724171459.281234-1-void@manifault.com	2024-08-30 16:29:01 -07:00
Stanislav Fomichev	a89e519b40	selftests/bpf: Add XDP_UMEM_TX_METADATA_LEN to XSK TX metadata test This flag is now required to use tx_metadata_len. Fixes: 40808a237d9c ("selftests/bpf: Add TX side to xdp_metadata") Reported-by: Julian Schindel <mail@arctic-alpaca.de> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20240713015253.121248-3-sdf@fomichev.me	2024-08-30 16:29:01 -07:00
Andrii Nakryiko	205e86de8b	libbpf: Fix no-args func prototype BTF dumping syntax For all these years libbpf's BTF dumper has been emitting not strictly valid syntax for function prototypes that have no input arguments. Instead of `int (blah)()` we should emit `int (blah)(void)`. This is not normally a problem, but it manifests when we get kfuncs in vmlinux.h that have no input arguments. Due to compiler internal specifics, we get no BTF information for such kfuncs, if they are not declared with proper `(void)`. The fix is trivial. We also need to adjust a few ancient tests that happily assumed `()` is correct. Fixes: 351131b51c7a ("libbpf: add btf_dump API for BTF-to-C conversion") Reported-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://lore.kernel.org/bpf/20240712224442.282823-1-andrii@kernel.org	2024-08-30 16:29:01 -07:00
Kan Liang	86fc78bd2b	perf/x86/intel: Support new data source for Lunar Lake A new PEBS data source format is introduced for the p-core of Lunar Lake. The data source field is extended to 8 bits with new encodings. A new layout is introduced into the union intel_x86_pebs_dse. Introduce the lnl_latency_data() to parse the new format. Enlarge the pebs_data_source[] accordingly to include new encodings. Only the mem load and the mem store events can generate the data source. Introduce INTEL_HYBRID_LDLAT_CONSTRAINT and INTEL_HYBRID_STLAT_CONSTRAINT to mark them. Add two new bits for the new cache-related data src, L2_MHB and MSC. The L2_MHB is short for L2 Miss Handling Buffer, which is similar to LFB (Line Fill Buffer), but to track the L2 Cache misses. The MSC stands for the memory-side cache. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lkml.kernel.org/r/20240626143545.480761-6-kan.liang@linux.intel.com	2024-08-30 16:29:01 -07:00
Andrii Nakryiko	20ccbb303a	ci: take into account common local DENYLIST/ALLOWLIST Similar to naming convention in BPF selftests. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-08-30 09:14:00 -07:00
chantra	26443a6d43	ci: fix test job names * use the architecture name in job name instead of `runs_on` labels Signed-off-by: Manu Bretelle <chantr4@gmail.com>	2024-08-29 10:58:52 +01:00
Andrii Nakryiko	22ec3eb15d	ci: deny verify_pkcs7_sig as it keeps failing This has nothing to do with libbpf and is probably failing due to environment setup. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-08-27 12:51:55 -07:00
Manu Bretelle	bc24cd126a	ci: run test on Ubuntu 24.04 Signed-off-by: Manu Bretelle <chantr4@gmail.com>	2024-08-22 12:59:18 -07:00
Manu Bretelle	92316f5072	ci: Pass llvm-version as an input and enforce passing it to build-selftests action Signed-off-by: Manu Bretelle <chantr4@gmail.com>	2024-08-21 16:04:36 -07:00
Manu Bretelle	a73c6f7f80	ci: Use llvm repositories matching the host we are running on As this will change to a Ubuntu 24.04 runner, we want this to automatically detect which ubuntu version it is running on. Signed-off-by: Manu Bretelle <chantr4@gmail.com>	2024-08-21 16:04:36 -07:00
Manu Bretelle	8e47e755cd	ci: bump default llvm version to 17 Ubuntu 24.04's minimum llvm version is 17. Bumping this now to limit changes later. Signed-off-by: Manu Bretelle <chantr4@gmail.com>	2024-08-21 16:04:36 -07:00
Manu Bretelle	ec0d0fda8b	ci: lock down s390x CI to Ubuntu 20.04 runners I am working on upgrading to 24.04 runners. In order to make sure that current jobs are scheduled on Ubuntu 20.04, we need to ask for runners with tag `docker-main`, which is currently set by those old runners. Later, we will be able to switch this tag to `docker-noble-main` which are Ubuntu 24.04 runners. Signed-off-by: Manu Bretelle <chantr4@gmail.com>	2024-08-21 16:04:36 -07:00
Ivan Shapovalov	b07dfe3b2a	Makefile: ensure $(OBJDIR) is created before writing to it Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name>	2024-07-29 14:05:05 -07:00
Andrii Nakryiko	686f600bca	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: a12978712d9001b060bcc10eaae42ad5102abe2b Checkpoint bpf-next commit: ec5b8c76ab1c6d163762d60cfbedcd27e7527144 Baseline bpf commit: b1c4b4d45263241ec6c2405a8df8265d4b58e707 Checkpoint bpf commit: e1533b6319ab9c3a97dad314dd88b3783bc41b69 Alan Maguire (1): libbpf: Fix error handling in btf__distill_base() Andreas Ziegler (1): libbpf: Add NULL checks to bpf_object__{prev_map,next_map} Andrii Nakryiko (2): libbpf: fix BPF skeleton forward/backward compat handling libbpf: improve old BPF skeleton handling for map auto-attach src/btf.c \| 2 +- src/libbpf.c \| 75 +++++++++++++++++++++++++++++----------------------- 2 files changed, 43 insertions(+), 34 deletions(-) Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-10 14:22:00 -07:00
Andrii Nakryiko	726d7f3722	sync: update .mailmap Update .mailmap based on libbpf's list of contributors and on the latest .mailmap version in the upstream repository. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-07-10 14:22:00 -07:00
Andrii Nakryiko	e6f1ae2557	libbpf: improve old BPF skeleton handling for map auto-attach Improve how we handle old BPF skeletons when it comes to BPF map auto-attachment. Emit one warn-level message per each struct_ops map that could have been auto-attached, if user provided recent enough BPF skeleton version. Don't spam log if there are no relevant struct_ops maps, though. This should help users realize that they probably need to regenerate BPF skeleton header with more recent bpftool/libbpf-cargo (or whatever other means of BPF skeleton generation). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20240708204540.4188946-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-07-10 14:22:00 -07:00
Andrii Nakryiko	bf7ddbef99	libbpf: fix BPF skeleton forward/backward compat handling BPF skeleton was designed from day one to be extensible. Generated BPF skeleton code specifies actual sizes of map/prog/variable skeletons for that reason and libbpf is supposed to work with newer/older versions correctly. Unfortunately, it was missed that we implicitly embed hard-coded most up-to-date (according to libbpf's version of libbpf.h header used to compile BPF skeleton header) sizes of those structs, which can differ from the actual sizes at runtime when libbpf is used as a shared library. We have a few places were we just index array of maps/progs/vars, which implicitly uses these potentially invalid sizes of structs. This patch aims to fix this problem going forward. Once this lands, we'll backport these changes in Github repo to create patched releases for older libbpfs. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Fixes: d66562fba1ce ("libbpf: Add BPF object skeleton support") Fixes: 430025e5dca5 ("libbpf: Add subskeleton scaffolding") Fixes: 08ac454e258e ("libbpf: Auto-attach struct_ops BPF maps in BPF skeleton") Co-developed-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240708204540.4188946-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-07-10 14:22:00 -07:00
Andreas Ziegler	1867490d8f	libbpf: Add NULL checks to bpf_object__{prev_map,next_map} In the current state, an erroneous call to bpf_object__find_map_by_name(NULL, ...) leads to a segmentation fault through the following call chain: bpf_object__find_map_by_name(obj = NULL, ...) -> bpf_object__for_each_map(pos, obj = NULL) -> bpf_object__next_map((obj = NULL), NULL) -> return (obj = NULL)->maps While calling bpf_object__find_map_by_name with obj = NULL is obviously incorrect, this should not lead to a segmentation fault but rather be handled gracefully. As __bpf_map__iter already handles this situation correctly, we can delegate the check for the regular case there and only add a check in case the prev or next parameter is NULL. Signed-off-by: Andreas Ziegler <ziegler.andreas@siemens.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240703083436.505124-1-ziegler.andreas@siemens.com	2024-07-10 14:22:00 -07:00
Alan Maguire	24aca0740b	libbpf: Fix error handling in btf__distill_base() Coverity points out that after calling btf__new_empty_split() the wrong value is checked for error. Fixes: 58e185a0dc35 ("libbpf: Add btf__distill_base() creating split BTF with distilled base BTF") Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240629100058.2866763-1-alan.maguire@oracle.com	2024-07-10 14:22:00 -07:00
Andrii Nakryiko	c1a6c770c4	libbpf: add btf_iter.o and btf_relocate.o to Makefile Upstream libbpf got two new .c files, make sure they are built with Github Makefile as well. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-06-27 10:01:42 -07:00
Andrii Nakryiko	223cd2273e	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 531876c80004ecff7bfdbd8ba6c6b48835ef5e22 Checkpoint bpf-next commit: a12978712d9001b060bcc10eaae42ad5102abe2b Baseline bpf commit: 62da3acd28955e7299babebdfcb14243b789e773 Checkpoint bpf commit: b1c4b4d45263241ec6c2405a8df8265d4b58e707 Alan Maguire (6): libbpf: Add btf__distill_base() creating split BTF with distilled base BTF libbpf: Split BTF relocation libbpf: BTF relocation followup fixing naming, loop logic libbpf: Split field iter code into its own file kernel libbpf,bpf: Share BTF relocate-related code with kernel libbpf: Fix clang compilation error in btf_relocate.c Andrii Nakryiko (4): libbpf: Add BTF field iterator libbpf: Make use of BTF field iterator in BPF linker code libbpf: Make use of BTF field iterator in BTF handling code libbpf: Remove callback-based type/string BTF field visitor helpers Antoine Tenart (1): libbpf: Skip base btf sanity checks Donglin Peng (1): libbpf: Checking the btf_type kind when fixing variable offsets Eduard Zingerman (1): libbpf: Make btf_parse_elf process .BTF.base transparently Mykyta Yatsenko (1): libbpf: Auto-attach struct_ops BPF maps in BPF skeleton Vadim Fedorenko (1): bpf: Add CHECKSUM_COMPLETE to bpf test progs include/uapi/linux/bpf.h \| 2 + src/btf.c \| 696 +++++++++++++++++++++++++++------------ src/btf.h \| 36 ++ src/btf_iter.c \| 177 ++++++++++ src/btf_relocate.c \| 519 +++++++++++++++++++++++++++++ src/libbpf.c \| 64 +++- src/libbpf.h \| 18 + src/libbpf.map \| 4 + src/libbpf_internal.h \| 29 +- src/linker.c \| 69 ++-- 10 files changed, 1378 insertions(+), 236 deletions(-) create mode 100644 src/btf_iter.c create mode 100644 src/btf_relocate.c Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-06-27 10:01:42 -07:00
Andrii Nakryiko	dcd076347c	sync: update .mailmap Update .mailmap based on libbpf's list of contributors and on the latest .mailmap version in the upstream repository. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-06-27 10:01:42 -07:00
Alan Maguire	e4982342e7	libbpf: Fix clang compilation error in btf_relocate.c When building with clang for ARCH=i386, the following errors are observed: CC kernel/bpf/btf_relocate.o ./tools/lib/bpf/btf_relocate.c:206:23: error: implicit truncation from 'int' to a one-bit wide bit-field changes value from 1 to -1 [-Werror,-Wsingle-bit-bitfield-constant-conversion] 206 \| info[id].needs_size = true; \| ^ ~ ./tools/lib/bpf/btf_relocate.c:256:25: error: implicit truncation from 'int' to a one-bit wide bit-field changes value from 1 to -1 [-Werror,-Wsingle-bit-bitfield-constant-conversion] 256 \| base_info.needs_size = true; \| ^ ~ 2 errors generated. The problem is we use 1-bit, 31-bit bitfields in a signed int. Changing to bool needs_size: 1; unsigned int size:31; ...resolves the error and pahole reports that 4 bytes are used for the underlying representation: $ pahole btf_name_info tools/lib/bpf/btf_relocate.o struct btf_name_info { const char * name; /* 0 8 / unsigned int needs_size:1; / 8: 0 4 / unsigned int size:31; / 8: 1 4 / __u32 id; / 12 4 / / size: 16, cachelines: 1, members: 4 / / last cacheline: 16 bytes */ }; Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240624192903.854261-1-alan.maguire@oracle.com	2024-06-27 10:01:42 -07:00
Antoine Tenart	95c63a08f2	libbpf: Skip base btf sanity checks When upgrading to libbpf 1.3 we noticed a big performance hit while loading programs using CORE on non base-BTF symbols. This was tracked down to the new BTF sanity check logic. The issue is the base BTF definitions are checked first for the base BTF and then again for every module BTF. Loading 5 dummy programs (using libbpf-rs) that are using CORE on a non-base BTF symbol on my system: - Before this fix: 3s. - With this fix: 0.1s. Fix this by only checking the types starting at the BTF start id. This should ensure the base BTF is still checked as expected but only once (btf->start_id == 1 when creating the base BTF), and then only additional types are checked for each module BTF. Fixes: 3903802bb99a ("libbpf: Add basic BTF sanity validation") Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/20240624090908.171231-1-atenart@kernel.org	2024-06-27 10:01:42 -07:00
Alan Maguire	27f0169332	libbpf,bpf: Share BTF relocate-related code with kernel Share relocation implementation with the kernel. As part of this, we also need the type/string iteration functions so also share btf_iter.c file. Relocation code in kernel and userspace is identical save for the impementation of the reparenting of split BTF to the relocated base BTF and retrieval of the BTF header from "struct btf"; these small functions need separate user-space and kernel implementations for the separate "struct btf"s they operate upon. One other wrinkle on the kernel side is we have to map .BTF.ids in modules as they were generated with the type ids used at BTF encoding time. btf_relocate() optionally returns an array mapping from old BTF ids to relocated ids, so we use that to fix up these references where needed for kfuncs. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20240620091733.1967885-5-alan.maguire@oracle.com	2024-06-27 10:01:42 -07:00
Alan Maguire	4ffb92e204	libbpf: Split field iter code into its own file kernel This will allow it to be shared with the kernel. No functional change. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240620091733.1967885-4-alan.maguire@oracle.com	2024-06-27 10:01:42 -07:00
Alan Maguire	bc021a8b42	libbpf: BTF relocation followup fixing naming, loop logic Use less verbose names in BTF relocation code and fix off-by-one error and typo in btf_relocate.c. Simplify loop over matching distilled types, moving from assigning a _next value in loop body to moving match check conditions into the guard. Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20240620091733.1967885-2-alan.maguire@oracle.com	2024-06-27 10:01:42 -07:00
Donglin Peng	88a0787335	libbpf: Checking the btf_type kind when fixing variable offsets I encountered an issue when building the test_progs from the repository [1]: $ pwd /work/Qemu/x86_64/linux-6.10-rc2/tools/testing/selftests/bpf/ $ make test_progs V=1 [...] ./tools/sbin/bpftool gen object ./ip_check_defrag.bpf.linked2.o ./ip_check_defrag.bpf.linked1.o libbpf: failed to find symbol for variable 'bpf_dynptr_slice' in section '.ksyms' Error: failed to link './ip_check_defrag.bpf.linked1.o': No such file or directory (2) [...] Upon investigation, I discovered that the btf_types referenced in the '.ksyms' section had a kind of BTF_KIND_FUNC instead of BTF_KIND_VAR: $ bpftool btf dump file ./ip_check_defrag.bpf.linked1.o [...] [2] DATASEC '.ksyms' size=0 vlen=2 type_id=16 offset=0 size=0 (FUNC 'bpf_dynptr_from_skb') type_id=17 offset=0 size=0 (FUNC 'bpf_dynptr_slice') [...] [16] FUNC 'bpf_dynptr_from_skb' type_id=82 linkage=extern [17] FUNC 'bpf_dynptr_slice' type_id=85 linkage=extern [...] For a detailed analysis, please refer to [2]. We can add a kind checking to fix the issue. [1] https://github.com/eddyz87/bpf/tree/binsort-btf-dedup [2] https://lore.kernel.org/all/0c0ef20c-c05e-4db9-bad7-2cbc0d6dfae7@oracle.com/ Fixes: 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support") Signed-off-by: Donglin Peng <dolinux.peng@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20240619122355.426405-1-dolinux.peng@gmail.com	2024-06-27 10:01:42 -07:00
Eduard Zingerman	4bc5a64933	libbpf: Make btf_parse_elf process .BTF.base transparently Update btf_parse_elf() to check if .BTF.base section is present. The logic is as follows: if .BTF.base section exists: distilled_base := btf_new(.BTF.base) if distilled_base: btf := btf_new(.BTF, .base_btf=distilled_base) if base_btf: btf_relocate(btf, base_btf) else: btf := btf_new(.BTF) return btf In other words: - if .BTF.base section exists, load BTF from it and use it as a base for .BTF load; - if base_btf is specified and .BTF.base section exist, relocate newly loaded .BTF against base_btf. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240613095014.357981-6-alan.maguire@oracle.com	2024-06-27 10:01:42 -07:00
Alan Maguire	2afe409348	libbpf: Split BTF relocation Map distilled base BTF type ids referenced in split BTF and their references to the base BTF passed in, and if the mapping succeeds, reparent the split BTF to the base BTF. Relocation is done by first verifying that distilled base BTF only consists of named INT, FLOAT, ENUM, FWD, STRUCT and UNION kinds; then we sort these to speed lookups. Once sorted, the base BTF is iterated, and for each relevant kind we check for an equivalent in distilled base BTF. When found, the mapping from distilled -> base BTF id and string offset is recorded. In establishing mappings, we need to ensure we check STRUCT/UNION size when the STRUCT/UNION is embedded in a split BTF STRUCT/UNION, and when duplicate names exist for the same STRUCT/UNION. Otherwise size is ignored in matching STRUCT/UNIONs. Once all mappings are established, we can update type ids and string offsets in split BTF and reparent it to the new base. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20240613095014.357981-4-alan.maguire@oracle.com	2024-06-27 10:01:42 -07:00
Alan Maguire	36cb1ad3ae	libbpf: Add btf__distill_base() creating split BTF with distilled base BTF To support more robust split BTF, adding supplemental context for the base BTF type ids that split BTF refers to is required. Without such references, a simple shuffling of base BTF type ids (without any other significant change) invalidates the split BTF. Here the attempt is made to store additional context to make split BTF more robust. This context comes in the form of distilled base BTF providing minimal information (name and - in some cases - size) for base INTs, FLOATs, STRUCTs, UNIONs, ENUMs and ENUM64s along with modified split BTF that points at that base and contains any additional types needed (such as TYPEDEF, PTR and anonymous STRUCT/UNION declarations). This information constitutes the minimal BTF representation needed to disambiguate or remove split BTF references to base BTF. The rules are as follows: - INT, FLOAT, FWD are recorded in full. - if a named base BTF STRUCT or UNION is referred to from split BTF, it will be encoded as a zero-member sized STRUCT/UNION (preserving size for later relocation checks). Only base BTF STRUCT/UNIONs that are either embedded in split BTF STRUCT/UNIONs or that have multiple STRUCT/UNION instances of the same name will _need_ size checks at relocation time, but as it is possible a different set of types will be duplicates in the later to-be-resolved base BTF, we preserve size information for all named STRUCT/UNIONs. - if an ENUM[64] is named, a ENUM forward representation (an ENUM with no values) of the same size is used. - in all other cases, the type is added to the new split BTF. Avoiding struct/union/enum/enum64 expansion is important to keep the distilled base BTF representation to a minimum size. When successful, new representations of the distilled base BTF and new split BTF that refers to it are returned. Both need to be freed by the caller. So to take a simple example, with split BTF with a type referring to "struct sk_buff", we will generate distilled base BTF with a 0-member STRUCT sk_buff of the appropriate size, and the split BTF will refer to it instead. Tools like pahole can utilize such split BTF to populate the .BTF section (split BTF) and an additional .BTF.base section. Then when the split BTF is loaded, the distilled base BTF can be used to relocate split BTF to reference the current (and possibly changed) base BTF. So for example if "struct sk_buff" was id 502 when the split BTF was originally generated, we can use the distilled base BTF to see that id 502 refers to a "struct sk_buff" and replace instances of id 502 with the current (relocated) base BTF sk_buff type id. Distilled base BTF is small; when building a kernel with all modules using distilled base BTF as a test, overall module size grew by only 5.3Mb total across ~2700 modules. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20240613095014.357981-2-alan.maguire@oracle.com	2024-06-27 10:01:42 -07:00
Vadim Fedorenko	0a66859bf1	bpf: Add CHECKSUM_COMPLETE to bpf test progs Add special flag to validate that TC BPF program properly updates checksum information in skb. Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240606145851.229116-1-vadfed@meta.com	2024-06-27 10:01:42 -07:00
Mykyta Yatsenko	be998aa3d4	libbpf: Auto-attach struct_ops BPF maps in BPF skeleton Similarly to `bpf_program`, support `bpf_map` automatic attachment in `bpf_object__attach_skeleton`. Currently only struct_ops maps could be attached. On bpftool side, code-generate links in skeleton struct for struct_ops maps. Similarly to `bpf_program_skeleton`, set links in `bpf_map_skeleton`. On libbpf side, extend `bpf_map` with new `autoattach` field to support enabling or disabling autoattach functionality, introducing getter/setter for this field. `bpf_object__(attach\|detach)_skeleton` is extended with attaching/detaching struct_ops maps logic. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240605175135.117127-1-yatsenko@meta.com	2024-06-27 10:01:42 -07:00
Andrii Nakryiko	78c78e90cd	libbpf: Remove callback-based type/string BTF field visitor helpers Now that all libbpf/bpftool code switched to btf_field_iter, remove btf_type_visit_type_ids() and btf_type_visit_str_offs() callback-based helpers as not needed anymore. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20240605001629.4061937-6-andrii@kernel.org	2024-06-27 10:01:42 -07:00
Andrii Nakryiko	dd19c7ef77	libbpf: Make use of BTF field iterator in BTF handling code Use new BTF field iterator logic to replace all the callback-based visitor calls. There is still a .BTF.ext callback-based visitor APIs that should be converted, which will happens as a follow up. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20240605001629.4061937-4-andrii@kernel.org	2024-06-27 10:01:42 -07:00
Andrii Nakryiko	13182b94f3	libbpf: Make use of BTF field iterator in BPF linker code Switch all BPF linker code dealing with iterating BTF type ID and string offset fields to new btf_field_iter facilities. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20240605001629.4061937-3-andrii@kernel.org	2024-06-27 10:01:42 -07:00
Andrii Nakryiko	cece3242fb	libbpf: Add BTF field iterator Implement iterator-based type ID and string offset BTF field iterator. This is used extensively in BTF-handling code and BPF linker code for various sanity checks, rewriting IDs/offsets, etc. Currently this is implemented as visitor pattern calling custom callbacks, which makes the logic (especially in simple cases) unnecessarily obscure and harder to follow. Having equivalent functionality using iterator pattern makes for simpler to understand and maintain code. As we add more code for BTF processing logic in libbpf, it's best to switch to iterator pattern before adding more callback-based code. The idea for iterator-based implementation is to record offsets of necessary fields within fixed btf_type parts (which should be iterated just once), and, for kinds that have multiple members (based on vlen field), record where in each member necessary fields are located. Generic iteration code then just keeps track of last offset that was returned and handles N members correctly. Return type is just u32 pointer, where NULL is returned when all relevant fields were already iterated. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20240605001629.4061937-2-andrii@kernel.org	2024-06-27 10:01:42 -07:00
Andrii Nakryiko	42065ea662	ci: make pahole-staging workflow manually triggerable Allow to manually trigger pahole-staging workflow. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-06-06 14:39:09 -07:00
Andrii Nakryiko	764d19da07	ci: revert switching to ubuntu-latest for pahole-staging workflow pahole staging workflow is using the same old VM image as BPF selftests stages. It doesn't have recent enough glibc, so we can't yet switch to newer Ubuntu, unfortunately. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-06-06 14:32:23 -07:00
Andrii Nakryiko	fbcb2871fe	ci: regenerate vmlinux.h Regenerated latest vmlinux.h. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-06-03 13:41:26 -07:00
Andrii Nakryiko	61a6e8edd7	github: remove PR template No one is looking at it anyways. It just gets in the way. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-06-03 13:41:26 -07:00
Andrii Nakryiko	4ab7361e64	libbpf: don't close(-1) in multi-uprobe feature detector Guard close(link_fd) with extra link_fd >= 0 check to prevent close(-1). Detected by Coverity static analysis. Fixes: 04d939a2ab22 ("libbpf: detect broken PID filtering logic for multi-uprobe") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20240529231212.768828-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-06-03 13:41:26 -07:00
Andrii Nakryiko	ff856238e2	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: eb4e7726279a344c82e3c23be396bcfd0a4d5669 Checkpoint bpf-next commit: 531876c80004ecff7bfdbd8ba6c6b48835ef5e22 Baseline bpf commit: 9dfdb706e164ae869b1d97f83ebf8523b2809714 Checkpoint bpf commit: 62da3acd28955e7299babebdfcb14243b789e773 Andrii Nakryiko (1): libbpf: keep FD_CLOEXEC flag when dup()'ing FD Jakub Kicinski (1): netdev: add qstat for csum complete include/uapi/linux/netdev.h \| 1 + src/libbpf_internal.h \| 10 +++------- 2 files changed, 4 insertions(+), 7 deletions(-) Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-06-03 13:41:26 -07:00
Jakub Kicinski	c085e9c364	netdev: add qstat for csum complete Recent commit 0cfe71f45f42 ("netdev: add queue stats") added a lot of useful stats, but only those immediately needed by virtio. Presumably virtio does not support CHECKSUM_COMPLETE, so statistic for that form of checksumming wasn't included. Other drivers will definitely need it, in fact we expect it to be needed in net-next soon (mlx5). So let's add the definition of the counter for CHECKSUM_COMPLETE to uAPI in net already, so that the counters are in a more natural order (all subsequent counters have not been present in any released kernel, yet). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Joe Damato <jdamato@fastly.com> Fixes: 0cfe71f45f42 ("netdev: add queue stats") Link: https://lore.kernel.org/r/20240529163547.3693194-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-06-03 13:41:26 -07:00
Andrii Nakryiko	805b689cd2	libbpf: keep FD_CLOEXEC flag when dup()'ing FD Make sure to preserve and/or enforce FD_CLOEXEC flag on duped FDs. Use dup3() with O_CLOEXEC flag for that. Without this fix libbpf effectively clears FD_CLOEXEC flag on each of BPF map/prog FD, which is definitely not the right or expected behavior. Reported-by: Lennart Poettering <lennart@poettering.net> Fixes: bc308d011ab8 ("libbpf: call dup2() syscall directly") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20240529223239.504241-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-06-03 13:41:26 -07:00
Andrii Nakryiko	9b789075a9	ci: switch to ubuntu-latest where possible Track ubuntu-latest where relevant and possible. We can't update to ubuntu-latest when building and running BPF selftests, though, because our QEMU image has too old of an GLIBC. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-05-28 22:37:25 -07:00
Andrii Nakryiko	c22d662a95	ci: update vmlinux.h to latest version Re-generate vmlinux.h to add latest kernel types necessary for BPF selftests. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-05-28 21:15:00 -07:00
Andrii Nakryiko	074445067f	ci: add temporary patch for failing upstream BPF selftest Add fix that landed in bpf tree to fix sk_storage_tracing selftest. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-05-28 20:39:55 -07:00
Andrii Nakryiko	9a1f1f28c6	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 009367099eb61a4fc2af44d4eb06b6b4de7de6db Checkpoint bpf-next commit: eb4e7726279a344c82e3c23be396bcfd0a4d5669 Baseline bpf commit: 3e9bc0472b910d4115e16e9c2d684c7757cb6c60 Checkpoint bpf commit: 9dfdb706e164ae869b1d97f83ebf8523b2809714 Abhishek Chauhan (1): net: Add additional bit to support clockid_t timestamp type Andrii Nakryiko (2): libbpf: fix feature detectors when using token_fd libbpf: detect broken PID filtering logic for multi-uprobe Arnaldo Carvalho de Melo (1): tools headers: Remove now unused copies of uapi/{fcntl,openat2}.h and asm/fcntl.h Daniel Jurgens (1): netdev: Add queue stats for TX stop and wake Mykyta Yatsenko (1): libbpf: Configure log verbosity with env variable Xuan Zhuo (1): netdev: add queue stats docs/libbpf_overview.rst \| 8 +++ include/uapi/linux/bpf.h \| 15 +++-- include/uapi/linux/fcntl.h \| 123 ----------------------------------- include/uapi/linux/netdev.h \| 21 ++++++ include/uapi/linux/openat2.h \| 43 ------------ src/bpf.c \| 2 +- src/features.c \| 33 +++++++++- src/libbpf.c \| 25 ++++++- src/libbpf.h \| 5 +- 9 files changed, 99 insertions(+), 176 deletions(-) delete mode 100644 include/uapi/linux/fcntl.h delete mode 100644 include/uapi/linux/openat2.h Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-05-28 20:39:55 -07:00
Andrii Nakryiko	0a519f87ee	sync: update .mailmap Update .mailmap based on libbpf's list of contributors and on the latest .mailmap version in the upstream repository. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-05-28 20:39:55 -07:00
Andrii Nakryiko	d9f9fd5b22	libbpf: detect broken PID filtering logic for multi-uprobe Libbpf is automatically (and transparently to user) detecting multi-uprobe support in the kernel, and, if supported, uses multi-uprobes to improve USDT attachment speed. USDTs can be attached system-wide or for the specific process by PID. In the latter case, we rely on correct kernel logic of not triggering USDT for unrelated processes. As such, on older kernels that do support multi-uprobes, but still have broken PID filtering logic, we need to fall back to singular uprobes. Unfortunately, whether user is using PID filtering or not is known at the attachment time, which happens after relevant BPF programs were loaded into the kernel. Also unfortunately, we need to make a call whether to use multi-uprobes or singular uprobe for SEC("usdt") programs during BPF object load time, at which point we have no information about possible PID filtering. The distinction between single and multi-uprobes is small, but important for the kernel. Multi-uprobes get BPF_TRACE_UPROBE_MULTI attach type, and kernel internally substitiute different implementation of some of BPF helpers (e.g., bpf_get_attach_cookie()) depending on whether uprobe is multi or singular. So, multi-uprobes and singular uprobes cannot be intermixed. All the above implies that we have to make an early and conservative call about the use of multi-uprobes. And so this patch modifies libbpf's existing feature detector for multi-uprobe support to also check correct PID filtering. If PID filtering is not yet fixed, we fall back to singular uprobes for USDTs. This extension to feature detection is simple thanks to kernel's -EINVAL addition for pid < 0. Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240521163401.3005045-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-05-28 20:39:55 -07:00
Mykyta Yatsenko	d4d3e68e8d	libbpf: Configure log verbosity with env variable Configure logging verbosity by setting LIBBPF_LOG_LEVEL environment variable, which is applied only to default logger. Once user set their custom logging callback, it is up to them to handle filtering. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240524131840.114289-1-yatsenko@meta.com	2024-05-28 20:39:55 -07:00
Abhishek Chauhan	0babfb126a	net: Add additional bit to support clockid_t timestamp type tstamp_type is now set based on actual clockid_t compressed into 2 bits. To make the design scalable for future needs this commit bring in the change to extend the tstamp_type:1 to tstamp_type:2 to support other clockid_t timestamp. We now support CLOCK_TAI as part of tstamp_type as part of this commit with existing support CLOCK_MONOTONIC and CLOCK_REALTIME. Signed-off-by: Abhishek Chauhan <quic_abchauha@quicinc.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20240509211834.3235191-3-quic_abchauha@quicinc.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-05-28 20:39:55 -07:00
Arnaldo Carvalho de Melo	89ed67d7ab	tools headers: Remove now unused copies of uapi/{fcntl,openat2}.h and asm/fcntl.h These were used to build perf to provide defines not available in older distros, but this was back in 2017, nowadays all the distros that are supported and I have build containers for work using just the system headers, so ditch them. Some of these older distros may not have things that are used in 'perf trace', but then they also don't have libtraceevent packages, so don't build 'perf trace'. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/20240315204835.748716-5-acme@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-05-28 20:39:55 -07:00
Andrii Nakryiko	8dfa981c53	libbpf: fix feature detectors when using token_fd Adjust `union bpf_attr` size passed to kernel in two feature-detecting functions to take into account prog_token_fd field. Libbpf is avoiding memset()'ing entire `union bpf_attr` by only using minimal set of bpf_attr's fields. Two places have been missed when wiring BPF token support in libbpf's feature detection logic. Fix them trivially. Fixes: f3dcee938f48 ("libbpf: Wire up token_fd into feature probing logic") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240513180804.403775-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-05-28 20:39:55 -07:00
Daniel Jurgens	15b461a608	netdev: Add queue stats for TX stop and wake TX queue stop and wake are counted by some drivers. Support reporting these via netdev-genl queue stats. Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Link: https://lore.kernel.org/r/20240510201927.1821109-2-danielj@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-28 20:39:55 -07:00
Xuan Zhuo	ec3c369941	netdev: add queue stats These stats are commonly. Support reporting those via netdev-genl queue stats. name: rx-hw-drops name: rx-hw-drop-overruns name: rx-csum-unnecessary name: rx-csum-none name: rx-csum-bad name: rx-hw-gro-packets name: rx-hw-gro-bytes name: rx-hw-gro-wire-packets name: rx-hw-gro-wire-bytes name: rx-hw-drop-ratelimits name: tx-hw-drops name: tx-hw-drop-errors name: tx-csum-none name: tx-needs-csum name: tx-hw-gso-packets name: tx-hw-gso-bytes name: tx-hw-gso-wire-packets name: tx-hw-gso-wire-bytes name: tx-hw-drop-ratelimits Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-05-28 20:39:55 -07:00
Andrii Nakryiko	02724cfd07	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 0737df6de94661ae55fd3343ce9abec32c687e62 Checkpoint bpf-next commit: 009367099eb61a4fc2af44d4eb06b6b4de7de6db Baseline bpf commit: 3e9bc0472b910d4115e16e9c2d684c7757cb6c60 Checkpoint bpf commit: 3e9bc0472b910d4115e16e9c2d684c7757cb6c60 Andrii Nakryiko (6): libbpf: fix potential overflow in ring__consume_n() libbpf: fix ring_buffer__consume_n() return result logic libbpf: remove unnecessary struct_ops prog validity check libbpf: handle yet another corner case of nulling out struct_ops program libbpf: fix libbpf_strerror_r() handling unknown errors libbpf: improve early detection of doomed-to-fail BPF program loading Jiri Olsa (2): libbpf: Fix error message in attach_kprobe_session libbpf: Fix error message in attach_kprobe_multi Jose E. Marchesi (3): libbpf: Fix bpf_ksym_exists() in GCC libbpf: Avoid casts from pointers to enums in bpf_tracing.h bpf: Avoid uninitialized value in BPF_CORE_READ_BITFIELD src/bpf_core_read.h \| 1 + src/bpf_helpers.h \| 17 +++++++++-- src/bpf_tracing.h \| 70 ++++++++++++++++++++++----------------------- src/libbpf.c \| 42 ++++++++++++++++++--------- src/ringbuf.c \| 4 +-- src/str_error.c \| 16 +++++++++-- src/usdt.bpf.h \| 24 ++++++++-------- 7 files changed, 106 insertions(+), 68 deletions(-) Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-05-08 16:04:40 -07:00
Jose E. Marchesi	3827aa514c	bpf: Avoid uninitialized value in BPF_CORE_READ_BITFIELD [Changes from V1: - Use a default branch in the switch statement to initialize `val'.] GCC warns that `val' may be used uninitialized in the BPF_CRE_READ_BITFIELD macro, defined in bpf_core_read.h as: [...] unsigned long long val; \ [...] \ switch (__CORE_RELO(s, field, BYTE_SIZE)) { \ case 1: val = (const unsigned char )p; break; \ case 2: val = (const unsigned short )p; break; \ case 4: val = (const unsigned int )p; break; \ case 8: val = (const unsigned long long )p; break; \ } \ [...] val; \ } \ This patch adds a default entry in the switch statement that sets `val' to zero in order to avoid the warning, and random values to be used in case __builtin_preserve_field_info returns unexpected values for BPF_FIELD_BYTE_SIZE. Tested in bpf-next master. No regressions. Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240508101313.16662-1-jose.marchesi@oracle.com	2024-05-08 16:04:40 -07:00
Andrii Nakryiko	e5146eff75	libbpf: improve early detection of doomed-to-fail BPF program loading Extend libbpf's pre-load checks for BPF programs, detecting more typical conditions that are destinated to cause BPF program failure. This is an opportunity to provide more helpful and actionable error message to users, instead of potentially very confusing BPF verifier log and/or error. In this case, we detect struct_ops BPF program that was not referenced anywhere, but still attempted to be loaded (according to libbpf logic). Suggest that the program might need to be used in some struct_ops variable. User will get a message of the following kind: libbpf: prog 'test_1_forgotten': SEC("struct_ops") program isn't referenced anywhere, did you forget to use it? Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240507001335.1445325-6-andrii@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-05-08 16:04:40 -07:00
Andrii Nakryiko	ed54f30307	libbpf: fix libbpf_strerror_r() handling unknown errors strerror_r(), used from libbpf-specific libbpf_strerror_r() wrapper is documented to return error in two different ways, depending on glibc version. Take that into account when handling strerror_r()'s own errors, which happens when we pass some non-standard (internal) kernel error to it. Before this patch we'd have "ERROR: strerror_r(524)=22", which is quite confusing. Now for the same situation we'll see a bit less visually scary "unknown error (-524)". At least we won't confuse user with irrelevant EINVAL (22). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240507001335.1445325-5-andrii@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-05-08 16:04:40 -07:00
Andrii Nakryiko	fe5fe762b9	libbpf: handle yet another corner case of nulling out struct_ops program There is yet another corner case where user can set STRUCT_OPS program reference in STRUCT_OPS map to NULL, but libbpf will fail to disable autoload for such BPF program. This time it's the case of "new" kernel which has type information about callback field, but user explicitly nulled-out program reference from user-space after opening BPF object. Fix, hopefully, the last remaining unhandled case. Fixes: 0737df6de946 ("libbpf: better fix for handling nulled-out struct_ops program") Fixes: f973fccd43d3 ("libbpf: handle nulled-out program in struct_ops correctly") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240507001335.1445325-3-andrii@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-05-08 16:04:40 -07:00
Andrii Nakryiko	504369cba4	libbpf: remove unnecessary struct_ops prog validity check libbpf ensures that BPF program references set in map->st_ops->progs[i] during open phase are always valid STRUCT_OPS programs. This is done in bpf_object__collect_st_ops_relos(). So there is no need to double-check that in bpf_map__init_kern_struct_ops(). Simplify the code by removing unnecessary check. Also, we avoid using local prog variable to keep code similar to the upcoming fix, which adds similar logic in another part of bpf_map__init_kern_struct_ops(). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240507001335.1445325-2-andrii@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-05-08 16:04:40 -07:00
Jose E. Marchesi	ea02e10fc4	libbpf: Avoid casts from pointers to enums in bpf_tracing.h [Differences from V1: - Do not introduce a global typedef, as this is a public header. - Keep the void* casts in BPF_KPROBE_READ_RET_IP and BPF_KRETPROBE_READ_RET_IP, as these are necessary for converting to a const void* argument of bpf_probe_read_kernel.] The BPF_PROG, BPF_KPROBE and BPF_KSYSCALL macros defined in tools/lib/bpf/bpf_tracing.h use a clever hack in order to provide a convenient way to define entry points for BPF programs as if they were normal C functions that get typed actual arguments, instead of as elements in a single "context" array argument. For example, PPF_PROGS allows writing: SEC("struct_ops/cwnd_event") void BPF_PROG(cwnd_event, struct sock sk, enum tcp_ca_event event) { bbr_cwnd_event(sk, event); dctcp_cwnd_event(sk, event); cubictcp_cwnd_event(sk, event); } That expands into a pair of functions: void ____cwnd_event (unsigned long long ctx, struct sock sk, enum tcp_ca_event event) { bbr_cwnd_event(sk, event); dctcp_cwnd_event(sk, event); cubictcp_cwnd_event(sk, event); } void cwnd_event (unsigned long long ctx) { _Pragma("GCC diagnostic push") _Pragma("GCC diagnostic ignored \"-Wint-conversion\"") return ____cwnd_event(ctx, (void)ctx[0], (void)ctx[1]); _Pragma("GCC diagnostic pop") } Note how the 64-bit unsigned integers in the incoming CTX get casted to a void pointer, and then implicitly converted to whatever type of the actual argument in the wrapped function. In this case: Arg1: unsigned long long -> void * -> struct sock * Arg2: unsigned long long -> void * -> enum tcp_ca_event The behavior of GCC and clang when facing such conversions differ: pointer -> pointer Allowed by the C standard. GCC: no warning nor error. clang: no warning nor error. pointer -> integer type [C standard says the result of this conversion is implementation defined, and it may lead to unaligned pointer etc.] GCC: error: integer from pointer without a cast [-Wint-conversion] clang: error: incompatible pointer to integer conversion [-Wint-conversion] pointer -> enumerated type GCC: error: incompatible types in assigment () clang: error: incompatible pointer to integer conversion [-Wint-conversion] These macros work because converting pointers to pointers is allowed, and converting pointers to integers also works provided a suitable integer type even if it is implementation defined, much like casting a pointer to uintptr_t is guaranteed to work by the C standard. The conversion errors emitted by both compilers by default are silenced by the pragmas. However, the GCC error marked with () above when assigning a pointer to an enumerated value is not associated with the -Wint-conversion warning, and it is not possible to turn it off. This is preventing building the BPF kernel selftests with GCC. This patch fixes this by avoiding intermediate casts to void*, replaced with casts to `unsigned long long', which is an integer type capable of safely store a BPF pointer, much like the standard uintptr_t. Testing performed in bpf-next master: - vmtest.sh -- ./test_verifier - vmtest.sh -- ./test_progs - make M=samples/bpf No regressions. Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240502170925.3194-1-jose.marchesi@oracle.com	2024-05-08 16:04:40 -07:00
Jose E. Marchesi	4ec5e360ae	libbpf: Fix bpf_ksym_exists() in GCC The macro bpf_ksym_exists is defined in bpf_helpers.h as: #define bpf_ksym_exists(sym) ({ \ _Static_assert(!__builtin_constant_p(!!sym), #sym " should be marked as __weak"); \ !!sym; \ }) The purpose of the macro is to determine whether a given symbol has been defined, given the address of the object associated with the symbol. It also has a compile-time check to make sure the object whose address is passed to the macro has been declared as weak, which makes the check on `sym' meaningful. As it happens, the check for weak doesn't work in GCC in all cases, because __builtin_constant_p not always folds at parse time when optimizing. This is because optimizations that happen later in the compilation process, like inlining, may make a previously non-constant expression a constant. This results in errors like the following when building the selftests with GCC: bpf_helpers.h:190:24: error: expression in static assertion is not constant 190 \| _Static_assert(!__builtin_constant_p(!!sym), #sym " should be marked as __weak"); \ \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fortunately recent versions of GCC support a __builtin_has_attribute that can be used to directly check for the __weak__ attribute. This patch changes bpf_helpers.h to use that builtin when building with a recent enough GCC, and to omit the check if GCC is too old to support the builtin. The macro used for GCC becomes: #define bpf_ksym_exists(sym) ({ \ _Static_assert(__builtin_has_attribute (sym, __weak__), #sym " should be marked as __weak"); \ !!sym; \ }) Note that since bpf_ksym_exists is designed to get the address of the object associated with symbol SYM, we pass sym to __builtin_has_attribute instead of sym. When an expression is passed to __builtin_has_attribute then it is the type of the passed expression that is checked for the specified attribute. The expression itself is not evaluated. This accommodates well with the existing usages of the macro: - For function objects: struct task_struct bpf_task_acquire(struct task_struct p) __ksym __weak; [...] bpf_ksym_exists(bpf_task_acquire) - For variable objects: extern const struct rq runqueues __ksym __weak; /* typed */ [...] bpf_ksym_exists(&runqueues) Note also that BPF support was added in GCC 10 and support for __builtin_has_attribute in GCC 9. Locally tested in bpf-next master branch. No regressions. Signed-of-by: Jose E. Marchesi <jose.marchesi@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/20240428112559.10518-1-jose.marchesi@oracle.com	2024-05-08 16:04:40 -07:00
Andrii Nakryiko	cb7bfc5e51	libbpf: fix ring_buffer__consume_n() return result logic Add INT_MAX check to ring_buffer__consume_n(). We do the similar check to handle int return result of all these ring buffer APIs in other APIs and ring_buffer__consume_n() is missing one. This patch fixes this omission. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20240430201952.888293-2-andrii@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-05-08 16:04:40 -07:00
Andrii Nakryiko	e3e84bd7d0	libbpf: fix potential overflow in ring__consume_n() ringbuf_process_ring() return int64_t, while ring__consume_n() assigns it to int. It's highly unlikely, but possible for ringbuf_process_ring() to return value larger than INT_MAX, so use int64_t. ring__consume_n() does check INT_MAX before returning int result to the user. Fixes: 4d22ea94ea33 ("libbpf: Add ring__consume_n / ring_buffer__consume_n") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20240430201952.888293-1-andrii@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-05-08 16:04:40 -07:00
Jiri Olsa	f3c4172c61	libbpf: Fix error message in attach_kprobe_multi We just failed to retrieve pattern, so we need to print spec instead. Fixes: ddc6b04989eb ("libbpf: Add bpf_program__attach_kprobe_multi_opts function") Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240502075541.1425761-2-jolsa@kernel.org	2024-05-08 16:04:40 -07:00
Jiri Olsa	d045f7682b	libbpf: Fix error message in attach_kprobe_session We just failed to retrieve pattern, so we need to print spec instead. Fixes: 2ca178f02b2f ("libbpf: Add support for kprobe session attach") Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240502075541.1425761-1-jolsa@kernel.org	2024-05-08 16:04:40 -07:00
Quentin Monnet	e055420033	sync: Commit .mailmap changes from script when sync-ing repo In commit `4794f18bf4` ("sync: Sync .mailmap entries"), we updated the sync-up script to automatically update libbpf's .mailmap; however, the script would not take care of committing the changes. Let's address this. The code is copied and adapted from the part where we commit changes to src/bpf_helper_defs.h. Signed-off-by: Quentin Monnet <qmo@kernel.org>	2024-05-01 17:38:26 -07:00
Andrii Nakryiko	255b705a16	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 1bba3b3d373dbafae891e7cb06b8c82c8d62aba1 Checkpoint bpf-next commit: 0737df6de94661ae55fd3343ce9abec32c687e62 Baseline bpf commit: b867247555c4181bf84eb10b72b176862c29112d Checkpoint bpf commit: 3e9bc0472b910d4115e16e9c2d684c7757cb6c60 Andrii Nakryiko (1): libbpf: better fix for handling nulled-out struct_ops program Jiri Olsa (3): bpf: Add support for kprobe session attach libbpf: Add support for kprobe session attach libbpf: Add kprobe session attach type name to attach_type_name Viktor Malik (1): libbpf: support "module: Function" syntax for tracing programs include/uapi/linux/bpf.h \| 1 + src/bpf.c \| 1 + src/libbpf.c \| 112 +++++++++++++++++++++++++++++++-------- src/libbpf.h \| 4 +- 4 files changed, 95 insertions(+), 23 deletions(-) Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-05-01 15:20:15 -07:00
Andrii Nakryiko	6a41f02ad4	libbpf: better fix for handling nulled-out struct_ops program Previous attempt to fix the handling of nulled-out (from skeleton) struct_ops program is working well only if struct_ops program is defined as non-autoloaded by default (i.e., has SEC("?struct_ops") annotation, with question mark). Unfortunately, that fix is incomplete due to how bpf_object_adjust_struct_ops_autoload() is marking referenced or non-referenced struct_ops program as autoloaded (or not). Because bpf_object_adjust_struct_ops_autoload() is run after bpf_map__init_kern_struct_ops() step, which sets program slot to NULL, such programs won't be considered "referenced", and so its autoload property won't be changed. This all sounds convoluted and it is, but the desire is to have as natural behavior (as far as struct_ops usage is concerned) as possible. This fix is redoing the original fix but makes it work for autoloaded-by-default struct_ops programs as well. We achieve this by forcing prog->autoload to false if prog was declaratively set for some struct_ops map, but then nulled-out from skeleton (programmatically). This achieves desired effect of not autoloading it. If such program is still referenced somewhere else (different struct_ops map or different callback field), it will get its autoload property adjusted by bpf_object_adjust_struct_ops_autoload() later. We also fix selftest, which accidentally used SEC("?struct_ops") annotation. It was meant to use autoload-by-default program from the very beginning. Fixes: f973fccd43d3 ("libbpf: handle nulled-out program in struct_ops correctly") Cc: Kui-Feng Lee <thinker.li@gmail.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240501041706.3712608-1-andrii@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-05-01 15:20:15 -07:00
Viktor Malik	dd589c3b31	libbpf: support "module: Function" syntax for tracing programs In some situations, it is useful to explicitly specify a kernel module to search for a tracing program target (e.g. when a function of the same name exists in multiple modules or in vmlinux). This patch enables that by allowing the "module:function" syntax for the find_kernel_btf_id function. Thanks to this, the syntax can be used both from a SEC macro (i.e. `SEC(fentry/module:function)`) and via the bpf_program__set_attach_target API call. Signed-off-by: Viktor Malik <vmalik@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/9085a8cb9a552de98e554deb22ff7e977d025440.1714469650.git.vmalik@redhat.com	2024-05-01 15:20:15 -07:00
Jiri Olsa	045a0372ef	libbpf: Add kprobe session attach type name to attach_type_name Adding kprobe session attach type name to attach_type_name, so libbpf_bpf_attach_type_str returns proper string name for BPF_TRACE_KPROBE_SESSION attach type. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240430112830.1184228-6-jolsa@kernel.org	2024-05-01 15:20:15 -07:00
Jiri Olsa	6c3cf5108e	libbpf: Add support for kprobe session attach Adding support to attach program in kprobe session mode with bpf_program__attach_kprobe_multi_opts function. Adding session bool to bpf_kprobe_multi_opts struct that allows to load and attach the bpf program via kprobe session. the attachment to create kprobe multi session. Also adding new program loader section that allows: SEC("kprobe.session/bpf_fentry_test*") and loads/attaches kprobe program as kprobe session. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240430112830.1184228-5-jolsa@kernel.org	2024-05-01 15:20:15 -07:00
Jiri Olsa	b63d2945ff	bpf: Add support for kprobe session attach Adding support to attach bpf program for entry and return probe of the same function. This is common use case which at the moment requires to create two kprobe multi links. Adding new BPF_TRACE_KPROBE_SESSION attach type that instructs kernel to attach single link program to both entry and exit probe. It's possible to control execution of the bpf program on return probe simply by returning zero or non zero from the entry bpf program execution to execute or not the bpf program on return probe respectively. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240430112830.1184228-2-jolsa@kernel.org	2024-05-01 15:20:15 -07:00
Andrii Nakryiko	d3e18fceec	ci: remove tcp_rtt test from 5.5 ALLOWLIST It's been updated to expecte the very latest kernel, can't succeed on 5.5 anymore. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-04-30 09:09:32 -07:00
Andrii Nakryiko	22bd976613	ci: update vmlinux.h Regenerate vmlinux.h to get all the latest types. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-04-30 09:09:32 -07:00
Andrii Nakryiko	f9f3fbf72d	sync: update .mailmap Update .mailmap generated during sync. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-04-30 09:09:32 -07:00
Andrii Nakryiko	37b8e0eb2d	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 82e38a505c9868e784ec31e743fd8a9fa5ca1084 Checkpoint bpf-next commit: 1bba3b3d373dbafae891e7cb06b8c82c8d62aba1 Baseline bpf commit: 5bcf0dcbf9066348058b88a510c57f70f384c92c Checkpoint bpf commit: b867247555c4181bf84eb10b72b176862c29112d Andrii Nakryiko (1): libbpf: handle nulled-out program in struct_ops correctly Jose E. Marchesi (1): bpf_helpers.h: Define bpf_tail_call_static when building with GCC Philo Lu (1): bpf: add mrtt and srtt as BPF_SOCK_OPS_RTT_CB args include/uapi/linux/bpf.h \| 2 ++ src/bpf_helpers.h \| 4 +++- src/libbpf.c \| 1 + 3 files changed, 6 insertions(+), 1 deletion(-) Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-04-30 09:09:32 -07:00
Andrii Nakryiko	f28271ab72	libbpf: handle nulled-out program in struct_ops correctly If struct_ops has one of program callbacks set declaratively and host kernel is old and doesn't support this callback, libbpf will allow to load such struct_ops as long as that callback was explicitly nulled-out (presumably through skeleton). This is all working correctly, except we won't reset corresponding program slot to NULL before bailing out, which will lead to libbpf not detecting that BPF program has to be not auto-loaded. Fix this by unconditionally resetting corresponding program slot to NULL. Fixes: c911fc61a7ce ("libbpf: Skip zeroed or null fields if not found in the kernel type.") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240428030954.3918764-1-andrii@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-04-30 09:09:32 -07:00
Jose E. Marchesi	b1051d9361	bpf_helpers.h: Define bpf_tail_call_static when building with GCC The definition of bpf_tail_call_static in tools/lib/bpf/bpf_helpers.h is guarded by a preprocessor check to assure that clang is recent enough to support it. This patch updates the guard so the function is compiled when using GCC 13 or later as well. Tested in bpf-next master. No regressions. Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240426145158.14409-1-jose.marchesi@oracle.com	2024-04-30 09:09:32 -07:00
Philo Lu	43df08cd17	bpf: add mrtt and srtt as BPF_SOCK_OPS_RTT_CB args Two important arguments in RTT estimation, mrtt and srtt, are passed to tcp_bpf_rtt(), so that bpf programs get more information about RTT computation in BPF_SOCK_OPS_RTT_CB. The difference between bpf_sock_ops->srtt_us and the srtt here is: the former is an old rtt before update, while srtt passed by tcp_bpf_rtt() is that after update. Signed-off-by: Philo Lu <lulie@linux.alibaba.com> Link: https://lore.kernel.org/r/20240425161724.73707-2-lulie@linux.alibaba.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-04-30 09:09:32 -07:00
Quentin Monnet	4794f18bf4	sync: Sync .mailmap entries The kernel repository has a .mailmap file to remap author names and email addresses to their desired format in Git logs (for details, see gitmailmap documentation [0]). Alas, this is only visible for author information when looking at the logs locally, as GitHub does not support mailmaps at the moment [1]. This commit adds a .mailmap file for libbpf, automatically generated from the kernel's version. The script to generate the .mailmap is added, too: it works by grepping email addresses from authors in the repository, and collecting all lines ending with this address in the kernel's .mailmap - in other words, all lines where this address is used as a pattern for a remapping. To keep the .mailmap up-to-date, add a call to the script to sync-kernel.sh. [0] https://git-scm.com/docs/gitmailmap [1] https://github.com/orgs/community/discussions/22518 Signed-off-by: Quentin Monnet <qmo@kernel.org>	2024-04-25 22:42:05 -07:00
Yonghong Song	2fdcc365a0	ci: regenerate latest vmlinux.h Update vmlinux.h to make BPF selftests compile. Signed-off-by: Yonghong Song <yonghong.song@linux.dev>	2024-04-24 15:16:35 -07:00
Yonghong Song	52c37177cc	Makefile: Ensure github libbpf version the same as the kernel one The kernel libbpf version is 1.5 now. So change github libbpf version to be 1.5 as well. Signed-off-by: Yonghong Song <yonghong.song@linux.dev>	2024-04-24 15:16:35 -07:00
Yonghong Song	7cbfddfdf2	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 14bb1e8c8d4ad5d9d2febb7d19c70a3cf536e1e5 Checkpoint bpf-next commit: 82e38a505c9868e784ec31e743fd8a9fa5ca1084 Baseline bpf commit: 443574b033876c85a35de4c65c14f7fe092222b2 Checkpoint bpf commit: 5bcf0dcbf9066348058b88a510c57f70f384c92c Andrea Righi (3): libbpf: Start v1.5 development cycle libbpf: ringbuf: Allow to consume up to a certain amount of items libbpf: Add ring__consume_n / ring_buffer__consume_n Anton Protopopov (2): bpf: Add support for passing mark with bpf_fib_lookup bpf: Pack struct bpf_fib_lookup Benjamin Tissoires (1): tools: sync include/uapi/linux/bpf.h David Lechner (1): bpf: Fix typo in uapi doc comments Mykyta Yatsenko (1): bpf: improve error message for unsupported helper Quentin Deslandes (2): libbpf: Fix misaligned array closing bracket libbpf: Fix dump of subsequent char arrays Tobias Böhm (1): libbpf: Use local bpf_helpers.h include Yonghong Song (4): libbpf: Mark libbpf_kallsyms_parse static function libbpf: Handle <orig_name>.llvm.<hash> symbol properly bpf: Add bpf_link support for sk_msg and sk_skb progs libbpf: Add bpf_link support for BPF_PROG_TYPE_SOCKMAP include/uapi/linux/bpf.h \| 35 +++++++++++++++++++++---- src/bpf_core_read.h \| 2 +- src/btf_dump.c \| 5 ++++ src/libbpf.c \| 33 ++++++++++++++++++++++-- src/libbpf.h \| 14 ++++++++++ src/libbpf.map \| 7 +++++ src/libbpf_internal.h \| 5 ---- src/libbpf_probes.c \| 6 +++-- src/libbpf_version.h \| 2 +- src/ringbuf.c \| 55 +++++++++++++++++++++++++++++++++------- 10 files changed, 139 insertions(+), 25 deletions(-) Signed-off-by: Yonghong Song <yonghong.song@linux.dev>	2024-04-24 15:16:35 -07:00
Yonghong Song	f2fe16ec95	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions. Signed-off-by: Yonghong Song <yonghong.song@linux.dev>	2024-04-24 15:16:35 -07:00
Benjamin Tissoires	a911ca1e3e	tools: sync include/uapi/linux/bpf.h cp include/uapi/linux/bpf.h tools/include/uapi/linux/bpf.h Signed-off-by: Benjamin Tissoires <bentiss@kernel.org> Link: https://lore.kernel.org/r/20240420-bpf_wq-v2-6-6c986a5a741f@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-04-24 15:16:35 -07:00
Quentin Deslandes	24924003c6	libbpf: Fix dump of subsequent char arrays When dumping a character array, libbpf will watch for a '\0' and set is_array_terminated=true if found. This prevents libbpf from printing the remaining characters of the array, treating it as a nul-terminated string. However, once this flag is set, it's never reset, leading to subsequent characters array not being printed properly: .str_multi = (__u8[2][16])[ [ 'H', 'e', 'l', ], ], This patch saves the is_array_terminated flag and restores its default (false) value before looping over the elements of an array, then restores it afterward. This way, libbpf's behavior is unchanged when dumping the characters of an array, but subsequent arrays are printed properly: .str_multi = (__u8[2][16])[ [ 'H', 'e', 'l', ], [ 'l', 'o', ], ], Signed-off-by: Quentin Deslandes <qde@naccy.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240413211258.134421-3-qde@naccy.de	2024-04-24 15:16:35 -07:00
Quentin Deslandes	2c6f445a8e	libbpf: Fix misaligned array closing bracket In btf_dump_array_data(), libbpf will call btf_dump_dump_type_data() for each element. For an array of characters, each element will be processed the following way: - btf_dump_dump_type_data() is called to print the character - btf_dump_data_pfx() prefixes the current line with the proper number of indentations - btf_dump_int_data() is called to print the character - After the last character is printed, btf_dump_dump_type_data() calls btf_dump_data_pfx() before writing the closing bracket However, for an array containing characters, btf_dump_int_data() won't print any '\0' and subsequent characters. This leads to situations where the line prefix is written, no character is added, then the prefix is written again before adding the closing bracket: (struct sk_metadata){ .str_array = (__u8[14])[ 'H', 'e', 'l', 'l', 'o', ], This change solves this issue by printing the '\0' character, which has two benefits: - The bracket closing the array is properly aligned - It's clear from a user point of view that libbpf uses '\0' as a terminator for arrays of characters. Signed-off-by: Quentin Deslandes <qde@naccy.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240413211258.134421-2-qde@naccy.de	2024-04-24 15:16:35 -07:00
Yonghong Song	09397e309a	libbpf: Add bpf_link support for BPF_PROG_TYPE_SOCKMAP Introduce a libbpf API function bpf_program__attach_sockmap() which allow user to get a bpf_link for their corresponding programs. Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20240410043532.3737722-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-04-24 15:16:35 -07:00
Yonghong Song	62217fb32a	bpf: Add bpf_link support for sk_msg and sk_skb progs Add bpf_link support for sk_msg and sk_skb programs. We have an internal request to support bpf_link for sk_msg programs so user space can have a uniform handling with bpf_link based libbpf APIs. Using bpf_link based libbpf API also has a benefit which makes system robust by decoupling prog life cycle and attachment life cycle. Reviewed-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20240410043527.3737160-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-04-24 15:16:35 -07:00
Andrea Righi	b521a722b9	libbpf: Add ring__consume_n / ring_buffer__consume_n Introduce a new API to consume items from a ring buffer, limited to a specified amount, and return to the caller the actual number of items consumed. Signed-off-by: Andrea Righi <andrea.righi@canonical.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/lkml/20240310154726.734289-1-andrea.righi@canonical.com/T Link: https://lore.kernel.org/bpf/20240406092005.92399-4-andrea.righi@canonical.com	2024-04-24 15:16:35 -07:00
Andrea Righi	98de9ace4d	libbpf: ringbuf: Allow to consume up to a certain amount of items In some cases, instead of always consuming all items from ring buffers in a greedy way, we may want to consume up to a certain amount of items, for example when we need to copy items from the BPF ring buffer to a limited user buffer. This change allows to set an upper limit to the amount of items consumed from one or more ring buffers. Signed-off-by: Andrea Righi <andrea.righi@canonical.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240406092005.92399-3-andrea.righi@canonical.com	2024-04-24 15:16:35 -07:00
Andrea Righi	26d9ab5f78	libbpf: Start v1.5 development cycle Bump libbpf.map to v1.5.0 to start a new libbpf version cycle. Signed-off-by: Andrea Righi <andrea.righi@canonical.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240406092005.92399-2-andrea.righi@canonical.com	2024-04-24 15:16:35 -07:00
Anton Protopopov	c5219d1b3d	bpf: Pack struct bpf_fib_lookup The struct bpf_fib_lookup is supposed to be of size 64. A recent commit 59b418c7063d ("bpf: Add a check for struct bpf_fib_lookup size") added a static assertion to check this property so that future changes to the structure will not accidentally break this assumption. As it immediately turned out, on some 32-bit arm systems, when AEABI=n, the total size of the structure was equal to 68, see [1]. This happened because the bpf_fib_lookup structure contains a union of two 16-bit fields: union { __u16 tot_len; __u16 mtu_result; }; which was supposed to compile to a 16-bit-aligned 16-bit field. On the aforementioned setups it was instead both aligned and padded to 32-bits. Declare this inner union as __attribute__((packed, aligned(2))) such that it always is of size 2 and is aligned to 16 bits. [1] https://lore.kernel.org/all/CA+G9fYtsoP51f-oP_Sp5MOq-Ffv8La2RztNpwvE6+R1VtFiLrw@mail.gmail.com/#t Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Fixes: e1850ea9bd9e ("bpf: bpf_fib_lookup return MTU value as output when looked up") Signed-off-by: Anton Protopopov <aspsk@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240403123303.1452184-1-aspsk@isovalent.com	2024-04-24 15:16:35 -07:00
Tobias Böhm	8d3a3e138b	libbpf: Use local bpf_helpers.h include Commit 20d59ee55172fdf6 ("libbpf: add bpf_core_cast() macro") added a bpf_helpers include in bpf_core_read.h as a system include. Usually, the includes are local, though, like in bpf_tracing.h. This commit adjusts the include to be local as well. Signed-off-by: Tobias Böhm <tobias@aibor.de> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/q5d5bgc6vty2fmaazd5e73efd6f5bhiru2le6fxn43vkw45bls@fhlw2s5ootdb	2024-04-24 15:16:35 -07:00
David Lechner	9f2853a352	bpf: Fix typo in uapi doc comments In a few places in the bpf uapi headers, EOPNOTSUPP is missing a "P" in the doc comments. This adds the missing "P". Signed-off-by: David Lechner <dlechner@baylibre.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240329152900.398260-2-dlechner@baylibre.com	2024-04-24 15:16:35 -07:00
Yonghong Song	d2f83fb976	libbpf: Handle <orig_name>.llvm.<hash> symbol properly With CONFIG_LTO_CLANG_THIN enabled, with some of previous version of kernel code base ([1]), I hit the following error: test_ksyms:PASS:kallsyms_fopen 0 nsec test_ksyms:FAIL:ksym_find symbol 'bpf_link_fops' not found #118 ksyms:FAIL The reason is that 'bpf_link_fops' is renamed to bpf_link_fops.llvm.8325593422554671469 Due to cross-file inlining, the static variable 'bpf_link_fops' in syscall.c is used by a function in another file. To avoid potential duplicated names, the llvm added suffix '.llvm.<hash>' ([2]) to 'bpf_link_fops' variable. Such renaming caused a problem in libbpf if 'bpf_link_fops' is used in bpf prog as a ksym but 'bpf_link_fops' does not match any symbol in /proc/kallsyms. To fix this issue, libbpf needs to understand that suffix '.llvm.<hash>' is caused by clang lto kernel and to process such symbols properly. With latest bpf-next code base built with CONFIG_LTO_CLANG_THIN, I cannot reproduce the above failure any more. But such an issue could happen with other symbols or in the future for bpf_link_fops symbol. For example, with my current kernel, I got the following from /proc/kallsyms: ffffffff84782154 d __func__.net_ratelimit.llvm.6135436931166841955 ffffffff85f0a500 d tk_core.llvm.726630847145216431 ffffffff85fdb960 d __fs_reclaim_map.llvm.10487989720912350772 ffffffff864c7300 d fake_dst_ops.llvm.54750082607048300 I could not easily create a selftest to test newly-added libbpf functionality with a static C test since I do not know which symbol is cross-file inlined. But based on my particular kernel, the following test change can run successfully. > diff --git a/tools/testing/selftests/bpf/prog_tests/ksyms.c b/tools/testing/selftests/bpf/prog_tests/ksyms.c > index 6a86d1f07800..904a103f7b1d 100644 > --- a/tools/testing/selftests/bpf/prog_tests/ksyms.c > +++ b/tools/testing/selftests/bpf/prog_tests/ksyms.c > @@ -42,6 +42,7 @@ void test_ksyms(void) > ASSERT_EQ(data->out__bpf_link_fops, link_fops_addr, "bpf_link_fops"); > ASSERT_EQ(data->out__bpf_link_fops1, 0, "bpf_link_fops1"); > ASSERT_EQ(data->out__btf_size, btf_size, "btf_size"); > + ASSERT_NEQ(data->out__fake_dst_ops, 0, "fake_dst_ops"); > ASSERT_EQ(data->out__per_cpu_start, per_cpu_start_addr, "__per_cpu_start"); > > cleanup: > diff --git a/tools/testing/selftests/bpf/progs/test_ksyms.c b/tools/testing/selftests/bpf/progs/test_ksyms.c > index 6c9cbb5a3bdf..fe91eef54b66 100644 > --- a/tools/testing/selftests/bpf/progs/test_ksyms.c > +++ b/tools/testing/selftests/bpf/progs/test_ksyms.c > @@ -9,11 +9,13 @@ __u64 out__bpf_link_fops = -1; > __u64 out__bpf_link_fops1 = -1; > __u64 out__btf_size = -1; > __u64 out__per_cpu_start = -1; > +__u64 out__fake_dst_ops = -1; > > extern const void bpf_link_fops __ksym; > extern const void __start_BTF __ksym; > extern const void __stop_BTF __ksym; > extern const void __per_cpu_start __ksym; > +extern const void fake_dst_ops __ksym; > /* non-existing symbol, weak, default to zero / > extern const void bpf_link_fops1 __ksym __weak; > > @@ -23,6 +25,7 @@ int handler(const void ctx) > out__bpf_link_fops = (__u64)&bpf_link_fops; > out__btf_size = (__u64)(&__stop_BTF - &__start_BTF); > out__per_cpu_start = (__u64)&__per_cpu_start; > + out__fake_dst_ops = (__u64)&fake_dst_ops; > > out__bpf_link_fops1 = (__u64)&bpf_link_fops1; This patch fixed the issue in libbpf such that the suffix '.llvm.<hash>' will be ignored during comparison of bpf prog ksym vs. symbols in /proc/kallsyms, this resolved the issue. Currently, only static variables in /proc/kallsyms are checked with '.llvm.<hash>' suffix since in bpf programs function ksyms with '.llvm.<hash>' suffix are most likely kfunc's and unlikely to be cross-file inlined. Note that currently kernel does not support gcc build with lto. [1] https://lore.kernel.org/bpf/20240302165017.1627295-1-yonghong.song@linux.dev/ [2] https://github.com/llvm/llvm-project/blob/release/18.x/llvm/include/llvm/IR/ModuleSummaryIndex.h#L1714-L1719 Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20240326041458.1198161-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-04-24 15:16:35 -07:00
Yonghong Song	8b9cb7d479	libbpf: Mark libbpf_kallsyms_parse static function Currently libbpf_kallsyms_parse() function is declared as a global function but actually it is not a API and there is no external users in bpftool/bpf-selftests. So let us mark the function as static. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20240326041453.1197949-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-04-24 15:16:35 -07:00
Mykyta Yatsenko	b062410166	bpf: improve error message for unsupported helper BPF verifier emits "unknown func" message when given BPF program type does not support BPF helper. This message may be confusing for users, as important context that helper is unknown only to current program type is not provided. This patch changes message to "program of this type cannot use helper " and aligns dependent code in libbpf and tests. Any suggestions on improving/changing this message are welcome. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <qmo@kernel.org> Link: https://lore.kernel.org/r/20240325152210.377548-1-yatsenko@meta.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-04-24 15:16:35 -07:00
Anton Protopopov	89d8cdf741	bpf: Add support for passing mark with bpf_fib_lookup Extend the bpf_fib_lookup() helper by making it to utilize mark if the BPF_FIB_LOOKUP_MARK flag is set. In order to pass the mark the four bytes of struct bpf_fib_lookup are used, shared with the output-only smac/dmac fields. Signed-off-by: Anton Protopopov <aspsk@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: David Ahern <dsahern@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240326101742.17421-2-aspsk@isovalent.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-04-24 15:16:35 -07:00
Song Liu	8a2054f417	ci/diffs: Add temporary fix for mitigation config Upstream is discussing the exact config to ship. In the meanwhile, which would unblock CI. More discussions here: https://lore.kernel.org/lkml/20240423045548.1324969-1-song@kernel.org/T/#u Signed-off-by: Song Liu <song@kernel.org>	2024-04-24 10:42:01 -07:00
Geyslan Gregório	46eafba62e	ci: bump run-on-arch action to v2.7.1 More info: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/ Signed-off-by: Geyslan Gregório <geyslan@gmail.com>	2024-04-02 21:51:25 -07:00
Geyslan Gregório	6d3595d215	ci: bump checkout action to v4 Due to the transition from Node 16 to Node 20, the checkout action needs to be updated to v4. More info: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/ Signed-off-by: Geyslan Gregório <geyslan@gmail.com>	2024-04-02 10:25:11 -07:00
Andrii Nakryiko	20ea95b450	ci: sync DENYLISTs with BPF CI Keep all the denylisted tests in sync. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-03-25 21:58:26 -07:00
Andrii Nakryiko	902af6913a	ci: clean up temporary patch It's already applied upstream. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-03-25 21:58:26 -07:00
Andrii Nakryiko	25a9cc27d7	ci: regenerate latest vmlinux.h Update vmlinux.h to make BPF selftests compile. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-03-25 21:58:26 -07:00
Andrii Nakryiko	8db4a2feeb	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: e63985ecd22681c7f5975f2e8637187a326b6791 Checkpoint bpf-next commit: 14bb1e8c8d4ad5d9d2febb7d19c70a3cf536e1e5 Baseline bpf commit: 2487007aa3b9fafbd2cb14068f49791ce1d7ede5 Checkpoint bpf commit: 443574b033876c85a35de4c65c14f7fe092222b2 Alexei Starovoitov (6): libbpf: Allow specifying 64-bit integers in map BTF. bpf: Introduce bpf_arena. bpf: Disasm support for addr_space_cast instruction. libbpf: Add __arg_arena to bpf_helpers.h libbpf: Add support for bpf_arena. libbpf, selftests/bpf: Adjust libbpf, bpftool, selftests to match LLVM Andrii Nakryiko (4): libbpf: Recognize __arena global variables. bpf: support BPF cookie in raw tracepoint (raw_tp, tp_btf) programs libbpf: add support for BPF cookie for raw_tp/tp_btf programs libbpf: fix u64-to-pointer cast on 32-bit arches Arnaldo Carvalho de Melo (1): libbpf: Define MFD_CLOEXEC if not available Jakub Kicinski (2): netdev: add per-queue statistics netdev: add queue stat for alloc failures Kui-Feng Lee (1): libbpf: Skip zeroed or null fields if not found in the kernel type. Mykyta Yatsenko (1): libbpbpf: Check bpf_map/bpf_program fd validity Quentin Monnet (1): libbpf: Prevent null-pointer dereference when prog to load has no BTF Yonghong Song (2): libbpf: Add new sec_def "sk_skb/verdict" bpf: Sync uapi bpf.h to tools directory include/uapi/linux/bpf.h \| 20 ++- include/uapi/linux/netdev.h \| 20 +++ src/bpf.c \| 16 +- src/bpf.h \| 9 + src/bpf_helpers.h \| 2 + src/libbpf.c \| 322 +++++++++++++++++++++++++++++++----- src/libbpf.h \| 13 +- src/libbpf.map \| 2 + src/libbpf_probes.c \| 7 + 9 files changed, 366 insertions(+), 45 deletions(-) Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-03-25 21:58:26 -07:00
Arnaldo Carvalho de Melo	7fee466676	libbpf: Define MFD_CLOEXEC if not available Since its going directly to the syscall to avoid not having memfd_create() available in some systems, do the same for its MFD_CLOEXEC flags, defining it if not available. This fixes the build in those systems, noticed while building perf on a set of build containers. Fixes: 9fa5e1a180aa639f ("libbpf: Call memfd_create() syscall directly") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/ZfxZ9nCyKvwmpKkE@x1	2024-03-25 21:58:26 -07:00
Andrii Nakryiko	ddf722fb5c	libbpf: fix u64-to-pointer cast on 32-bit arches It's been reported that (void *)map->map_extra is causing compilation warnings on 32-bit architectures. It's easy enough to fix this by casting to long first. Fixes: 79ff13e99169 ("libbpf: Add support for bpf_arena.") Reported-by: Ryan Eatmon <reatmon@ti.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Message-ID: <20240319215143.1279312-1-andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-03-25 21:58:26 -07:00
Alexei Starovoitov	137193b655	libbpf, selftests/bpf: Adjust libbpf, bpftool, selftests to match LLVM The selftests use to tell LLVM about special pointers. For LLVM there is nothing "arena" about them. They are simply pointers in a different address space. Hence LLVM diff https://github.com/llvm/llvm-project/pull/85161 renamed: . macro __BPF_FEATURE_ARENA_CAST -> __BPF_FEATURE_ADDR_SPACE_CAST . global variables in __attribute__((address_space(N))) are now placed in section named ".addr_space.N" instead of ".arena.N". Adjust libbpf, bpftool, and selftests to match LLVM. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20240315021834.62988-3-alexei.starovoitov@gmail.com	2024-03-25 21:58:26 -07:00
Yonghong Song	d2676a58de	bpf: Sync uapi bpf.h to tools directory There is a difference between kernel uapi bpf.h and tools uapi bpf.h. There is no functionality difference, but let us sync properly to make it easy for later bpf.h update. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240325033842.1693553-1-yonghong.song@linux.dev	2024-03-25 21:58:26 -07:00
Yonghong Song	4d95d8b7f0	libbpf: Add new sec_def "sk_skb/verdict" The new sec_def specifies sk_skb program type with BPF_SK_SKB_VERDICT attachment type. This way, libbpf will set expected_attach_type properly for the program. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240319175412.2941149-1-yonghong.song@linux.dev	2024-03-25 21:58:26 -07:00
Andrii Nakryiko	f5828cc352	libbpf: add support for BPF cookie for raw_tp/tp_btf programs Wire up BPF cookie passing or raw_tp and tp_btf programs, both in low-level and high-level APIs. Acked-by: Stanislav Fomichev <sdf@google.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Message-ID: <20240319233852.1977493-5-andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-03-25 21:58:26 -07:00
Andrii Nakryiko	cbd6e3596c	bpf: support BPF cookie in raw tracepoint (raw_tp, tp_btf) programs Wire up BPF cookie for raw tracepoint programs (both BTF and non-BTF aware variants). This brings them up to part w.r.t. BPF cookie usage with classic tracepoint and fentry/fexit programs. Acked-by: Stanislav Fomichev <sdf@google.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Message-ID: <20240319233852.1977493-4-andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-03-25 21:58:26 -07:00
Mykyta Yatsenko	7cfc365995	libbpbpf: Check bpf_map/bpf_program fd validity libbpf creates bpf_program/bpf_map structs for each program/map that user defines, but it allows to disable creating/loading those objects in kernel, in that case they won't have associated file descriptor (fd < 0). Such functionality is used for backward compatibility with some older kernels. Nothing prevents users from passing these maps or programs with no kernel counterpart to libbpf APIs. This change introduces explicit checks for kernel objects existence, aiming to improve visibility of those edge cases and provide meaningful warnings to users. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240318131808.95959-1-yatsenko@meta.com	2024-03-25 21:58:26 -07:00
Kui-Feng Lee	a5459eac49	libbpf: Skip zeroed or null fields if not found in the kernel type. Accept additional fields of a struct_ops type with all zero values even if these fields are not in the corresponding type in the kernel. This provides a way to be backward compatible. User space programs can use the same map on a machine running an old kernel by clearing fields that do not exist in the kernel. Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240313214139.685112-2-thinker.li@gmail.com	2024-03-25 21:58:26 -07:00
Quentin Monnet	f84ee80801	libbpf: Prevent null-pointer dereference when prog to load has no BTF In bpf_objec_load_prog(), there's no guarantee that obj->btf is non-NULL when passing it to btf__fd(), and this function does not perform any check before dereferencing its argument (as bpf_object__btf_fd() used to do). As a consequence, we get segmentation fault errors in bpftool (for example) when trying to load programs that come without BTF information. v2: Keep btf__fd() in the fix instead of reverting to bpf_object__btf_fd(). Fixes: df7c3f7d3a3d ("libbpf: make uniform use of btf__fd() accessor inside libbpf") Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Quentin Monnet <qmo@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240314150438.232462-1-qmo@kernel.org	2024-03-25 21:58:26 -07:00
Andrii Nakryiko	2d042d22a7	libbpf: Recognize __arena global variables. LLVM automatically places __arena variables into ".arena.1" ELF section. In order to use such global variables bpf program must include definition of arena map in ".maps" section, like: struct { __uint(type, BPF_MAP_TYPE_ARENA); __uint(map_flags, BPF_F_MMAPABLE); __uint(max_entries, 1000); /* number of pages / __ulong(map_extra, 2ull << 44); / start of mmap() region / } arena SEC(".maps"); libbpf recognizes both uses of arena and creates single `struct bpf_map ` instance in libbpf APIs. ".arena.1" ELF section data is used as initial data image, which is exposed through skeleton and bpf_map__initial_value() to the user, if they need to tune it before the load phase. During load phase, this initial image is copied over into mmap()'ed region corresponding to arena, and discarded. Few small checks here and there had to be added to make sure this approach works with bpf_map__initial_value(), mostly due to hard-coded assumption that map->mmaped is set up with mmap() syscall and should be munmap()'ed. For arena, .arena.1 can be (much) smaller than maximum arena size, so this smaller data size has to be tracked separately. Given it is enforced that there is only one arena for entire bpf_object instance, we just keep it in a separate field. This can be generalized if necessary later. All global variables from ".arena.1" section are accessible from user space via skel->arena->name_of_var. For bss/data/rodata the skeleton/libbpf perform the following sequence: 1. addr = mmap(MAP_ANONYMOUS) 2. user space optionally modifies global vars 3. map_fd = bpf_create_map() 4. bpf_update_map_elem(map_fd, addr) // to store values into the kernel 5. mmap(addr, MAP_FIXED, map_fd) after step 5 user spaces see the values it wrote at step 2 at the same addresses arena doesn't support update_map_elem. Hence skeleton/libbpf do: 1. addr = malloc(sizeof SEC ".arena.1") 2. user space optionally modifies global vars 3. map_fd = bpf_create_map(MAP_TYPE_ARENA) 4. real_addr = mmap(map->map_extra, MAP_SHARED \| MAP_FIXED, map_fd) 5. memcpy(real_addr, addr) // this will fault-in and allocate pages At the end look and feel of global data vs __arena global data is the same from bpf prog pov. Another complication is: struct { __uint(type, BPF_MAP_TYPE_ARENA); } arena SEC(".maps"); int __arena foo; int bar; ptr1 = &foo; // relocation against ".arena.1" section ptr2 = &arena; // relocation against ".maps" section ptr3 = &bar; // relocation against ".bss" section Fo the kernel ptr1 and ptr2 has point to the same arena's map_fd while ptr3 points to a different global array's map_fd. For the verifier: ptr1->type == unknown_scalar ptr2->type == const_ptr_to_map ptr3->type == ptr_to_map_value After verification, from JIT pov all 3 ptr-s are normal ld_imm64 insns. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20240308010812.89848-11-alexei.starovoitov@gmail.com	2024-03-25 21:58:26 -07:00
Alexei Starovoitov	4524a45a2a	libbpf: Add support for bpf_arena. mmap() bpf_arena right after creation, since the kernel needs to remember the address returned from mmap. This is user_vm_start. LLVM will generate bpf_arena_cast_user() instructions where necessary and JIT will add upper 32-bit of user_vm_start to such pointers. Fix up bpf_map_mmap_sz() to compute mmap size as map->value_size * map->max_entries for arrays and PAGE_SIZE * map->max_entries for arena. Don't set BTF at arena creation time, since it doesn't support it. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240308010812.89848-9-alexei.starovoitov@gmail.com	2024-03-25 21:58:26 -07:00
Alexei Starovoitov	086825355f	libbpf: Add __arg_arena to bpf_helpers.h Add __arg_arena to bpf_helpers.h Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240308010812.89848-8-alexei.starovoitov@gmail.com	2024-03-25 21:58:26 -07:00
Alexei Starovoitov	6de941bc1e	bpf: Disasm support for addr_space_cast instruction. LLVM generates rX = addr_space_cast(rY, dst_addr_space, src_addr_space) instruction when pointers in non-zero address space are used by the bpf program. Recognize this insn in uapi and in bpf disassembler. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/bpf/20240308010812.89848-3-alexei.starovoitov@gmail.com	2024-03-25 21:58:26 -07:00
Alexei Starovoitov	1675c13fae	bpf: Introduce bpf_arena. Introduce bpf_arena, which is a sparse shared memory region between the bpf program and user space. Use cases: 1. User space mmap-s bpf_arena and uses it as a traditional mmap-ed anonymous region, like memcached or any key/value storage. The bpf program implements an in-kernel accelerator. XDP prog can search for a key in bpf_arena and return a value without going to user space. 2. The bpf program builds arbitrary data structures in bpf_arena (hash tables, rb-trees, sparse arrays), while user space consumes it. 3. bpf_arena is a "heap" of memory from the bpf program's point of view. The user space may mmap it, but bpf program will not convert pointers to user base at run-time to improve bpf program speed. Initially, the kernel vm_area and user vma are not populated. User space can fault in pages within the range. While servicing a page fault, bpf_arena logic will insert a new page into the kernel and user vmas. The bpf program can allocate pages from that region via bpf_arena_alloc_pages(). This kernel function will insert pages into the kernel vm_area. The subsequent fault-in from user space will populate that page into the user vma. The BPF_F_SEGV_ON_FAULT flag at arena creation time can be used to prevent fault-in from user space. In such a case, if a page is not allocated by the bpf program and not present in the kernel vm_area, the user process will segfault. This is useful for use cases 2 and 3 above. bpf_arena_alloc_pages() is similar to user space mmap(). It allocates pages either at a specific address within the arena or allocates a range with the maple tree. bpf_arena_free_pages() is analogous to munmap(), which frees pages and removes the range from the kernel vm_area and from user process vmas. bpf_arena can be used as a bpf program "heap" of up to 4GB. The speed of bpf program is more important than ease of sharing with user space. This is use case 3. In such a case, the BPF_F_NO_USER_CONV flag is recommended. It will tell the verifier to treat the rX = bpf_arena_cast_user(rY) instruction as a 32-bit move wX = wY, which will improve bpf prog performance. Otherwise, bpf_arena_cast_user is translated by JIT to conditionally add the upper 32 bits of user vm_start (if the pointer is not NULL) to arena pointers before they are stored into memory. This way, user space sees them as valid 64-bit pointers. Diff https://github.com/llvm/llvm-project/pull/84410 enables LLVM BPF backend generate the bpf_addr_space_cast() instruction to cast pointers between address_space(1) which is reserved for bpf_arena pointers and default address space zero. All arena pointers in a bpf program written in C language are tagged as __attribute__((address_space(1))). Hence, clang provides helpful diagnostics when pointers cross address space. Libbpf and the kernel support only address_space == 1. All other address space identifiers are reserved. rX = bpf_addr_space_cast(rY, /* dst_as / 1, / src_as / 0) tells the verifier that rX->type = PTR_TO_ARENA. Any further operations on PTR_TO_ARENA register have to be in the 32-bit domain. The verifier will mark load/store through PTR_TO_ARENA with PROBE_MEM32. JIT will generate them as kern_vm_start + 32bit_addr memory accesses. The behavior is similar to copy_from_kernel_nofault() except that no address checks are necessary. The address is guaranteed to be in the 4GB range. If the page is not present, the destination register is zeroed on read, and the operation is ignored on write. rX = bpf_addr_space_cast(rY, 0, 1) tells the verifier that rX->type = unknown scalar. If arena->map_flags has BPF_F_NO_USER_CONV set, then the verifier converts such cast instructions to mov32. Otherwise, JIT will emit native code equivalent to: rX = (u32)rY; if (rY) rX \|= clear_lo32_bits(arena->user_vm_start); / replace hi32 bits in rX */ After such conversion, the pointer becomes a valid user pointer within bpf_arena range. The user process can access data structures created in bpf_arena without any additional computations. For example, a linked list built by a bpf program can be walked natively by user space. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Barret Rhoden <brho@google.com> Link: https://lore.kernel.org/bpf/20240308010812.89848-2-alexei.starovoitov@gmail.com	2024-03-25 21:58:26 -07:00
Alexei Starovoitov	385d344839	libbpf: Allow specifying 64-bit integers in map BTF. __uint() macro that is used to specify map attributes like: __uint(type, BPF_MAP_TYPE_ARRAY); __uint(map_flags, BPF_F_MMAPABLE); It is limited to 32-bit, since BTF_KIND_ARRAY has u32 "number of elements" field in "struct btf_array". Introduce __ulong() macro that allows specifying values bigger than 32-bit. In map definition "map_extra" is the only u64 field, so far. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20240307031228.42896-5-alexei.starovoitov@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-03-25 21:58:26 -07:00
Jakub Kicinski	d71c0ed2ef	netdev: add queue stat for alloc failures Rx alloc failures are commonly counted by drivers. Support reporting those via netdev-genl queue stats. Acked-by: Stanislav Fomichev <sdf@google.com> Reviewed-by: Amritha Nambiar <amritha.nambiar@intel.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Link: https://lore.kernel.org/r/20240306195509.1502746-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-03-25 21:58:26 -07:00
Jakub Kicinski	5e80833e50	netdev: add per-queue statistics The ethtool-nl family does a good job exposing various protocol related and IEEE/IETF statistics which used to get dumped under ethtool -S, with creative names. Queue stats don't have a netlink API, yet, and remain a lion's share of ethtool -S output for new drivers. Not only is that bad because the names differ driver to driver but it's also bug-prone. Intuitively drivers try to report only the stats for active queues, but querying ethtool stats involves multiple system calls, and the number of stats is read separately from the stats themselves. Worse still when user space asks for values of the stats, it doesn't inform the kernel how big the buffer is. If number of stats increases in the meantime kernel will overflow user buffer. Add a netlink API for dumping queue stats. Queue information is exposed via the netdev-genl family, so add the stats there. Support per-queue and sum-for-device dumps. Latter will be useful when subsequent patches add more interesting common stats than just bytes and packets. The API does not currently distinguish between HW and SW stats. The expectation is that the source of the stats will either not matter much (good packets) or be obvious (skb alloc errors). Acked-by: Stanislav Fomichev <sdf@google.com> Reviewed-by: Amritha Nambiar <amritha.nambiar@intel.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Link: https://lore.kernel.org/r/20240306195509.1502746-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-03-25 21:58:26 -07:00
Andrii Nakryiko	2778cbce60	ci: add xdp_bonding fixes from bpf/master bpf tree has fixes for xdp_bonding selftests which are not yet in bpf-next, so add them as temporary CI-only patches. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-03-06 15:22:13 -08:00
Andrii Nakryiko	4f875865b7	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 2ab256e93249f5ac1da665861aa0f03fb4208d9c Checkpoint bpf-next commit: 7d763bc4a44a51e48dde406d6c6c8a26a60ec647 Baseline bpf commit: dced881ead78e4d6add3735d02a9186ba2415630 Checkpoint bpf commit: 2487007aa3b9fafbd2cb14068f49791ce1d7ede5 Aahil Awatramani (1): bonding: Add independent control state machine Alexei Starovoitov (1): bpf: Introduce may_goto instruction Chen Shen (1): libbpf: Correct debug message in btf__load_vmlinux_btf Eduard Zingerman (7): libbpf: Allow version suffixes (___smth) for struct_ops types libbpf: Tie struct_ops programs to kernel BTF ids, not to local ids libbpf: Honor autocreate flag for struct_ops maps libbpf: Sync progs autoload with maps autocreate for struct_ops maps libbpf: Replace elf_state->st_ops_* fields with SEC_ST_OPS sec_type libbpf: Struct_ops in SEC("?.struct_ops") / SEC("?.struct_ops.link") libbpf: Rewrite btf datasec names starting from '?' Kees Cook (1): bpf: Replace bpf_lpm_trie_key 0-length array with flexible array Kui-Feng Lee (2): libbpf: Set btf_value_type_id of struct bpf_map for struct_ops. libbpf: Convert st_ops->data to shadow type. include/uapi/linux/bpf.h \| 24 +++- include/uapi/linux/if_link.h \| 1 + src/btf.c \| 2 +- src/features.c \| 22 +++ src/libbpf.c \| 252 +++++++++++++++++++++++++++-------- src/libbpf_internal.h \| 2 + 6 files changed, 242 insertions(+), 61 deletions(-) Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-03-06 15:22:13 -08:00
Eduard Zingerman	438adf417d	libbpf: Rewrite btf datasec names starting from '?' Optional struct_ops maps are defined using question mark at the start of the section name, e.g.: SEC("?.struct_ops") struct test_ops optional_map = { ... }; This commit teaches libbpf to detect if kernel allows '?' prefix in datasec names, and if it doesn't then to rewrite such names by replacing '?' with '_', e.g.: DATASEC ?.struct_ops -> DATASEC _.struct_ops Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240306104529.6453-13-eddyz87@gmail.com	2024-03-06 13:58:27 -08:00
Eduard Zingerman	fc8b86bda2	libbpf: Struct_ops in SEC("?.struct_ops") / SEC("?.struct_ops.link") Allow using two new section names for struct_ops maps: - SEC("?.struct_ops") - SEC("?.struct_ops.link") To specify maps that have bpf_map->autocreate == false after open. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240306104529.6453-12-eddyz87@gmail.com	2024-03-06 13:58:27 -08:00
Eduard Zingerman	d5d0b6e920	libbpf: Replace elf_state->st_ops_* fields with SEC_ST_OPS sec_type The next patch would add two new section names for struct_ops maps. To make working with multiple struct_ops sections more convenient: - remove fields like elf_state->st_ops_{shndx,link_shndx}; - mark section descriptions hosting struct_ops as elf_sec_desc->sec_type == SEC_ST_OPS; After these changes struct_ops sections could be processed uniformly by iterating bpf_object->efile.secs entries. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240306104529.6453-11-eddyz87@gmail.com	2024-03-06 13:58:27 -08:00
Eduard Zingerman	060c604db8	libbpf: Sync progs autoload with maps autocreate for struct_ops maps Automatically select which struct_ops programs to load depending on which struct_ops maps are selected for automatic creation. E.g. for the BPF code below: SEC("struct_ops/test_1") int BPF_PROG(foo) { ... } SEC("struct_ops/test_2") int BPF_PROG(bar) { ... } SEC(".struct_ops.link") struct test_ops___v1 A = { .foo = (void )foo }; SEC(".struct_ops.link") struct test_ops___v2 B = { .foo = (void )foo, .bar = (void *)bar, }; And the following libbpf API calls: bpf_map__set_autocreate(skel->maps.A, true); bpf_map__set_autocreate(skel->maps.B, false); The autoload would be enabled for program 'foo' and disabled for program 'bar'. During load, for each struct_ops program P, referenced from some struct_ops map M: - set P.autoload = true if M.autocreate is true for some M; - set P.autoload = false if M.autocreate is false for all M; - don't change P.autoload, if P is not referenced from any map. Do this after bpf_object__init_kern_struct_ops_maps() to make sure that shadow vars assignment is done. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240306104529.6453-9-eddyz87@gmail.com	2024-03-06 13:58:27 -08:00
Eduard Zingerman	cb426140d0	libbpf: Honor autocreate flag for struct_ops maps Skip load steps for struct_ops maps not marked for automatic creation. This should allow to load bpf object in situations like below: SEC("struct_ops/foo") int BPF_PROG(foo) { ... } SEC("struct_ops/bar") int BPF_PROG(bar) { ... } struct test_ops___v1 { int (foo)(void); }; struct test_ops___v2 { int (foo)(void); int (does_not_exist)(void); }; SEC(".struct_ops.link") struct test_ops___v1 map_for_old = { .test_1 = (void )foo }; SEC(".struct_ops.link") struct test_ops___v2 map_for_new = { .test_1 = (void )foo, .does_not_exist = (void )bar }; Suppose program is loaded on old kernel that does not have definition for 'does_not_exist' struct_ops member. After this commit it would be possible to load such object file after the following tweaks: bpf_program__set_autoload(skel->progs.bar, false); bpf_map__set_autocreate(skel->maps.map_for_new, false); Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20240306104529.6453-4-eddyz87@gmail.com	2024-03-06 13:58:27 -08:00
Eduard Zingerman	aded62b120	libbpf: Tie struct_ops programs to kernel BTF ids, not to local ids Enforce the following existing limitation on struct_ops programs based on kernel BTF id instead of program-local BTF id: struct_ops BPF prog can be re-used between multiple .struct_ops & .struct_ops.link as long as it's the same struct_ops struct definition and the same function pointer field This allows reusing same BPF program for versioned struct_ops map definitions, e.g.: SEC("struct_ops/test") int BPF_PROG(foo) { ... } struct some_ops___v1 { int (test)(void); }; struct some_ops___v2 { int (test)(void); }; SEC(".struct_ops.link") struct some_ops___v1 a = { .test = foo } SEC(".struct_ops.link") struct some_ops___v2 b = { .test = foo } Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240306104529.6453-3-eddyz87@gmail.com	2024-03-06 13:58:27 -08:00
Eduard Zingerman	f7fd5dbc07	libbpf: Allow version suffixes (___smth) for struct_ops types E.g. allow the following struct_ops definitions: struct bpf_testmod_ops___v1 { int (test)(void); }; struct bpf_testmod_ops___v2 { int (test)(void); }; SEC(".struct_ops.link") struct bpf_testmod_ops___v1 a = { .test = ... } SEC(".struct_ops.link") struct bpf_testmod_ops___v2 b = { .test = ... } Where both bpf_testmod_ops__v1 and bpf_testmod_ops__v2 would be resolved as 'struct bpf_testmod_ops' from kernel BTF. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: David Vernet <void@manifault.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240306104529.6453-2-eddyz87@gmail.com	2024-03-06 13:58:27 -08:00
Alexei Starovoitov	00b08dceea	bpf: Introduce may_goto instruction Introduce may_goto instruction that from the verifier pov is similar to open coded iterators bpf_for()/bpf_repeat() and bpf_loop() helper, but it doesn't iterate any objects. In assembly 'may_goto' is a nop most of the time until bpf runtime has to terminate the program for whatever reason. In the current implementation may_goto has a hidden counter, but other mechanisms can be used. For programs written in C the later patch introduces 'cond_break' macro that combines 'may_goto' with 'break' statement and has similar semantics: cond_break is a nop until bpf runtime has to break out of this loop. It can be used in any normal "for" or "while" loop, like for (i = zero; i < cnt; cond_break, i++) { The verifier recognizes that may_goto is used in the program, reserves additional 8 bytes of stack, initializes them in subprog prologue, and replaces may_goto instruction with: aux_reg = (u64 )(fp - 40) if aux_reg == 0 goto pc+off aux_reg -= 1 (u64 )(fp - 40) = aux_reg may_goto instruction can be used by LLVM to implement __builtin_memcpy, __builtin_strcmp. may_goto is not a full substitute for bpf_for() macro. bpf_for() doesn't have induction variable that verifiers sees, so 'i' in bpf_for(i, 0, 100) is seen as imprecise and bounded. But when the code is written as: for (i = 0; i < 100; cond_break, i++) the verifier see 'i' as precise constant zero, hence cond_break (aka may_goto) doesn't help to converge the loop. A static or global variable can be used as a workaround: static int zero = 0; for (i = zero; i < 100; cond_break, i++) // works! may_goto works well with arena pointers that don't need to be bounds checked on access. Load/store from arena returns imprecise unbounded scalar and loops with may_goto pass the verifier. Reserve new opcode BPF_JMP \| BPF_JCOND for may_goto insn. JCOND stands for conditional pseudo jump. Since goto_or_nop insn was proposed, it may use the same opcode. may_goto vs goto_or_nop can be distinguished by src_reg: code = BPF_JMP \| BPF_JCOND src_reg = 0 - may_goto src_reg = 1 - goto_or_nop Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20240306031929.42666-2-alexei.starovoitov@gmail.com	2024-03-06 13:58:27 -08:00
Chen Shen	bf52494e2b	libbpf: Correct debug message in btf__load_vmlinux_btf In the function btf__load_vmlinux_btf, the debug message incorrectly refers to 'path' instead of 'sysfs_btf_path'. Signed-off-by: Chen Shen <peterchenshen@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/20240302062218.3587-1-peterchenshen@gmail.com	2024-03-06 13:58:27 -08:00
Kui-Feng Lee	acfaeffeaa	libbpf: Convert st_ops->data to shadow type. Convert st_ops->data to the shadow type of the struct_ops map. The shadow type of a struct_ops type is a variant of the original struct type providing a way to access/change the values in the maps of the struct_ops type. bpf_map__initial_value() will return st_ops->data for struct_ops types. The skeleton is going to use it as the pointer to the shadow type of the original struct type. One of the main differences between the original struct type and the shadow type is that all function pointers of the shadow type are converted to pointers of struct bpf_program. Users can replace these bpf_program pointers with other BPF programs. The st_ops->progs[] will be updated before updating the value of a map to reflect the changes made by users. Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240229064523.2091270-3-thinker.li@gmail.com	2024-03-06 13:58:27 -08:00
Kui-Feng Lee	0758d8b0f2	libbpf: Set btf_value_type_id of struct bpf_map for struct_ops. For a struct_ops map, btf_value_type_id is the type ID of it's struct type. This value is required by bpftool to generate skeleton including pointers of shadow types. The code generator gets the type ID from bpf_map__btf_value_type_id() in order to get the type information of the struct type of a map. Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240229064523.2091270-2-thinker.li@gmail.com	2024-03-06 13:58:27 -08:00
Kees Cook	fa4d00254d	bpf: Replace bpf_lpm_trie_key 0-length array with flexible array Replace deprecated 0-length array in struct bpf_lpm_trie_key with flexible array. Found with GCC 13: ../kernel/bpf/lpm_trie.c:207:51: warning: array subscript i is outside array bounds of 'const __u8[0]' {aka 'const unsigned char[]'} [-Warray-bounds=] 207 \| (__be16 )&key->data[i]); \| ^~~~~~~~~~~~~ ../include/uapi/linux/swab.h:102:54: note: in definition of macro '__swab16' 102 \| #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x)) \| ^ ../include/linux/byteorder/generic.h:97:21: note: in expansion of macro '__be16_to_cpu' 97 \| #define be16_to_cpu __be16_to_cpu \| ^~~~~~~~~~~~~ ../kernel/bpf/lpm_trie.c:206:28: note: in expansion of macro 'be16_to_cpu' 206 \| u16 diff = be16_to_cpu((__be16 )&node->data[i] ^ \| ^~~~~~~~~~~ In file included from ../include/linux/bpf.h:7: ../include/uapi/linux/bpf.h:82:17: note: while referencing 'data' 82 \| __u8 data[0]; /* Arbitrary size / \| ^~~~ And found at run-time under CONFIG_FORTIFY_SOURCE: UBSAN: array-index-out-of-bounds in kernel/bpf/lpm_trie.c:218:49 index 0 is out of range for type '__u8 []' Changing struct bpf_lpm_trie_key is difficult since has been used by userspace. For example, in Cilium: struct egress_gw_policy_key { struct bpf_lpm_trie_key lpm_key; __u32 saddr; __u32 daddr; }; While direct references to the "data" member haven't been found, there are static initializers what include the final member. For example, the "{}" here: struct egress_gw_policy_key in_key = { .lpm_key = { 32 + 24, {} }, .saddr = CLIENT_IP, .daddr = EXTERNAL_SVC_IP & 0Xffffff, }; To avoid the build time and run time warnings seen with a 0-sized trailing array for struct bpf_lpm_trie_key, introduce a new struct that correctly uses a flexible array for the trailing bytes, struct bpf_lpm_trie_key_u8. As part of this, include the "header" portion (which is just the "prefixlen" member), so it can be used by anything building a bpf_lpr_trie_key that has trailing members that aren't a u8 flexible array (like the self-test[1]), which is named struct bpf_lpm_trie_key_hdr. Unfortunately, C++ refuses to parse the __struct_group() helper, so it is not possible to define struct bpf_lpm_trie_key_hdr directly in struct bpf_lpm_trie_key_u8, so we must open-code the union directly. Adjust the kernel code to use struct bpf_lpm_trie_key_u8 through-out, and for the selftest to use struct bpf_lpm_trie_key_hdr. Add a comment to the UAPI header directing folks to the two new options. Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Gustavo A. R. Silva <gustavoars@kernel.org> Closes: https://paste.debian.net/hidden/ca500597/ Link: https://lore.kernel.org/all/202206281009.4332AA33@keescook/ [1] Link: https://lore.kernel.org/bpf/20240222155612.it.533-kees@kernel.org	2024-03-06 13:58:27 -08:00
Aahil Awatramani	f749be80b7	bonding: Add independent control state machine Add support for the independent control state machine per IEEE 802.1AX-2008 5.4.15 in addition to the existing implementation of the coupled control state machine. Introduces two new states, AD_MUX_COLLECTING and AD_MUX_DISTRIBUTING in the LACP MUX state machine for separated handling of an initial Collecting state before the Collecting and Distributing state. This enables a port to be in a state where it can receive incoming packets while not still distributing. This is useful for reducing packet loss when a port begins distributing before its partner is able to collect. Added new functions such as bond_set_slave_tx_disabled_flags and bond_set_slave_rx_enabled_flags to precisely manage the port's collecting and distributing states. Previously, there was no dedicated method to disable TX while keeping RX enabled, which this patch addresses. Note that the regular flow process in the kernel's bonding driver remains unaffected by this patch. The extension requires explicit opt-in by the user (in order to ensure no disruptions for existing setups) via netlink support using the new bonding parameter coupled_control. The default value for coupled_control is set to 1 so as to preserve existing behaviour. Signed-off-by: Aahil Awatramani <aahila@google.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20240202175858.1573852-1-aahila@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-03-06 13:58:27 -08:00
Andrii Nakryiko	fb98d4bd25	include: fix BPF_CALL_REL definition Fix our Github-specific definition of BPF_CALL_REL macro. It was missing the code part. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-03-01 15:39:45 -08:00
Kui-Feng Lee	f4e9b606f4	ci: clean up bpf_test_no_cfi.ko for v5.5.0 and v4.9.0. bpf_test_no_cfi.ko is not available for v5.5.0 and v4.9.0. Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>	2024-02-27 10:14:31 -08:00
Kui-Feng Lee	ff95bd6238	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 92a871ab9fa59a74d013bc04f321026a057618e7 Checkpoint bpf-next commit: 2ab256e93249f5ac1da665861aa0f03fb4208d9c Baseline bpf commit: 577e4432f3ac810049cb7e6b71f4d96ec7c6e894 Checkpoint bpf commit: dced881ead78e4d6add3735d02a9186ba2415630 Arnaldo Carvalho de Melo (1): tools headers UAPI: Sync linux/fcntl.h with the kernel sources Cupertino Miranda (1): libbpf: Add support to GCC in CORE macro definitions Martin Kelly (1): bpf: Clarify batch lookup/lookup_and_delete semantics Matt Bobrowski (1): libbpf: Make remark about zero-initializing bpf_*_info structs include/uapi/linux/bpf.h \| 6 ++++- include/uapi/linux/fcntl.h \| 3 +++ src/bpf.h \| 39 ++++++++++++++++++++++++--------- src/bpf_core_read.h \| 45 ++++++++++++++++++++++++++++++++------ 4 files changed, 75 insertions(+), 18 deletions(-) Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>	2024-02-27 10:14:31 -08:00
Arnaldo Carvalho de Melo	a894b0cb9b	tools headers UAPI: Sync linux/fcntl.h with the kernel sources To get the changes in: 8a924db2d7b5eb69 ("fs: Pass AT_GETATTR_NOSEC flag to getattr interface function") That don't add anything that is handled by existing hard coded tables or table generation scripts. This silences this perf build warning: Warning: Kernel ABI header differences: diff -u tools/include/uapi/linux/fcntl.h include/uapi/linux/fcntl.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stefan Berger <stefanb@linux.ibm.com> Link: https://lore.kernel.org/lkml/ZbJv9fGF_k2xXEdr@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2024-02-27 10:14:31 -08:00
Martin Kelly	afa81fb1cb	bpf: Clarify batch lookup/lookup_and_delete semantics The batch lookup and lookup_and_delete APIs have two parameters, in_batch and out_batch, to facilitate iterative lookup/lookup_and_deletion operations for supported maps. Except NULL for in_batch at the start of these two batch operations, both parameters need to point to memory equal or larger than the respective map key size, except for various hashmaps (hash, percpu_hash, lru_hash, lru_percpu_hash) where the in_batch/out_batch memory size should be at least 4 bytes. Document these semantics to clarify the API. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20240221211838.1241578-1-martin.kelly@crowdstrike.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-02-27 10:14:31 -08:00
Matt Bobrowski	16e68ab13c	libbpf: Make remark about zero-initializing bpf_*_info structs In some situations, if you fail to zero-initialize the bpf_{prog,map,btf,link}_info structs supplied to the set of LIBBPF helpers bpf_{prog,map,btf,link}_get_info_by_fd(), you can expect the helper to return an error. This can possibly leave people in a situation where they're scratching their heads for an unnnecessary amount of time. Make an explicit remark about the requirement of zero-initializing the supplied bpf_{prog,map,btf,link}_info structs for the respective LIBBPF helpers. Internally, LIBBPF helpers bpf_{prog,map,btf,link}_get_info_by_fd() call into bpf_obj_get_info_by_fd() where the bpf(2) BPF_OBJ_GET_INFO_BY_FD command is used. This specific command is effectively backed by restrictions enforced by the bpf_check_uarg_tail_zero() helper. This function ensures that if the size of the supplied bpf_{prog,map,btf,link}_info structs are larger than what the kernel can handle, trailing bits are zeroed. This can be a problem when compiling against UAPI headers that don't necessarily match the sizes of the same underlying types known to the kernel. Signed-off-by: Matt Bobrowski <mattbobrowski@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/ZcyEb8x4VbhieWsL@google.com	2024-02-27 10:14:31 -08:00
Cupertino Miranda	b19fdbf1be	libbpf: Add support to GCC in CORE macro definitions Due to internal differences between LLVM and GCC the current implementation for the CO-RE macros does not fit GCC parser, as it will optimize those expressions even before those would be accessible by the BPF backend. As examples, the following would be optimized out with the original definitions: - As enums are converted to their integer representation during parsing, the IR would not know how to distinguish an integer constant from an actual enum value. - Types need to be kept as temporary variables, as the existing type casts of the 0 address (as expanded for LLVM), are optimized away by the GCC C parser, never really reaching GCCs IR. Although, the macros appear to add extra complexity, the expanded code is removed from the compilation flow very early in the compilation process, not really affecting the quality of the generated assembly. Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240213173543.1397708-1-cupertino.miranda@oracle.com	2024-02-27 10:14:31 -08:00
Manu Bretelle	445486dcbf	ci: Pass arch parameter to setup-build-env Since `1bc40aecb3` arch parameter needs to be passed to `setup-build-env` Signed-off-by: Manu Bretelle <chantr4@gmail.com>	2024-02-15 10:44:45 -08:00
Andrii Nakryiko	820bca2cb6	ci: verifier_global_subprogs can't be run on 5.5 We get: libbpf: struct_ops init_kern: struct bpf_dummy_ops is not found in kernel BTF So even though it's irrelevant to the subtests we do want to test, entire test has to be skipped, unfortunately. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-02-06 11:52:00 -08:00
Andrii Nakryiko	8a8feae5f4	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 943b043aeecce9accb6d367af47791c633e95e4d Checkpoint bpf-next commit: 92a871ab9fa59a74d013bc04f321026a057618e7 Baseline bpf commit: 577e4432f3ac810049cb7e6b71f4d96ec7c6e894 Checkpoint bpf commit: 577e4432f3ac810049cb7e6b71f4d96ec7c6e894 Andrii Nakryiko (1): libbpf: fix return value for PERF_EVENT __arg_ctx type fix up check Toke Høiland-Jørgensen (1): libbpf: Use OPTS_SET() macro in bpf_xdp_query() src/libbpf.c \| 6 +++--- src/netlink.c \| 4 ++-- 2 files changed, 5 insertions(+), 5 deletions(-) Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-02-06 11:52:00 -08:00
Toke Høiland-Jørgensen	a20b60f971	libbpf: Use OPTS_SET() macro in bpf_xdp_query() When the feature_flags and xdp_zc_max_segs fields were added to the libbpf bpf_xdp_query_opts, the code writing them did not use the OPTS_SET() macro. This causes libbpf to write to those fields unconditionally, which means that programs compiled against an older version of libbpf (with a smaller size of the bpf_xdp_query_opts struct) will have its stack corrupted by libbpf writing out of bounds. The patch adding the feature_flags field has an early bail out if the feature_flags field is not part of the opts struct (via the OPTS_HAS) macro, but the patch adding xdp_zc_max_segs does not. For consistency, this fix just changes the assignments to both fields to use the OPTS_SET() macro. Fixes: 13ce2daa259a ("xsk: add new netlink attribute dedicated for ZC max frags") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240206125922.1992815-1-toke@redhat.com	2024-02-06 11:52:00 -08:00
Andrii Nakryiko	b24a6277cc	libbpf: fix return value for PERF_EVENT __arg_ctx type fix up check If PERF_EVENT program has __arg_ctx argument with matching architecture-specific pt_regs/user_pt_regs/user_regs_struct pointer type, libbpf should still perform type rewrite for old kernels, but not emit the warning. Fix copy/paste from kernel code where 0 is meant to signify "no error" condition. For libbpf we need to return "true" to proceed with type rewrite (which for PERF_EVENT program will be a canonical `struct bpf_perf_event_data *` type). Fixes: 9eea8fafe33e ("libbpf: fix __arg_ctx type enforcement for perf_event programs") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240206002243.1439450-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-02-06 11:52:00 -08:00
Andrii Nakryiko	25fe467af4	ci: allowlist tests validating libbpf's __arg_ctx type rewrite logic Allowlist test_global_funcs/arg_tag_ctx* and a few of verifier_global_subprogs subtests that validate libbpf's logic for rewriting __arg_ctx globl subprog argument types on kernels that don't natively support __arg_ctx. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-02-06 10:17:28 -08:00
Andrii Nakryiko	f11758a780	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: ced33f2cfa21a14a292a00e31dc9f85c1bfbda1c Checkpoint bpf-next commit: 943b043aeecce9accb6d367af47791c633e95e4d Baseline bpf commit: 577e4432f3ac810049cb7e6b71f4d96ec7c6e894 Checkpoint bpf commit: 577e4432f3ac810049cb7e6b71f4d96ec7c6e894 Andrii Nakryiko (8): libbpf: integrate __arg_ctx feature detector into kernel_supports() libbpf: fix __arg_ctx type enforcement for perf_event programs libbpf: add __arg_trusted and __arg_nullable tag macros libbpf: add bpf_core_cast() macro libbpf: Call memfd_create() syscall directly libbpf: Add missing LIBBPF_API annotation to libbpf_set_memlock_rlim API libbpf: Add btf__new_split() API that was declared but not implemented libbpf: Add missed btf_ext__raw_data() API Eduard Zingerman (1): libbpf: Remove unnecessary null check in kernel_supports() Ian Rogers (1): libbpf: Add some details for BTF parsing failures src/bpf.h \| 2 +- src/bpf_core_read.h \| 13 ++++++ src/bpf_helpers.h \| 2 + src/btf.c \| 33 ++++++++++++--- src/features.c \| 58 +++++++++++++++++++++++++ src/libbpf.c \| 99 ++++++++++++++----------------------------- src/libbpf.map \| 5 ++- src/libbpf_internal.h \| 2 + src/linker.c \| 2 +- 9 files changed, 140 insertions(+), 76 deletions(-) Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-02-01 15:10:17 -08:00
Andrii Nakryiko	cbb8ba352d	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-02-01 15:10:17 -08:00
Andrii Nakryiko	95b4beb502	libbpf: Add missed btf_ext__raw_data() API Another API that was declared in libbpf.map but actual implementation was missing. btf_ext__get_raw_data() was intended as a discouraged alias to consistently-named btf_ext__raw_data(), so make this an actuality. Fixes: 20eccf29e297 ("libbpf: hide and discourage inconsistently named getters") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20240201172027.604869-5-andrii@kernel.org	2024-02-01 15:10:17 -08:00
Andrii Nakryiko	5b7613e50f	libbpf: Add btf__new_split() API that was declared but not implemented Seems like original commit adding split BTF support intended to add btf__new_split() API, and even declared it in libbpf.map, but never added (trivial) implementation. Fix this. Fixes: ba451366bf44 ("libbpf: Implement basic split BTF support") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20240201172027.604869-4-andrii@kernel.org	2024-02-01 15:10:17 -08:00
Andrii Nakryiko	245394fb36	libbpf: Add missing LIBBPF_API annotation to libbpf_set_memlock_rlim API LIBBPF_API annotation seems missing on libbpf_set_memlock_rlim API, so add it to make this API callable from libbpf's shared library version. Fixes: e542f2c4cd16 ("libbpf: Auto-bump RLIMIT_MEMLOCK if kernel needs it for BPF") Fixes: ab9a5a05dc48 ("libbpf: fix up few libbpf.map problems") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20240201172027.604869-3-andrii@kernel.org	2024-02-01 15:10:17 -08:00
Andrii Nakryiko	3b19b1bb55	libbpf: Call memfd_create() syscall directly Some versions of Android do not implement memfd_create() wrapper in their libc implementation, leading to build failures ([0]). On the other hand, memfd_create() is available as a syscall on quite old kernels (3.17+, while bpf() syscall itself is available since 3.18+), so it is ok to assume that syscall availability and call into it with syscall() helper to avoid Android-specific workarounds. Validated in libbpf-bootstrap's CI ([1]). [0] https://github.com/libbpf/libbpf-bootstrap/actions/runs/7701003207/job/20986080319#step:5:83 [1] https://github.com/libbpf/libbpf-bootstrap/actions/runs/7715988887/job/21031767212?pr=253 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20240201172027.604869-2-andrii@kernel.org	2024-02-01 15:10:17 -08:00
Eduard Zingerman	7529e0c4c7	libbpf: Remove unnecessary null check in kernel_supports() After recent changes, Coverity complained about inconsistent null checks in kernel_supports() function: kernel_supports(const struct bpf_object *obj, ...) [...] // var_compare_op: Comparing obj to null implies that obj might be null if (obj && obj->gen_loader) return true; // var_deref_op: Dereferencing null pointer obj if (obj->token_fd) return feat_supported(obj->feat_cache, feat_id); [...] - The original null check was introduced by commit [0], which introduced a call `kernel_supports(NULL, ...)` in function bump_rlimit_memlock(); - This call was refactored to use `feat_supported(NULL, ...)` in commit [1]. Looking at all places where kernel_supports() is called: - There is either `obj->...` access before the call; - Or `obj` comes from `prog->obj` expression, where `prog` comes from enumeration of programs in `obj`; - Or `obj` comes from `prog->obj`, where `prog` is a parameter to one of the API functions: - bpf_program__attach_kprobe_opts; - bpf_program__attach_kprobe; - bpf_program__attach_ksyscall. Assuming correct API usage, it appears that `obj` can never be null when passed to kernel_supports(). Silence the Coverity warning by removing redundant null check. [0] e542f2c4cd16 ("libbpf: Auto-bump RLIMIT_MEMLOCK if kernel needs it for BPF") [1] d6dd1d49367a ("libbpf: Further decouple feature checking logic from bpf_object") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240131212615.20112-1-eddyz87@gmail.com	2024-02-01 15:10:17 -08:00
Andrii Nakryiko	688879fb01	libbpf: add bpf_core_cast() macro Add bpf_core_cast() macro that wraps bpf_rdonly_cast() kfunc. It's more ergonomic than kfunc, as it automatically extracts btf_id with bpf_core_type_id_kernel(), and works with type names. It also casts result to (T ) pointer. See the definition of the macro, it's self-explanatory. libbpf declares bpf_rdonly_cast() extern as __weak __ksym and should be safe to not conflict with other possible declarations in user code. But we do have a conflict with current BPF selftests that declare their externs with first argument as `void obj`, while libbpf opts into more permissive `const void *obj`. This causes conflict, so we fix up BPF selftests uses in the same patch. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240130212023.183765-2-andrii@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-02-01 15:10:17 -08:00
Andrii Nakryiko	0303e25be3	libbpf: add __arg_trusted and __arg_nullable tag macros Add __arg_trusted to annotate global func args that accept trusted PTR_TO_BTF_ID arguments. Also add __arg_nullable to combine with __arg_trusted (and maybe other tags in the future) to force global subprog itself (i.e., callee) to do NULL checks, as opposed to default non-NULL semantics (and thus caller's responsibility to ensure non-NULL values). Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240130000648.2144827-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-02-01 15:10:17 -08:00
Ian Rogers	9b306ac9be	libbpf: Add some details for BTF parsing failures As CONFIG_DEBUG_INFO_BTF is default off the existing "failed to find valid kernel BTF" message makes diagnosing the kernel build issue somewhat cryptic. Add a little more detail with the hope of helping users. Before: ``` libbpf: failed to find valid kernel BTF libbpf: Error loading vmlinux BTF: -3 ``` After not accessible: ``` libbpf: kernel BTF is missing at '/sys/kernel/btf/vmlinux', was CONFIG_DEBUG_INFO_BTF enabled? libbpf: failed to find valid kernel BTF libbpf: Error loading vmlinux BTF: -3 ``` After not readable: ``` libbpf: failed to read kernel BTF from (/sys/kernel/btf/vmlinux): -1 ``` Closes: https://lore.kernel.org/bpf/CAP-5=fU+DN_+Y=Y4gtELUsJxKNDDCOvJzPHvjUVaUoeFAzNnig@mail.gmail.com/ Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240125231840.1647951-1-irogers@google.com	2024-02-01 15:10:17 -08:00
Andrii Nakryiko	c57fb75864	libbpf: fix __arg_ctx type enforcement for perf_event programs Adjust PERF_EVENT type enforcement around __arg_ctx to match exactly what kernel is doing. Fixes: 76ec90a996e3 ("libbpf: warn on unexpected __arg_ctx type when rewriting BTF") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240125205510.3642094-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-02-01 15:10:17 -08:00
Andrii Nakryiko	0b412d1918	libbpf: integrate __arg_ctx feature detector into kernel_supports() Now that feature detection code is in bpf-next tree, integrate __arg_ctx kernel-side support into kernel_supports() framework. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240125205510.3642094-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-02-01 15:10:17 -08:00
Andrii Nakryiko	3b09738928	sync: remove NETDEV_XSK_FLAGS_MASK which is not in bpf/bpf-next anymore This part of code is not present in either bpf or bpf-next trees anymore, so manually remove it. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-01-29 10:48:12 -08:00
Andrii Nakryiko	5139f12ef1	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: c8632acf193beac64bbdaebef013368c480bf74f Checkpoint bpf-next commit: ced33f2cfa21a14a292a00e31dc9f85c1bfbda1c Baseline bpf commit: 0a5bd0ffe790511d802e7f40898429a89e2487df Checkpoint bpf commit: 577e4432f3ac810049cb7e6b71f4d96ec7c6e894 Andrii Nakryiko (1): libbpf: Fix faccessat() usage on Android src/libbpf_internal.h \| 14 ++++++++++++++ 1 file changed, 14 insertions(+) Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-01-29 10:48:12 -08:00
Andrii Nakryiko	830e0d017b	libbpf: Fix faccessat() usage on Android Android implementation of libc errors out with -EINVAL in faccessat() if passed AT_EACCESS ([0]), this leads to ridiculous issue with libbpf refusing to load /sys/kernel/btf/vmlinux on Androids ([1]). Fix by detecting Android and redefining AT_EACCESS to 0, it's equivalent on Android. [0] https://android.googlesource.com/platform/bionic/+/refs/heads/android13-release/libc/bionic/faccessat.cpp#50 [1] https://github.com/libbpf/libbpf-bootstrap/issues/250#issuecomment-1911324250 Fixes: 6a4ab8869d0b ("libbpf: Fix the case of running as non-root with capabilities") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20240126220944.2497665-1-andrii@kernel.org	2024-01-29 10:48:12 -08:00
Andrii Nakryiko	fad5d91381	libbpf: make sure linux/kernel.h includes linux/compiler.h This replicates kernel upstream setup and brings READ_ONCE() and WRITE_ONCE() macros anywhere where linux/kernel.h is included, which is assumption libbpf code makes. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-01-26 18:12:29 -05:00
Andrii Nakryiko	8ca30626cc	Makefile: add features.o to Makefile Libbpf got new source code file, features.c, we need to add it to Makefile here on Github version as well. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-01-26 18:12:29 -05:00
Andrii Nakryiko	274d6037f8	libbpf: add BPF_CALL_REL() macro implementation Add BPF_CALL_REL() macro implementation into include/linux/filter.h header, which is now used by libbpf code for feature detection. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-01-26 18:12:29 -05:00
Andrii Nakryiko	0f84f3bef6	ci: regenerate vmlinux.h Update vmlinux.h for old kernel CI workflows. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-01-26 18:12:29 -05:00
Andrii Nakryiko	3dea2db84b	ci: drop custom patches for fixing upstream kernel issues All the issues should be fixed upstream already. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-01-26 18:12:29 -05:00
Andrii Nakryiko	2f81310ec0	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 98e20e5e13d2811898921f999288be7151a11954 Checkpoint bpf-next commit: c8632acf193beac64bbdaebef013368c480bf74f Baseline bpf commit: 7c5e046bdcb2513f9decb3765d8bf92d604279cf Checkpoint bpf commit: 0a5bd0ffe790511d802e7f40898429a89e2487df Andrey Grafin (1): libbpf: Apply map_set_def_max_entries() for inner_maps on creation Andrii Nakryiko (17): libbpf: feature-detect arg:ctx tag support in kernel libbpf: warn on unexpected __arg_ctx type when rewriting BTF libbpf: call dup2() syscall directly bpf: Introduce BPF token object bpf: Add BPF token support to BPF_MAP_CREATE command bpf: Add BPF token support to BPF_BTF_LOAD command bpf: Add BPF token support to BPF_PROG_LOAD command libbpf: Add bpf_token_create() API libbpf: Add BPF token support to bpf_map_create() API libbpf: Add BPF token support to bpf_btf_load() API libbpf: Add BPF token support to bpf_prog_load() API libbpf: Split feature detectors definitions from cached results libbpf: Further decouple feature checking logic from bpf_object libbpf: Move feature detection code into its own file libbpf: Wire up token_fd into feature probing logic libbpf: Wire up BPF token support at BPF object level libbpf: Support BPF token path setting through LIBBPF_BPF_TOKEN_PATH envvar Daniel Borkmann (1): bpf: Sync uapi bpf.h header for the tooling infra Dima Tisnek (1): libbpf: Correct bpf_core_read.h comment wrt bpf_core_relo struct Jiri Olsa (2): bpf: Add cookie to perf_event bpf_link_info records bpf: Store cookies in kprobe_multi bpf_link_info data Kan Liang (2): perf: Add branch stack counters perf/x86/intel: Support branch counters logging Kui-Feng Lee (3): bpf: pass btf object id in bpf_map_info. bpf: pass attached BTF to the bpf_struct_ops subsystem libbpf: Find correct module BTFs for struct_ops maps and progs. Martin KaFai Lau (1): libbpf: Ensure undefined bpf_attr field stays 0 include/uapi/linux/bpf.h \| 79 +++- include/uapi/linux/perf_event.h \| 13 + src/bpf.c \| 42 +- src/bpf.h \| 38 +- src/bpf_core_read.h \| 2 +- src/btf.c \| 10 +- src/elf.c \| 2 - src/features.c \| 503 +++++++++++++++++++++ src/libbpf.c \| 744 ++++++++++++-------------------- src/libbpf.h \| 21 +- src/libbpf.map \| 1 + src/libbpf_internal.h \| 50 ++- src/libbpf_probes.c \| 12 +- src/str_error.h \| 3 + 14 files changed, 1019 insertions(+), 501 deletions(-) create mode 100644 src/features.c Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-01-26 18:12:29 -05:00
Andrii Nakryiko	0e57fade4e	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-01-26 18:12:29 -05:00
Andrii Nakryiko	a36646e2b3	libbpf: Support BPF token path setting through LIBBPF_BPF_TOKEN_PATH envvar To allow external admin authority to override default BPF FS location (/sys/fs/bpf) for implicit BPF token creation, teach libbpf to recognize LIBBPF_BPF_TOKEN_PATH envvar. If it is specified and user application didn't explicitly specify bpf_token_path option, it will be treated exactly like bpf_token_path option, overriding default /sys/fs/bpf location and making BPF token mandatory. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20240124022127.2379740-29-andrii@kernel.org	2024-01-26 18:12:29 -05:00
Andrii Nakryiko	e1a43809a9	libbpf: Wire up BPF token support at BPF object level Add BPF token support to BPF object-level functionality. BPF token is supported by BPF object logic either as an explicitly provided BPF token from outside (through BPF FS path), or implicitly (unless prevented through bpf_object_open_opts). Implicit mode is assumed to be the most common one for user namespaced unprivileged workloads. The assumption is that privileged container manager sets up default BPF FS mount point at /sys/fs/bpf with BPF token delegation options (delegate_{cmds,maps,progs,attachs} mount options). BPF object during loading will attempt to create BPF token from /sys/fs/bpf location, and pass it for all relevant operations (currently, map creation, BTF load, and program load). In this implicit mode, if BPF token creation fails due to whatever reason (BPF FS is not mounted, or kernel doesn't support BPF token, etc), this is not considered an error. BPF object loading sequence will proceed with no BPF token. In explicit BPF token mode, user provides explicitly custom BPF FS mount point path. In such case, BPF object will attempt to create BPF token from provided BPF FS location. If BPF token creation fails, that is considered a critical error and BPF object load fails with an error. Libbpf provides a way to disable implicit BPF token creation, if it causes any troubles (BPF token is designed to be completely optional and shouldn't cause any problems even if provided, but in the world of BPF LSM, custom security logic can be installed that might change outcome depending on the presence of BPF token). To disable libbpf's default BPF token creation behavior user should provide either invalid BPF token FD (negative), or empty bpf_token_path option. BPF token presence can influence libbpf's feature probing, so if BPF object has associated BPF token, feature probing is instructed to use BPF object-specific feature detection cache and token FD. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20240124022127.2379740-26-andrii@kernel.org	2024-01-26 18:12:29 -05:00
Andrii Nakryiko	a3b317a9c0	libbpf: Wire up token_fd into feature probing logic Adjust feature probing callbacks to take into account optional token_fd. In unprivileged contexts, some feature detectors would fail to detect kernel support just because BPF program, BPF map, or BTF object can't be loaded due to privileged nature of those operations. So when BPF object is loaded with BPF token, this token should be used for feature probing. This patch is setting support for this scenario, but we don't yet pass non-zero token FD. This will be added in the next patch. We also switched BPF cookie detector from using kprobe program to tracepoint one, as tracepoint is somewhat less dangerous BPF program type and has higher likelihood of being allowed through BPF token in the future. This change has no effect on detection behavior. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20240124022127.2379740-25-andrii@kernel.org	2024-01-26 18:12:29 -05:00
Andrii Nakryiko	d42f0b8943	libbpf: Move feature detection code into its own file It's quite a lot of well isolated code, so it seems like a good candidate to move it out of libbpf.c to reduce its size. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20240124022127.2379740-24-andrii@kernel.org	2024-01-26 18:12:29 -05:00
Andrii Nakryiko	9bf95048b7	libbpf: Further decouple feature checking logic from bpf_object Add feat_supported() helper that accepts feature cache instead of bpf_object. This allows low-level code in bpf.c to not know or care about higher-level concept of bpf_object, yet it will be able to utilize custom feature checking in cases where BPF token might influence the outcome. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20240124022127.2379740-23-andrii@kernel.org	2024-01-26 18:12:29 -05:00
Andrii Nakryiko	9454419946	libbpf: Split feature detectors definitions from cached results Split a list of supported feature detectors with their corresponding callbacks from actual cached supported/missing values. This will allow to have more flexible per-token or per-object feature detectors in subsequent refactorings. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20240124022127.2379740-22-andrii@kernel.org	2024-01-26 18:12:29 -05:00
Andrii Nakryiko	8082a311d3	libbpf: Add BPF token support to bpf_prog_load() API Wire through token_fd into bpf_prog_load(). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20240124022127.2379740-16-andrii@kernel.org	2024-01-26 18:12:29 -05:00
Andrii Nakryiko	ac4a66ea12	libbpf: Add BPF token support to bpf_btf_load() API Allow user to specify token_fd for bpf_btf_load() API that wraps kernel's BPF_BTF_LOAD command. This allows loading BTF from unprivileged process as long as it has BPF token allowing BPF_BTF_LOAD command, which can be created and delegated by privileged process. Wire through new btf_flags as well, so that user can provide BPF_F_TOKEN_FD flag, if necessary. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20240124022127.2379740-15-andrii@kernel.org	2024-01-26 18:12:29 -05:00
Andrii Nakryiko	8002c052f3	libbpf: Add BPF token support to bpf_map_create() API Add ability to provide token_fd for BPF_MAP_CREATE command through bpf_map_create() API. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20240124022127.2379740-14-andrii@kernel.org	2024-01-26 18:12:29 -05:00
Andrii Nakryiko	5cc8482fe2	libbpf: Add bpf_token_create() API Add low-level wrapper API for BPF_TOKEN_CREATE command in bpf() syscall. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20240124022127.2379740-13-andrii@kernel.org	2024-01-26 18:12:29 -05:00
Andrii Nakryiko	21fb08cb35	bpf: Add BPF token support to BPF_PROG_LOAD command Add basic support of BPF token to BPF_PROG_LOAD. BPF_F_TOKEN_FD flag should be set in prog_flags field when providing prog_token_fd. Wire through a set of allowed BPF program types and attach types, derived from BPF FS at BPF token creation time. Then make sure we perform bpf_token_capable() checks everywhere where it's relevant. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20240124022127.2379740-7-andrii@kernel.org	2024-01-26 18:12:29 -05:00
Andrii Nakryiko	eb9d10835c	bpf: Add BPF token support to BPF_BTF_LOAD command Accept BPF token FD in BPF_BTF_LOAD command to allow BTF data loading through delegated BPF token. BPF_F_TOKEN_FD flag has to be specified when passing BPF token FD. Given BPF_BTF_LOAD command didn't have flags field before, we also add btf_flags field. BTF loading is a pretty straightforward operation, so as long as BPF token is created with allow_cmds granting BPF_BTF_LOAD command, kernel proceeds to parsing BTF data and creating BTF object. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20240124022127.2379740-6-andrii@kernel.org	2024-01-26 18:12:29 -05:00
Andrii Nakryiko	1386c15b7b	bpf: Add BPF token support to BPF_MAP_CREATE command Allow providing token_fd for BPF_MAP_CREATE command to allow controlled BPF map creation from unprivileged process through delegated BPF token. New BPF_F_TOKEN_FD flag is added to specify together with BPF token FD for BPF_MAP_CREATE command. Wire through a set of allowed BPF map types to BPF token, derived from BPF FS at BPF token creation time. This, in combination with allowed_cmds allows to create a narrowly-focused BPF token (controlled by privileged agent) with a restrictive set of BPF maps that application can attempt to create. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20240124022127.2379740-5-andrii@kernel.org	2024-01-26 18:12:29 -05:00
Andrii Nakryiko	5cd6a8493b	bpf: Introduce BPF token object Add new kind of BPF kernel object, BPF token. BPF token is meant to allow delegating privileged BPF functionality, like loading a BPF program or creating a BPF map, from privileged process to a trusted unprivileged process, all while having a good amount of control over which privileged operations could be performed using provided BPF token. This is achieved through mounting BPF FS instance with extra delegation mount options, which determine what operations are delegatable, and also constraining it to the owning user namespace (as mentioned in the previous patch). BPF token itself is just a derivative from BPF FS and can be created through a new bpf() syscall command, BPF_TOKEN_CREATE, which accepts BPF FS FD, which can be attained through open() API by opening BPF FS mount point. Currently, BPF token "inherits" delegated command, map types, prog type, and attach type bit sets from BPF FS as is. In the future, having an BPF token as a separate object with its own FD, we can allow to further restrict BPF token's allowable set of things either at the creation time or after the fact, allowing the process to guard itself further from unintentionally trying to load undesired kind of BPF programs. But for now we keep things simple and just copy bit sets as is. When BPF token is created from BPF FS mount, we take reference to the BPF super block's owning user namespace, and then use that namespace for checking all the {CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, CAP_SYS_ADMIN} capabilities that are normally only checked against init userns (using capable()), but now we check them using ns_capable() instead (if BPF token is provided). See bpf_token_capable() for details. Such setup means that BPF token in itself is not sufficient to grant BPF functionality. User namespaced process has to also have necessary combination of capabilities inside that user namespace. So while previously CAP_BPF was useless when granted within user namespace, now it gains a meaning and allows container managers and sys admins to have a flexible control over which processes can and need to use BPF functionality within the user namespace (i.e., container in practice). And BPF FS delegation mount options and derived BPF tokens serve as a per-container "flag" to grant overall ability to use bpf() (plus further restrict on which parts of bpf() syscalls are treated as namespaced). Note also, BPF_TOKEN_CREATE command itself requires ns_capable(CAP_BPF) within the BPF FS owning user namespace, rounding up the ns_capable() story of BPF token. Also creating BPF token in init user namespace is currently not supported, given BPF token doesn't have any effect in init user namespace anyways. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/bpf/20240124022127.2379740-4-andrii@kernel.org	2024-01-26 18:12:29 -05:00
Martin KaFai Lau	385ae492fa	libbpf: Ensure undefined bpf_attr field stays 0 The commit 9e926acda0c2 ("libbpf: Find correct module BTFs for struct_ops maps and progs.") sets a newly added field (value_type_btf_obj_fd) to -1 in libbpf when the caller of the libbpf's bpf_map_create did not define this field by passing a NULL "opts" or passing in a "opts" that does not cover this new field. OPT_HAS(opts, field) is used to decide if the field is defined or not: ((opts) && opts->sz >= offsetofend(typeof(*(opts)), field)) Once OPTS_HAS decided the field is not defined, that field should be set to 0. For this particular new field (value_type_btf_obj_fd), its corresponding map_flags "BPF_F_VTYPE_BTF_OBJ_FD" is not set. Thus, the kernel does not treat it as an fd field. Fixes: 9e926acda0c2 ("libbpf: Find correct module BTFs for struct_ops maps and progs.") Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240124224418.2905133-1-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-26 18:12:29 -05:00
Dima Tisnek	517ff8d823	libbpf: Correct bpf_core_read.h comment wrt bpf_core_relo struct Past commit ([0]) removed the last vestiges of struct bpf_field_reloc, it's called struct bpf_core_relo now. [0] 28b93c64499a ("libbpf: Clean up and improve CO-RE reloc logging") Signed-off-by: Dima Tisnek <dimaqq@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/20240121060126.15650-1-dimaqq@gmail.com	2024-01-26 18:12:29 -05:00
Kui-Feng Lee	0b0dfaf1be	libbpf: Find correct module BTFs for struct_ops maps and progs. Locate the module BTFs for struct_ops maps and progs and pass them to the kernel. This ensures that the kernel correctly resolves type IDs from the appropriate module BTFs. For the map of a struct_ops object, the FD of the module BTF is set to bpf_map to keep a reference to the module BTF. The FD is passed to the kernel as value_type_btf_obj_fd when the struct_ops object is loaded. For a bpf_struct_ops prog, attach_btf_obj_fd of bpf_prog is the FD of a module BTF in the kernel. Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240119225005.668602-13-thinker.li@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-01-26 18:12:29 -05:00
Kui-Feng Lee	d767ead51f	bpf: pass attached BTF to the bpf_struct_ops subsystem Pass the fd of a btf from the userspace to the bpf() syscall, and then convert the fd into a btf. The btf is generated from the module that defines the target BPF struct_ops type. In order to inform the kernel about the module that defines the target struct_ops type, the userspace program needs to provide a btf fd for the respective module's btf. This btf contains essential information on the types defined within the module, including the target struct_ops type. A btf fd must be provided to the kernel for struct_ops maps and for the bpf programs attached to those maps. In the case of the bpf programs, the attach_btf_obj_fd parameter is passed as part of the bpf_attr and is converted into a btf. This btf is then stored in the prog->aux->attach_btf field. Here, it just let the verifier access attach_btf directly. In the case of struct_ops maps, a btf fd is passed as value_type_btf_obj_fd of bpf_attr. The bpf_struct_ops_map_alloc() function converts the fd to a btf and stores it as st_map->btf. A flag BPF_F_VTYPE_BTF_OBJ_FD is added for map_flags to indicate that the value of value_type_btf_obj_fd is set. Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Link: https://lore.kernel.org/r/20240119225005.668602-9-thinker.li@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-01-26 18:12:29 -05:00
Kui-Feng Lee	d70e2071e2	bpf: pass btf object id in bpf_map_info. Include btf object id (btf_obj_id) in bpf_map_info so that tools (ex: bpftools struct_ops dump) know the correct btf from the kernel to look up type information of struct_ops types. Since struct_ops types can be defined and registered in a module. The type information of a struct_ops type are defined in the btf of the module defining it. The userspace tools need to know which btf is for the module defining a struct_ops type. Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com> Link: https://lore.kernel.org/r/20240119225005.668602-7-thinker.li@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-01-26 18:12:29 -05:00
Jiri Olsa	fe508381b4	bpf: Store cookies in kprobe_multi bpf_link_info data Storing cookies in kprobe_multi bpf_link_info data. The cookies field is optional and if provided it needs to be an array of __u64 with kprobe_multi.count length. Acked-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240119110505.400573-3-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-26 18:12:29 -05:00
Jiri Olsa	de2f366450	bpf: Add cookie to perf_event bpf_link_info records At the moment we don't store cookie for perf_event probes, while we do that for the rest of the probes. Adding cookie fields to struct bpf_link_info perf event probe records: perf_event.uprobe perf_event.kprobe perf_event.tracepoint perf_event.perf_event And the code to store that in bpf_link_info struct. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Song Liu <song@kernel.org> Acked-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/r/20240119110505.400573-2-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-26 18:12:29 -05:00
Andrii Nakryiko	390c6234eb	libbpf: call dup2() syscall directly We've ran into issues with using dup2() API in production setting, where libbpf is linked into large production environment and ends up calling unintended custom implementations of dup2(). These custom implementations don't provide atomic FD replacement guarantees of dup2() syscall, leading to subtle and hard to debug issues. To prevent this in the future and guarantee that no libc implementation will do their own custom non-atomic dup2() implementation, call dup2() syscall directly with syscall(SYS_dup2). Note that some architectures don't seem to provide dup2 and have dup3 instead. Try to detect and pick best syscall. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <song@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20240119210201.1295511-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-26 18:12:29 -05:00
Andrey Grafin	89ca11a79b	libbpf: Apply map_set_def_max_entries() for inner_maps on creation This patch allows to auto create BPF_MAP_TYPE_ARRAY_OF_MAPS and BPF_MAP_TYPE_HASH_OF_MAPS with values of BPF_MAP_TYPE_PERF_EVENT_ARRAY by bpf_object__load(). Previous behaviour created a zero filled btf_map_def for inner maps and tried to use it for a map creation but the linux kernel forbids to create a BPF_MAP_TYPE_PERF_EVENT_ARRAY map with max_entries=0. Fixes: 646f02ffdd49 ("libbpf: Add BTF-defined map-in-map support") Signed-off-by: Andrey Grafin <conquistador@yandex-team.ru> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/bpf/20240117130619.9403-1-conquistador@yandex-team.ru Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-26 18:12:29 -05:00
Daniel Borkmann	4c3742d9c1	bpf: Sync uapi bpf.h header for the tooling infra Both commit 91051f003948 ("tcp: Dump bound-only sockets in inet_diag.") and commit 985b8ea9ec7e ("bpf, docs: Fix bpf_redirect_peer header doc") missed the tooling header sync. Fix it. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-26 18:12:29 -05:00
Andrii Nakryiko	f4211a704f	libbpf: warn on unexpected __arg_ctx type when rewriting BTF On kernel that don't support arg:ctx tag, before adjusting global subprog BTF information to match kernel's expected canonical type names, make sure that types used by user are meaningful, and if not, warn and don't do BTF adjustments. This is similar to checks that kernel performs, but narrower in scope, as only a small subset of BPF program types can be accommodated by libbpf using canonical type names. Libbpf unconditionally allows `struct pt_regs ` for perf_event program types, unlike kernel, which supports that conditionally on architecture. This is done to keep things simple and not cause unnecessary false positives. This seems like a minor and harmless deviation, which in real-world programs will be caught by kernels with arg:ctx tag support anyways. So KISS principle. This logic is hard to test (especially on latest kernels), so manual testing was performed instead. Libbpf emitted the following warning for perf_event program with wrong context argument type: libbpf: prog 'arg_tag_ctx_perf': subprog 'subprog_ctx_tag' arg#0 is expected to be of `struct bpf_perf_event_data ` type Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240118033143.3384355-6-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-26 18:12:29 -05:00
Andrii Nakryiko	939ab641b8	libbpf: feature-detect arg:ctx tag support in kernel Add feature detector of kernel-side arg:ctx (__arg_ctx) tag support. If this is detected, libbpf will avoid doing any __arg_ctx-related BTF rewriting and checks in favor of letting kernel handle this completely. test_global_funcs/ctx_arg_rewrite subtest is adjusted to do the same feature detection (albeit in much simpler, though round-about and inefficient, way), and skip the tests. This is done to still be able to execute this test on older kernels (like in libbpf CI). Note, BPF token series ([0]) does a major refactor and code moving of libbpf-internal feature detection "framework", so to avoid unnecessary conflicts we keep newly added feature detection stand-alone with ad-hoc result caching. Once things settle, there will be a small follow up to re-integrate everything back and move code into its final place in newly-added (by BPF token series) features.c file. [0] https://patchwork.kernel.org/project/netdevbpf/list/?series=814209&state=* Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240118033143.3384355-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-26 18:12:29 -05:00
Kan Liang	82ebbd97c8	perf/x86/intel: Support branch counters logging The branch counters logging (A.K.A LBR event logging) introduces a per-counter indication of precise event occurrences in LBRs. It can provide a means to attribute exposed retirement latency to combinations of events across a block of instructions. It also provides a means of attributing Timed LBR latencies to events. The feature is first introduced on SRF/GRR. It is an enhancement of the ARCH LBR. It adds new fields in the LBR_INFO MSRs to log the occurrences of events on the GP counters. The information is displayed by the order of counters. The design proposed in this patch requires that the events which are logged must be in a group with the event that has LBR. If there are more than one LBR group, the counters logging information only from the current group (overflowed) are stored for the perf tool, otherwise the perf tool cannot know which and when other groups are scheduled especially when multiplexing is triggered. The user can ensure it uses the maximum number of counters that support LBR info (4 by now) by making the group large enough. The HW only logs events by the order of counters. The order may be different from the order of enabling which the perf tool can understand. When parsing the information of each branch entry, convert the counter order to the enabled order, and store the enabled order in the extension space. Unconditionally reset LBRs for an LBR event group when it's deleted. The logged counter information is only valid for the current LBR group. If another LBR group is scheduled later, the information from the stale LBRs would be otherwise wrongly interpreted. Add a sanity check in intel_pmu_hw_config(). Disable the feature if other counter filters (inv, cmask, edge, in_tx) are set or LBR call stack mode is enabled. (For the LBR call stack mode, we cannot simply flush the LBR, since it will break the call stack. Also, there is no obvious usage with the call stack mode for now.) Only applying the PERF_SAMPLE_BRANCH_COUNTERS doesn't require any branch stack setup. Expose the maximum number of supported counters and the width of the counters into the sysfs. The perf tool can use the information to parse the logged counters in each branch. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20231025201626.3000228-5-kan.liang@linux.intel.com	2024-01-26 18:12:29 -05:00
Kan Liang	9705c4c622	perf: Add branch stack counters Currently, the additional information of a branch entry is stored in a u64 space. With more and more information added, the space is running out. For example, the information of occurrences of events will be added for each branch. Two places were suggested to append the counters. https://lore.kernel.org/lkml/20230802215814.GH231007@hirez.programming.kicks-ass.net/ One place is right after the flags of each branch entry. It changes the existing struct perf_branch_entry. The later ARCH specific implementation has to be really careful to consistently pick the right struct. The other place is right after the entire struct perf_branch_stack. The disadvantage is that the pointer of the extra space has to be recorded. The common interface perf_sample_save_brstack() has to be updated. The latter is much straightforward, and should be easily understood and maintained. It is implemented in the patch. Add a new branch sample type, PERF_SAMPLE_BRANCH_COUNTERS, to indicate the event which is recorded in the branch info. The "u64 counters" may store the occurrences of several events. The information regarding the number of events/counters and the width of each counter should be exposed via sysfs as a reference for the perf tool. Define the branch_counter_nr and branch_counter_width ABI here. The support will be implemented later in the Intel-specific patch. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20231025201626.3000228-1-kan.liang@linux.intel.com	2024-01-26 18:12:29 -05:00
Andrii Nakryiko	528cb9d3e9	README: update Ubuntu link Do what [0] proposed to do, but with properly formatted commit message and Signed-off-by. [0] https://github.com/libbpf/libbpf/pull/742 Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-01-25 16:47:44 -08:00
Andrii Nakryiko	f81eef23b3	ci: skip two tests failing due to kernel bug Add lwt_reroute and tc_links_ingress to DENYLIST, as they are currently broken due to kernel bug. Fix is underreview and should make it into bpf-next soon. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-01-04 19:15:17 -05:00
Andrii Nakryiko	feabd96e00	ci: regenerate vmlinux.h Need bpf_xfrm_state_opts and others. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-01-04 19:15:17 -05:00
Andrii Nakryiko	1570d568a0	Makefile: bump to v1.4.0 dev version Bump Github-only Makefile to match 1.4 development version. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-01-04 19:15:17 -05:00
Andrii Nakryiko	e2203b3057	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 750011e239a50873251c16207b0fe78eabf8577e Checkpoint bpf-next commit: 98e20e5e13d2811898921f999288be7151a11954 Baseline bpf commit: bc4fbf022c68967cb49b2b820b465cf90de974b8 Checkpoint bpf commit: 7c5e046bdcb2513f9decb3765d8bf92d604279cf Alyssa Ross (1): libbpf: Skip DWARF sections in linker sanity check Amritha Nambiar (4): netdev-genl: spec: Extend netdev netlink spec in YAML for queue netdev-genl: spec: Extend netdev netlink spec in YAML for NAPI netdev-genl: spec: Add irq in netdev netlink YAML spec netdev-genl: spec: Add PID in netdev netlink YAML spec Andrii Nakryiko (24): bpf: introduce BPF token object bpf: add BPF token support to BPF_MAP_CREATE command bpf: add BPF token support to BPF_BTF_LOAD command bpf: add BPF token support to BPF_PROG_LOAD command libbpf: add bpf_token_create() API libbpf: add BPF token support to bpf_map_create() API libbpf: add BPF token support to bpf_btf_load() API libbpf: add BPF token support to bpf_prog_load() API bpf: rename MAX_BPF_LINK_TYPE into __MAX_BPF_LINK_TYPE for consistency libbpf: split feature detectors definitions from cached results libbpf: further decouple feature checking logic from bpf_object libbpf: move feature detection code into its own file libbpf: wire up token_fd into feature probing logic libbpf: wire up BPF token support at BPF object level libbpf: support BPF token path setting through LIBBPF_BPF_TOKEN_PATH envvar Revert BPF token-related functionality libbpf: add __arg_xxx macros for annotating global func args libbpf: make uniform use of btf__fd() accessor inside libbpf libbpf: use explicit map reuse flag to skip map creation steps libbpf: don't rely on map->fd as an indicator of map being created libbpf: use stable map placeholder FDs libbpf: move exception callbacks assignment logic into relocation step libbpf: move BTF loading step after relocation step libbpf: implement __arg_ctx fallback logic Daniel Xu (1): libbpf: Add BPF_CORE_WRITE_BITFIELD() macro David Vernet (1): bpf: Load vmlinux btf for any struct_ops map Eduard Zingerman (1): libbpf: Start v1.4 development cycle Jakub Kicinski (1): tools: ynl: add sample for getting page-pool information Jamal Hadi Salim (5): net/sched: Remove uapi support for rsvp classifier net/sched: Remove uapi support for tcindex classifier net/sched: Remove uapi support for dsmark qdisc net/sched: Remove uapi support for ATM qdisc net/sched: Remove uapi support for CBQ qdisc Jiri Olsa (2): libbpf: Add st_type argument to elf_resolve_syms_offsets function bpf: Add link_info support for uprobe multi link Larysa Zaremba (1): xdp: Add VLAN tag hint Mingyi Zhang (1): libbpf: Fix NULL pointer dereference in bpf_object__collect_prog_relos Sergei Trofimovich (1): libbpf: Add pr_warn() for EINVAL cases in linker_sanity_check_elf Stanislav Fomichev (3): xsk: Support tx_metadata_len xsk: Add TX timestamp and TX checksum offload support xsk: Add option to calculate TX checksum in SW include/uapi/linux/bpf.h \| 14 +- include/uapi/linux/if_xdp.h \| 61 +++- include/uapi/linux/netdev.h \| 81 ++++- include/uapi/linux/pkt_cls.h \| 47 --- include/uapi/linux/pkt_sched.h \| 109 ------ src/bpf_core_read.h \| 32 ++ src/bpf_helpers.h \| 3 + src/elf.c \| 5 +- src/libbpf.c \| 585 +++++++++++++++++++++++++-------- src/libbpf.map \| 3 + src/libbpf_internal.h \| 17 +- src/libbpf_version.h \| 2 +- src/linker.c \| 27 +- 13 files changed, 673 insertions(+), 313 deletions(-) Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-01-04 19:15:17 -05:00
Andrii Nakryiko	3102067b4e	libbpf: implement __arg_ctx fallback logic Out of all special global func arg tag annotations, __arg_ctx is practically is the most immediately useful and most critical to have working across multitude kernel version, if possible. This would allow end users to write much simpler code if __arg_ctx semantics worked for older kernels that don't natively understand btf_decl_tag("arg:ctx") in verifier logic. Luckily, it is possible to ensure __arg_ctx works on old kernels through a bit of extra work done by libbpf, at least in a lot of common cases. To explain the overall idea, we need to go back at how context argument was supported in global funcs before __arg_ctx support was added. This was done based on special struct name checks in kernel. E.g., for BPF_PROG_TYPE_PERF_EVENT the expectation is that argument type `struct bpf_perf_event_data ` mark that argument as PTR_TO_CTX. This is all good as long as global function is used from the same BPF program types only, which is often not the case. If the same subprog has to be called from, say, kprobe and perf_event program types, there is no single definition that would satisfy BPF verifier. Subprog will have context argument either for kprobe (if using bpf_user_pt_regs_t struct name) or perf_event (with bpf_perf_event_data struct name), but not both. This limitation was the reason to add btf_decl_tag("arg:ctx"), making the actual argument type not important, so that user can just define "generic" signature: __noinline int global_subprog(void ctx __arg_ctx) { ... } I won't belabor how libbpf is implementing subprograms, see a huge comment next to bpf_object_relocate_calls() function. The idea is that each main/entry BPF program gets its own copy of global_subprog's code appended. This per-program copy of global subprog code and associated func_info .BTF.ext information, pointing to FUNC -> FUNC_PROTO BTF type chain allows libbpf to simulate __arg_ctx behavior transparently, even if the kernel doesn't yet support __arg_ctx annotation natively. The idea is straightforward: each time we append global subprog's code and func_info information, we adjust its FUNC -> FUNC_PROTO type information, if necessary (that is, libbpf can detect the presence of btf_decl_tag("arg:ctx") just like BPF verifier would do it). The rest is just mechanical and somewhat painful BTF manipulation code. It's painful because we need to clone FUNC -> FUNC_PROTO, instead of reusing it, as same FUNC -> FUNC_PROTO chain might be used by another main BPF program within the same BPF object, so we can't just modify it in-place (and cloning BTF types within the same struct btf object is painful due to constant memory invalidation, see comments in code). Uploaded BPF object's BTF information has to work for all BPF programs at the same time. Once we have FUNC -> FUNC_PROTO clones, we make sure that instead of using some `void ctx` parameter definition, we have an expected `struct bpf_perf_event_data ctx` definition (as far as BPF verifier and kernel is concerned), which will mark it as context for BPF verifier. Same global subprog relocated and copied into another main BPF program will get different type information according to main program's type. It all works out in the end in a completely transparent way for end user. Libbpf maintains internal program type -> expected context struct name mapping internally. Note, not all BPF program types have named context struct, so this approach won't work for such programs (just like it didn't before __arg_ctx). So native __arg_ctx is still important to have in kernel to have generic context support across all BPF program types. Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240104013847.3875810-8-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-04 19:15:17 -05:00
Andrii Nakryiko	a4f0740b3d	libbpf: move BTF loading step after relocation step With all the preparations in previous patches done we are ready to postpone BTF loading and sanitization step until after all the relocations are performed. Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240104013847.3875810-7-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-04 19:15:17 -05:00
Andrii Nakryiko	94470256c1	libbpf: move exception callbacks assignment logic into relocation step Move the logic of finding and assigning exception callback indices from BTF sanitization step to program relocations step, which seems more logical and will unblock moving BTF loading to after relocation step. Exception callbacks discovery and assignment has no dependency on BTF being loaded into the kernel, it only uses BTF information. It does need to happen before subprogram relocations happen, though. Which is why the split. No functional changes. Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240104013847.3875810-6-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-04 19:15:17 -05:00
Andrii Nakryiko	4d68ea90c2	libbpf: use stable map placeholder FDs Move map creation to later during BPF object loading by pre-creating stable placeholder FDs (utilizing memfd_create()). Use dup2() syscall to then atomically make those placeholder FDs point to real kernel BPF map objects. This change allows to delay BPF map creation to after all the BPF program relocations. That, in turn, allows to delay BTF finalization and loading into kernel to after all the relocations as well. We'll take advantage of the latter in subsequent patches to allow libbpf to adjust BTF in a way that helps with BPF global function usage. Clean up a few places where we close map->fd, which now shouldn't happen, because map->fd should be a valid FD regardless of whether map was created or not. Surprisingly and nicely it simplifies a bunch of error handling code. If this change doesn't backfire, I'm tempted to pre-create such stable FDs for other entities (progs, maybe even BTF). We previously did some manipulations to make gen_loader work with fake map FDs, with stable map FDs this hack is not necessary for maps (we still have it for BTF, but I left it as is for now). Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240104013847.3875810-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-04 19:15:17 -05:00
Andrii Nakryiko	2ea3d8042f	libbpf: don't rely on map->fd as an indicator of map being created With the upcoming switch to preallocated placeholder FDs for maps, switch various getters/setter away from checking map->fd. Use map_is_created() helper that detect whether BPF map can be modified based on map->obj->loaded state, with special provision for maps set up with bpf_map__reuse_fd(). For backwards compatibility, we take map_is_created() into account in bpf_map__fd() getter as well. This way before bpf_object__load() phase bpf_map__fd() will always return -1, just as before the changes in subsequent patches adding stable map->fd placeholders. We also get rid of all internal uses of bpf_map__fd() getter, as it's more oriented for uses external to libbpf. The above map_is_created() check actually interferes with some of the internal uses, if map FD is fetched through bpf_map__fd(). Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240104013847.3875810-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-04 19:15:17 -05:00
Andrii Nakryiko	e9ce55197b	libbpf: use explicit map reuse flag to skip map creation steps Instead of inferring whether map already point to previously created/pinned BPF map (which user can specify with bpf_map__reuse_fd()) API), use explicit map->reused flag that is set in such case. Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240104013847.3875810-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-04 19:15:17 -05:00
Andrii Nakryiko	3fb45d3761	libbpf: make uniform use of btf__fd() accessor inside libbpf It makes future grepping and code analysis a bit easier. Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240104013847.3875810-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-04 19:15:17 -05:00
Jamal Hadi Salim	2e49eb8bf6	net/sched: Remove uapi support for CBQ qdisc Commit 051d44209842 ("net/sched: Retire CBQ qdisc") retired the CBQ qdisc. Remove UAPI for it. Iproute2 will sync by equally removing it from user space. Reviewed-by: Victor Nogueira <victor@mojatatu.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-01-04 19:15:17 -05:00
Jamal Hadi Salim	5473fe6aef	net/sched: Remove uapi support for ATM qdisc Commit fb38306ceb9e ("net/sched: Retire ATM qdisc") retired the ATM qdisc. Remove UAPI for it. Iproute2 will sync by equally removing it from user space. Reviewed-by: Victor Nogueira <victor@mojatatu.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-01-04 19:15:17 -05:00
Jamal Hadi Salim	c04d1b669d	net/sched: Remove uapi support for dsmark qdisc Commit bbe77c14ee61 ("net/sched: Retire dsmark qdisc") retired the dsmark classifier. Remove UAPI support for it. Iproute2 will sync by equally removing it from user space. Reviewed-by: Victor Nogueira <victor@mojatatu.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-01-04 19:15:17 -05:00
Jamal Hadi Salim	717798e2f9	net/sched: Remove uapi support for tcindex classifier commit 8c710f75256b ("net/sched: Retire tcindex classifier") retired the TC tcindex classifier. Remove UAPI for it. Iproute2 will sync by equally removing it from user space. Reviewed-by: Victor Nogueira <victor@mojatatu.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-01-04 19:15:17 -05:00
Jamal Hadi Salim	f2c790ca1a	net/sched: Remove uapi support for rsvp classifier commit 265b4da82dbf ("net/sched: Retire rsvp classifier") retired the TC RSVP classifier. Remove UAPI for it. Iproute2 will sync by equally removing it from user space. Reviewed-by: Victor Nogueira <victor@mojatatu.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-01-04 19:15:17 -05:00
Mingyi Zhang	c008eb921e	libbpf: Fix NULL pointer dereference in bpf_object__collect_prog_relos An issue occurred while reading an ELF file in libbpf.c during fuzzing: Program received signal SIGSEGV, Segmentation fault. 0x0000000000958e97 in bpf_object.collect_prog_relos () at libbpf.c:4206 4206 in libbpf.c (gdb) bt #0 0x0000000000958e97 in bpf_object.collect_prog_relos () at libbpf.c:4206 #1 0x000000000094f9d6 in bpf_object.collect_relos () at libbpf.c:6706 #2 0x000000000092bef3 in bpf_object_open () at libbpf.c:7437 #3 0x000000000092c046 in bpf_object.open_mem () at libbpf.c:7497 #4 0x0000000000924afa in LLVMFuzzerTestOneInput () at fuzz/bpf-object-fuzzer.c:16 #5 0x000000000060be11 in testblitz_engine::fuzzer::Fuzzer::run_one () #6 0x000000000087ad92 in tracing::span::Span::in_scope () #7 0x00000000006078aa in testblitz_engine::fuzzer::util::walkdir () #8 0x00000000005f3217 in testblitz_engine::entrypoint::main::{{closure}} () #9 0x00000000005f2601 in main () (gdb) scn_data was null at this code(tools/lib/bpf/src/libbpf.c): if (rel->r_offset % BPF_INSN_SZ \|\| rel->r_offset >= scn_data->d_size) { The scn_data is derived from the code above: scn = elf_sec_by_idx(obj, sec_idx); scn_data = elf_sec_data(obj, scn); relo_sec_name = elf_sec_str(obj, shdr->sh_name); sec_name = elf_sec_name(obj, scn); if (!relo_sec_name \|\| !sec_name)// don't check whether scn_data is NULL return -EINVAL; In certain special scenarios, such as reading a malformed ELF file, it is possible that scn_data may be a null pointer Signed-off-by: Mingyi Zhang <zhangmingyi5@huawei.com> Signed-off-by: Xin Liu <liuxin350@huawei.com> Signed-off-by: Changye Wu <wuchangye@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20231221033947.154564-1-liuxin350@huawei.com	2024-01-04 19:15:17 -05:00
Alyssa Ross	6252a2fdcc	libbpf: Skip DWARF sections in linker sanity check clang can generate (with -g -Wa,--compress-debug-sections) 4-byte aligned DWARF sections that declare themselves to be 8-byte aligned in the section header. Since DWARF sections are dropped during linking anyway, just skip running the sanity checks on them. Reported-by: Sergei Trofimovich <slyich@gmail.com> Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: Alyssa Ross <hi@alyssa.is> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Closes: https://lore.kernel.org/bpf/ZXcFRJVKbKxtEL5t@nz.home/ Link: https://lore.kernel.org/bpf/20231219110324.8989-1-hi@alyssa.is	2024-01-04 19:15:17 -05:00
Andrii Nakryiko	c378eff58c	libbpf: add __arg_xxx macros for annotating global func args Add a set of __arg_xxx macros which can be used to augment BPF global subprogs/functions with extra information for use by BPF verifier. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231215011334.2307144-9-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-04 19:15:17 -05:00
Andrii Nakryiko	c65b319c04	Revert BPF token-related functionality This patch includes the following revert (one conflicting BPF FS patch and three token patch sets, represented by merge commits): - revert 0f5d5454c723 "Merge branch 'bpf-fs-mount-options-parsing-follow-ups'"; - revert 750e785796bb "bpf: Support uid and gid when mounting bpffs"; - revert 733763285acf "Merge branch 'bpf-token-support-in-libbpf-s-bpf-object'"; - revert c35919dcce28 "Merge branch 'bpf-token-and-bpf-fs-based-delegation'". Link: https://lore.kernel.org/bpf/CAHk-=wg7JuFYwGy=GOMbRCtOL+jwSQsdUaBsRWkDVYbxipbM5A@mail.gmail.com Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2024-01-04 19:15:17 -05:00
Larysa Zaremba	43e7309228	xdp: Add VLAN tag hint Implement functionality that enables drivers to expose VLAN tag to XDP code. VLAN tag is represented by 2 variables: - protocol ID, which is passed to bpf code in BE - VLAN TCI, in host byte order Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Link: https://lore.kernel.org/r/20231205210847.28460-10-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-04 19:15:17 -05:00
Andrii Nakryiko	b166b99eed	libbpf: support BPF token path setting through LIBBPF_BPF_TOKEN_PATH envvar To allow external admin authority to override default BPF FS location (/sys/fs/bpf) for implicit BPF token creation, teach libbpf to recognize LIBBPF_BPF_TOKEN_PATH envvar. If it is specified and user application didn't explicitly specify neither bpf_token_path nor bpf_token_fd option, it will be treated exactly like bpf_token_path option, overriding default /sys/fs/bpf location and making BPF token mandatory. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231213190842.3844987-10-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-04 19:15:17 -05:00
Andrii Nakryiko	5df9eba06a	libbpf: wire up BPF token support at BPF object level Add BPF token support to BPF object-level functionality. BPF token is supported by BPF object logic either as an explicitly provided BPF token from outside (through BPF FS path or explicit BPF token FD), or implicitly (unless prevented through bpf_object_open_opts). Implicit mode is assumed to be the most common one for user namespaced unprivileged workloads. The assumption is that privileged container manager sets up default BPF FS mount point at /sys/fs/bpf with BPF token delegation options (delegate_{cmds,maps,progs,attachs} mount options). BPF object during loading will attempt to create BPF token from /sys/fs/bpf location, and pass it for all relevant operations (currently, map creation, BTF load, and program load). In this implicit mode, if BPF token creation fails due to whatever reason (BPF FS is not mounted, or kernel doesn't support BPF token, etc), this is not considered an error. BPF object loading sequence will proceed with no BPF token. In explicit BPF token mode, user provides explicitly either custom BPF FS mount point path or creates BPF token on their own and just passes token FD directly. In such case, BPF object will either dup() token FD (to not require caller to hold onto it for entire duration of BPF object lifetime) or will attempt to create BPF token from provided BPF FS location. If BPF token creation fails, that is considered a critical error and BPF object load fails with an error. Libbpf provides a way to disable implicit BPF token creation, if it causes any troubles (BPF token is designed to be completely optional and shouldn't cause any problems even if provided, but in the world of BPF LSM, custom security logic can be installed that might change outcome dependin on the presence of BPF token). To disable libbpf's default BPF token creation behavior user should provide either invalid BPF token FD (negative), or empty bpf_token_path option. BPF token presence can influence libbpf's feature probing, so if BPF object has associated BPF token, feature probing is instructed to use BPF object-specific feature detection cache and token FD. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231213190842.3844987-7-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-04 19:15:17 -05:00
Andrii Nakryiko	b14daa8b9b	libbpf: wire up token_fd into feature probing logic Adjust feature probing callbacks to take into account optional token_fd. In unprivileged contexts, some feature detectors would fail to detect kernel support just because BPF program, BPF map, or BTF object can't be loaded due to privileged nature of those operations. So when BPF object is loaded with BPF token, this token should be used for feature probing. This patch is setting support for this scenario, but we don't yet pass non-zero token FD. This will be added in the next patch. We also switched BPF cookie detector from using kprobe program to tracepoint one, as tracepoint is somewhat less dangerous BPF program type and has higher likelihood of being allowed through BPF token in the future. This change has no effect on detection behavior. Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231213190842.3844987-6-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-04 19:15:17 -05:00
Andrii Nakryiko	fab327c888	libbpf: move feature detection code into its own file It's quite a lot of well isolated code, so it seems like a good candidate to move it out of libbpf.c to reduce its size. Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231213190842.3844987-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-04 19:15:17 -05:00
Andrii Nakryiko	feda0728e0	libbpf: further decouple feature checking logic from bpf_object Add feat_supported() helper that accepts feature cache instead of bpf_object. This allows low-level code in bpf.c to not know or care about higher-level concept of bpf_object, yet it will be able to utilize custom feature checking in cases where BPF token might influence the outcome. Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231213190842.3844987-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-04 19:15:17 -05:00
Andrii Nakryiko	11c977ffaf	libbpf: split feature detectors definitions from cached results Split a list of supported feature detectors with their corresponding callbacks from actual cached supported/missing values. This will allow to have more flexible per-token or per-object feature detectors in subsequent refactorings. Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231213190842.3844987-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-04 19:15:17 -05:00
Daniel Xu	9d2f8aaf21	libbpf: Add BPF_CORE_WRITE_BITFIELD() macro === Motivation === Similar to reading from CO-RE bitfields, we need a CO-RE aware bitfield writing wrapper to make the verifier happy. Two alternatives to this approach are: 1. Use the upcoming `preserve_static_offset` [0] attribute to disable CO-RE on specific structs. 2. Use broader byte-sized writes to write to bitfields. (1) is a bit hard to use. It requires specific and not-very-obvious annotations to bpftool generated vmlinux.h. It's also not generally available in released LLVM versions yet. (2) makes the code quite hard to read and write. And especially if BPF_CORE_READ_BITFIELD() is already being used, it makes more sense to to have an inverse helper for writing. === Implementation details === Since the logic is a bit non-obvious, I thought it would be helpful to explain exactly what's going on. To start, it helps by explaining what LSHIFT_U64 (lshift) and RSHIFT_U64 (rshift) is designed to mean. Consider the core of the BPF_CORE_READ_BITFIELD() algorithm: val <<= __CORE_RELO(s, field, LSHIFT_U64); val = val >> __CORE_RELO(s, field, RSHIFT_U64); Basically what happens is we lshift to clear the non-relevant (blank) higher order bits. Then we rshift to bring the relevant bits (bitfield) down to LSB position (while also clearing blank lower order bits). To illustrate: Start: ........XXX...... Lshift: XXX......00000000 Rshift: 00000000000000XXX where `.` means blank bit, `0` means 0 bit, and `X` means bitfield bit. After the two operations, the bitfield is ready to be interpreted as a regular integer. Next, we want to build an alternative (but more helpful) mental model on lshift and rshift. That is, to consider: * rshift as the total number of blank bits in the u64 * lshift as number of blank bits left of the bitfield in the u64 Take a moment to consider why that is true by consulting the above diagram. With this insight, we can now define the following relationship: bitfield _ \| \| 0.....00XXX0...00 \| \| \| \| \|______\| \| \| lshift \| \| \|____\| (rshift - lshift) That is, we know the number of higher order blank bits is just lshift. And the number of lower order blank bits is (rshift - lshift). Finally, we can examine the core of the write side algorithm: mask = (~0ULL << rshift) >> lshift; // 1 val = (val & ~mask) \| ((nval << rpad) & mask); // 2 1. Compute a mask where the set bits are the bitfield bits. The first left shift zeros out exactly the number of blank bits, leaving a bitfield sized set of 1s. The subsequent right shift inserts the correct amount of higher order blank bits. 2. On the left of the `\|`, mask out the bitfield bits. This creates 0s where the new bitfield bits will go. On the right of the `\|`, bring nval into the correct bit position and mask out any bits that fall outside of the bitfield. Finally, by bor'ing the two halves, we get the final set of bits to write back. [0]: https://reviews.llvm.org/D133361 Co-developed-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Co-developed-by: Jonathan Lemon <jlemon@aviatrix.com> Signed-off-by: Jonathan Lemon <jlemon@aviatrix.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/r/4d3dd215a4fd57d980733886f9c11a45e1a9adf3.1702325874.git.dxu@dxuuu.xyz Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2024-01-04 19:15:17 -05:00
Sergei Trofimovich	5f68c571c8	libbpf: Add pr_warn() for EINVAL cases in linker_sanity_check_elf Before the change on `i686-linux` `systemd` build failed as: $ bpftool gen object src/core/bpf/socket_bind/socket-bind.bpf.o src/core/bpf/socket_bind/socket-bind.bpf.unstripped.o Error: failed to link 'src/core/bpf/socket_bind/socket-bind.bpf.unstripped.o': Invalid argument (22) After the change it fails as: $ bpftool gen object src/core/bpf/socket_bind/socket-bind.bpf.o src/core/bpf/socket_bind/socket-bind.bpf.unstripped.o libbpf: ELF section #9 has inconsistent alignment addr=8 != d=4 in src/core/bpf/socket_bind/socket-bind.bpf.unstripped.o Error: failed to link 'src/core/bpf/socket_bind/socket-bind.bpf.unstripped.o': Invalid argument (22) Now it's slightly easier to figure out what is wrong with an ELF file. Signed-off-by: Sergei Trofimovich <slyich@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20231208215100.435876-1-slyich@gmail.com	2024-01-04 19:15:17 -05:00
David Vernet	235ea85487	bpf: Load vmlinux btf for any struct_ops map In libbpf, when determining whether we need to load vmlinux btf, we're currently (among other things) checking whether there is any struct_ops program present in the object. This works for most realistic struct_ops maps, as a struct_ops map is of course typically composed of one or more struct_ops programs. However, that technically need not be the case. A struct_ops interface could be defined which allows a map to be specified which one or more non-prog fields, and which provides default behavior if no struct_ops progs is actually provided otherwise. For sched_ext, for example, you technically only need to specify the name of the scheduler in the struct_ops map, with the core scheduler logic providing default behavior if no prog is actually specified. If we were to define and try to load such a struct_ops map, we would crash in libbpf when initializing it as obj->btf_vmlinux will be NULL: Reading symbols from minimal... (gdb) r Starting program: minimal_example [Thread debugging using libthread_db enabled] Using host libthread_db library "/usr/lib/libthread_db.so.1". Program received signal SIGSEGV, Segmentation fault. 0x000055555558308c in btf__type_cnt (btf=0x0) at btf.c:612 612 return btf->start_id + btf->nr_types; (gdb) bt type_name=0x5555555d99e3 "sched_ext_ops", kind=4) at btf.c:914 kind=4) at btf.c:942 type=0x7fffffffe558, type_id=0x7fffffffe548, ... data_member=0x7fffffffe568) at libbpf.c:948 kern_btf=0x0) at libbpf.c:1017 at libbpf.c:8059 So as to account for such bare-bones struct_ops maps, let's update obj_needs_vmlinux_btf() to also iterate over an obj's maps and check whether any of them are struct_ops maps. Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/20231208061704.400463-1-void@manifault.com	2024-01-04 19:15:17 -05:00
Andrii Nakryiko	400cbd6148	bpf: rename MAX_BPF_LINK_TYPE into __MAX_BPF_LINK_TYPE for consistency To stay consistent with the naming pattern used for similar cases in BPF UAPI (__MAX_BPF_ATTACH_TYPE, etc), rename MAX_BPF_LINK_TYPE into __MAX_BPF_LINK_TYPE. Also similar to MAX_BPF_ATTACH_TYPE and MAX_BPF_REG, add: #define MAX_BPF_LINK_TYPE __MAX_BPF_LINK_TYPE Not all __MAX_xxx enums have such #define, so I'm not sure if we should add it or not, but I figured I'll start with a completely backwards compatible way, and we can drop that, if necessary. Also adjust a selftest that used MAX_BPF_LINK_TYPE enum. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20231206190920.1651226-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-04 19:15:17 -05:00
Andrii Nakryiko	ec1cab73a7	libbpf: add BPF token support to bpf_prog_load() API Wire through token_fd into bpf_prog_load(). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-16-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-04 19:15:17 -05:00
Andrii Nakryiko	207b6ebb60	libbpf: add BPF token support to bpf_btf_load() API Allow user to specify token_fd for bpf_btf_load() API that wraps kernel's BPF_BTF_LOAD command. This allows loading BTF from unprivileged process as long as it has BPF token allowing BPF_BTF_LOAD command, which can be created and delegated by privileged process. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-15-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-04 19:15:17 -05:00
Andrii Nakryiko	a23b8ffcf6	libbpf: add BPF token support to bpf_map_create() API Add ability to provide token_fd for BPF_MAP_CREATE command through bpf_map_create() API. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-14-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-04 19:15:17 -05:00
Andrii Nakryiko	f8954ca692	libbpf: add bpf_token_create() API Add low-level wrapper API for BPF_TOKEN_CREATE command in bpf() syscall. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-13-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-04 19:15:17 -05:00
Andrii Nakryiko	1ebea57322	bpf: add BPF token support to BPF_PROG_LOAD command Add basic support of BPF token to BPF_PROG_LOAD. Wire through a set of allowed BPF program types and attach types, derived from BPF FS at BPF token creation time. Then make sure we perform bpf_token_capable() checks everywhere where it's relevant. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-7-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-04 19:15:17 -05:00
Andrii Nakryiko	544acb9af6	bpf: add BPF token support to BPF_BTF_LOAD command Accept BPF token FD in BPF_BTF_LOAD command to allow BTF data loading through delegated BPF token. BTF loading is a pretty straightforward operation, so as long as BPF token is created with allow_cmds granting BPF_BTF_LOAD command, kernel proceeds to parsing BTF data and creating BTF object. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-6-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-04 19:15:17 -05:00
Andrii Nakryiko	9abcc5efc8	bpf: add BPF token support to BPF_MAP_CREATE command Allow providing token_fd for BPF_MAP_CREATE command to allow controlled BPF map creation from unprivileged process through delegated BPF token. Wire through a set of allowed BPF map types to BPF token, derived from BPF FS at BPF token creation time. This, in combination with allowed_cmds allows to create a narrowly-focused BPF token (controlled by privileged agent) with a restrictive set of BPF maps that application can attempt to create. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-04 19:15:17 -05:00
Andrii Nakryiko	33de35fd83	bpf: introduce BPF token object Add new kind of BPF kernel object, BPF token. BPF token is meant to allow delegating privileged BPF functionality, like loading a BPF program or creating a BPF map, from privileged process to a trusted unprivileged process, all while having a good amount of control over which privileged operations could be performed using provided BPF token. This is achieved through mounting BPF FS instance with extra delegation mount options, which determine what operations are delegatable, and also constraining it to the owning user namespace (as mentioned in the previous patch). BPF token itself is just a derivative from BPF FS and can be created through a new bpf() syscall command, BPF_TOKEN_CREATE, which accepts BPF FS FD, which can be attained through open() API by opening BPF FS mount point. Currently, BPF token "inherits" delegated command, map types, prog type, and attach type bit sets from BPF FS as is. In the future, having an BPF token as a separate object with its own FD, we can allow to further restrict BPF token's allowable set of things either at the creation time or after the fact, allowing the process to guard itself further from unintentionally trying to load undesired kind of BPF programs. But for now we keep things simple and just copy bit sets as is. When BPF token is created from BPF FS mount, we take reference to the BPF super block's owning user namespace, and then use that namespace for checking all the {CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, CAP_SYS_ADMIN} capabilities that are normally only checked against init userns (using capable()), but now we check them using ns_capable() instead (if BPF token is provided). See bpf_token_capable() for details. Such setup means that BPF token in itself is not sufficient to grant BPF functionality. User namespaced process has to also have necessary combination of capabilities inside that user namespace. So while previously CAP_BPF was useless when granted within user namespace, now it gains a meaning and allows container managers and sys admins to have a flexible control over which processes can and need to use BPF functionality within the user namespace (i.e., container in practice). And BPF FS delegation mount options and derived BPF tokens serve as a per-container "flag" to grant overall ability to use bpf() (plus further restrict on which parts of bpf() syscalls are treated as namespaced). Note also, BPF_TOKEN_CREATE command itself requires ns_capable(CAP_BPF) within the BPF FS owning user namespace, rounding up the ns_capable() story of BPF token. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-04 19:15:17 -05:00
Amritha Nambiar	ac9cd25de9	netdev-genl: spec: Add PID in netdev netlink YAML spec Add support in netlink spec(netdev.yaml) for PID of the NAPI thread. Add code generated from the spec. Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Link: https://lore.kernel.org/r/170147335301.5260.11872351477120434501.stgit@anambiarhost.jf.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-04 19:15:17 -05:00
Amritha Nambiar	cfa6e420f4	netdev-genl: spec: Add irq in netdev netlink YAML spec Add support in netlink spec(netdev.yaml) for interrupt number among the NAPI attributes. Add code generated from the spec. Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Link: https://lore.kernel.org/r/170147334210.5260.18178387869057516983.stgit@anambiarhost.jf.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-04 19:15:17 -05:00
Amritha Nambiar	36f30e4c30	netdev-genl: spec: Extend netdev netlink spec in YAML for NAPI Add support in netlink spec(netdev.yaml) for napi related information. Add code generated from the spec. Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Link: https://lore.kernel.org/r/170147333119.5260.7050639053080529108.stgit@anambiarhost.jf.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-04 19:15:17 -05:00
Amritha Nambiar	e4fcfe7db7	netdev-genl: spec: Extend netdev netlink spec in YAML for queue Add support in netlink spec(netdev.yaml) for queue information. Add code generated from the spec. Note: The "queue-type" attribute takes values 0 and 1 for rx and tx queue type respectively. Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Link: https://lore.kernel.org/r/170147330963.5260.2576294626647300472.stgit@anambiarhost.jf.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-04 19:15:17 -05:00
Stanislav Fomichev	419eab9ec7	xsk: Add option to calculate TX checksum in SW For XDP_COPY mode, add a UMEM option XDP_UMEM_TX_SW_CSUM to call skb_checksum_help in transmit path. Might be useful to debugging issues with real hardware. I also use this mode in the selftests. Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20231127190319.1190813-9-sdf@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-04 19:15:17 -05:00
Stanislav Fomichev	95134be22e	xsk: Add TX timestamp and TX checksum offload support This change actually defines the (initial) metadata layout that should be used by AF_XDP userspace (xsk_tx_metadata). The first field is flags which requests appropriate offloads, followed by the offload-specific fields. The supported per-device offloads are exported via netlink (new xsk-flags). The offloads themselves are still implemented in a bit of a framework-y fashion that's left from my initial kfunc attempt. I'm introducing new xsk_tx_metadata_ops which drivers are supposed to implement. The drivers are also supposed to call xsk_tx_metadata_request/xsk_tx_metadata_complete in the right places. Since xsk_tx_metadata_{request,_complete} are static inline, we don't incur any extra overhead doing indirect calls. The benefit of this scheme is as follows: - keeps all metadata layout parsing away from driver code - makes it easy to grep and see which drivers implement what - don't need any extra flags to maintain to keep track of what offloads are implemented; if the callback is implemented - the offload is supported (used by netlink reporting code) Two offloads are defined right now: 1. XDP_TXMD_FLAGS_CHECKSUM: skb-style csum_start+csum_offset 2. XDP_TXMD_FLAGS_TIMESTAMP: writes TX timestamp back into metadata area upon completion (tx_timestamp field) XDP_TXMD_FLAGS_TIMESTAMP is also implemented for XDP_COPY mode: it writes SW timestamp from the skb destructor (note I'm reusing hwtstamps to pass metadata pointer). The struct is forward-compatible and can be extended in the future by appending more fields. Reviewed-by: Song Yoong Siang <yoong.siang.song@intel.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20231127190319.1190813-3-sdf@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-04 19:15:17 -05:00
Stanislav Fomichev	2f95d28664	xsk: Support tx_metadata_len For zerocopy mode, tx_desc->addr can point to an arbitrary offset and carry some TX metadata in the headroom. For copy mode, there is no way currently to populate skb metadata. Introduce new tx_metadata_len umem config option that indicates how many bytes to treat as metadata. Metadata bytes come prior to tx_desc address (same as in RX case). The size of the metadata has mostly the same constraints as XDP: - less than 256 bytes - 8-byte aligned (compared to 4-byte alignment on xdp, due to 8-byte timestamp in the completion) - non-zero This data is not interpreted in any way right now. Reviewed-by: Song Yoong Siang <yoong.siang.song@intel.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20231127190319.1190813-2-sdf@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-04 19:15:17 -05:00
Jiri Olsa	afb384f685	bpf: Add link_info support for uprobe multi link Adding support to get uprobe_link details through bpf_link_info interface. Adding new struct uprobe_multi to struct bpf_link_info to carry the uprobe_multi link details. The uprobe_multi.count is passed from user space to denote size of array fields (offsets/ref_ctr_offsets/cookies). The actual array size is stored back to uprobe_multi.count (allowing user to find out the actual array size) and array fields are populated up to the user passed size. All the non-array fields (path/count/flags/pid) are always set. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/20231125193130.834322-4-jolsa@kernel.org	2024-01-04 19:15:17 -05:00
Jiri Olsa	467dd7bda5	libbpf: Add st_type argument to elf_resolve_syms_offsets function We need to get offsets for static variables in following changes, so making elf_resolve_syms_offsets to take st_type value as argument and passing it to elf_sym_iter_new. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/bpf/20231125193130.834322-2-jolsa@kernel.org	2024-01-04 19:15:17 -05:00
Eduard Zingerman	9c794e5ab4	libbpf: Start v1.4 development cycle Bump libbpf.map to v1.4.0 to start a new libbpf version cycle. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20231123000439.12025-1-eddyz87@gmail.com	2024-01-04 19:15:17 -05:00
Jakub Kicinski	eb40a93a10	tools: ynl: add sample for getting page-pool information Regenerate the tools/ code after netdev spec changes. Add sample to query page-pool info in a concise fashion: $ ./page-pool eth0[2] page pools: 10 (zombies: 0) refs: 41984 bytes: 171966464 (refs: 0 bytes: 0) recycling: 90.3% (alloc: 656:397681 recycle: 89652:270201) Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-01-04 19:15:17 -05:00
Eduard Zingerman	1baa3e2355	ci: move /dev/kvm permissions setup from to actions/vmtest.yml The vmtest action is used by several workflows: test, pahole, ondemand. At the same time, vmtest action requires valid access rights to /dev/kvm and is the only action that uses it. This commit moves /dev/kvm permissions setup from test workflow to vmtest action, in order to make sure that setup logic is shared by all workflows that run vmtest. Should fix CI failures like [1]. [1] https://github.com/libbpf/libbpf/actions/runs/7104762048/job/19340484589 Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>	2023-12-13 15:50:08 -05:00
Andrii Nakryiko	1b2ae67c1d	ci: custom patch to patch out BPF_F_TEST_REG_INVARIANTS flag Without needing to modify tons of BPF selftests file, make sure we don't pass BPF_F_TEST_REG_INVARIANTS to kernel, to make BPF selftests work on 4.9 and 5.5 kernels. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-12-05 12:51:08 -05:00
Andrii Nakryiko	20c0a9e3d7	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 155addf0814a92d08fce26a11b27e3315cdba977 Checkpoint bpf-next commit: 750011e239a50873251c16207b0fe78eabf8577e Baseline bpf commit: 83b9dda8afa4e968d9cce253f390b01c0612a2a5 Checkpoint bpf commit: bc4fbf022c68967cb49b2b820b465cf90de974b8 Andrii Nakryiko (2): bpf: add register bounds sanity checks and sanitization bpf: rename BPF_F_TEST_SANITY_STRICT to BPF_F_TEST_REG_INVARIANTS Jordan Rome (1): bpf: Add crosstask check to __bpf_get_stack include/uapi/linux/bpf.h \| 6 ++++++ 1 file changed, 6 insertions(+) Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-11-22 16:20:56 -05:00
Andrii Nakryiko	b88b3ac09d	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-11-22 16:20:56 -05:00
Andrii Nakryiko	96ed1c508f	bpf: rename BPF_F_TEST_SANITY_STRICT to BPF_F_TEST_REG_INVARIANTS Rename verifier internal flag BPF_F_TEST_SANITY_STRICT to more neutral BPF_F_TEST_REG_INVARIANTS. This is a follow up to [0]. A few selftests and veristat need to be adjusted in the same patch as well. [0] https://patchwork.kernel.org/project/netdevbpf/patch/20231112010609.848406-5-andrii@kernel.org/ Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231117171404.225508-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-22 16:20:56 -05:00
Andrii Nakryiko	7ccc41c138	bpf: add register bounds sanity checks and sanitization Add simple sanity checks that validate well-formed ranges (min <= max) across u64, s64, u32, and s32 ranges. Also for cases when the value is constant (either 64-bit or 32-bit), we validate that ranges and tnums are in agreement. These bounds checks are performed at the end of BPF_ALU/BPF_ALU64 operations, on conditional jumps, and for LDX instructions (where subreg zero/sign extension is probably the most important to check). This covers most of the interesting cases. Also, we validate the sanity of the return register when manually adjusting it for some special helpers. By default, sanity violation will trigger a warning in verifier log and resetting register bounds to "unbounded" ones. But to aid development and debugging, BPF_F_TEST_SANITY_STRICT flag is added, which will trigger hard failure of verification with -EFAULT on register bounds violations. This allows selftests to catch such issues. veristat will also gain a CLI option to enable this behavior. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Link: https://lore.kernel.org/r/20231112010609.848406-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-22 16:20:56 -05:00
Jordan Rome	785a079966	bpf: Add crosstask check to __bpf_get_stack Currently get_perf_callchain only supports user stack walking for the current task. Passing the correct crosstask param will return 0 frames if the task passed to __bpf_get_stack isn't the current one instead of a single incorrect frame/address. This change passes the correct crosstask param but also does a preemptive check in __bpf_get_stack if the task is current and returns -EOPNOTSUPP if it is not. This issue was found using bpf_get_task_stack inside a BPF iterator ("iter/task"), which iterates over all tasks. bpf_get_task_stack works fine for fetching kernel stacks but because get_perf_callchain relies on the caller to know if the requested task is the current one (via crosstask) it was failing in a confusing way. It might be possible to get user stacks for all tasks utilizing something like access_process_vm but that requires the bpf program calling bpf_get_task_stack to be sleepable and would therefore be a breaking change. Fixes: fa28dcb82a38 ("bpf: Introduce helper bpf_get_task_stack()") Signed-off-by: Jordan Rome <jordalgo@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20231108112334.3433136-1-jordalgo@meta.com	2023-11-22 16:20:56 -05:00
Eduard Zingerman	a6b990991c	ci: disable sockopt selftest for 5.5 kernel The following 'sockopt' selftests fail on libbpf CI for kernel 5.5: - sockopt/getsockopt: read ctx->optlen:FAIL - sockopt/getsockopt: support smaller ctx->optlen:FAIL - sockopt/setsockopt: read ctx->level:FAIL - sockopt/setsockopt: read ctx->optname:FAIL - sockopt/setsockopt: read ctx->optlen:FAIL - sockopt/setsockopt: ctx->optlen == -1 is ok:FAIL Examples of failing CI runs: - https://github.com/libbpf/libbpf/actions/runs/6961182067 - https://github.com/libbpf/libbpf/actions/runs/6961088131 The failures are strange as all tests were added quite a while ago (Jun 27 2019) by commit: 9ec8a4c9489d ("selftests/bpf: add sockopt test") But seem to be unrelated to libbpf. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>	2023-11-22 16:20:43 -05:00
Eduard Zingerman	4161e1f41d	ci: disable a number of selftest causing CI for LATEST kernel All tests disabled in this commit pass on main kernel CI and fail or flip/flop on libbpf CI. Failures do not seem to be related to libbpf. It appears that common theme for all failing tests is that hardware perf events are not delivered as expected on github CI worker machines. Examples of failed CI runs: - https://github.com/libbpf/libbpf/actions/runs/6961182067 - https://github.com/libbpf/libbpf/actions/runs/6961088131 Fails with the following log: test_send_signal_common:FAIL:incorrect result \ unexpected incorrect result: actual 48 != expected 50 Test mode of operation: - fork' - child: - install handler for SIGUSR1; - send ready message to parent; - wait for SIGUSR1 in busy loop; - send message '2' (50) to parent if SIGUSR1 occured; - send message '0' (48) to parent if no SIGUSR1 occured. - parent: - wait for ready message from child; - install perf_event or tracepoint bpf program that uses bpf_send_signal() to send SIGUSR1; - wait for message '0' or '2' from child, '2' is expected for test success. It appears that perf event that should be triggered by parent never happens, thus message 48 is received by parent and test fails. Fails with the following log: test_and_reset_skel:FAIL:found_vm_exec \ unexpected found_vm_exec: actual 0 != expected 1 Such log is printed if variables set from BPF program are not set after some timeout. The program that should set the variable is SEC("perf_event") int handle_pe(void), it appears that it is never run. Fails with the following log: pe_subtest:FAIL:pe_res1 unexpected pe_res1: actual 0 != expected 1048576 Variable pe_res1 should be triggered by program SEC("perf_event") int handle_pe(struct pt_regs *ctx), it appears that it is never run. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>	2023-11-22 16:20:43 -05:00
Eduard Zingerman	93f360cf4b	ci: don't set /dev/kvm permissions when CI user is root s390 tests are executed on selfhosted runner using root user, avoid setting /dev/kvm permissions in such case. This should fix CI failures like [0]. (Still necessary for x86 tests executed on standard github runners). [0] https://github.com/libbpf/libbpf/actions/runs/6898545987/job/18768732980?pr=752 Fixes: `168630f852` ("ci: give /dev/kvm 0666 permissions inside CI runner") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>	2023-11-17 15:36:52 -05:00
Eduard Zingerman	5ff0102329	ci: use config.vm for kernel config when present Recent kernel commit [0] changed selftests config snippets structure by extracting VM specific options to the file 'config.vm'. This file has to be used in .github/actions/vmtest/action.yml at step 'Prepare to build BPF selftests', otherwise drivers necessary for e.g. root file system access are not compiled into the kernel, leading to CI failures like [1]. [0] b0cf0dcde8ca ("selftests/bpf: Consolidate VIRTIO/9P configs in config.vm file") [1] https://github.com/libbpf/libbpf/actions/runs/6830439839/job/18578379328?pr=747 Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>	2023-11-16 20:25:07 -05:00
Andrii Nakryiko	0c54691bae	ci: apply temporary patch to make bpf-next build Apply fe69a1b1b6ed ("selftests: bpf: xskxceiver: ksft_print_msg: fix format type error") to make bpf-next build. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-11-13 21:51:02 -05:00
Eduard Zingerman	168630f852	ci: give /dev/kvm 0666 permissions inside CI runner Starting recently libbpf CI runs started failing with the following error: ##[group]vm_init - Starting virtual machine... Starting VM with 4 CPUs... INFO: /dev/kvm exists KVM acceleration can be used Could not access KVM kernel module: Permission denied qemu-system-x86_64: failed to initialize KVM: Permission denied ##[error]Process completed with exit code 2. E.g. see here [0]. The error happens because CI user has not enough rights to access /dev/kvm. On a regular machine the solution would be to add user to group 'kvm', however that would require a re-login, which is cumbersome to achieve in CI setting. Instead, use a recipe described in [1] to make udev set 0666 access permissions for /dev/kvm. [0] https://github.com/libbpf/libbpf/actions/runs/6819530119/job/18547589967?pr=746 [1] https://stackoverflow.com/questions/37300811/android-studio-dev-kvm-device-permission-denied/61984745#61984745 Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>	2023-11-13 18:21:02 -08:00
Eduard Zingerman	5d4237d52d	ci: regenerate vmlinux.h Regenerate latest vmlinux.h for old kernel CI tests. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>	2023-11-13 18:21:02 -08:00
Eduard Zingerman	fa0e866373	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 0e133a13370389d3894891eafe54fec2c44ad735 Checkpoint bpf-next commit: e80742d917492f10926b46b0caca050c6c9231d6 Baseline bpf commit: 8f8abb863fa5a4cc18955c6a0e17af0ded3e4a76 Checkpoint bpf commit: 83b9dda8afa4e968d9cce253f390b01c0612a2a5 Daniel Borkmann (3): netkit, bpf: Add bpf programmable net device tools: Sync if_link uapi header libbpf: Add link-based API for netkit Yonghong Song (2): libbpf: Fix potential uninitialized tail padding with LIBBPF_OPTS_RESET bpf: Use named fields for certain bpf uapi structs include/uapi/linux/bpf.h \| 37 +++++---- include/uapi/linux/if_link.h \| 141 +++++++++++++++++++++++++++++++++++ src/bpf.c \| 16 ++++ src/bpf.h \| 5 ++ src/libbpf.c \| 39 ++++++++++ src/libbpf.h \| 15 ++++ src/libbpf.map \| 1 + src/libbpf_common.h \| 13 ++-- 8 files changed, 246 insertions(+), 21 deletions(-) Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>	2023-11-13 18:21:02 -08:00
Yonghong Song	0fa5ff4f54	bpf: Use named fields for certain bpf uapi structs Martin and Vadim reported a verifier failure with bpf_dynptr usage. The issue is mentioned but Vadim workarounded the issue with source change ([1]). The below describes what is the issue and why there is a verification failure. int BPF_PROG(skb_crypto_setup) { struct bpf_dynptr algo, key; ... bpf_dynptr_from_mem(..., ..., 0, &algo); ... } The bpf program is using vmlinux.h, so we have the following definition in vmlinux.h: struct bpf_dynptr { long: 64; long: 64; }; Note that in uapi header bpf.h, we have struct bpf_dynptr { long: 64; long: 64; } __attribute__((aligned(8))); So we lost alignment information for struct bpf_dynptr by using vmlinux.h. Let us take a look at a simple program below: $ cat align.c typedef unsigned long long __u64; struct bpf_dynptr_no_align { __u64 :64; __u64 :64; }; struct bpf_dynptr_yes_align { __u64 :64; __u64 :64; } __attribute__((aligned(8))); void bar(void , void ); int foo() { struct bpf_dynptr_no_align a; struct bpf_dynptr_yes_align b; bar(&a, &b); return 0; } $ clang --target=bpf -O2 -S -emit-llvm align.c Look at the generated IR file align.ll: ... %a = alloca %struct.bpf_dynptr_no_align, align 1 %b = alloca %struct.bpf_dynptr_yes_align, align 8 ... The compiler dictates the alignment for struct bpf_dynptr_no_align is 1 and the alignment for struct bpf_dynptr_yes_align is 8. So theoretically compiler could allocate variable %a with alignment 1 although in reallity the compiler may choose a different alignment by considering other local variables. In [1], the verification failure happens because variable 'algo' is allocated on the stack with alignment 4 (fp-28). But the verifer wants its alignment to be 8. To fix the issue, the RFC patch ([1]) tried to add '__attribute__((aligned(8)))' to struct bpf_dynptr plus other similar structs. Andrii suggested that we could directly modify uapi struct with named fields like struct 'bpf_iter_num': struct bpf_iter_num { /* opaque iterator state; having __u64 here allows to preserve correct * alignment requirements in vmlinux.h, generated from BTF */ __u64 __opaque[1]; } __attribute__((aligned(8))); Indeed, adding named fields for those affected structs in this patch can preserve alignment when bpf program references them in vmlinux.h. With this patch, the verification failure in [1] can also be resolved. [1] https://lore.kernel.org/bpf/1b100f73-7625-4c1f-3ae5-50ecf84d3ff0@linux.dev/ [2] https://lore.kernel.org/bpf/20231103055218.2395034-1-yonghong.song@linux.dev/ Cc: Vadim Fedorenko <vadfed@meta.com> Cc: Martin KaFai Lau <martin.lau@linux.dev> Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231104024900.1539182-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-10 13:27:01 -08:00
Yonghong Song	2d5df9f626	libbpf: Fix potential uninitialized tail padding with LIBBPF_OPTS_RESET Martin reported that there is a libbpf complaining of non-zero-value tail padding with LIBBPF_OPTS_RESET macro if struct bpf_netkit_opts is modified to have a 4-byte tail padding. This only happens to clang compiler. The commend line is: ./test_progs -t tc_netkit_multi_links Martin and I did some investigation and found this indeed the case and the following are the investigation details. Clang: clang version 18.0.0 <I tried clang15/16/17 and they all have similar results> tools/lib/bpf/libbpf_common.h: #define LIBBPF_OPTS_RESET(NAME, ...) \ do { \ memset(&NAME, 0, sizeof(NAME)); \ NAME = (typeof(NAME)) { \ .sz = sizeof(NAME), \ __VA_ARGS__ \ }; \ } while (0) #endif tools/lib/bpf/libbpf.h: struct bpf_netkit_opts { /* size of this struct, for forward/backward compatibility / size_t sz; __u32 flags; __u32 relative_fd; __u32 relative_id; __u64 expected_revision; size_t :0; }; #define bpf_netkit_opts__last_field expected_revision In the above struct bpf_netkit_opts, there is no tail padding. prog_tests/tc_netkit.c: static void serial_test_tc_netkit_multi_links_target(int mode, int target) { ... LIBBPF_OPTS(bpf_netkit_opts, optl); ... LIBBPF_OPTS_RESET(optl, .flags = BPF_F_BEFORE, .relative_fd = bpf_program__fd(skel->progs.tc1), ); ... } Let us make the following source change, note that we have a 4-byte tailing padding now. diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index 6cd9c501624f..0dd83910ae9a 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -803,13 +803,13 @@ bpf_program__attach_tcx(const struct bpf_program prog, int ifindex, struct bpf_netkit_opts { /* size of this struct, for forward/backward compatibility */ size_t sz; - __u32 flags; __u32 relative_fd; __u32 relative_id; __u64 expected_revision; + __u32 flags; size_t :0; }; -#define bpf_netkit_opts__last_field expected_revision +#define bpf_netkit_opts__last_field flags The clang 18 generated asm code looks like below: ; LIBBPF_OPTS_RESET(optl, 55e3: 48 8d 7d 98 leaq -0x68(%rbp), %rdi 55e7: 31 f6 xorl %esi, %esi 55e9: ba 20 00 00 00 movl $0x20, %edx 55ee: e8 00 00 00 00 callq 0x55f3 <serial_test_tc_netkit_multi_links_target+0x18d3> 55f3: 48 c7 85 10 fd ff ff 20 00 00 00 movq $0x20, -0x2f0(%rbp) 55fe: 48 8b 85 68 ff ff ff movq -0x98(%rbp), %rax 5605: 48 8b 78 18 movq 0x18(%rax), %rdi 5609: e8 00 00 00 00 callq 0x560e <serial_test_tc_netkit_multi_links_target+0x18ee> 560e: 89 85 18 fd ff ff movl %eax, -0x2e8(%rbp) 5614: c7 85 1c fd ff ff 00 00 00 00 movl $0x0, -0x2e4(%rbp) 561e: 48 c7 85 20 fd ff ff 00 00 00 00 movq $0x0, -0x2e0(%rbp) 5629: c7 85 28 fd ff ff 08 00 00 00 movl $0x8, -0x2d8(%rbp) 5633: 48 8b 85 10 fd ff ff movq -0x2f0(%rbp), %rax 563a: 48 89 45 98 movq %rax, -0x68(%rbp) 563e: 48 8b 85 18 fd ff ff movq -0x2e8(%rbp), %rax 5645: 48 89 45 a0 movq %rax, -0x60(%rbp) 5649: 48 8b 85 20 fd ff ff movq -0x2e0(%rbp), %rax 5650: 48 89 45 a8 movq %rax, -0x58(%rbp) 5654: 48 8b 85 28 fd ff ff movq -0x2d8(%rbp), %rax 565b: 48 89 45 b0 movq %rax, -0x50(%rbp) ; link = bpf_program__attach_netkit(skel->progs.tc2, ifindex, &optl); At -O0 level, the clang compiler creates an intermediate copy. We have below to store 'flags' with 4-byte store and leave another 4 byte in the same 8-byte-aligned storage undefined, 5629: c7 85 28 fd ff ff 08 00 00 00 movl $0x8, -0x2d8(%rbp) and later we store 8-byte to the original zero'ed buffer 5654: 48 8b 85 28 fd ff ff movq -0x2d8(%rbp), %rax 565b: 48 89 45 b0 movq %rax, -0x50(%rbp) This caused a problem as the 4-byte value at [%rbp-0x2dc, %rbp-0x2e0) may be garbage. gcc (gcc 11.4) does not have this issue as it does zeroing struct first before doing assignments: ; LIBBPF_OPTS_RESET(optl, 50fd: 48 8d 85 40 fc ff ff leaq -0x3c0(%rbp), %rax 5104: ba 20 00 00 00 movl $0x20, %edx 5109: be 00 00 00 00 movl $0x0, %esi 510e: 48 89 c7 movq %rax, %rdi 5111: e8 00 00 00 00 callq 0x5116 <serial_test_tc_netkit_multi_links_target+0x1522> 5116: 48 8b 45 f0 movq -0x10(%rbp), %rax 511a: 48 8b 40 18 movq 0x18(%rax), %rax 511e: 48 89 c7 movq %rax, %rdi 5121: e8 00 00 00 00 callq 0x5126 <serial_test_tc_netkit_multi_links_target+0x1532> 5126: 48 c7 85 40 fc ff ff 00 00 00 00 movq $0x0, -0x3c0(%rbp) 5131: 48 c7 85 48 fc ff ff 00 00 00 00 movq $0x0, -0x3b8(%rbp) 513c: 48 c7 85 50 fc ff ff 00 00 00 00 movq $0x0, -0x3b0(%rbp) 5147: 48 c7 85 58 fc ff ff 00 00 00 00 movq $0x0, -0x3a8(%rbp) 5152: 48 c7 85 40 fc ff ff 20 00 00 00 movq $0x20, -0x3c0(%rbp) 515d: 89 85 48 fc ff ff movl %eax, -0x3b8(%rbp) 5163: c7 85 58 fc ff ff 08 00 00 00 movl $0x8, -0x3a8(%rbp) ; link = bpf_program__attach_netkit(skel->progs.tc2, ifindex, &optl); It is not clear how to resolve the compiler code generation as the compiler generates correct code w.r.t. how to handle unnamed padding in C standard. So this patch changed LIBBPF_OPTS_RESET macro to avoid uninitialized tail padding. We already knows LIBBPF_OPTS macro works on both gcc and clang, even with tail padding. So LIBBPF_OPTS_RESET is changed to be a LIBBPF_OPTS followed by a memcpy(), thus avoiding uninitialized tail padding. The below is asm code generated with this patch and with clang compiler: ; LIBBPF_OPTS_RESET(optl, 55e3: 48 8d bd 10 fd ff ff leaq -0x2f0(%rbp), %rdi 55ea: 31 f6 xorl %esi, %esi 55ec: ba 20 00 00 00 movl $0x20, %edx 55f1: e8 00 00 00 00 callq 0x55f6 <serial_test_tc_netkit_multi_links_target+0x18d6> 55f6: 48 c7 85 10 fd ff ff 20 00 00 00 movq $0x20, -0x2f0(%rbp) 5601: 48 8b 85 68 ff ff ff movq -0x98(%rbp), %rax 5608: 48 8b 78 18 movq 0x18(%rax), %rdi 560c: e8 00 00 00 00 callq 0x5611 <serial_test_tc_netkit_multi_links_target+0x18f1> 5611: 89 85 18 fd ff ff movl %eax, -0x2e8(%rbp) 5617: c7 85 1c fd ff ff 00 00 00 00 movl $0x0, -0x2e4(%rbp) 5621: 48 c7 85 20 fd ff ff 00 00 00 00 movq $0x0, -0x2e0(%rbp) 562c: c7 85 28 fd ff ff 08 00 00 00 movl $0x8, -0x2d8(%rbp) 5636: 48 8b 85 10 fd ff ff movq -0x2f0(%rbp), %rax 563d: 48 89 45 98 movq %rax, -0x68(%rbp) 5641: 48 8b 85 18 fd ff ff movq -0x2e8(%rbp), %rax 5648: 48 89 45 a0 movq %rax, -0x60(%rbp) 564c: 48 8b 85 20 fd ff ff movq -0x2e0(%rbp), %rax 5653: 48 89 45 a8 movq %rax, -0x58(%rbp) 5657: 48 8b 85 28 fd ff ff movq -0x2d8(%rbp), %rax 565e: 48 89 45 b0 movq %rax, -0x50(%rbp) ; link = bpf_program__attach_netkit(skel->progs.tc2, ifindex, &optl); In the above code, a temporary buffer is zeroed and then has proper value assigned. Finally, values in temporary buffer are copied to the original variable buffer, hence tail padding is guaranteed to be 0. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/bpf/20231107201511.2548645-1-yonghong.song@linux.dev	2023-11-10 13:27:01 -08:00
Daniel Borkmann	2cb0236318	libbpf: Add link-based API for netkit This adds bpf_program__attach_netkit() API to libbpf. Overall it is very similar to tcx. The API looks as following: LIBBPF_API struct bpf_link * bpf_program__attach_netkit(const struct bpf_program prog, int ifindex, const struct bpf_netkit_opts opts); The struct bpf_netkit_opts is done in similar way as struct bpf_tcx_opts for supporting bpf_mprog control parameters. The attach location for the primary and peer device is derived from the program section "netkit/primary" and "netkit/peer", respectively. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20231024214904.29825-4-daniel@iogearbox.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-11-10 13:27:01 -08:00
Daniel Borkmann	cc7f085286	tools: Sync if_link uapi header Sync if_link uapi header to the latest version as we need the refresher in tooling for netkit device. Given it's been a while since the last sync and the diff is fairly big, it has been done as its own commit. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20231024214904.29825-3-daniel@iogearbox.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-11-10 13:27:01 -08:00
Daniel Borkmann	62b1e4905b	netkit, bpf: Add bpf programmable net device This work adds a new, minimal BPF-programmable device called "netkit" (former PoC code-name "meta") we recently presented at LSF/MM/BPF. The core idea is that BPF programs are executed within the drivers xmit routine and therefore e.g. in case of containers/Pods moving BPF processing closer to the source. One of the goals was that in case of Pod egress traffic, this allows to move BPF programs from hostns tcx ingress into the device itself, providing earlier drop or forward mechanisms, for example, if the BPF program determines that the skb must be sent out of the node, then a redirect to the physical device can take place directly without going through per-CPU backlog queue. This helps to shift processing for such traffic from softirq to process context, leading to better scheduling decisions/performance (see measurements in the slides). In this initial version, the netkit device ships as a pair, but we plan to extend this further so it can also operate in single device mode. The pair comes with a primary and a peer device. Only the primary device, typically residing in hostns, can manage BPF programs for itself and its peer. The peer device is designated for containers/Pods and cannot attach/detach BPF programs. Upon the device creation, the user can set the default policy to 'pass' or 'drop' for the case when no BPF program is attached. Additionally, the device can be operated in L3 (default) or L2 mode. The management of BPF programs is done via bpf_mprog, so that multi-attach is supported right from the beginning with similar API and dependency controls as tcx. For details on the latter see commit 053c8e1f235d ("bpf: Add generic attach/detach/query API for multi-progs"). tc BPF compatibility is provided, so that existing programs can be easily migrated. Going forward, we plan to use netkit devices in Cilium as the main device type for connecting Pods. They will be operated in L3 mode in order to simplify a Pod's neighbor management and the peer will operate in default drop mode, so that no traffic is leaving between the time when a Pod is brought up by the CNI plugin and programs attached by the agent. Additionally, the programs we attach via tcx on the physical devices are using bpf_redirect_peer() for inbound traffic into netkit device, hence the latter is also supporting the ndo_get_peer_dev callback. Similarly, we use bpf_redirect_neigh() for the way out, pushing from netkit peer to phys device directly. Also, BIG TCP is supported on netkit device. For the follow-up work in single device mode, we plan to convert Cilium's cilium_host/_net devices into a single one. An extensive test suite for checking device operations and the BPF program and link management API comes as BPF selftests in this series. Co-developed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Stanislav Fomichev <sdf@google.com> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://github.com/borkmann/iproute2/tree/pr/netkit Link: http://vger.kernel.org/bpfconf2023_material/tcx_meta_netdev_borkmann.pdf (24ff.) Link: https://lore.kernel.org/r/20231024214904.29825-2-daniel@iogearbox.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-11-10 13:27:01 -08:00
Andrii Nakryiko	3189f70538	docs: attempt to fix .readthedocs.yaml Seems like we need to update the config ([0],[1]). [0] https://blog.readthedocs.com/migrate-configuration-v2/ [1] https://blog.readthedocs.com/use-build-os-config/ Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-10-27 14:07:51 -07:00
Yonghong Song	6a5776066c	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 2147c8d07e1abc8dfc3433ca18eed5295e230ede Checkpoint bpf-next commit: 0e133a13370389d3894891eafe54fec2c44ad735 Baseline bpf commit: 9ff8d2717fc8f63e5cb226ddbda20649eefa2728 Checkpoint bpf commit: 9ff8d2717fc8f63e5cb226ddbda20649eefa2728 Alexandre Ghiti (1): libbpf: Fix syscall access arguments on riscv Andrii Nakryiko (1): libbpf: Don't assume SHT_GNU_verdef presence for SHT_GNU_versym section Daan De Meyer (3): bpf: Implement cgroup sockaddr hooks for unix sockets libbpf: Add support for cgroup unix socket address hooks documentation/bpf: Document cgroup unix socket address hooks David Vernet (1): bpf: Add ability to pin bpf timer to calling CPU Martynas Pumputis (1): bpf: Derive source IP addr via bpf_*_fib_lookup() docs/program_types.rst \| 10 ++++++++++ include/uapi/linux/bpf.h \| 27 +++++++++++++++++++++++---- src/bpf_tracing.h \| 2 -- src/elf.c \| 16 ++++++++++------ src/libbpf.c \| 10 ++++++++++ 5 files changed, 53 insertions(+), 12 deletions(-) Signed-off-by: Yonghong Song <yonghong.song@linux.dev>	2023-10-26 09:00:01 -07:00
Yonghong Song	acecaf855d	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions. Signed-off-by: Yonghong Song <yonghong.song@linux.dev>	2023-10-19 11:36:22 -07:00
Andrii Nakryiko	365cefa149	libbpf: Don't assume SHT_GNU_verdef presence for SHT_GNU_versym section Fix too eager assumption that SHT_GNU_verdef ELF section is going to be present whenever binary has SHT_GNU_versym section. It seems like either SHT_GNU_verdef or SHT_GNU_verneed can be used, so failing on missing SHT_GNU_verdef actually breaks use cases in production. One specific reported issue, which was used to manually test this fix, was trying to attach to `readline` function in BASH binary. Fixes: bb7fa09399b9 ("libbpf: Support symbol versioning for uprobe") Reported-by: Liam Wisehart <liamwisehart@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Manu Bretelle <chantr4@gmail.com> Reviewed-by: Fangrui Song <maskray@google.com> Acked-by: Hengqi Chen <hengqi.chen@gmail.com> Link: https://lore.kernel.org/bpf/20231016182840.4033346-1-andrii@kernel.org	2023-10-19 11:36:22 -07:00
Daan De Meyer	f4b6dcfca1	documentation/bpf: Document cgroup unix socket address hooks Update the documentation to mention the new cgroup unix sockaddr hooks. Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com> Link: https://lore.kernel.org/r/20231011185113.140426-8-daan.j.demeyer@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-10-19 11:36:22 -07:00
Daan De Meyer	748787456b	libbpf: Add support for cgroup unix socket address hooks Add the necessary plumbing to hook up the new cgroup unix sockaddr hooks into libbpf. Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com> Link: https://lore.kernel.org/r/20231011185113.140426-6-daan.j.demeyer@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-10-19 11:36:22 -07:00
Daan De Meyer	8a08d63f29	bpf: Implement cgroup sockaddr hooks for unix sockets These hooks allows intercepting connect(), getsockname(), getpeername(), sendmsg() and recvmsg() for unix sockets. The unix socket hooks get write access to the address length because the address length is not fixed when dealing with unix sockets and needs to be modified when a unix socket address is modified by the hook. Because abstract socket unix addresses start with a NUL byte, we cannot recalculate the socket address in kernelspace after running the hook by calculating the length of the unix socket path using strlen(). These hooks can be used when users want to multiplex syscall to a single unix socket to multiple different processes behind the scenes by redirecting the connect() and other syscalls to process specific sockets. We do not implement support for intercepting bind() because when using bind() with unix sockets with a pathname address, this creates an inode in the filesystem which must be cleaned up. If we rewrite the address, the user might try to clean up the wrong file, leaking the socket in the filesystem where it is never cleaned up. Until we figure out a solution for this (and a use case for intercepting bind()), we opt to not allow rewriting the sockaddr in bind() calls. We also implement recvmsg() support for connected streams so that after a connect() that is modified by a sockaddr hook, any corresponding recmvsg() on the connected socket can also be modified to make the connected program think it is connected to the "intended" remote. Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com> Link: https://lore.kernel.org/r/20231011185113.140426-5-daan.j.demeyer@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-10-19 11:36:22 -07:00
Martynas Pumputis	c9f8eb5310	bpf: Derive source IP addr via bpf__fib_lookup() Extend the bpf_fib_lookup() helper by making it to return the source IPv4/IPv6 address if the BPF_FIB_LOOKUP_SRC flag is set. For example, the following snippet can be used to derive the desired source IP address: struct bpf_fib_lookup p = { .ipv4_dst = ip4->daddr }; ret = bpf_skb_fib_lookup(skb, p, sizeof(p), BPF_FIB_LOOKUP_SRC \| BPF_FIB_LOOKUP_SKIP_NEIGH); if (ret != BPF_FIB_LKUP_RET_SUCCESS) return TC_ACT_SHOT; / the p.ipv4_src now contains the source address */ The inability to derive the proper source address may cause malfunctions in BPF-based dataplanes for hosts containing netdevs with more than one routable IP address or for multi-homed hosts. For example, Cilium implements packet masquerading in BPF. If an egressing netdev to which the Cilium's BPF prog is attached has multiple IP addresses, then only one [hardcoded] IP address can be used for masquerading. This breaks connectivity if any other IP address should have been selected instead, for example, when a public and private addresses are attached to the same egress interface. The change was tested with Cilium [1]. Nikolay Aleksandrov helped to figure out the IPv6 addr selection. [1]: https://github.com/cilium/cilium/pull/28283 Signed-off-by: Martynas Pumputis <m@lambda.lt> Link: https://lore.kernel.org/r/20231007081415.33502-2-m@lambda.lt Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-10-19 11:36:22 -07:00
David Vernet	1c0358823c	bpf: Add ability to pin bpf timer to calling CPU BPF supports creating high resolution timers using bpf_timer_* helper functions. Currently, only the BPF_F_TIMER_ABS flag is supported, which specifies that the timeout should be interpreted as absolute time. It would also be useful to be able to pin that timer to a core. For example, if you wanted to make a subset of cores run without timer interrupts, and only have the timer be invoked on a single core. This patch adds support for this with a new BPF_F_TIMER_CPU_PIN flag. When specified, the HRTIMER_MODE_PINNED flag is passed to hrtimer_start(). A subsequent patch will update selftests to validate. Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <song@kernel.org> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/bpf/20231004162339.200702-2-void@manifault.com	2023-10-19 11:36:22 -07:00
Alexandre Ghiti	20c1170ea4	libbpf: Fix syscall access arguments on riscv Since commit 08d0ce30e0e4 ("riscv: Implement syscall wrappers"), riscv selects ARCH_HAS_SYSCALL_WRAPPER so let's use the generic implementation of PT_REGS_SYSCALL_REGS(). Fixes: 08d0ce30e0e4 ("riscv: Implement syscall wrappers") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/bpf/20231004110905.49024-2-bjorn@kernel.org	2023-10-19 11:36:22 -07:00
Yonghong Song	b44eb3a8fa	libbpf: fix bpf-checkpoint-commit The previous sync bpf-checkpoint-commit becomes invalid due to upstream bpf tree force-push. This patch picked a new valid commit as the bpf-checkpoint-commit so the sync script can work with newer changes. Signed-off-by: Yonghong Song <yonghong.song@linux.dev>	2023-10-19 11:36:22 -07:00
Yonghong Song	14648264b1	ci: Regenerate latest vmlinux.h for old kernel CI testts Without the change, we will have failures like below: Warning: Kernel ABI header at 'tools/include/uapi/linux/if_xdp.h' differs from latest version at 'include/uapi/linux/if_xdp.h' progs/getsockname_unix_prog.c:27:15: error: no member named 'uaddrlen' in 'struct bpf_sock_addr_kern' if (sa_kern->uaddrlen != unaddrlen) ~~~~~~~ ^ 1 error generated. make: * [Makefile:605: /home/runner/work/libbpf/libbpf/.kernel/tools/testing/selftests/bpf/getsockname_unix_prog.bpf.o] Error 1 make: * Waiting for unfinished jobs.... Error: Process completed with exit code 2. in Kernel 5.5.0 on ubuntu-20.04 + selftests Manu Bretelle kindly helped regenerate the vmlinux.h from latest bpf-next kernel for me. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>	2023-10-19 11:36:22 -07:00
Song Liu	e26b84dc33	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 45ee73a0722b9e1d0b7a524d06756291b13b5912 Checkpoint bpf-next commit: 2147c8d07e1abc8dfc3433ca18eed5295e230ede Baseline bpf commit: 57eb5e1c5c57972c95e8efab6bc81b87161b0b07 Checkpoint bpf commit: 4cb893e89221be9c791e43cab6a8e937cd57e17f Hengqi Chen (3): libbpf: Resolve symbol conflicts at the same offset for uprobe libbpf: Support symbol versioning for uprobe libbpf: Allow Golang symbols in uprobe secdef Jiri Olsa (2): bpf: Add missed value to kprobe_multi link info bpf: Add missed value to kprobe perf link info Kumar Kartikeya Dwivedi (2): libbpf: Refactor bpf_object__reloc_code libbpf: Add support for custom exception callbacks Martin Kelly (8): libbpf: Refactor cleanup in ring_buffer__add libbpf: Switch rings to array of pointers libbpf: Add ring_buffer__ring libbpf: Add ring__producer_pos, ring__consumer_pos libbpf: Add ring__avail_data_size libbpf: Add ring__size libbpf: Add ring__map_fd libbpf: Add ring__consume include/uapi/linux/bpf.h \| 2 + src/elf.c \| 139 ++++++++++++++++++++++++++--- src/libbpf.c \| 188 ++++++++++++++++++++++++++++++++------- src/libbpf.h \| 73 +++++++++++++++ src/libbpf.map \| 7 ++ src/ringbuf.c \| 85 +++++++++++++++--- 6 files changed, 439 insertions(+), 55 deletions(-) Signed-off-by: Song Liu <song@kernel.org>	2023-10-02 11:17:48 -07:00
Hengqi Chen	9a3a2e9303	libbpf: Allow Golang symbols in uprobe secdef Golang symbols in ELF files are different from C/C++ which contains special characters like '', '(' and ')'. With generics, things get more complicated, there are symbols like: github.com/cilium/ebpf/internal.(Deque[go.shape.interface { Format(fmt.State, int32); TypeName() string;github.com/cilium/ebpf/btf.copy() github.com/cilium/ebpf/btf.Type}]).Grow Matching such symbols using `%m[^\n]` in sscanf, this excludes newline which typically does not appear in ELF symbols. This should work in most use-cases and also work for unicode letters in identifiers. If newline do show up in ELF symbols, users can still attach to such symbol by specifying bpf_uprobe_opts::func_name. A working example can be found at this repo ([0]). [0]: https://github.com/chenhengqi/libbpf-go-symbols Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230929155954.92448-1-hengqi.chen@gmail.com	2023-10-02 11:17:48 -07:00
Jiri Olsa	96d70a52ad	bpf: Add missed value to kprobe perf link info Add missed value to kprobe attached through perf link info to hold the stats of missed kprobe handler execution. The kprobe's missed counter gets incremented when kprobe handler is not executed due to another kprobe running on the same cpu. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230920213145.1941596-4-jolsa@kernel.org	2023-10-02 11:17:48 -07:00
Jiri Olsa	de02cb1697	bpf: Add missed value to kprobe_multi link info Add missed value to kprobe_multi link info to hold the stats of missed kprobe_multi probe. The missed counter gets incremented when fprobe fails the recursion check or there's no rethook available for return probe. In either case the attached bpf program is not executed. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Song Liu <song@kernel.org> Reviewed-by: Song Liu <song@kernel.org> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/bpf/20230920213145.1941596-3-jolsa@kernel.org	2023-10-02 11:17:48 -07:00
Martin Kelly	b520bcd7d8	libbpf: Add ring__consume Add ring__consume to consume a single ringbuffer, analogous to ring_buffer__consume. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-14-martin.kelly@crowdstrike.com	2023-10-02 11:17:48 -07:00
Martin Kelly	6413c2d063	libbpf: Add ring__map_fd Add ring__map_fd to get the file descriptor underlying a given ringbuffer. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-12-martin.kelly@crowdstrike.com	2023-10-02 11:17:48 -07:00
Martin Kelly	cd3fe56c75	libbpf: Add ring__size Add ring__size to get the total size of a given ringbuffer. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-10-martin.kelly@crowdstrike.com	2023-10-02 11:17:48 -07:00
Martin Kelly	3e675ed6ab	libbpf: Add ring__avail_data_size Add ring__avail_data_size for querying the currently available data in the ringbuffer, similar to the BPF_RB_AVAIL_DATA flag in bpf_ringbuf_query. This is racy during ongoing operations but is still useful for overall information on how a ringbuffer is behaving. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-8-martin.kelly@crowdstrike.com	2023-10-02 11:17:48 -07:00
Martin Kelly	2ad16b970a	libbpf: Add ring__producer_pos, ring__consumer_pos Add APIs to get the producer and consumer position for a given ringbuffer. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-6-martin.kelly@crowdstrike.com	2023-10-02 11:17:48 -07:00
Martin Kelly	a20576f5f2	libbpf: Add ring_buffer__ring Add a new function ring_buffer__ring, which exposes struct ring * to the user, representing a single ringbuffer. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-4-martin.kelly@crowdstrike.com	2023-10-02 11:17:48 -07:00
Martin Kelly	bfa471bc85	libbpf: Switch rings to array of pointers Switch rb->rings to be an array of pointers instead of a contiguous block. This allows for each ring pointer to be stable after ring_buffer__add is called, which allows us to expose struct ring * to the user without gotchas. Without this change, the realloc in ring_buffer__add could invalidate a struct ring *, making it unsafe to give to the user. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-3-martin.kelly@crowdstrike.com	2023-10-02 11:17:48 -07:00
Martin Kelly	64f2b4ab49	libbpf: Refactor cleanup in ring_buffer__add Refactor the cleanup code in ring_buffer__add to use a unified err_out label. This reduces code duplication, as well as plugging a potential leak if mmap_sz != (__u64)(size_t)mmap_sz (currently this would miss unmapping tmp because ringbuf_unmap_ring isn't called). Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-2-martin.kelly@crowdstrike.com	2023-10-02 11:17:48 -07:00
Hengqi Chen	cd91ca8f99	libbpf: Support symbol versioning for uprobe In current implementation, we assume that symbol found in .dynsym section would have a version suffix and use it to compare with symbol user supplied. According to the spec ([0]), this assumption is incorrect, the version info of dynamic symbols are stored in .gnu.version and .gnu.version_d sections of ELF objects. For example: $ nm -D /lib/x86_64-linux-gnu/libc.so.6 \| grep rwlock_wrlock 000000000009b1a0 T __pthread_rwlock_wrlock@GLIBC_2.2.5 000000000009b1a0 T pthread_rwlock_wrlock@@GLIBC_2.34 000000000009b1a0 T pthread_rwlock_wrlock@GLIBC_2.2.5 $ readelf -W --dyn-syms /lib/x86_64-linux-gnu/libc.so.6 \| grep rwlock_wrlock 706: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 __pthread_rwlock_wrlock@GLIBC_2.2.5 2568: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 pthread_rwlock_wrlock@@GLIBC_2.34 2571: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 pthread_rwlock_wrlock@GLIBC_2.2.5 In this case, specify pthread_rwlock_wrlock@@GLIBC_2.34 or pthread_rwlock_wrlock@GLIBC_2.2.5 in bpf_uprobe_opts::func_name won't work. Because the qualified name does NOT match `pthread_rwlock_wrlock` (without version suffix) in .dynsym sections. This commit implements the symbol versioning for dynsym and allows user to specify symbol in the following forms: - func - func@LIB_VERSION - func@@LIB_VERSION In case of symbol conflicts, error out and users should resolve it by specifying a qualified name. [0]: https://refspecs.linuxfoundation.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic/symversion.html Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20230918024813.237475-3-hengqi.chen@gmail.com	2023-10-02 11:17:48 -07:00
Hengqi Chen	df9cd9f69c	libbpf: Resolve symbol conflicts at the same offset for uprobe Dynamic symbols in shared library may have the same name, for example: $ nm -D /lib/x86_64-linux-gnu/libc.so.6 \| grep rwlock_wrlock 000000000009b1a0 T __pthread_rwlock_wrlock@GLIBC_2.2.5 000000000009b1a0 T pthread_rwlock_wrlock@@GLIBC_2.34 000000000009b1a0 T pthread_rwlock_wrlock@GLIBC_2.2.5 $ readelf -W --dyn-syms /lib/x86_64-linux-gnu/libc.so.6 \| grep rwlock_wrlock 706: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 __pthread_rwlock_wrlock@GLIBC_2.2.5 2568: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 pthread_rwlock_wrlock@@GLIBC_2.34 2571: 000000000009b1a0 878 FUNC GLOBAL DEFAULT 15 pthread_rwlock_wrlock@GLIBC_2.2.5 Currently, users can't attach a uprobe to pthread_rwlock_wrlock because there are two symbols named pthread_rwlock_wrlock and both are global bind. And libbpf considers it as a conflict. Since both of them are at the same offset we could accept one of them harmlessly. Note that we already does this in elf_resolve_syms_offsets. Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20230918024813.237475-2-hengqi.chen@gmail.com	2023-10-02 11:17:48 -07:00
Kumar Kartikeya Dwivedi	713d1f5a83	libbpf: Add support for custom exception callbacks Add support to libbpf to append exception callbacks when loading a program. The exception callback is found by discovering the declaration tag 'exception_callback:<value>' and finding the callback in the value of the tag. The process is done in two steps. First, for each main program, the bpf_object__sanitize_and_load_btf function finds and marks its corresponding exception callback as defined by the declaration tag on it. Second, bpf_object__reloc_code is modified to append the indicated exception callback at the end of the instruction iteration (since exception callback will never be appended in that loop, as it is not directly referenced). Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20230912233214.1518551-16-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-10-02 11:17:48 -07:00
Kumar Kartikeya Dwivedi	998213a1e3	libbpf: Refactor bpf_object__reloc_code Refactor bpf_object__append_subprog_code out of bpf_object__reloc_code to be able to reuse it to append subprog related code for the exception callback to the main program. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20230912233214.1518551-15-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-10-02 11:17:48 -07:00
Andrii Nakryiko	56069cda78	ci: denylist empty_skb temporary The fix is in bpf tree. Needs to be merged to bpf-next, on which libbpf CI is tested. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-09-15 15:57:14 -07:00
Andrii Nakryiko	aadf88d4f6	ci: remove outdated temporary patches Remove patches, they don't apply and are not needed anymore. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-09-15 15:57:14 -07:00
Andrii Nakryiko	10da3d2384	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 9e3b47abeb8f76c39c570ffc924ac0b35f132274 Checkpoint bpf-next commit: 45ee73a0722b9e1d0b7a524d06756291b13b5912 Baseline bpf commit: 23d775f12dcd23d052a4927195f15e970e27ab26 Checkpoint bpf commit: 57eb5e1c5c57972c95e8efab6bc81b87161b0b07 Andrii Nakryiko (1): libbpf: Add basic BTF sanity validation Ravi Bangoria (1): perf/mem: Introduce PERF_MEM_LVLNUM_UNC Stanislav Fomichev (2): bpf: expose information about supported xdp metadata kfunc bpf: Clarify error expectations from bpf_clone_redirect Yonghong Song (2): libbpf: Add __percpu_kptr macro definition bpf: Mark BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE deprecated include/uapi/linux/bpf.h \| 13 ++- include/uapi/linux/netdev.h \| 16 ++++ include/uapi/linux/perf_event.h \| 3 +- src/bpf_helpers.h \| 1 + src/btf.c \| 160 ++++++++++++++++++++++++++++++++ 5 files changed, 190 insertions(+), 3 deletions(-) Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-09-15 15:57:14 -07:00
Andrii Nakryiko	d2838b2be3	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-09-15 15:57:14 -07:00
Stanislav Fomichev	aa44abfdd2	bpf: Clarify error expectations from bpf_clone_redirect Commit 151e887d8ff9 ("veth: Fixing transmit return status for dropped packets") exposed the fact that bpf_clone_redirect is capable of returning raw NET_XMIT_XXX return codes. This is in the conflict with its UAPI doc which says the following: "0 on success, or a negative error in case of failure." Update the UAPI to reflect the fact that bpf_clone_redirect can return positive error numbers, but don't explicitly define their meaning. Reported-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230911194731.286342-1-sdf@google.com	2023-09-15 15:57:14 -07:00
Stanislav Fomichev	6070b1bdcf	bpf: expose information about supported xdp metadata kfunc Add new xdp-rx-metadata-features member to netdev netlink which exports a bitmask of supported kfuncs. Most of the patch is autogenerated (headers), the only relevant part is netdev.yaml and the changes in netdev-genl.c to marshal into netlink. Example output on veth: $ ip link add veth0 type veth peer name veth1 # ifndex == 12 $ ./tools/net/ynl/samples/netdev 12 Select ifc ($ifindex; or 0 = dump; or -2 ntf check): 12 veth1[12] xdp-features (23): basic redirect rx-sg xdp-rx-metadata-features (3): timestamp hash xdp-zc-max-segs=0 Cc: netdev@vger.kernel.org Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20230913171350.369987-3-sdf@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-09-15 15:57:14 -07:00
Yonghong Song	6f30f1a00a	bpf: Mark BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE deprecated Now 'BPF_MAP_TYPE_CGRP_STORAGE + local percpu ptr' can cover all BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE functionality and more. So mark BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE deprecated. Also make changes in selftests/bpf/test_bpftool_synctypes.py and selftest libbpf_str to fix otherwise test errors. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230827152837.2003563-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-15 15:57:14 -07:00
Yonghong Song	332198af03	libbpf: Add __percpu_kptr macro definition Add __percpu_kptr macro definition in bpf_helpers.h. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230827152800.1998492-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-09-15 15:57:14 -07:00
Andrii Nakryiko	2dbdd3b564	libbpf: Add basic BTF sanity validation Implement a simple and straightforward BTF sanity check when parsing BTF data. Right now it's very basic and just validates that all the string offsets and type IDs are within valid range. For FUNC we also check that it points to FUNC_PROTO kinds. Even with such simple checks it fixes a bunch of crashes found by OSS fuzzer ([0]-[5]) and will allow fuzzer to make further progress. Some other invariants will be checked in follow up patches (like ensuring there is no infinite type loops), but this seems like a good start already. Adding FUNC -> FUNC_PROTO check revealed that one of selftests has a problem with FUNC pointing to VAR instead, so fix it up in the same commit. [0] https://github.com/libbpf/libbpf/issues/482 [1] https://github.com/libbpf/libbpf/issues/483 [2] https://github.com/libbpf/libbpf/issues/485 [3] https://github.com/libbpf/libbpf/issues/613 [4] https://github.com/libbpf/libbpf/issues/618 [5] https://github.com/libbpf/libbpf/issues/619 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Reviewed-by: Song Liu <song@kernel.org> Closes: https://github.com/libbpf/libbpf/issues/617 Link: https://lore.kernel.org/bpf/20230825202152.1813394-1-andrii@kernel.org	2023-09-15 15:57:14 -07:00
Ravi Bangoria	d8a4b198da	perf/mem: Introduce PERF_MEM_LVLNUM_UNC Older API PERF_MEM_LVL_UNC can be replaced by PERF_MEM_LVLNUM_UNC. Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230725150206.184-2-ravi.bangoria@amd.com	2023-09-15 15:57:14 -07:00
Andrii Nakryiko	5fc0677111	ci: update list of tests/subtests for 5.5 kernel Some tests can't succeed on 5.5, which is very old. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-09-07 09:11:51 -07:00
Daniel Müller	295b5726f0	Introduce pull request template This change introduces a pull request template that hopefully helps prevent more libbpf-specific pull requests that should really be submitted to the BPF mailing from being opened against this repository. Recent examples include [0] [1]. [0] https://github.com/libbpf/libbpf/pull/712 [1] https://github.com/libbpf/libbpf/pull/723 Signed-off-by: Daniel Müller <deso@posteo.net>	2023-09-05 11:08:57 -07:00
Andrii Nakryiko	5a46421ad8	ci: deny newly added tc_bpf/tc_bpf_non_root for 5.5 It doesn't work on 5.5 and was just recently introduced as a new subtest to already existing test. Add subtest to denylist. Also clean up old denylist, leaving only "exception" relative to ALLOWLIST. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-08-25 11:51:28 -07:00
Andrii Nakryiko	942a0b8056	Makefile: silence GCC's bogus complaint about possible NULL in printf GCC started complaining that some of libbpf pr_warn() statements might be passing NULL for map name. Map name is never NULL for non-NULL map pointer, so this is a false positive which triggers build failures. Silence format-overflow warning altogether to avoid this in the future as well. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-08-25 11:51:28 -07:00
Andrii Nakryiko	fcc940e6b2	Makefile: add elf.c to a list of built files Libbpf now has one more .c file, make sure Github Makefile builds it. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-08-25 11:51:28 -07:00
Andrii Nakryiko	2e6b54e5ea	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 0a55264cf966fb95ebf9d03d9f81fa992f069312 Checkpoint bpf-next commit: 9e3b47abeb8f76c39c570ffc924ac0b35f132274 Baseline bpf commit: 23d775f12dcd23d052a4927195f15e970e27ab26 Checkpoint bpf commit: 23d775f12dcd23d052a4927195f15e970e27ab26 Andrii Nakryiko (1): libbpf: fix signedness determination in CO-RE relo handling logic Daniel Xu (1): libbpf: Add bpf_object__unpin() Hao Luo (1): libbpf: Free btf_vmlinux when closing bpf_object Jiri Olsa (15): bpf: Switch BPF_F_KPROBE_MULTI_RETURN macro to enum bpf: Add multi uprobe link bpf: Add cookies support for uprobe_multi link bpf: Add pid filter support for uprobe_multi link libbpf: Add uprobe_multi attach type and link names libbpf: Move elf_find_func_offset* functions to elf object libbpf: Add elf_open/elf_close functions libbpf: Add elf symbol iterator libbpf: Add elf_resolve_syms_offsets function libbpf: Add elf_resolve_pattern_offsets function libbpf: Add bpf_link_create support for multi uprobes libbpf: Add bpf_program__attach_uprobe_multi function libbpf: Add support for u[ret]probe.multi[.s] program sections libbpf: Add uprobe multi link detection libbpf: Add uprobe multi link support to bpf_program__attach_usdt include/uapi/linux/bpf.h \| 22 +- src/bpf.c \| 11 + src/bpf.h \| 11 +- src/elf.c \| 440 +++++++++++++++++++++++++++++++++++++++ src/libbpf.c \| 404 ++++++++++++++++++----------------- src/libbpf.h \| 52 +++++ src/libbpf.map \| 2 + src/libbpf_internal.h \| 21 ++ src/relo_core.c \| 2 +- src/usdt.c \| 116 +++++++---- 10 files changed, 853 insertions(+), 228 deletions(-) create mode 100644 src/elf.c Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-08-25 11:51:28 -07:00
Andrii Nakryiko	b4c8def45f	libbpf: fix signedness determination in CO-RE relo handling logic Extracting btf_int_encoding() is only meaningful for BTF_KIND_INT, so we need to check that first before inferring signedness. Closes: https://github.com/libbpf/libbpf/issues/704 Reported-by: Lorenz Bauer <lmb@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230824000016.2658017-2-andrii@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-08-25 11:51:28 -07:00
Daniel Xu	62a186ea68	libbpf: Add bpf_object__unpin() For bpf_object__pin_programs() there is bpf_object__unpin_programs(). Likewise bpf_object__unpin_maps() for bpf_object__pin_maps(). But no bpf_object__unpin() for bpf_object__pin(). Adding the former adds symmetry to the API. It's also convenient for cleanup in application code. It's an API I would've used if it was available for a repro I was writing earlier. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/bpf/b2f9d41da4a350281a0b53a804d11b68327e14e5.1692832478.git.dxu@dxuuu.xyz	2023-08-25 11:51:28 -07:00
Hao Luo	a687461867	libbpf: Free btf_vmlinux when closing bpf_object I hit a memory leak when testing bpf_program__set_attach_target(). Basically, set_attach_target() may allocate btf_vmlinux, for example, when setting attach target for bpf_iter programs. But btf_vmlinux is freed only in bpf_object_load(), which means if we only open bpf object but not load it, setting attach target may leak btf_vmlinux. So let's free btf_vmlinux in bpf_object__close() anyway. Signed-off-by: Hao Luo <haoluo@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230822193840.1509809-1-haoluo@google.com	2023-08-25 11:51:28 -07:00
Jiri Olsa	74188c1740	libbpf: Add uprobe multi link support to bpf_program__attach_usdt Adding support for usdt_manager_attach_usdt to use uprobe_multi link to attach to usdt probes. The uprobe_multi support is detected before the usdt program is loaded and its expected_attach_type is set accordingly. If uprobe_multi support is detected the usdt_manager_attach_usdt gathers uprobes info and calls bpf_program__attach_uprobe to create all needed uprobes. If uprobe_multi support is not detected the old behaviour stays. Also adding usdt.s program section for sleepable usdt probes. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-18-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-08-25 11:51:28 -07:00
Jiri Olsa	60cf42249b	libbpf: Add uprobe multi link detection Adding uprobe-multi link detection. It will be used later in bpf_program__attach_usdt function to check and use uprobe_multi link over standard uprobe links. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-17-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-08-25 11:51:28 -07:00
Jiri Olsa	bc829bac06	libbpf: Add support for u[ret]probe.multi[.s] program sections Adding support for several uprobe_multi program sections to allow auto attach of multi_uprobe programs. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-16-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-08-25 11:51:28 -07:00
Jiri Olsa	9f76dd6dd0	libbpf: Add bpf_program__attach_uprobe_multi function Adding bpf_program__attach_uprobe_multi function that allows to attach multiple uprobes with uprobe_multi link. The user can specify uprobes with direct arguments: binary_path/func_pattern/pid or with struct bpf_uprobe_multi_opts opts argument fields: const char *syms; const unsigned long offsets; const unsigned long ref_ctr_offsets; const __u64 cookies; User can specify 2 mutually exclusive set of inputs: 1) use only path/func_pattern/pid arguments 2) use path/pid with allowed combinations of: syms/offsets/ref_ctr_offsets/cookies/cnt - syms and offsets are mutually exclusive - ref_ctr_offsets and cookies are optional Any other usage results in error. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-15-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-08-25 11:51:28 -07:00
Jiri Olsa	cd21cc08cc	libbpf: Add bpf_link_create support for multi uprobes Adding new uprobe_multi struct to bpf_link_create_opts object to pass multiple uprobe data to link_create attr uapi. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-14-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-08-25 11:51:28 -07:00
Jiri Olsa	c7ef3a169e	libbpf: Add elf_resolve_pattern_offsets function Adding elf_resolve_pattern_offsets function that looks up offsets for symbols specified by pattern argument. The 'pattern' argument allows wildcards (*?' supported). Offsets are returned in allocated array together with its size and needs to be released by the caller. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-13-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-08-25 11:51:28 -07:00
Jiri Olsa	91fd655644	libbpf: Add elf_resolve_syms_offsets function Adding elf_resolve_syms_offsets function that looks up offsets for symbols specified in syms array argument. Offsets are returned in allocated array with the 'cnt' size, that needs to be released by the caller. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-12-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-08-25 11:51:28 -07:00
Jiri Olsa	b7ec9d9669	libbpf: Add elf symbol iterator Adding elf symbol iterator object (and some functions) that follow open-coded iterator pattern and some functions to ease up iterating elf object symbols. The idea is to iterate single symbol section with: struct elf_sym_iter iter; struct elf_sym *sym; if (elf_sym_iter_new(&iter, elf, binary_path, SHT_DYNSYM)) goto error; while ((sym = elf_sym_iter_next(&iter))) { ... } I considered opening the elf inside the iterator and iterate all symbol sections, but then it gets more complicated wrt user checks for when the next section is processed. Plus side is the we don't need 'exit' function, because caller/user is in charge of that. The returned iterated symbol object from elf_sym_iter_next function is placed inside the struct elf_sym_iter, so no extra allocation or argument is needed. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-11-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-08-25 11:51:28 -07:00
Jiri Olsa	1f8929293e	libbpf: Add elf_open/elf_close functions Adding elf_open/elf_close functions and using it in elf_find_func_offset_from_file function. It will be used in following changes to save some common code. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-10-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-08-25 11:51:28 -07:00
Jiri Olsa	0cd5b05f53	libbpf: Move elf_find_func_offset* functions to elf object Adding new elf object that will contain elf related functions. There's no functional change. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-9-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-08-25 11:51:28 -07:00
Jiri Olsa	a1c2e05c4f	libbpf: Add uprobe_multi attach type and link names Adding new uprobe_multi attach type and link names, so the functions can resolve the new values. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-8-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-08-25 11:51:28 -07:00
Jiri Olsa	c1a12134bd	bpf: Add pid filter support for uprobe_multi link Adding support to specify pid for uprobe_multi link and the uprobes are created only for task with given pid value. Using the consumer.filter filter callback for that, so the task gets filtered during the uprobe installation. We still need to check the task during runtime in the uprobe handler, because the handler could get executed if there's another system wide consumer on the same uprobe (thanks Oleg for the insight). Cc: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230809083440.3209381-6-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-08-25 11:51:28 -07:00
Jiri Olsa	12466f75db	bpf: Add cookies support for uprobe_multi link Adding support to specify cookies array for uprobe_multi link. The cookies array share indexes and length with other uprobe_multi arrays (offsets/ref_ctr_offsets). The cookies[i] value defines cookie for i-the uprobe and will be returned by bpf_get_attach_cookie helper when called from ebpf program hooked to that specific uprobe. Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230809083440.3209381-5-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-08-25 11:51:28 -07:00
Jiri Olsa	ba4a10d764	bpf: Add multi uprobe link Adding new multi uprobe link that allows to attach bpf program to multiple uprobes. Uprobes to attach are specified via new link_create uprobe_multi union: struct { __aligned_u64 path; __aligned_u64 offsets; __aligned_u64 ref_ctr_offsets; __u32 cnt; __u32 flags; } uprobe_multi; Uprobes are defined for single binary specified in path and multiple calling sites specified in offsets array with optional reference counters specified in ref_ctr_offsets array. All specified arrays have length of 'cnt'. The 'flags' supports single bit for now that marks the uprobe as return probe. Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230809083440.3209381-4-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-08-25 11:51:28 -07:00
Jiri Olsa	8765ef8276	bpf: Switch BPF_F_KPROBE_MULTI_RETURN macro to enum Switching BPF_F_KPROBE_MULTI_RETURN macro to anonymous enum, so it'd show up in vmlinux.h. There's not functional change compared to having this as macro. Acked-by: Yafang Shao <laoar.shao@gmail.com> Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230809083440.3209381-2-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-08-25 11:51:28 -07:00
Andrii Nakryiko	6a91da19fe	fuzz: use https-based URL for elfutils For environments behind proxies, having https:// URL for pulling GIT is more convenient. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-08-24 14:14:18 -07:00
Andrii Nakryiko	383198dc49	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: a3e7e6b17946f48badce98d7ac360678a0ea7393 Checkpoint bpf-next commit: 0a55264cf966fb95ebf9d03d9f81fa992f069312 Baseline bpf commit: 496720b7cfb6574a8f6f4d434f23e3d1e6cfaeb9 Checkpoint bpf commit: 23d775f12dcd23d052a4927195f15e970e27ab26 Alan Maguire (1): bpf: sync tools/ uapi header with Arnaldo Carvalho de Melo (1): tools headers uapi: Sync linux/fcntl.h with the kernel sources Daniel Borkmann (5): bpf: Add generic attach/detach/query API for multi-progs bpf: Add fd-based tcx multi-prog infra with link support libbpf: Add opts-based attach/detach/query API for tcx libbpf: Add link-based API for tcx libbpf: Add helper macro to clear opts structs Daniel Xu (1): netfilter: bpf: Support BPF_F_NETFILTER_IP_DEFRAG in netfilter link Dave Marchevsky (1): libbpf: Support triple-underscore flavors for kfunc relocation Jiri Olsa (1): bpf: Add support for bpf_get_func_ip helper for uprobe program Lorenz Bauer (1): bpf, net: Support SO_REUSEPORT sockets with bpf_sk_assign Maciej Fijalkowski (1): xsk: add new netlink attribute dedicated for ZC max frags Magnus Karlsson (2): selftests/xsk: transmit and receive multi-buffer packets selftests/xsk: add basic multi-buffer test Marco Vedovati (1): libbpf: Set close-on-exec flag on gzopen Sergey Kacheev (1): libbpf: Use local includes inside the library Stanislav Fomichev (1): ynl: regenerate all headers Yafang Shao (2): bpf: Support ->fill_link_info for kprobe_multi bpf: Support ->fill_link_info for perf_event Yonghong Song (1): bpf: Support new sign-extension load insns include/uapi/linux/bpf.h \| 128 +++++++++++++++++++++++++++++++----- include/uapi/linux/fcntl.h \| 5 ++ include/uapi/linux/if_xdp.h \| 9 +++ include/uapi/linux/netdev.h \| 4 +- src/bpf.c \| 127 ++++++++++++++++++++++++----------- src/bpf.h \| 97 +++++++++++++++++++++++---- src/bpf_tracing.h \| 2 +- src/libbpf.c \| 94 +++++++++++++++++++++----- src/libbpf.h \| 18 ++++- src/libbpf.map \| 2 + src/libbpf_common.h \| 16 +++++ src/netlink.c \| 5 ++ src/usdt.bpf.h \| 4 +- 13 files changed, 423 insertions(+), 88 deletions(-) Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-08-21 13:27:45 -07:00
Andrii Nakryiko	839c08a6d8	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-08-21 13:27:45 -07:00
Dave Marchevsky	6d704c7ffd	libbpf: Support triple-underscore flavors for kfunc relocation The function signature of kfuncs can change at any time due to their intentional lack of stability guarantees. As kfuncs become more widely used, BPF program writers will need facilities to support calling different versions of a kfunc from a single BPF object. Consider this simplified example based on a real scenario we ran into at Meta: /* initial kfunc signature / int some_kfunc(void ptr) /* Oops, we need to add some flag to modify behavior. No problem, change the kfunc. flags = 0 retains original behavior / int some_kfunc(void ptr, long flags) If the initial version of the kfunc is deployed on some portion of the fleet and the new version on the rest, a fleetwide service that uses some_kfunc will currently need to load different BPF programs depending on which some_kfunc is available. Luckily CO-RE provides a facility to solve a very similar problem, struct definition changes, by allowing program writers to declare my_struct___old and my_struct___new, with ___suffix being considered a 'flavor' of the non-suffixed name and being ignored by bpf_core_type_exists and similar calls. This patch extends the 'flavor' facility to the kfunc extern relocation process. BPF program writers can now declare extern int some_kfunc___old(void ptr) extern int some_kfunc___new(void ptr, int flags) then test which version of the kfunc exists with bpf_ksym_exists. Relocation and verifier's dead code elimination will work in concert as expected, allowing this pattern: if (bpf_ksym_exists(some_kfunc___old)) some_kfunc___old(ptr); else some_kfunc___new(ptr, 0); Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: David Vernet <void@manifault.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20230817225353.2570845-1-davemarchevsky@fb.com	2023-08-21 13:27:45 -07:00
Marco Vedovati	20699ecf61	libbpf: Set close-on-exec flag on gzopen Enable the close-on-exec flag when using gzopen. This is especially important for multithreaded programs making use of libbpf, where a fork + exec could race with libbpf library calls, potentially resulting in a file descriptor leaked to the new process. This got missed in 59842c5451fe ("libbpf: Ensure libbpf always opens files with O_CLOEXEC"). Fixes: 59842c5451fe ("libbpf: Ensure libbpf always opens files with O_CLOEXEC") Signed-off-by: Marco Vedovati <marco.vedovati@crowdstrike.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230810214350.106301-1-martin.kelly@crowdstrike.com	2023-08-21 13:27:45 -07:00
Jiri Olsa	cd85f34103	bpf: Add support for bpf_get_func_ip helper for uprobe program Adding support for bpf_get_func_ip helper for uprobe program to return probed address for both uprobe and return uprobe. We discussed this in [1] and agreed that uprobe can have special use of bpf_get_func_ip helper that differs from kprobe. The kprobe bpf_get_func_ip returns: - address of the function if probe is attach on function entry for both kprobe and return kprobe - 0 if the probe is not attach on function entry The uprobe bpf_get_func_ip returns: - address of the probe for both uprobe and return uprobe The reason for this semantic change is that kernel can't really tell if the probe user space address is function entry. The uprobe program is actually kprobe type program attached as uprobe. One of the consequences of this design is that uprobes do not have its own set of helpers, but share them with kprobes. As we need different functionality for bpf_get_func_ip helper for uprobe, I'm adding the bool value to the bpf_trace_run_ctx, so the helper can detect that it's executed in uprobe context and call specific code. The is_uprobe bool is set as true in bpf_prog_run_array_sleepable, which is currently used only for executing bpf programs in uprobe. Renaming bpf_prog_run_array_sleepable to bpf_prog_run_array_uprobe to address that it's only used for uprobes and that it sets the run_ctx.is_uprobe as suggested by Yafang Shao. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Alan Maguire <alan.maguire@oracle.com> [1] https://lore.kernel.org/bpf/CAEf4BzZ=xLVkG5eurEuvLU79wAMtwho7ReR+XJAgwhFF4M-7Cg@mail.gmail.com/ Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Viktor Malik <vmalik@redhat.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230807085956.2344866-2-jolsa@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-08-21 13:27:45 -07:00
Sergey Kacheev	26e32f542b	libbpf: Use local includes inside the library In our monrepo, we try to minimize special processing when importing (aka vendor) third-party source code. Ideally, we try to import directly from the repositories with the code without changing it, we try to stick to the source code dependency instead of the artifact dependency. In the current situation, a patch has to be made for libbpf to fix the includes in bpf headers so that they work directly from libbpf/src. Signed-off-by: Sergey Kacheev <s.kacheev@gmail.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/CAJVhQqUg6OKq6CpVJP5ng04Dg+z=igevPpmuxTqhsR3dKvd9+Q@mail.gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-08-21 13:27:45 -07:00
Daniel Xu	c5f64030de	netfilter: bpf: Support BPF_F_NETFILTER_IP_DEFRAG in netfilter link This commit adds support for enabling IP defrag using pre-existing netfilter defrag support. Basically all the flag does is bump a refcnt while the link the active. Checks are also added to ensure the prog requesting defrag support is run _after_ netfilter defrag hooks. We also take care to avoid any issues w.r.t. module unloading -- while defrag is active on a link, the module is prevented from unloading. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Reviewed-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/r/5cff26f97e55161b7d56b09ddcf5f8888a5add1d.1689970773.git.dxu@dxuuu.xyz Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-08-21 13:27:45 -07:00
Yonghong Song	3d0e1c5a3a	bpf: Support new sign-extension load insns Add interpreter/jit support for new sign-extension load insns which adds a new mode (BPF_MEMSX). Also add verifier support to recognize these insns and to do proper verification with new insns. In verifier, besides to deduce proper bounds for the dst_reg, probed memory access is also properly handled. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230728011156.3711870-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-08-21 13:27:45 -07:00
Lorenz Bauer	36cabf8a4a	bpf, net: Support SO_REUSEPORT sockets with bpf_sk_assign Currently the bpf_sk_assign helper in tc BPF context refuses SO_REUSEPORT sockets. This means we can't use the helper to steer traffic to Envoy, which configures SO_REUSEPORT on its sockets. In turn, we're blocked from removing TPROXY from our setup. The reason that bpf_sk_assign refuses such sockets is that the bpf_sk_lookup helpers don't execute SK_REUSEPORT programs. Instead, one of the reuseport sockets is selected by hash. This could cause dispatch to the "wrong" socket: sk = bpf_sk_lookup_tcp(...) // select SO_REUSEPORT by hash bpf_sk_assign(skb, sk) // SK_REUSEPORT wasn't executed Fixing this isn't as simple as invoking SK_REUSEPORT from the lookup helpers unfortunately. In the tc context, L2 headers are at the start of the skb, while SK_REUSEPORT expects L3 headers instead. Instead, we execute the SK_REUSEPORT program when the assigned socket is pulled out of the skb, further up the stack. This creates some trickiness with regards to refcounting as bpf_sk_assign will put both refcounted and RCU freed sockets in skb->sk. reuseport sockets are RCU freed. We can infer that the sk_assigned socket is RCU freed if the reuseport lookup succeeds, but convincing yourself of this fact isn't straight forward. Therefore we defensively check refcounting on the sk_assign sock even though it's probably not required in practice. Fixes: 8e368dc72e86 ("bpf: Fix use of sk->sk_reuseport from sk_assign") Fixes: cf7fbe660f2d ("bpf: Add socket assign support") Co-developed-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Joe Stringer <joe@cilium.io> Link: https://lore.kernel.org/bpf/CACAyw98+qycmpQzKupquhkxbvWK4OFyDuuLMBNROnfWMZxUWeA@mail.gmail.com/ Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Lorenz Bauer <lmb@isovalent.com> Link: https://lore.kernel.org/r/20230720-so-reuseport-v6-7-7021b683cdae@isovalent.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-08-21 13:27:45 -07:00
Stanislav Fomichev	1180ab4066	ynl: regenerate all headers Also add support to pass topdir to ynl-regen.sh (Jakub) and call it from the makefile to update the UAPI headers. Signed-off-by: Stanislav Fomichev <sdf@google.com> Co-developed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20230727163001.3952878-4-sdf@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-08-21 13:27:45 -07:00
Arnaldo Carvalho de Melo	e6ab647970	tools headers uapi: Sync linux/fcntl.h with the kernel sources To get the changes in: 96b2b072ee62be8a ("exportfs: allow exporting non-decodeable file handles to userspace") That don't add anything that is handled by existing hard coded tables or table generation scripts. This silences this perf build warning: Warning: Kernel ABI header differences: diff -u tools/include/uapi/linux/fcntl.h include/uapi/linux/fcntl.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Amir Goldstein <amir73il@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Jan Kara <jack@suse.cz> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZK11P5AwRBUxxutI@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-08-21 13:27:45 -07:00
Alan Maguire	71d8eadb90	bpf: sync tools/ uapi header with Seeing the following: Warning: Kernel ABI header at 'tools/include/uapi/linux/bpf.h' differs from latest version at 'include/uapi/linux/bpf.h' ...so sync tools version missing some list_node/rb_tree fields. Fixes: c3c510ce431c ("bpf: Add 'owner' field to bpf_{list,rb}_node") Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/r/20230719162257.20818-1-alan.maguire@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-08-21 13:27:45 -07:00
Daniel Borkmann	488031955d	libbpf: Add helper macro to clear opts structs Add a small and generic LIBBPF_OPTS_RESET() helper macros which clears an opts structure and reinitializes its .sz member to place the structure size. Additionally, the user can pass option-specific data to reinitialize via varargs. I found this very useful when developing selftests, but it is also generic enough as a macro next to the existing LIBBPF_OPTS() which hides the .sz initialization, too. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20230719140858.13224-6-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-08-21 13:27:45 -07:00
Daniel Borkmann	0fadd4ba39	libbpf: Add link-based API for tcx Implement tcx BPF link support for libbpf. The bpf_program__attach_fd() API has been refactored slightly in order to pass bpf_link_create_opts pointer as input. A new bpf_program__attach_tcx() has been added on top of this which allows for passing all relevant data via extensible struct bpf_tcx_opts. The program sections tcx/ingress and tcx/egress correspond to the hook locations for tc ingress and egress, respectively. For concrete usage examples, see the extensive selftests that have been developed as part of this series. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230719140858.13224-5-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-08-21 13:27:45 -07:00
Daniel Borkmann	bb5d7c1be8	libbpf: Add opts-based attach/detach/query API for tcx Extend libbpf attach opts and add a new detach opts API so this can be used to add/remove fd-based tcx BPF programs. The old-style bpf_prog_detach() and bpf_prog_detach2() APIs are refactored to reuse the new bpf_prog_detach_opts() internally. The bpf_prog_query_opts() API got extended to be able to handle the new link_ids, link_attach_flags and revision fields. For concrete usage examples, see the extensive selftests that have been developed as part of this series. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230719140858.13224-4-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-08-21 13:27:45 -07:00
Daniel Borkmann	b064c40d94	bpf: Add fd-based tcx multi-prog infra with link support This work refactors and adds a lightweight extension ("tcx") to the tc BPF ingress and egress data path side for allowing BPF program management based on fds via bpf() syscall through the newly added generic multi-prog API. The main goal behind this work which we also presented at LPC [0] last year and a recent update at LSF/MM/BPF this year [3] is to support long-awaited BPF link functionality for tc BPF programs, which allows for a model of safe ownership and program detachment. Given the rise in tc BPF users in cloud native environments, this becomes necessary to avoid hard to debug incidents either through stale leftover programs or 3rd party applications accidentally stepping on each others toes. As a recap, a BPF link represents the attachment of a BPF program to a BPF hook point. The BPF link holds a single reference to keep BPF program alive. Moreover, hook points do not reference a BPF link, only the application's fd or pinning does. A BPF link holds meta-data specific to attachment and implements operations for link creation, (atomic) BPF program update, detachment and introspection. The motivation for BPF links for tc BPF programs is multi-fold, for example: - From Meta: "It's especially important for applications that are deployed fleet-wide and that don't "control" hosts they are deployed to. If such application crashes and no one notices and does anything about that, BPF program will keep running draining resources or even just, say, dropping packets. We at FB had outages due to such permanent BPF attachment semantics. With fd-based BPF link we are getting a framework, which allows safe, auto-detachable behavior by default, unless application explicitly opts in by pinning the BPF link." [1] - From Cilium-side the tc BPF programs we attach to host-facing veth devices and phys devices build the core datapath for Kubernetes Pods, and they implement forwarding, load-balancing, policy, EDT-management, etc, within BPF. Currently there is no concept of 'safe' ownership, e.g. we've recently experienced hard-to-debug issues in a user's staging environment where another Kubernetes application using tc BPF attached to the same prio/handle of cls_bpf, accidentally wiping all Cilium-based BPF programs from underneath it. The goal is to establish a clear/safe ownership model via links which cannot accidentally be overridden. [0,2] BPF links for tc can co-exist with non-link attachments, and the semantics are in line also with XDP links: BPF links cannot replace other BPF links, BPF links cannot replace non-BPF links, non-BPF links cannot replace BPF links and lastly only non-BPF links can replace non-BPF links. In case of Cilium, this would solve mentioned issue of safe ownership model as 3rd party applications would not be able to accidentally wipe Cilium programs, even if they are not BPF link aware. Earlier attempts [4] have tried to integrate BPF links into core tc machinery to solve cls_bpf, which has been intrusive to the generic tc kernel API with extensions only specific to cls_bpf and suboptimal/complex since cls_bpf could be wiped from the qdisc also. Locking a tc BPF program in place this way, is getting into layering hacks given the two object models are vastly different. We instead implemented the tcx (tc 'express') layer which is an fd-based tc BPF attach API, so that the BPF link implementation blends in naturally similar to other link types which are fd-based and without the need for changing core tc internal APIs. BPF programs for tc can then be successively migrated from classic cls_bpf to the new tc BPF link without needing to change the program's source code, just the BPF loader mechanics for attaching is sufficient. For the current tc framework, there is no change in behavior with this change and neither does this change touch on tc core kernel APIs. The gist of this patch is that the ingress and egress hook have a lightweight, qdisc-less extension for BPF to attach its tc BPF programs, in other words, a minimal entry point for tc BPF. The name tcx has been suggested from discussion of earlier revisions of this work as a good fit, and to more easily differ between the classic cls_bpf attachment and the fd-based one. For the ingress and egress tcx points, the device holds a cache-friendly array with program pointers which is separated from control plane (slow-path) data. Earlier versions of this work used priority to determine ordering and expression of dependencies similar as with classic tc, but it was challenged that for something more future-proof a better user experience is required. Hence this resulted in the design and development of the generic attach/detach/query API for multi-progs. See prior patch with its discussion on the API design. tcx is the first user and later we plan to integrate also others, for example, one candidate is multi-prog support for XDP which would benefit and have the same 'look and feel' from API perspective. The goal with tcx is to have maximum compatibility to existing tc BPF programs, so they don't need to be rewritten specifically. Compatibility to call into classic tcf_classify() is also provided in order to allow successive migration or both to cleanly co-exist where needed given its all one logical tc layer and the tcx plus classic tc cls/act build one logical overall processing pipeline. tcx supports the simplified return codes TCX_NEXT which is non-terminating (go to next program) and terminating ones with TCX_PASS, TCX_DROP, TCX_REDIRECT. The fd-based API is behind a static key, so that when unused the code is also not entered. The struct tcx_entry's program array is currently static, but could be made dynamic if necessary at a point in future. The a/b pair swap design has been chosen so that for detachment there are no allocations which otherwise could fail. The work has been tested with tc-testing selftest suite which all passes, as well as the tc BPF tests from the BPF CI, and also with Cilium's L4LB. Thanks also to Nikolay Aleksandrov and Martin Lau for in-depth early reviews of this work. [0] https://lpc.events/event/16/contributions/1353/ [1] https://lore.kernel.org/bpf/CAEf4BzbokCJN33Nw_kg82sO=xppXnKWEncGTWCTB9vGCmLB6pw@mail.gmail.com [2] https://colocatedeventseu2023.sched.com/event/1Jo6O/tales-from-an-ebpf-programs-murder-mystery-hemanth-malla-guillaume-fournier-datadog [3] http://vger.kernel.org/bpfconf2023_material/tcx_meta_netdev_borkmann.pdf [4] https://lore.kernel.org/bpf/20210604063116.234316-1-memxor@gmail.com Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20230719140858.13224-3-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-08-21 13:27:45 -07:00
Daniel Borkmann	d7e583a6ea	bpf: Add generic attach/detach/query API for multi-progs This adds a generic layer called bpf_mprog which can be reused by different attachment layers to enable multi-program attachment and dependency resolution. In-kernel users of the bpf_mprog don't need to care about the dependency resolution internals, they can just consume it with few API calls. The initial idea of having a generic API sparked out of discussion [0] from an earlier revision of this work where tc's priority was reused and exposed via BPF uapi as a way to coordinate dependencies among tc BPF programs, similar as-is for classic tc BPF. The feedback was that priority provides a bad user experience and is hard to use [1], e.g.: I cannot help but feel that priority logic copy-paste from old tc, netfilter and friends is done because "that's how things were done in the past". [...] Priority gets exposed everywhere in uapi all the way to bpftool when it's right there for users to understand. And that's the main problem with it. The user don't want to and don't need to be aware of it, but uapi forces them to pick the priority. [...] Your cover letter [0] example proves that in real life different service pick the same priority. They simply don't know any better. Priority is an unnecessary magic that apps _have_ to pick, so they just copy-paste and everyone ends up using the same. The course of the discussion showed more and more the need for a generic, reusable API where the "same look and feel" can be applied for various other program types beyond just tc BPF, for example XDP today does not have multi- program support in kernel, but also there was interest around this API for improving management of cgroup program types. Such common multi-program management concept is useful for BPF management daemons or user space BPF applications coordinating internally about their attachments. Both from Cilium and Meta side [2], we've collected the following requirements for a generic attach/detach/query API for multi-progs which has been implemented as part of this work: - Support prog-based attach/detach and link API - Dependency directives (can also be combined): - BPF_F_{BEFORE,AFTER} with relative_{fd,id} which can be {prog,link,none} - BPF_F_ID flag as {fd,id} toggle; the rationale for id is so that user space application does not need CAP_SYS_ADMIN to retrieve foreign fds via bpf_*_get_fd_by_id() - BPF_F_LINK flag as {prog,link} toggle - If relative_{fd,id} is none, then BPF_F_BEFORE will just prepend, and BPF_F_AFTER will just append for attaching - Enforced only at attach time - BPF_F_REPLACE with replace_bpf_fd which can be prog, links have their own infra for replacing their internal prog - If no flags are set, then it's default append behavior for attaching - Internal revision counter and optionally being able to pass expected_revision - User space application can query current state with revision, and pass it along for attachment to assert current state before doing updates - Query also gets extension for link_ids array and link_attach_flags: - prog_ids are always filled with program IDs - link_ids are filled with link IDs when link was used, otherwise 0 - {prog,link}_attach_flags for holding {prog,link}-specific flags - Must be easy to integrate/reuse for in-kernel users The uapi-side changes needed for supporting bpf_mprog are rather minimal, consisting of the additions of the attachment flags, revision counter, and expanding existing union with relative_{fd,id} member. The bpf_mprog framework consists of an bpf_mprog_entry object which holds an array of bpf_mprog_fp (fast-path structure). The bpf_mprog_cp (control-path structure) is part of bpf_mprog_bundle. Both have been separated, so that fast-path gets efficient packing of bpf_prog pointers for maximum cache efficiency. Also, array has been chosen instead of linked list or other structures to remove unnecessary indirections for a fast point-to-entry in tc for BPF. The bpf_mprog_entry comes as a pair via bpf_mprog_bundle so that in case of updates the peer bpf_mprog_entry is populated and then just swapped which avoids additional allocations that could otherwise fail, for example, in detach case. bpf_mprog_{fp,cp} arrays are currently static, but they could be converted to dynamic allocation if necessary at a point in future. Locking is deferred to the in-kernel user of bpf_mprog, for example, in case of tcx which uses this API in the next patch, it piggybacks on rtnl. An extensive test suite for checking all aspects of this API for prog-based attach/detach and link API comes as BPF selftests in this series. Thanks also to Andrii Nakryiko for early API discussions wrt Meta's BPF prog management. [0] https://lore.kernel.org/bpf/20221004231143.19190-1-daniel@iogearbox.net [1] https://lore.kernel.org/bpf/CAADnVQ+gEY3FjCR=+DmjDR4gp5bOYZUFJQXj4agKFHT9CQPZBw@mail.gmail.com [2] http://vger.kernel.org/bpfconf2023_material/tcx_meta_netdev_borkmann.pdf Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20230719140858.13224-2-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-08-21 13:27:45 -07:00
Magnus Karlsson	071630384b	selftests/xsk: add basic multi-buffer test Add the first basic multi-buffer test that sends a stream of 9K packets and validates that they are received at the other end. In order to enable sending and receiving multi-buffer packets, code that sets the MTU is introduced as well as modifications to the XDP programs so that they signal that they are multi-buffer enabled. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20230719132421.584801-20-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-08-21 13:27:45 -07:00
Magnus Karlsson	658b107d4d	selftests/xsk: transmit and receive multi-buffer packets Add the ability to send and receive packets that are larger than the size of a umem frame, using the AF_XDP /XDP multi-buffer support. There are three pieces of code that need to be changed to achieve this: the Rx path, the Tx path, and the validation logic. Both the Rx path and Tx could only deal with a single fragment per packet. The Tx path is extended with a new function called pkt_nb_frags() that can be used to retrieve the number of fragments a packet will consume. We then create these many fragments in a loop and fill the N-1 first ones to the max size limit to use the buffer space efficiently, and the Nth one with whatever data that is left. This goes on until we have filled in at the most BATCH_SIZE worth of descriptors and fragments. If we detect that the next packet would lead to BATCH_SIZE number of fragments sent being exceeded, we do not send this packet and finish the batch. This packet is instead sent in the next iteration of BATCH_SIZE fragments. For Rx, we loop over all fragments we receive as usual, but for every descriptor that we receive we call a new validation function called is_frag_valid() to validate the consistency of this fragment. The code then checks if the packet continues in the next frame. If so, it loops over the next packet and performs the same validation. once we have received the last fragment of the packet we also call the function is_pkt_valid() to validate the packet as a whole. If we get to the end of the batch and we are not at the end of the current packet, we back out the partial packet and end the loop. Once we get into the receive loop next time, we start over from the beginning of that packet. This so the code becomes simpler at the cost of some performance. The validation function is_frag_valid() checks that the sequence and packet numbers are correct at the start and end of each fragment. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20230719132421.584801-19-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-08-21 13:27:45 -07:00
Maciej Fijalkowski	8ae70bcbdf	xsk: add new netlink attribute dedicated for ZC max frags Introduce new netlink attribute NETDEV_A_DEV_XDP_ZC_MAX_SEGS that will carry maximum fragments that underlying ZC driver is able to handle on TX side. It is going to be included in netlink response only when driver supports ZC. Any value higher than 1 implies multi-buffer ZC support on underlying device. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/r/20230719132421.584801-11-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-08-21 13:27:45 -07:00
Yafang Shao	4cd8e50d37	bpf: Support ->fill_link_info for perf_event By introducing support for ->fill_link_info to the perf_event link, users gain the ability to inspect it using `bpftool link show`. While the current approach involves accessing this information via `bpftool perf show`, consolidating link information for all link types in one place offers greater convenience. Additionally, this patch extends support to the generic perf event, which is not currently accommodated by `bpftool perf show`. While only the perf type and config are exposed to userspace, other attributes such as sample_period and sample_freq are ignored. It's important to note that if kptr_restrict is not permitted, the probed address will not be exposed, maintaining security measures. A new enum bpf_perf_event_type is introduced to help the user understand which struct is relevant. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230709025630.3735-9-laoar.shao@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-08-21 13:27:45 -07:00
Yafang Shao	b89ede420b	bpf: Support ->fill_link_info for kprobe_multi With the addition of support for fill_link_info to the kprobe_multi link, users will gain the ability to inspect it conveniently using the `bpftool link show`. This enhancement provides valuable information to the user, including the count of probed functions and their respective addresses. It's important to note that if the kptr_restrict setting is not permitted, the probed address will not be exposed, ensuring security. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230709025630.3735-2-laoar.shao@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-08-21 13:27:45 -07:00
Andrii Nakryiko	05f94ddbb8	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: c628747cc8800cf6d33d09f7f42c8b6f91e64dc7 Checkpoint bpf-next commit: a3e7e6b17946f48badce98d7ac360678a0ea7393 Baseline bpf commit: 496720b7cfb6574a8f6f4d434f23e3d1e6cfaeb9 Checkpoint bpf commit: 496720b7cfb6574a8f6f4d434f23e3d1e6cfaeb9 Andrii Nakryiko (1): libbpf: Fix realloc API handling in zero-sized edge cases John Sanpe (1): libbpf: Remove HASHMAP_INIT static initialization helper src/hashmap.h \| 10 ---------- src/libbpf.c \| 15 ++++++++++++--- src/usdt.c \| 5 ++++- 3 files changed, 16 insertions(+), 14 deletions(-) Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-07-11 10:03:25 -07:00
John Sanpe	bf88aaa6fe	libbpf: Remove HASHMAP_INIT static initialization helper Remove the wrong HASHMAP_INIT. It's not used anywhere in libbpf. Signed-off-by: John Sanpe <sanpeqf@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230711070712.2064144-1-sanpeqf@gmail.com	2023-07-11 10:03:25 -07:00
Andrii Nakryiko	f117080307	libbpf: Fix realloc API handling in zero-sized edge cases realloc() and reallocarray() can either return NULL or a special non-NULL pointer, if their size argument is zero. This requires a bit more care to handle NULL-as-valid-result situation differently from NULL-as-error case. This has caused real issues before ([0]), and just recently bit again in production when performing bpf_program__attach_usdt(). This patch fixes 4 places that do or potentially could suffer from this mishandling of NULL, including the reported USDT-related one. There are many other places where realloc()/reallocarray() is used and NULL is always treated as an error value, but all those have guarantees that their size is always non-zero, so those spot don't need any extra handling. [0] d08ab82f59d5 ("libbpf: Fix double-free when linker processes empty sections") Fixes: 999783c8bbda ("libbpf: Wire up spec management and other arch-independent USDT logic") Fixes: b63b3c490eee ("libbpf: Add bpf_program__set_insns function") Fixes: 697f104db8a6 ("libbpf: Support custom SEC() handlers") Fixes: b12688267280 ("libbpf: Change the order of data and text relocations.") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230711024150.1566433-1-andrii@kernel.org	2023-07-11 10:03:25 -07:00
Andrii Nakryiko	6c020e6c47	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 856fe03d929205b4c8c8fa51296342cd85592e3f Checkpoint bpf-next commit: c628747cc8800cf6d33d09f7f42c8b6f91e64dc7 Baseline bpf commit: 496720b7cfb6574a8f6f4d434f23e3d1e6cfaeb9 Checkpoint bpf commit: 496720b7cfb6574a8f6f4d434f23e3d1e6cfaeb9 Andrii Nakryiko (1): libbpf: only reset sec_def handler when necessary src/libbpf.c \| 27 +++++++++++++++++++-------- 1 file changed, 19 insertions(+), 8 deletions(-) Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-07-10 14:24:42 -07:00
Andrii Nakryiko	1743bd1e40	libbpf: only reset sec_def handler when necessary Don't reset recorded sec_def handler unconditionally on bpf_program__set_type(). There are two situations where this is wrong. First, if the program type didn't actually change. In that case original SEC handler should work just fine. Second, catch-all custom SEC handler is supposed to work with any BPF program type and SEC() annotation, so it also doesn't make sense to reset that. This patch fixes both issues. This was reported recently in the context of breaking perf tool, which uses custom catch-all handler for fancy BPF prologue generation logic. This patch should fix the issue. [0] https://lore.kernel.org/linux-perf-users/ab865e6d-06c5-078e-e404-7f90686db50d@amd.com/ Fixes: d6e6286a12e7 ("libbpf: disassociate section handler on explicit bpf_program__set_type() call") Reported-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20230707231156.1711948-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-07-10 14:24:42 -07:00
Andrii Nakryiko	a2258003f2	ci: install headers before building selftests Ensure latest kernel headers are available. Similar to [0]. [0] https://github.com/libbpf/ci/pull/102 Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-07-07 18:55:44 -07:00
Andrii Nakryiko	add1aac281	ci: add kprobe_multi_bench_attach to DENYLIST It is suspected to be causing kernel crashes in libbpf CI, which we don't see in kernel-patches CI. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-07-07 18:55:44 -07:00
Andrii Nakryiko	ea27ebcffd	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 25085b4e9251c77758964a8e8651338972353642 Checkpoint bpf-next commit: 856fe03d929205b4c8c8fa51296342cd85592e3f Baseline bpf commit: ad96f1c9138e0897bee7f7c5e54b3e24f8b62f57 Checkpoint bpf commit: 496720b7cfb6574a8f6f4d434f23e3d1e6cfaeb9 Andrea Terzolo (1): libbpf: Skip modules BTF loading when CAP_SYS_ADMIN is missing Florian Westphal (1): libbpf: Add netfilter link attach helper Jackie Liu (2): libbpf: Cross-join available_filter_functions and kallsyms for multi-kprobes libbpf: Use available_filter_functions_addrs with multi-kprobes src/bpf.c \| 8 ++ src/bpf.h \| 6 ++ src/libbpf.c \| 216 ++++++++++++++++++++++++++++++++++++++++++++++--- src/libbpf.h \| 15 ++++ src/libbpf.map \| 1 + 5 files changed, 233 insertions(+), 13 deletions(-) Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-07-07 18:55:44 -07:00
Jackie Liu	b9c4ad5468	libbpf: Use available_filter_functions_addrs with multi-kprobes Now that kernel provides a new available_filter_functions_addrs file which can help us avoid the need to cross-validate available_filter_functions and kallsyms, we can improve efficiency of multi-attach kprobes. For example, on my device, the sample program [1] of start time: $ sudo ./funccount "tcp_*" before after 1.2s 1.0s [1]: https://github.com/JackieLiu1/ketones/tree/master/src/funccount Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230705091209.3803873-2-liu.yun@linux.dev	2023-07-07 18:55:44 -07:00
Jackie Liu	732c4c6df2	libbpf: Cross-join available_filter_functions and kallsyms for multi-kprobes When using regular expression matching with "kprobe multi", it scans all the functions under "/proc/kallsyms" that can be matched. However, not all of them can be traced by kprobe.multi. If any one of the functions fails to be traced, it will result in the failure of all functions. The best approach is to filter out the functions that cannot be traced to ensure proper tracking of the functions. Closes: https://lore.kernel.org/oe-kbuild-all/202307030355.TdXOHklM-lkp@intel.com/ Reported-by: kernel test robot <lkp@intel.com> Suggested-by: Jiri Olsa <jolsa@kernel.org> Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230705091209.3803873-1-liu.yun@linux.dev	2023-07-07 18:55:44 -07:00
Florian Westphal	6bec18258c	libbpf: Add netfilter link attach helper Add new api function: bpf_program__attach_netfilter. It takes a bpf program (netfilter type), and a pointer to a option struct that contains the desired attachment (protocol family, priority, hook location, ...). It returns a pointer to a 'bpf_link' structure or NULL on error. Next patch adds new netfilter_basic test that uses this function to attach a program to a few pf/hook/priority combinations. v2: change name and use bpf_link_create. Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/bpf/CAEf4BzZrmUv27AJp0dDxBDMY_B8e55-wLs8DUKK69vCWsCG_pQ@mail.gmail.com/ Link: https://lore.kernel.org/bpf/CAEf4BzZ69YgrQW7DHCJUT_X+GqMq_ZQQPBwopaJJVGFD5=d5Vg@mail.gmail.com/ Link: https://lore.kernel.org/bpf/20230628152738.22765-2-fw@strlen.de	2023-07-07 18:55:44 -07:00
Andrea Terzolo	3f33f9a6b8	libbpf: Skip modules BTF loading when CAP_SYS_ADMIN is missing If during CO-RE relocations libbpf is not able to find the target type in the running kernel BTF, it searches for it in modules' BTF. The downside of this approach is that loading modules' BTF requires CAP_SYS_ADMIN and this prevents BPF applications from running with more granular capabilities (e.g. CAP_BPF) when they don't need to search types into modules' BTF. This patch skips by default modules' BTF loading phase when CAP_SYS_ADMIN is missing. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Co-developed-by: Federico Di Pierro <nierro92@gmail.com> Signed-off-by: Federico Di Pierro <nierro92@gmail.com> Signed-off-by: Andrea Terzolo <andreaterzolo3@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/CAGQdkDvYU_e=_NX+6DRkL_-TeH3p+QtsdZwHkmH0w3Fuzw0C4w@mail.gmail.com Link: https://lore.kernel.org/bpf/20230626093614.21270-1-andreaterzolo3@gmail.com	2023-07-07 18:55:44 -07:00
Manu Bretelle	ec6f716eda	ci: Add bpf_nf/{xdp,tc-bpf}-ct to denylist for x86 This test is consistently failing on x86 for unknown reasons. Signed-off-by: Manu Bretelle <chantr4@gmail.com>	2023-06-17 00:07:28 +00:00
Manu Bretelle	3c7fcfe0ce	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: fcf1fa29c8ea75bf104c35ce29b65ce2ba6a6a9d Checkpoint bpf-next commit: 25085b4e9251c77758964a8e8651338972353642 Baseline bpf commit: f726e03564ef4e754dd93beb54303e2e1671049e Checkpoint bpf commit: ad96f1c9138e0897bee7f7c5e54b3e24f8b62f57 Andrii Nakryiko (2): libbpf: Ensure libbpf always opens files with O_CLOEXEC libbpf: Ensure FD >= 3 during bpf_map__reuse_fd() Florian Westphal (1): bpf: netfilter: Add BPF_NETFILTER bpf_attach_type JP Kobryn (1): libbpf: Change var type in datasec resize func Louis DeLosSantos (1): bpf: Add table ID to bpf_fib_lookup BPF helper include/uapi/linux/bpf.h \| 22 +++++++++++++++++++--- src/btf.c \| 2 +- src/libbpf.c \| 26 +++++++++++++------------- src/libbpf_probes.c \| 4 +++- src/usdt.c \| 5 ++--- 5 files changed, 38 insertions(+), 21 deletions(-) Signed-off-by: Manu Bretelle <chantr4@gmail.com>	2023-06-17 00:07:28 +00:00
Manu Bretelle	ef3e2ef82a	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions. Signed-off-by: Manu Bretelle <chantr4@gmail.com>	2023-06-17 00:07:28 +00:00
Florian Westphal	45188d0d01	bpf: netfilter: Add BPF_NETFILTER bpf_attach_type Andrii Nakryiko writes: And we currently don't have an attach type for NETLINK BPF link. Thankfully it's not too late to add it. I see that link_create() in kernel/bpf/syscall.c just bypasses attach_type check. We shouldn't have done that. Instead we need to add BPF_NETLINK attach type to enum bpf_attach_type. And wire all that properly throughout the kernel and libbpf itself. This adds BPF_NETFILTER and uses it. This breaks uabi but this wasn't in any non-rc release yet, so it should be fine. v2: check link_attack prog type in link_create too Fixes: 84601d6ee68a ("bpf: add bpf_link support for BPF_NETFILTER programs") Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/CAEf4BzZ69YgrQW7DHCJUT_X+GqMq_ZQQPBwopaJJVGFD5=d5Vg@mail.gmail.com/ Link: https://lore.kernel.org/bpf/20230605131445.32016-1-fw@strlen.de	2023-06-17 00:07:28 +00:00
Louis DeLosSantos	f02ec78083	bpf: Add table ID to bpf_fib_lookup BPF helper Add ability to specify routing table ID to the `bpf_fib_lookup` BPF helper. A new field `tbid` is added to `struct bpf_fib_lookup` used as parameters to the `bpf_fib_lookup` BPF helper. When the helper is called with the `BPF_FIB_LOOKUP_DIRECT` and `BPF_FIB_LOOKUP_TBID` flags the `tbid` field in `struct bpf_fib_lookup` will be used as the table ID for the fib lookup. If the `tbid` does not exist the fib lookup will fail with `BPF_FIB_LKUP_RET_NOT_FWDED`. The `tbid` field becomes a union over the vlan related output fields in `struct bpf_fib_lookup` and will be zeroed immediately after usage. This functionality is useful in containerized environments. For instance, if a CNI wants to dictate the next-hop for traffic leaving a container it can create a container-specific routing table and perform a fib lookup against this table in a "host-net-namespace-side" TC program. This functionality also allows `ip rule` like functionality at the TC layer, allowing an eBPF program to pick a routing table based on some aspect of the sk_buff. As a concrete use case, this feature will be used in Cilium's SRv6 L3VPN datapath. When egress traffic leaves a Pod an eBPF program attached by Cilium will determine which VRF the egress traffic should target, and then perform a FIB lookup in a specific table representing this VRF's FIB. Signed-off-by: Louis DeLosSantos <louis.delos.devel@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230505-bpf-add-tbid-fib-lookup-v2-1-0a31c22c748c@gmail.com	2023-06-17 00:07:28 +00:00
Andrii Nakryiko	fa1a18d38b	libbpf: Ensure FD >= 3 during bpf_map__reuse_fd() Improve bpf_map__reuse_fd() logic and ensure that dup'ed map FD is "good" (>= 3) and has O_CLOEXEC flags. Use fcntl(F_DUPFD_CLOEXEC) for that, similarly to ensure_good_fd() helper we already use in low-level APIs that work with bpf() syscall. Suggested-by: Lennart Poettering <lennart@poettering.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230525221311.2136408-2-andrii@kernel.org	2023-06-17 00:07:28 +00:00
Andrii Nakryiko	ba7a44da68	libbpf: Ensure libbpf always opens files with O_CLOEXEC Make sure that libbpf code always gets FD with O_CLOEXEC flag set, regardless if file is open through open() or fopen(). For the latter this means to add "e" to mode string, which is supported since pretty ancient glibc v2.7. Also drop the outdated TODO comment in usdt.c, which was already completed. Suggested-by: Lennart Poettering <lennart@poettering.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230525221311.2136408-1-andrii@kernel.org	2023-06-17 00:07:28 +00:00
Manu Bretelle	cb23f981c3	ci: Dump kconfig before running tests This helps troubleshooting by validating what the Kconfig of the testing environment is. Signed-off-by: Manu Bretelle <chantr4@gmail.com>	2023-06-15 14:04:53 -07:00
Daniel Müller	f7eb43b90f	ci: add fix for sockopt sub-tests Sockopt sub-tests currently don't honor denylisting properly. Fix them. Upstream fix was found at [0]. [0] https://lore.kernel.org/bpf/20230525232248.640465-1-deso@posteo.net/T/#u Signed-off-by: Daniel Müller <deso@posteo.net>	2023-05-25 16:44:19 -07:00
Daniel Müller	9710829e78	ci: Gracefully handle test names with spaces inside Cherry pick of pieces of f909f8bf110d ("ci: temporarily disable test_btf_dump_case") from vmtest to handle spaces in test names properly. Signed-off-by: Daniel Müller <deso@posteo.net>	2023-05-25 16:44:19 -07:00
JP Kobryn	e021ccbd7d	libbpf: Change var type in datasec resize func This changes a local variable type that stores a new array id to match the return type of btf__add_array(). Signed-off-by: JP Kobryn <inwardvessel@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20230525001323.8554-1-inwardvessel@gmail.com Signed-off-by: Daniel Müller <deso@posteo.net>	2023-05-25 16:44:19 -07:00
Daniel Müller	0755b497cf	ci: add fix for multi-kprobe as temporary patch This fixes 39d954200bf6 ("fprobe: Skip exit_handler if entry_handler returns !0"), which causes multiple multi-kprobe tests to fail. Upstream fix was found at [0]. [0] https://lore.kernel.org/all/168100731160.79534.374827110083836722.stgit@devnote2/#r Signed-off-by: Daniel Müller <deso@posteo.net>	2023-05-25 16:44:19 -07:00
Daniel Müller	c4ffdf1e72	ci: Adjust allow/deny lists for most recent sync Adjust the allow & deny lists for use after the most recent sync with upstream. Signed-off-by: Daniel Müller <deso@posteo.net>	2023-05-25 16:44:19 -07:00
Daniel Müller	c850306199	ci: Regenerate latest vmlinux.h for old kernel CI tests. CI will fail without it. Signed-off-by: Daniel Müller <deso@posteo.net>	2023-05-25 16:44:19 -07:00
Daniel Müller	fb6998382d	libbpf: Bump version to v1.3 in Makefile Bump LIBBPF_MINOR_VERSION to 3 for v1.3 dev cycle. Signed-off-by: Daniel Müller <deso@posteo.net>	2023-05-25 16:44:19 -07:00
Daniel Müller	9aea1da2bb	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 2ddade322925641ee2a75f13665c51f2e74d7791 Checkpoint bpf-next commit: fcf1fa29c8ea75bf104c35ce29b65ce2ba6a6a9d Baseline bpf commit: 71b547f561247897a0a14f3082730156c0533fed Checkpoint bpf commit: f726e03564ef4e754dd93beb54303e2e1671049e Alexey Dobriyan (1): ELF: fix all "Elf" typos Andrii Nakryiko (4): libbpf: fix offsetof() and container_of() to work with CO-RE libbpf: Start v1.3 development cycle bpf: Support O_PATH FDs in BPF_OBJ_PIN and BPF_OBJ_GET commands libbpf: Add opts-based bpf_obj_pin() API and add support for path_fd Florian Westphal (1): tools: bpftool: print netfilter link info JP Kobryn (1): libbpf: Add capability for resizing datasec maps Jiri Olsa (1): libbpf: Store zero fd to fd_array for loader kfunc relocation Kenjiro Nakayama (1): libbpf: Fix comment about arc and riscv arch in bpf_tracing.h Martin KaFai Lau (1): libbpf: btf_dump_type_data_check_overflow needs to consider BTF_MEMBER_BITFIELD_SIZE include/uapi/linux/bpf.h \| 24 +++++++ src/bpf.c \| 17 ++++- src/bpf.h \| 18 ++++- src/bpf_helpers.h \| 15 +++-- src/bpf_tracing.h \| 3 +- src/btf_dump.c \| 22 +++++- src/gen_loader.c \| 14 ++-- src/libbpf.c \| 140 ++++++++++++++++++++++++++++++++++++--- src/libbpf.h \| 18 ++++- src/libbpf.map \| 5 ++ src/libbpf_probes.c \| 1 + src/libbpf_version.h \| 2 +- src/usdt.c \| 2 +- 13 files changed, 246 insertions(+), 35 deletions(-) Signed-off-by: Daniel Müller <deso@posteo.net>	2023-05-25 16:44:19 -07:00
JP Kobryn	8b4e1b39a4	libbpf: Add capability for resizing datasec maps This patch updates bpf_map__set_value_size() so that if the given map is memory mapped, it will attempt to resize the mapped region. Initial contents of the mapped region are preserved. BTF is not required, but after the mapping is resized an attempt is made to adjust the associated BTF information if the following criteria is met: - BTF info is present - the map is a datasec - the final variable in the datasec is an array ... the resulting BTF info will be updated so that the final array variable is associated with a new BTF array type sized to cover the requested size. Note that the initial resizing of the memory mapped region can succeed while the subsequent BTF adjustment can fail. In this case, BTF info is dropped from the map by clearing the key and value type. Signed-off-by: JP Kobryn <inwardvessel@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20230524004537.18614-2-inwardvessel@gmail.com Signed-off-by: Daniel Müller <deso@posteo.net>	2023-05-25 16:44:19 -07:00
Andrii Nakryiko	a50544ef45	libbpf: Add opts-based bpf_obj_pin() API and add support for path_fd Add path_fd support for bpf_obj_pin() and bpf_obj_get() operations (through their opts-based variants). This allows to take advantage of new kernel-side support for O_PATH-based pin/get location specification. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230523170013.728457-4-andrii@kernel.org Signed-off-by: Daniel Müller <deso@posteo.net>	2023-05-25 16:44:19 -07:00
Andrii Nakryiko	bfb0454244	bpf: Support O_PATH FDs in BPF_OBJ_PIN and BPF_OBJ_GET commands Current UAPI of BPF_OBJ_PIN and BPF_OBJ_GET commands of bpf() syscall forces users to specify pinning location as a string-based absolute or relative (to current working directory) path. This has various implications related to security (e.g., symlink-based attacks), forces BPF FS to be exposed in the file system, which can cause races with other applications. One of the feedbacks we got from folks working with containers heavily was that inability to use purely FD-based location specification was an unfortunate limitation and hindrance for BPF_OBJ_PIN and BPF_OBJ_GET commands. This patch closes this oversight, adding path_fd field to BPF_OBJ_PIN and BPF_OBJ_GET UAPI, following conventions established by *at() syscalls for dirfd + pathname combinations. This now allows interesting possibilities like working with detached BPF FS mount (e.g., to perform multiple pinnings without running a risk of someone interfering with them), and generally making pinning/getting more secure and not prone to any races and/or security attacks. This is demonstrated by a selftest added in subsequent patch that takes advantage of new mount APIs (fsopen, fsconfig, fsmount) to demonstrate creating detached BPF FS mount, pinning, and then getting BPF map out of it, all while never exposing this private instance of BPF FS to outside worlds. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/bpf/20230523170013.728457-4-andrii@kernel.org Signed-off-by: Daniel Müller <deso@posteo.net>	2023-05-25 16:44:19 -07:00
Andrii Nakryiko	79811cad50	libbpf: Start v1.3 development cycle Bump libbpf.map to v1.3.0 to start a new libbpf version cycle. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230523170013.728457-3-andrii@kernel.org Signed-off-by: Daniel Müller <deso@posteo.net>	2023-05-25 16:44:19 -07:00
Jiri Olsa	4bb0b0ca09	libbpf: Store zero fd to fd_array for loader kfunc relocation When moving some of the test kfuncs to bpf_testmod I hit an issue when some of the kfuncs that object uses are in module and some in vmlinux. The problem is that both vmlinux and module kfuncs get allocated btf_fd_idx index into fd_array, but we store to it the BTF fd value only for module's kfunc, not vmlinux's one because (it's zero). Then after the program is loaded we check if fd_array[btf_fd_idx] != 0 and close the fd. When the object has kfuncs from both vmlinux and module, the fd from fd_array[btf_fd_idx] from previous load will be stored in there for vmlinux's kfunc, so we close unrelated fd (of the program we just loaded in my case). Fixing this by storing zero to fd_array[btf_fd_idx] for vmlinux kfuncs, so the we won't close stale fd. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230515133756.1658301-2-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Müller <deso@posteo.net>	2023-05-25 16:44:19 -07:00
Andrii Nakryiko	ac42790129	libbpf: fix offsetof() and container_of() to work with CO-RE It seems like __builtin_offset() doesn't preserve CO-RE field relocations properly. So if offsetof() macro is defined through __builtin_offset(), CO-RE-enabled BPF code using container_of() will be subtly and silently broken. To avoid this problem, redefine offsetof() and container_of() in the form that works with CO-RE relocations more reliably. Fixes: 5fbc220862fc ("tools/libpf: Add offsetof/container_of macro in bpf_helpers.h") Reported-by: Lennart Poettering <lennart@poettering.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20230509065502.2306180-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Müller <deso@posteo.net>	2023-05-25 16:44:19 -07:00
Kenjiro Nakayama	6a6cf6dcdc	libbpf: Fix comment about arc and riscv arch in bpf_tracing.h To make comments about arc and riscv arch in bpf_tracing.h accurate, this patch fixes the comment about arc and adds the comment for riscv. Signed-off-by: Kenjiro Nakayama <nakayamakenjiro@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230504035443.427927-1-nakayamakenjiro@gmail.com Signed-off-by: Daniel Müller <deso@posteo.net>	2023-05-25 16:44:19 -07:00
Martin KaFai Lau	b9711e7015	libbpf: btf_dump_type_data_check_overflow needs to consider BTF_MEMBER_BITFIELD_SIZE The btf_dump/struct_data selftest is failing with: [...] test_btf_dump_struct_data:FAIL:unexpected return value dumping fs_context unexpected unexpected return value dumping fs_context: actual -7 != expected 264 [...] The reason is in btf_dump_type_data_check_overflow(). It does not use BTF_MEMBER_BITFIELD_SIZE from the struct's member (btf_member). Instead, it is using the enum size which is 4. It had been working till the recent commit 4e04143c869c ("fs_context: drop the unused lsm_flags member") removed an integer member which also removed the 4 bytes padding at the end of the fs_context. Missing this 4 bytes padding exposed this bug. In particular, when btf_dump_type_data_check_overflow() reaches the member 'phase', -E2BIG is returned. The fix is to pass bit_sz to btf_dump_type_data_check_overflow(). In btf_dump_type_data_check_overflow(), it does a different size check when bit_sz is not zero. The current fs_context: [3600] ENUM 'fs_context_purpose' encoding=UNSIGNED size=4 vlen=3 'FS_CONTEXT_FOR_MOUNT' val=0 'FS_CONTEXT_FOR_SUBMOUNT' val=1 'FS_CONTEXT_FOR_RECONFIGURE' val=2 [3601] ENUM 'fs_context_phase' encoding=UNSIGNED size=4 vlen=7 'FS_CONTEXT_CREATE_PARAMS' val=0 'FS_CONTEXT_CREATING' val=1 'FS_CONTEXT_AWAITING_MOUNT' val=2 'FS_CONTEXT_AWAITING_RECONF' val=3 'FS_CONTEXT_RECONF_PARAMS' val=4 'FS_CONTEXT_RECONFIGURING' val=5 'FS_CONTEXT_FAILED' val=6 [3602] STRUCT 'fs_context' size=264 vlen=21 'ops' type_id=3603 bits_offset=0 'uapi_mutex' type_id=235 bits_offset=64 'fs_type' type_id=872 bits_offset=1216 'fs_private' type_id=21 bits_offset=1280 'sget_key' type_id=21 bits_offset=1344 'root' type_id=781 bits_offset=1408 'user_ns' type_id=251 bits_offset=1472 'net_ns' type_id=984 bits_offset=1536 'cred' type_id=1785 bits_offset=1600 'log' type_id=3621 bits_offset=1664 'source' type_id=42 bits_offset=1792 'security' type_id=21 bits_offset=1856 's_fs_info' type_id=21 bits_offset=1920 'sb_flags' type_id=20 bits_offset=1984 'sb_flags_mask' type_id=20 bits_offset=2016 's_iflags' type_id=20 bits_offset=2048 'purpose' type_id=3600 bits_offset=2080 bitfield_size=8 'phase' type_id=3601 bits_offset=2088 bitfield_size=8 'need_free' type_id=67 bits_offset=2096 bitfield_size=1 'global' type_id=67 bits_offset=2097 bitfield_size=1 'oldapi' type_id=67 bits_offset=2098 bitfield_size=1 Fixes: 920d16af9b42 ("libbpf: BTF dumper support for typed data") Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20230428013638.1581263-1-martin.lau@linux.dev Signed-off-by: Daniel Müller <deso@posteo.net>	2023-05-25 16:44:19 -07:00
Alexey Dobriyan	4c484d662c	ELF: fix all "Elf" typos ELF is acronym and therefore should be spelled in all caps. I left one exception at Documentation/arm/nwfpe/nwfpe.rst which looks like being written in the first person. Link: https://lkml.kernel.org/r/Y/3wGWQviIOkyLJW@p183 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Daniel Müller <deso@posteo.net>	2023-05-25 16:44:19 -07:00
Florian Westphal	1c9aa4791a	tools: bpftool: print netfilter link info Dump protocol family, hook and priority value: $ bpftool link 2: netfilter prog 14 ip input prio -128 pids install(3264) 5: netfilter prog 14 ip6 forward prio 21 pids a.out(3387) 9: netfilter prog 14 ip prerouting prio 123 pids a.out(5700) 10: netfilter prog 14 ip input prio 21 pids test2(5701) v2: Quentin Monnet suggested to also add 'bpftool net' support: $ bpftool net xdp: tc: flow_dissector: netfilter: ip prerouting prio 21 prog_id 14 ip input prio -128 prog_id 14 ip input prio 21 prog_id 14 ip forward prio 21 prog_id 14 ip output prio 21 prog_id 14 ip postrouting prio 21 prog_id 14 'bpftool net' only dumps netfilter link type, links are sorted by protocol family, hook and priority. v5: fix bpf ci failure: libbpf needs small update to prog_type_name[] and probe_prog_load helper. v4: don't fail with -EOPNOTSUPP in libbpf probe_prog_load, update prog_type_name[] with "netfilter" entry (bpf ci) v3: fix bpf.h copy, 'reserved' member was removed (Alexei) use p_err, not fprintf (Quentin) Suggested-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/eeeaac99-9053-90c2-aa33-cc1ecb1ae9ca@isovalent.com/ Reviewed-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/r/20230421170300.24115-6-fw@strlen.de Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Müller <deso@posteo.net>	2023-05-25 16:44:19 -07:00
Andrii Nakryiko	3f591a6610	git: make .gitattributes compatible with git-archive-all action As reported by Quentin, using Github Action to archive all submodules (e.g., for retsnoop release packaging) is impacted by it not supporting "<glob>/" pattern in .gitattributes. Use "<glob>/**" instead. [0] https://github.com/anakryiko/retsnoop/pull/42#issuecomment-1560797837 Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-05-25 13:14:58 -07:00
Evgeny Vereshchagin	532293bdf4	fuzz: bump elfutils to 0.189 The elfutils project has fixed several issues found by fuzz targets so it should help to prevent the libbpf fuzz target from running into them. Signed-off-by: Evgeny Vereshchagin <evvers@ya.ru>	2023-05-12 14:29:41 -07:00
Song Liu	fbd60dbff5	ci: Fix test_progs failure Fix test_progs failure xdp_bonding/xdp_bonding_redirect_multi with a missing commit (in bpf, but not in bpf-next yet). Signed-off-by: Song Liu <song@kernel.org>	2023-04-20 12:01:06 -07:00
Song Liu	44b0bc9ad7	ci: Regenerate latest vmlinux.h for old kernel CI tests. CI fails without it. Signed-off-by: Song Liu <song@kernel.org>	2023-04-19 16:15:07 -07:00
Song Liu	f0e39b4946	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 4ca13d1002f37c10038ff4ed3cfdc70dbe049d60 Checkpoint bpf-next commit: 2ddade322925641ee2a75f13665c51f2e74d7791 Baseline bpf commit: a6f6a95f25803500079513780d11a911ce551d76 Checkpoint bpf commit: 71b547f561247897a0a14f3082730156c0533fed Andrii Nakryiko (9): libbpf: Don't enforce unnecessary verifier log restrictions on libbpf side bpf: Add log_true_size output field to return necessary log buffer size libbpf: Wire through log_true_size returned from kernel for BPF_PROG_LOAD libbpf: Wire through log_true_size for bpf_btf_load() API libbpf: misc internal libbpf clean ups around log fixup libbpf: report vmlinux vs module name when dealing with ksyms libbpf: improve handling of unresolved kfuncs libbpf: move bpf_for(), bpf_for_each(), and bpf_repeat() into bpf_helpers.h libbpf: mark bpf_iter_num_{new,next,destroy} as __weak Arnaldo Carvalho de Melo (1): tools include UAPI: Synchronize linux/fcntl.h with the kernel sources Dave Marchevsky (1): bpf: Introduce opaque bpf_refcount struct and add btf_record plumbing Herbert Xu (1): macvlan: Add netlink attribute for broadcast cutoff Lorenzo Bianconi (1): xdp: add xdp_set_features_flag utility routine include/uapi/linux/bpf.h \| 16 +++++- include/uapi/linux/fcntl.h \| 1 + include/uapi/linux/if_link.h \| 1 + include/uapi/linux/netdev.h \| 2 + src/bpf.c \| 17 +++--- src/bpf.h \| 22 +++++-- src/bpf_helpers.h \| 103 +++++++++++++++++++++++++++++++++ src/libbpf.c \| 107 ++++++++++++++++++++++++++++------- 8 files changed, 237 insertions(+), 32 deletions(-) Signed-off-by: Song Liu <song@kernel.org>	2023-04-19 16:15:07 -07:00
Andrii Nakryiko	294c85e9b3	libbpf: mark bpf_iter_num_{new,next,destroy} as __weak Mark bpf_iter_num_{new,next,destroy}() kfuncs declared for bpf_for()/bpf_repeat() macros as __weak to allow users to feature-detect their presence and guard bpf_for()/bpf_repeat() loops accordingly for backwards compatibility with old kernels. Now that libbpf supports kfunc calls poisoning and better reporting of unresolved (but called) kfuncs, declaring number iterator kfuncs in bpf_helpers.h won't degrade user experience and won't cause unnecessary kernel feature dependencies. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230418002148.3255690-7-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-19 16:15:07 -07:00
Andrii Nakryiko	2293c20f82	libbpf: move bpf_for(), bpf_for_each(), and bpf_repeat() into bpf_helpers.h To make it easier for bleeding-edge BPF applications, such as sched_ext, to utilize open-coded iterators, move bpf_for(), bpf_for_each(), and bpf_repeat() macros from selftests/bpf-internal bpf_misc.h helper, to libbpf-provided bpf_helpers.h header. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230418002148.3255690-6-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-19 16:15:07 -07:00
Andrii Nakryiko	e6cc30f445	libbpf: improve handling of unresolved kfuncs Currently, libbpf leaves `call #0` instruction for __weak unresolved kfuncs, which might lead to a confusing verifier log situations, where invalid `call #0` will be treated as successfully validated. We can do better. Libbpf already has an established mechanism of poisoning instructions that failed some form of resolution (e.g., CO-RE relocation and BPF map set to not be auto-created). Libbpf doesn't fail them outright to allow users to guard them through other means, and as long as BPF verifier can prove that such poisoned instructions cannot be ever reached, this doesn't consistute an invalid BPF program. If user didn't guard such code, libbpf will extract few pieces of information to tie such poisoned instructions back to additional information about what entitity wasn't resolved (e.g., BPF map name, or CO-RE relocation information). __weak unresolved kfuncs fit this model well, so this patch extends libbpf with poisioning and log fixup logic for kfunc calls. Note, this poisoning is done only for kfunc calls, not kfunc address resolution (ldimm64 instructions). The former cannot be ever valid, if reached, so it's safe to poison them. The latter is a valid mechanism to check if __weak kfunc ksym was resolved, and do necessary guarding and work arounds based on this result, supported in most recent kernels. As such, libbpf keeps such ldimm64 instructions as loading zero, never poisoning them. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230418002148.3255690-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-19 16:15:07 -07:00
Andrii Nakryiko	6fd310547d	libbpf: report vmlinux vs module name when dealing with ksyms Currently libbpf always reports "kernel" as a source of ksym BTF type, which is ambiguous given ksym's BTF can come from either vmlinux or kernel module BTFs. Make this explicit and log module name, if used BTF is from kernel module. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230418002148.3255690-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-19 16:15:07 -07:00
Andrii Nakryiko	0db753a9f8	libbpf: misc internal libbpf clean ups around log fixup Normalize internal constants, field names, and comments related to log fixup. Also add explicit `ext_idx` alias for relocation where relocation is pointing to extern description for additional information. No functional changes, just a clean up before subsequent additions. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230418002148.3255690-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-19 16:15:07 -07:00
Dave Marchevsky	44f59ec077	bpf: Introduce opaque bpf_refcount struct and add btf_record plumbing A 'struct bpf_refcount' is added to the set of opaque uapi/bpf.h types meant for use in BPF programs. Similarly to other opaque types like bpf_spin_lock and bpf_rbtree_node, the verifier needs to know where in user-defined struct types a bpf_refcount can be located, so necessary btf_record plumbing is added to enable this. bpf_refcount is sized to hold a refcount_t. Similarly to bpf_spin_lock, the offset of a bpf_refcount is cached in btf_record as refcount_off in addition to being in the field array. Caching refcount_off makes sense for this field because further patches in the series will modify functions that take local kptrs (e.g. bpf_obj_drop) to change their behavior if the type they're operating on is refcounted. So enabling fast "is this type refcounted?" checks is desirable. No such verifier behavior changes are introduced in this patch, just logic to recognize 'struct bpf_refcount' in btf_record. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/r/20230415201811.343116-3-davemarchevsky@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-04-19 16:15:07 -07:00
Andrii Nakryiko	2f01564c50	libbpf: Wire through log_true_size for bpf_btf_load() API Similar to what we did for bpf_prog_load() in previous patch, wire returning of log_true_size value from kernel back to the user through OPTS out field. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230406234205.323208-17-andrii@kernel.org	2023-04-19 16:15:07 -07:00
Andrii Nakryiko	c2fe7adb33	libbpf: Wire through log_true_size returned from kernel for BPF_PROG_LOAD Add output-only log_true_size field to bpf_prog_load_opts to return bpf_attr->log_true_size value back from bpf() syscall. Note, that we have to drop const modifier from opts in bpf_prog_load(). This could potentially cause compilation error for some users. But the usual practice is to define bpf_prog_load_ops as a local variable next to bpf_prog_load() call and pass pointer to it, so const vs non-const makes no difference and won't even come up in most (if not all) cases. There are no runtime and ABI backwards/forward compatibility issues at all. If user provides old struct bpf_prog_load_opts, libbpf won't set new fields. If old libbpf is provided new bpf_prog_load_opts, nothing will happen either as old libbpf doesn't yet know about this new field. Adding a new variant of bpf_prog_load() just for this seems like a big and unnecessary overkill. As a corroborating evidence is the fact that entire selftests/bpf code base required not adjustment whatsoever. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230406234205.323208-16-andrii@kernel.org	2023-04-19 16:15:07 -07:00
Andrii Nakryiko	88004dd87a	bpf: Add log_true_size output field to return necessary log buffer size Add output-only log_true_size and btf_log_true_size field to BPF_PROG_LOAD and BPF_BTF_LOAD commands, respectively. It will return the size of log buffer necessary to fit in all the log contents at specified log_level. This is very useful for BPF loader libraries like libbpf to be able to size log buffer correctly, but could be used by users directly, if necessary, as well. This patch plumbs all this through the code, taking into account actual bpf_attr size provided by user to determine if these new fields are expected by users. And if they are, set them from kernel on return. We refactory btf_parse() function to accommodate this, moving attr and uattr handling inside it. The rest is very straightforward code, which is split from the logging accounting changes in the previous patch to make it simpler to review logic vs UAPI changes. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Lorenz Bauer <lmb@isovalent.com> Link: https://lore.kernel.org/bpf/20230406234205.323208-13-andrii@kernel.org	2023-04-19 16:15:07 -07:00
Andrii Nakryiko	a22abb9c85	libbpf: Don't enforce unnecessary verifier log restrictions on libbpf side This basically prevents any forward compatibility. And we either way just return -EINVAL, which would otherwise be returned from bpf() syscall anyways. Similarly, drop enforcement of non-NULL log_buf when log_level > 0. This won't be true anymore soon. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Lorenz Bauer <lmb@isovalent.com> Link: https://lore.kernel.org/bpf/20230406234205.323208-5-andrii@kernel.org	2023-04-19 16:15:07 -07:00
Herbert Xu	2c0c927a38	macvlan: Add netlink attribute for broadcast cutoff Make the broadcast cutoff configurable through netlink. Note that macvlan is weird because there is no central device for us to configure (the lowerdev could be anything). So all the options are duplicated over what could be thousands of child devices. IFLA_MACVLAN_BC_QUEUE_LEN took the approach of taking the maximum of all child device settings. This is unnecessary as we could simply store the option in the port device and take the last child device that gets updated as the value to use. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-04-19 16:15:07 -07:00
Andrii Nakryiko	d9d17f6d71	git: add .gitattributes file ignoring assets/ during archiving We don't need to archive assets/ subdir when packaging libbpf sources in retsnoop and veristat repos. Mark assets/ as export-ignore to skip it. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-04-01 15:36:54 -07:00
Daniel Müller	3783577161	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 226bc6ae6405c46a6e9865835c36a1d45fc0b3bf Checkpoint bpf-next commit: 4ca13d1002f37c10038ff4ed3cfdc70dbe049d60 Baseline bpf commit: 915efd8a446b74442039d31689d5d863caf82517 Checkpoint bpf commit: a6f6a95f25803500079513780d11a911ce551d76 Andrii Nakryiko (1): libbpf: disassociate section handler on explicit bpf_program__set_type() call Arnaldo Carvalho de Melo (1): tools include UAPI: Synchronize linux/fcntl.h with the kernel sources Eduard Zingerman (1): libbpf: Fix double-free when linker processes empty sections JP Kobryn (1): libbpf: Ensure print callback usage is thread-safe Jakub Kicinski (1): ynl: broaden the license even more Lorenzo Bianconi (1): xdp: add xdp_set_features_flag utility routine include/uapi/linux/fcntl.h \| 1 + include/uapi/linux/netdev.h \| 4 +++- src/libbpf.c \| 10 +++++++--- src/libbpf.h \| 2 ++ src/linker.c \| 14 +++++++++++++- 5 files changed, 26 insertions(+), 5 deletions(-) Signed-off-by: Daniel Müller <deso@posteo.net>	2023-03-30 16:24:24 -07:00
Jakub Kicinski	75c14163b9	ynl: broaden the license even more I relicensed Netlink spec code to GPL-2.0 OR BSD-3-Clause but we still put a slightly different license on the uAPI header than the rest of the code. Use the Linux-syscall-note on all the specs and all generated code. It's moot for kernel code, but should not hurt. This way the licenses match everywhere. Cc: Chuck Lever <chuck.lever@oracle.com> Fixes: 37d9df224d1e ("ynl: re-license uniformly under GPL-2.0 OR BSD-3-Clause") Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Daniel Müller <deso@posteo.net>	2023-03-30 16:24:24 -07:00
Lorenzo Bianconi	056e9bcc19	xdp: add xdp_set_features_flag utility routine Introduce xdp_set_features_flag utility routine in order to update dynamically xdp_features according to the dynamic hw configuration via ethtool (e.g. changing number of hw rx/tx queues). Add xdp_clear_features_flag() in order to clear all xdp_feature flag. Reviewed-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Daniel Müller <deso@posteo.net>	2023-03-30 16:24:24 -07:00
Arnaldo Carvalho de Melo	14ae9422db	tools include UAPI: Synchronize linux/fcntl.h with the kernel sources To pick up the changes in: 6fd7353829cafc40 ("mm/memfd: add F_SEAL_EXEC") That doesn't add or change any perf tools functionality, only addresses these build warnings: Warning: Kernel ABI header at 'tools/include/uapi/linux/fcntl.h' differs from latest version at 'include/uapi/linux/fcntl.h' diff -u tools/include/uapi/linux/fcntl.h include/uapi/linux/fcntl.h Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Daniel Müller <deso@posteo.net>	2023-03-30 16:24:24 -07:00
Andrii Nakryiko	3fd6eebb2d	libbpf: disassociate section handler on explicit bpf_program__set_type() call If user explicitly overrides programs's type with bpf_program__set_type() API call, we need to disassociate whatever SEC_DEF handler libbpf determined initially based on program's SEC() definition, as it's not goind to be valid anymore and could lead to crashes and/or confusing failures. Also, fix up bpf_prog_test_load() helper in selftests/bpf, which is force-setting program type (even if that's completely unnecessary; this is quite a legacy piece of code), and thus should expect auto-attach to not work, yet one of the tests explicitly relies on auto-attach for testing. Instead, force-set program type only if it differs from the desired one. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230327185202.1929145-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Müller <deso@posteo.net>	2023-03-30 16:24:24 -07:00
Eduard Zingerman	4218389b1e	libbpf: Fix double-free when linker processes empty sections Double-free error in bpf_linker__free() was reported by James Hilliard. The error is caused by miss-use of realloc() in extend_sec(). The error occurs when two files with empty sections of the same name are linked: - when first file is processed: - extend_sec() calls realloc(dst->raw_data, dst_align_sz) with dst->raw_data == NULL and dst_align_sz == 0; - dst->raw_data is set to a special pointer to a memory block of size zero; - when second file is processed: - extend_sec() calls realloc(dst->raw_data, dst_align_sz) with dst->raw_data == <special pointer> and dst_align_sz == 0; - realloc() "frees" dst->raw_data special pointer and returns NULL; - extend_sec() exits with -ENOMEM, and the old dst->raw_data value is preserved (it is now invalid); - eventually, bpf_linker__free() attempts to free dst->raw_data again. This patch fixes the bug by avoiding -ENOMEM exit for dst_align_sz == 0. The fix was suggested by Andrii Nakryiko <andrii.nakryiko@gmail.com>. Reported-by: James Hilliard <james.hilliard1@gmail.com> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: James Hilliard <james.hilliard1@gmail.com> Link: https://lore.kernel.org/bpf/CADvTj4o7ZWUikKwNTwFq0O_AaX+46t_+Ca9gvWMYdWdRtTGeHQ@mail.gmail.com/ Link: https://lore.kernel.org/bpf/20230328004738.381898-3-eddyz87@gmail.com Signed-off-by: Daniel Müller <deso@posteo.net>	2023-03-30 16:24:24 -07:00
JP Kobryn	ae32d7169d	libbpf: Ensure print callback usage is thread-safe This patch prevents races on the print function pointer, allowing the libbpf_set_print() function to become thread-safe. Signed-off-by: JP Kobryn <inwardvessel@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230325010845.46000-1-inwardvessel@gmail.com Signed-off-by: Daniel Müller <deso@posteo.net>	2023-03-30 16:24:24 -07:00
Andrii Nakryiko	b362bb6e10	ci: update libbpf/ci references to use "main" Seems like deafult branch was renamed s/master/main/, adopt libbpf CI to not fail. Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-03-27 10:45:47 -07:00
Andrii Nakryiko	f8cd00f613	ci: fallback to llvm-16 and clang-16 again Seems like upstream LLVM/Clang packaging still has issues with llvm/clang 17. Fallback to 16 again, for now. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-03-23 13:10:17 -07:00
Andrii Nakryiko	dc4e7076ad	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: b8a2e3f93d412114a1539ea97b59b3e6ed6e1f9a Checkpoint bpf-next commit: 226bc6ae6405c46a6e9865835c36a1d45fc0b3bf Baseline bpf commit: a33a6eaa19d3af261e8708bfc8ba62020703117f Checkpoint bpf commit: 915efd8a446b74442039d31689d5d863caf82517 Alexei Starovoitov (5): libbpf: Fix relocation of kfunc ksym in ld_imm64 insn. libbpf: Introduce bpf_ksym_exists() macro. libbpf: Fix ld_imm64 copy logic for ksym in light skeleton. libbpf: Rename RELO_EXTERN_VAR/FUNC. libbpf: Support kfunc detection in light skeleton. Daniel Müller (1): libbpf: Ignore warnings about "inefficient alignment" Kui-Feng Lee (5): bpf: Create links for BPF struct_ops maps. libbpf: Create a bpf_link in bpf_map__attach_struct_ops(). bpf: Update the struct_ops of a bpf_link. libbpf: Update a bpf_link with another struct_ops. libbpf: Use .struct_ops.link section to indicate a struct_ops with a link. Liu Pan (1): libbpf: Explicitly call write to append content to file Sreevani Sreejith (1): bpf, docs: Libbpf overview documentation docs/index.rst \| 25 +++-- docs/libbpf_overview.rst \| 228 +++++++++++++++++++++++++++++++++++++ include/uapi/linux/bpf.h \| 33 +++++- src/bpf.c \| 8 +- src/bpf.h \| 3 +- src/bpf_gen_internal.h \| 4 +- src/bpf_helpers.h \| 5 + src/gen_loader.c \| 48 ++++---- src/libbpf.c \| 235 +++++++++++++++++++++++++++++---------- src/libbpf.h \| 1 + src/libbpf.map \| 1 + src/zip.c \| 6 + 12 files changed, 501 insertions(+), 96 deletions(-) create mode 100644 docs/libbpf_overview.rst Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-03-23 13:10:17 -07:00
Kui-Feng Lee	465a73051d	libbpf: Use .struct_ops.link section to indicate a struct_ops with a link. Flags a struct_ops is to back a bpf_link by putting it to the ".struct_ops.link" section. Once it is flagged, the created struct_ops can be used to create a bpf_link or update a bpf_link that has been backed by another struct_ops. Signed-off-by: Kui-Feng Lee <kuifeng@meta.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230323032405.3735486-8-kuifeng@meta.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-03-23 13:10:17 -07:00
Kui-Feng Lee	e51cdaaca0	libbpf: Update a bpf_link with another struct_ops. Introduce bpf_link__update_map(), which allows to atomically update underlying struct_ops implementation for given struct_ops BPF link. Also add old_map_fd to struct bpf_link_update_opts to handle BPF_F_REPLACE feature. Signed-off-by: Kui-Feng Lee <kuifeng@meta.com> Link: https://lore.kernel.org/r/20230323032405.3735486-7-kuifeng@meta.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-03-23 13:10:17 -07:00
Kui-Feng Lee	055cbdcc9f	bpf: Update the struct_ops of a bpf_link. By improving the BPF_LINK_UPDATE command of bpf(), it should allow you to conveniently switch between different struct_ops on a single bpf_link. This would enable smoother transitions from one struct_ops to another. The struct_ops maps passing along with BPF_LINK_UPDATE should have the BPF_F_LINK flag. Signed-off-by: Kui-Feng Lee <kuifeng@meta.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230323032405.3735486-6-kuifeng@meta.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-03-23 13:10:17 -07:00
Kui-Feng Lee	c6893dccd9	libbpf: Create a bpf_link in bpf_map__attach_struct_ops(). bpf_map__attach_struct_ops() was creating a dummy bpf_link as a placeholder, but now it is constructing an authentic one by calling bpf_link_create() if the map has the BPF_F_LINK flag. You can flag a struct_ops map with BPF_F_LINK by calling bpf_map__set_map_flags(). Signed-off-by: Kui-Feng Lee <kuifeng@meta.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230323032405.3735486-5-kuifeng@meta.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-03-23 13:10:17 -07:00
Kui-Feng Lee	077bf73900	bpf: Create links for BPF struct_ops maps. Make bpf_link support struct_ops. Previously, struct_ops were always used alone without any associated links. Upon updating its value, a struct_ops would be activated automatically. Yet other BPF program types required to make a bpf_link with their instances before they could become active. Now, however, you can create an inactive struct_ops, and create a link to activate it later. With bpf_links, struct_ops has a behavior similar to other BPF program types. You can pin/unpin them from their links and the struct_ops will be deactivated when its link is removed while previously need someone to delete the value for it to be deactivated. bpf_links are responsible for registering their associated struct_ops. You can only use a struct_ops that has the BPF_F_LINK flag set to create a bpf_link, while a structs without this flag behaves in the same manner as before and is registered upon updating its value. The BPF_LINK_TYPE_STRUCT_OPS serves a dual purpose. Not only is it used to craft the links for BPF struct_ops programs, but also to create links for BPF struct_ops them-self. Since the links of BPF struct_ops programs are only used to create trampolines internally, they are never seen in other contexts. Thus, they can be reused for struct_ops themself. To maintain a reference to the map supporting this link, we add bpf_struct_ops_link as an additional type. The pointer of the map is RCU and won't be necessary until later in the patchset. Signed-off-by: Kui-Feng Lee <kuifeng@meta.com> Link: https://lore.kernel.org/r/20230323032405.3735486-4-kuifeng@meta.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-03-23 13:10:17 -07:00
Alexei Starovoitov	68cd7cd386	libbpf: Support kfunc detection in light skeleton. Teach gen_loader to find {btf_id, btf_obj_fd} of kernel variables and kfuncs and populate corresponding ld_imm64 and bpf_call insns. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230321203854.3035-4-alexei.starovoitov@gmail.com	2023-03-23 13:10:17 -07:00
Alexei Starovoitov	a5464a5b0e	libbpf: Rename RELO_EXTERN_VAR/FUNC. RELO_EXTERN_VAR/FUNC names are not correct anymore. RELO_EXTERN_VAR represent ksym symbol in ld_imm64 insn. It can point to kernel variable or kfunc. Rename RELO_EXTERN_VAR->RELO_EXTERN_LD64 and RELO_EXTERN_FUNC->RELO_EXTERN_CALL to match what they actually represent. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20230321203854.3035-2-alexei.starovoitov@gmail.com	2023-03-23 13:10:17 -07:00
Liu Pan	753e4d07d1	libbpf: Explicitly call write to append content to file Write data to fd by calling "vdprintf", in most implementations of the standard library, the data is finally written by the writev syscall. But "uprobe_events/kprobe_events" does not allow segmented writes, so switch the "append_to_file" function to explicit write() call. Signed-off-by: Liu Pan <patteliu@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230320030720.650-1-patteliu@gmail.com	2023-03-23 13:10:17 -07:00
Alexei Starovoitov	5b45c90c49	libbpf: Fix ld_imm64 copy logic for ksym in light skeleton. Unlike normal libbpf the light skeleton 'loader' program is doing btf_find_by_name_kind() call at run-time to find ksym in the kernel and populate its {btf_id, btf_obj_fd} pair in ld_imm64 insn. To avoid doing the search multiple times for the same ksym it remembers the first patched ld_imm64 insn and copies {btf_id, btf_obj_fd} from it into subsequent ld_imm64 insn. Fix a bug in copying logic, since it may incorrectly clear BPF_PSEUDO_BTF_ID flag. Also replace always true if (btf_obj_fd >= 0) check with unconditional JMP_JA to clarify the code. Fixes: d995816b77eb ("libbpf: Avoid reload of imm for weak, unresolved, repeating ksym") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230319203014.55866-1-alexei.starovoitov@gmail.com	2023-03-23 13:10:17 -07:00
Sreevani Sreejith	2db620d982	bpf, docs: Libbpf overview documentation This patch documents overview of libbpf, including its features for developing BPF programs. Signed-off-by: Sreevani Sreejith <ssreevani@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20230315195405.2051559-1-ssreevani@meta.com	2023-03-23 13:10:17 -07:00
Alexei Starovoitov	c401b96718	libbpf: Introduce bpf_ksym_exists() macro. Introduce bpf_ksym_exists() macro that can be used by BPF programs to detect at load time whether particular ksym (either variable or kfunc) is present in the kernel. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230317201920.62030-4-alexei.starovoitov@gmail.com	2023-03-23 13:10:17 -07:00
Alexei Starovoitov	fd28ca4b5b	libbpf: Fix relocation of kfunc ksym in ld_imm64 insn. void *p = kfunc; -> generates ld_imm64 insn. kfunc() -> generates bpf_call insn. libbpf patches bpf_call insn correctly while only btf_id part of ld_imm64 is set in the former case. Which means that pointers to kfuncs in modules are not patched correctly and the verifier rejects load of such programs due to btf_id being out of range. Fix libbpf to patch ld_imm64 for kfunc. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230317201920.62030-3-alexei.starovoitov@gmail.com	2023-03-23 13:10:17 -07:00
Daniel Müller	c722f76593	libbpf: Ignore warnings about "inefficient alignment" Some consumers of libbpf compile the code base with different warnings enabled. In a report for perf, for example, -Wpacked was set which caused warnings about "inefficient alignment" to be emitted on a subset of supported architectures. With this change we silence specifically those warnings, as we intentionally worked with packed structs. This is a similar resolution as in b2f10cd4e805 ("perf cpumap: Fix alignment for masks in event encoding"). Fixes: 1eebcb60633f ("libbpf: Implement basic zip archive parsing support") Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/bpf/CA+G9fYtBnwxAWXi2+GyNByApxnf_DtP1-6+_zOKAdJKnJBexjg@mail.gmail.com/ Link: https://lore.kernel.org/bpf/20230315171550.1551603-1-deso@posteo.net	2023-03-23 13:10:17 -07:00
David Vernet	b5e9722ec2	ci: Regenerate latest vmlinux.h for old kernel CI tests. CI will fail without it. Signed-off-by: David Vernet <void@manifault.com>	2023-03-15 13:18:34 -07:00
David Vernet	7fdf16de6d	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: db55174d05ee6bed9d0583ba08e99c891ef0ed05 Checkpoint bpf-next commit: b8a2e3f93d412114a1539ea97b59b3e6ed6e1f9a Baseline bpf commit: d900f3d20cc3169ce42ec72acc850e662a4d4db2 Checkpoint bpf commit: a33a6eaa19d3af261e8708bfc8ba62020703117f Andrii Nakryiko (1): bpf: implement numbers iterator Daniel Müller (1): libbpf: Fix theoretical u32 underflow in find_cd() function Jakub Kicinski (1): ynl: re-license uniformly under GPL-2.0 OR BSD-3-Clause Jesus Sanchez-Palencia (1): libbpf: Revert poisoning of strlcpy Menglong Dong (1): libbpf: Add support to set kprobe/uprobe attach mode Michael Weiß (1): bpf: Fix a typo for BPF_F_ANY_ALIGNMENT in bpf.h Puranjay Mohan (2): libbpf: Refactor parse_usdt_arg() to re-use code libbpf: USDT arm arg parsing support Ross Zwisler (1): bpf: use canonical ftrace path include/uapi/linux/bpf.h \| 18 +++- include/uapi/linux/netdev.h \| 2 +- src/libbpf.c \| 48 ++++++++- src/libbpf.h \| 50 ++++++--- src/libbpf_internal.h \| 4 +- src/usdt.c \| 196 ++++++++++++++++++++++-------------- src/zip.c \| 3 +- 7 files changed, 219 insertions(+), 102 deletions(-) Signed-off-by: David Vernet <void@manifault.com>	2023-03-15 13:18:34 -07:00
David Vernet	faae78aac4	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions. Signed-off-by: David Vernet <void@manifault.com>	2023-03-15 13:18:34 -07:00
Jesus Sanchez-Palencia	950cffc036	libbpf: Revert poisoning of strlcpy This reverts commit 6d0c4b11e743("libbpf: Poison strlcpy()"). It added the pragma poison directive to libbpf_internal.h to protect against accidental usage of strlcpy but ended up breaking the build for toolchains based on libcs which provide the strlcpy() declaration from string.h (e.g. uClibc-ng). The include order which causes the issue is: string.h, from Iibbpf_common.h:12, from libbpf.h:20, from libbpf_internal.h:26, from strset.c:9: Fixes: 6d0c4b11e743 ("libbpf: Poison strlcpy()") Signed-off-by: Jesus Sanchez-Palencia <jesussanp@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230309004836.2808610-1-jesussanp@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-03-15 13:18:34 -07:00
Jakub Kicinski	bdc7c5e217	ynl: re-license uniformly under GPL-2.0 OR BSD-3-Clause I was intending to make all the Netlink Spec code BSD-3-Clause to ease the adoption but it appears that: - I fumbled the uAPI and used "GPL WITH uAPI note" there - it gives people pause as they expect GPL in the kernel As suggested by Chuck re-license under dual. This gives us benefit of full BSD freedom while fulfilling the broad "kernel is under GPL" expectations. Link: https://lore.kernel.org/all/20230304120108.05dd44c5@kernel.org/ Link: https://lore.kernel.org/r/20230306200457.3903854-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-03-15 13:18:34 -07:00
Ross Zwisler	e8107c3959	bpf: use canonical ftrace path The canonical location for the tracefs filesystem is at /sys/kernel/tracing. But, from Documentation/trace/ftrace.rst: Before 4.1, all ftrace tracing control files were within the debugfs file system, which is typically located at /sys/kernel/debug/tracing. For backward compatibility, when mounting the debugfs file system, the tracefs file system will be automatically mounted at: /sys/kernel/debug/tracing Many comments and samples in the bpf code still refer to this older debugfs path, so let's update them to avoid confusion. There are a few spots where the bpf code explicitly checks both tracefs and debugfs (tools/bpf/bpftool/tracelog.c and tools/lib/api/fs/fs.c) and I've left those alone so that the tools can continue to work with both paths. Signed-off-by: Ross Zwisler <zwisler@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20230313205628.1058720-2-zwisler@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-03-15 13:18:34 -07:00
Michael Weiß	c5be1b0770	bpf: Fix a typo for BPF_F_ANY_ALIGNMENT in bpf.h Fix s/BPF_PROF_LOAD/BPF_PROG_LOAD/ typo in the documentation comment for BPF_F_ANY_ALIGNMENT in bpf.h. Signed-off-by: Michael Weiß <michael.weiss@aisec.fraunhofer.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230309133823.944097-1-michael.weiss@aisec.fraunhofer.de	2023-03-15 13:18:34 -07:00
Andrii Nakryiko	32d34a9415	bpf: implement numbers iterator Implement the first open-coded iterator type over a range of integers. It's public API consists of: - bpf_iter_num_new() constructor, which accepts [start, end) range (that is, start is inclusive, end is exclusive). - bpf_iter_num_next() which will keep returning read-only pointer to int until the range is exhausted, at which point NULL will be returned. If bpf_iter_num_next() is kept calling after this, NULL will be persistently returned. - bpf_iter_num_destroy() destructor, which needs to be called at some point to clean up iterator state. BPF verifier enforces that iterator destructor is called at some point before BPF program exits. Note that `start = end = X` is a valid combination to setup an empty iterator. bpf_iter_num_new() will return 0 (success) for any such combination. If bpf_iter_num_new() detects invalid combination of input arguments, it returns error, resets iterator state to, effectively, empty iterator, so any subsequent call to bpf_iter_num_next() will keep returning NULL. BPF verifier has no knowledge that returned integers are in the [start, end) value range, as both `start` and `end` are not statically known and enforced: they are runtime values. While the implementation is pretty trivial, some care needs to be taken to avoid overflows and underflows. Subsequent selftests will validate correctness of [start, end) semantics, especially around extremes (INT_MIN and INT_MAX). Similarly to bpf_loop(), we enforce that no more than BPF_MAX_LOOPS can be specified. bpf_iter_num_{new,next,destroy}() is a logical evolution from bounded BPF loops and bpf_loop() helper and is the basis for implementing ergonomic BPF loops with no statically known or verified bounds. Subsequent patches implement bpf_for() macro, demonstrating how this can be wrapped into something that works and feels like a normal for() loop in C language. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230308184121.1165081-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-03-15 13:18:34 -07:00
Puranjay Mohan	aab5f194e1	libbpf: USDT arm arg parsing support Parsing of USDT arguments is architecture-specific; on arm it is relatively easy since registers used are r[0-10], fp, ip, sp, lr, pc. Format is slightly different compared to aarch64; forms are - "size @ [ reg, #offset ]" for dereferences, for example "-8 @ [ sp, #76 ]" ; " -4 @ [ sp ]" - "size @ reg" for register values; for example "-4@r0" - "size @ #value" for raw values; for example "-8@#1" Add support for parsing USDT arguments for ARM architecture. To test the above changes QEMU's virt[1] board with cortex-a15 CPU was used. libbpf-bootstrap's usdt example[2] was modified to attach to a test program with DTRACE_PROBE1/2/3/4... probes to test different combinations. [1] https://www.qemu.org/docs/master/system/arm/virt.html [2] https://github.com/libbpf/libbpf-bootstrap/blob/master/examples/c/usdt.bpf.c Signed-off-by: Puranjay Mohan <puranjay12@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230307120440.25941-3-puranjay12@gmail.com	2023-03-15 13:18:34 -07:00
Puranjay Mohan	c5fe344018	libbpf: Refactor parse_usdt_arg() to re-use code The parse_usdt_arg() function is defined differently for each architecture but the last part of the function is repeated verbatim for each architecture. Refactor parse_usdt_arg() to fill the arg_sz and then do the repeated post-processing in parse_usdt_spec(). Signed-off-by: Puranjay Mohan <puranjay12@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230307120440.25941-2-puranjay12@gmail.com	2023-03-15 13:18:34 -07:00
Daniel Müller	232f42135a	libbpf: Fix theoretical u32 underflow in find_cd() function Coverity reported a potential underflow of the offset variable used in the find_cd() function. Switch to using a signed 64 bit integer for the representation of offset to make sure we can never underflow. Fixes: 1eebcb60633f ("libbpf: Implement basic zip archive parsing support") Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230307215504.837321-1-deso@posteo.net	2023-03-15 13:18:34 -07:00
Menglong Dong	cc7177624f	libbpf: Add support to set kprobe/uprobe attach mode By default, libbpf will attach the kprobe/uprobe BPF program in the latest mode that supported by kernel. In this patch, we add the support to let users manually attach kprobe/uprobe in legacy or perf mode. There are 3 mode that supported by the kernel to attach kprobe/uprobe: LEGACY: create perf event in legacy way and don't use bpf_link PERF: create perf event with perf_event_open() and don't use bpf_link Signed-off-by: Menglong Dong <imagedong@tencent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Biao Jiang <benbjiang@tencent.com> Link: create perf event with perf_event_open() and use bpf_link Link: https://lore.kernel.org/bpf/20230113093427.1666466-1-imagedong@tencent.com/ Link: https://lore.kernel.org/bpf/20230306064833.7932-2-imagedong@tencent.com Users now can manually choose the mode with bpf_program__attach_uprobe_opts()/bpf_program__attach_kprobe_opts().	2023-03-15 13:18:34 -07:00
Daniel Müller	cf46d44f0a	sync: Add section about need for Makefile adjustments When performing a sync with the kernel repository using the sync-kernel.sh script, it may be necessary to manually adjust the library's Makefile if: - new source files were added upstream - new public headers were added upstream This change adds a new section to `SYNC.md` to spell out this need. Signed-off-by: Daniel Müller <deso@posteo.net>	2023-03-06 13:06:39 -08:00
Daniel Müller	a41e6ef325	Stop running l4lb_all test on 5.5.0 The l4lb_all/l4lb_noinline_dynptr test no does not run on kernel 5.5.0, because functionality is missing there. Do not allow running it. Signed-off-by: Daniel Müller <deso@posteo.net>	2023-03-06 09:47:37 -08:00
Daniel Müller	c2495832ce	libbpf: Properly build zip.o The sync script does not seem to be automatically adding newly added files added to the kernel repo build to the local Makefile. Do that now. Signed-off-by: Daniel Müller <deso@posteo.net>	2023-03-06 09:47:37 -08:00
Daniel Müller	bfb1e97426	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: c8ee37bde4021a275d2e4f33bd48d54912bb00c4 Checkpoint bpf-next commit: db55174d05ee6bed9d0583ba08e99c891ef0ed05 Baseline bpf commit: 2d311f480b52eeb2e1fd432d64b78d82952c3808 Checkpoint bpf commit: d900f3d20cc3169ce42ec72acc850e662a4d4db2 Alexei Starovoitov (1): bpf: Rename __kptr_ref -> __kptr and __kptr -> __kptr_untrusted. Daniel Müller (3): libbpf: Implement basic zip archive parsing support libbpf: Introduce elf_find_func_offset_from_file() function libbpf: Add support for attaching uprobes to shared objects in APKs Joanne Koong (3): bpf: Add skb dynptrs bpf: Add xdp dynptrs bpf: Add bpf_dynptr_slice and bpf_dynptr_slice_rdwr Tero Kristo (1): bpf: Add support for absolute value BPF timers Viktor Malik (3): libbpf: Remove unnecessary ternary operator libbpf: Remove several dead assignments libbpf: Cleanup linker_append_elf_relos include/uapi/linux/bpf.h \| 33 +++- src/bpf_helpers.h \| 2 +- src/btf.c \| 2 - src/libbpf.c \| 149 ++++++++++++++---- src/linker.c \| 11 +- src/relo_core.c \| 3 - src/zip.c \| 328 +++++++++++++++++++++++++++++++++++++++ src/zip.h \| 47 ++++++ 8 files changed, 529 insertions(+), 46 deletions(-) create mode 100644 src/zip.c create mode 100644 src/zip.h Signed-off-by: Daniel Müller <deso@posteo.net>	2023-03-06 09:47:37 -08:00
Daniel Müller	a468b16788	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions. Signed-off-by: Daniel Müller <deso@posteo.net>	2023-03-06 09:47:37 -08:00
Alexei Starovoitov	6c673bb00b	bpf: Rename __kptr_ref -> __kptr and __kptr -> __kptr_untrusted. __kptr meant to store PTR_UNTRUSTED kernel pointers inside bpf maps. The concept felt useful, but didn't get much traction, since bpf_rdonly_cast() was added soon after and bpf programs received a simpler way to access PTR_UNTRUSTED kernel pointers without going through restrictive __kptr usage. Rename __kptr_ref -> __kptr and __kptr -> __kptr_untrusted to indicate its intended usage. The main goal of __kptr_untrusted was to read/write such pointers directly while bpf_kptr_xchg was a mechanism to access refcnted kernel pointers. The next patch will allow RCU protected __kptr access with direct read. At that point __kptr_untrusted will be deprecated. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20230303041446.3630-2-alexei.starovoitov@gmail.com Signed-off-by: Daniel Müller <deso@posteo.net>	2023-03-06 09:47:37 -08:00
Tero Kristo	b6c58f7619	bpf: Add support for absolute value BPF timers Add a new flag BPF_F_TIMER_ABS that can be passed to bpf_timer_start() to start an absolute value timer instead of the default relative value. This makes the timer expire at an exact point in time, instead of a time with latencies induced by both the BPF and timer subsystems. Suggested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Tero Kristo <tero.kristo@linux.intel.com> Link: https://lore.kernel.org/r/20230302114614.2985072-2-tero.kristo@linux.intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Müller <deso@posteo.net>	2023-03-06 09:47:37 -08:00
Daniel Müller	db26142ffb	libbpf: Add support for attaching uprobes to shared objects in APKs This change adds support for attaching uprobes to shared objects located in APKs, which is relevant for Android systems where various libraries may reside in APKs. To make that happen, we extend the syntax for the "binary path" argument to attach to with that supported by various Android tools: <archive>!/<binary-in-archive> For example: /system/app/test-app/test-app.apk!/lib/arm64-v8a/libc++_shared.so APKs need to be specified via full path, i.e., we do not attempt to resolve mere file names by searching system directories. We cannot currently test this functionality end-to-end in an automated fashion, because it relies on an Android system being present, but there is no support for that in CI. I have tested the functionality manually, by creating a libbpf program containing a uretprobe, attaching it to a function inside a shared object inside an APK, and verifying the sanity of the returned values. Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230301212308.1839139-4-deso@posteo.net	2023-03-06 09:47:37 -08:00
Daniel Müller	47eb62005a	libbpf: Introduce elf_find_func_offset_from_file() function This change splits the elf_find_func_offset() function in two: elf_find_func_offset(), which now accepts an already opened Elf object instead of a path to a file that is to be opened, as well as elf_find_func_offset_from_file(), which opens a binary based on a path and then invokes elf_find_func_offset() on the Elf object. Having this split in responsibilities will allow us to call elf_find_func_offset() from other code paths on Elf objects that did not necessarily come from a file on disk. Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230301212308.1839139-3-deso@posteo.net	2023-03-06 09:47:37 -08:00
Daniel Müller	9ca6f946cd	libbpf: Implement basic zip archive parsing support This change implements support for reading zip archives, including opening an archive, finding an entry based on its path and name in it, and closing it. The code was copied from https://github.com/iovisor/bcc/pull/4440, which implements similar functionality for bcc. The author confirmed that he is fine with this usage and the corresponding relicensing. I adjusted it to adhere to libbpf coding standards. Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Michał Gregorczyk <michalgr@meta.com> Link: https://lore.kernel.org/bpf/20230301212308.1839139-2-deso@posteo.net	2023-03-06 09:47:37 -08:00
Viktor Malik	87695e9723	libbpf: Cleanup linker_append_elf_relos Clang Static Analyser (scan-build) reports some unused symbols and dead assignments in the linker_append_elf_relos function. Clean these up. Signed-off-by: Viktor Malik <vmalik@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/c5c8fe9f411b69afada8399d23bb048ef2a70535.1677658777.git.vmalik@redhat.com Signed-off-by: Daniel Müller <deso@posteo.net>	2023-03-06 09:47:37 -08:00
Viktor Malik	3706449b1b	libbpf: Remove several dead assignments Clang Static Analyzer (scan-build) reports several dead assignments in libbpf where the assigned value is unconditionally overridden by another value before it is read. Remove these assignments. Signed-off-by: Viktor Malik <vmalik@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/5503d18966583e55158471ebbb2f67374b11bf5e.1677658777.git.vmalik@redhat.com Signed-off-by: Daniel Müller <deso@posteo.net>	2023-03-06 09:47:37 -08:00
Viktor Malik	4c75268933	libbpf: Remove unnecessary ternary operator Coverity reports that the first check of 'err' in bpf_object__init_maps is always false as 'err' is initialized to 0 at that point. Remove the unnecessary ternary operator. Signed-off-by: Viktor Malik <vmalik@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/78a3702f2ea9f32a84faaae9b674c56269d330a7.1677658777.git.vmalik@redhat.com Signed-off-by: Daniel Müller <deso@posteo.net>	2023-03-06 09:47:37 -08:00
Joanne Koong	3fe3cccb06	bpf: Add bpf_dynptr_slice and bpf_dynptr_slice_rdwr Two new kfuncs are added, bpf_dynptr_slice and bpf_dynptr_slice_rdwr. The user must pass in a buffer to store the contents of the data slice if a direct pointer to the data cannot be obtained. For skb and xdp type dynptrs, these two APIs are the only way to obtain a data slice. However, for other types of dynptrs, there is no difference between bpf_dynptr_slice(_rdwr) and bpf_dynptr_data. For skb type dynptrs, the data is copied into the user provided buffer if any of the data is not in the linear portion of the skb. For xdp type dynptrs, the data is copied into the user provided buffer if the data is between xdp frags. If the skb is cloned and a call to bpf_dynptr_data_rdwr is made, then the skb will be uncloned (see bpf_unclone_prologue()). Please note that any bpf_dynptr_write() automatically invalidates any prior data slices of the skb dynptr. This is because the skb may be cloned or may need to pull its paged buffer into the head. As such, any bpf_dynptr_write() will automatically have its prior data slices invalidated, even if the write is to data in the skb head of an uncloned skb. Please note as well that any other helper calls that change the underlying packet buffer (eg bpf_skb_pull_data()) invalidates any data slices of the skb dynptr as well, for the same reasons. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Link: https://lore.kernel.org/r/20230301154953.641654-10-joannelkoong@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Müller <deso@posteo.net>	2023-03-06 09:47:37 -08:00
Joanne Koong	0c5b5b5d91	bpf: Add xdp dynptrs Add xdp dynptrs, which are dynptrs whose underlying pointer points to a xdp_buff. The dynptr acts on xdp data. xdp dynptrs have two main benefits. One is that they allow operations on sizes that are not statically known at compile-time (eg variable-sized accesses). Another is that parsing the packet data through dynptrs (instead of through direct access of xdp->data and xdp->data_end) can be more ergonomic and less brittle (eg does not need manual if checking for being within bounds of data_end). For reads and writes on the dynptr, this includes reading/writing from/to and across fragments. Data slices through the bpf_dynptr_data API are not supported; instead bpf_dynptr_slice() and bpf_dynptr_slice_rdwr() should be used. For examples of how xdp dynptrs can be used, please see the attached selftests. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Link: https://lore.kernel.org/r/20230301154953.641654-9-joannelkoong@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Müller <deso@posteo.net>	2023-03-06 09:47:37 -08:00
Joanne Koong	d16fc1f0f5	bpf: Add skb dynptrs Add skb dynptrs, which are dynptrs whose underlying pointer points to a skb. The dynptr acts on skb data. skb dynptrs have two main benefits. One is that they allow operations on sizes that are not statically known at compile-time (eg variable-sized accesses). Another is that parsing the packet data through dynptrs (instead of through direct access of skb->data and skb->data_end) can be more ergonomic and less brittle (eg does not need manual if checking for being within bounds of data_end). For bpf prog types that don't support writes on skb data, the dynptr is read-only (bpf_dynptr_write() will return an error) For reads and writes through the bpf_dynptr_read() and bpf_dynptr_write() interfaces, reading and writing from/to data in the head as well as from/to non-linear paged buffers is supported. Data slices through the bpf_dynptr_data API are not supported; instead bpf_dynptr_slice() and bpf_dynptr_slice_rdwr() (added in subsequent commit) should be used. For examples of how skb dynptrs can be used, please see the attached selftests. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Link: https://lore.kernel.org/r/20230301154953.641654-8-joannelkoong@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Müller <deso@posteo.net>	2023-03-06 09:47:37 -08:00
Andrii Nakryiko	37922c6fb2	sync: add sync process documentation at SYNC.md Explain sync setup expectations, necessary steps, common gotchas and necessary manual adjustments. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-02-28 09:22:25 -08:00
Yonghong Song	19cd9a1d4b	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 951bce29c8988209cc359e1fa35a4aaa35542fd5 Checkpoint bpf-next commit: c8ee37bde4021a275d2e4f33bd48d54912bb00c4 Baseline bpf commit: 3a70e0d4c9d74cb00f7c0ec022f5599f9f7ba07d Checkpoint bpf commit: 2d311f480b52eeb2e1fd432d64b78d82952c3808 Ilya Leoshkevich (1): libbpf: Document bpf_{btf,link,map,prog}_get_info_by_fd() Puranjay Mohan (1): libbpf: Fix arm syscall regs spec in bpf_tracing.h Rob Herring (1): perf: Add perf_event_attr::config3 Tariq Toukan (1): netdev-genl: fix repeated typo oflloading -> offloading Tiezhu Yang (1): libbpf: Use struct user_pt_regs to define __PT_REGS_CAST() for LoongArch Yonghong Song (1): libbpf: Fix bpf_xdp_query() in old kernels include/uapi/linux/netdev.h \| 2 +- include/uapi/linux/perf_event.h \| 3 ++ src/bpf.h \| 69 ++++++++++++++++++++++++++++++--- src/bpf_tracing.h \| 3 ++ src/netlink.c \| 8 +++- 5 files changed, 78 insertions(+), 7 deletions(-) Signed-off-by: Yonghong Song <yhs@fb.com>	2023-02-28 09:17:25 -08:00
Tariq Toukan	a6c64dbfa2	netdev-genl: fix repeated typo oflloading -> offloading Fix a repeated copy/paste typo. Fixes: d3d854fd6a1d ("netdev-genl: create a simple family for netdev stuff") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Acked-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-02-28 09:17:25 -08:00
Yonghong Song	0d7ac28818	libbpf: Fix bpf_xdp_query() in old kernels Commit 04d58f1b26a4("libbpf: add API to get XDP/XSK supported features") added feature_flags to struct bpf_xdp_query_opts. If a user uses bpf_xdp_query_opts with feature_flags member, the bpf_xdp_query() will check whether 'netdev' family exists or not in the kernel. If it does not exist, the bpf_xdp_query() will return -ENOENT. But 'netdev' family does not exist in old kernels as it is introduced in the same patch set as Commit 04d58f1b26a4. So old kernel with newer libbpf won't work properly with bpf_xdp_query() api call. To fix this issue, if the return value of libbpf_netlink_resolve_genl_family_id() is -ENOENT, bpf_xdp_query() will just return 0, skipping the rest of xdp feature query. This preserves backward compatibility. Fixes: 04d58f1b26a4 ("libbpf: add API to get XDP/XSK supported features") Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230227224943.1153459-1-yhs@fb.com	2023-02-28 09:17:25 -08:00
Ilya Leoshkevich	3fdc11b883	libbpf: Document bpf_{btf,link,map,prog}_get_info_by_fd() Replace the short informal description with the proper doc comments. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230220234958.764997-1-iii@linux.ibm.com	2023-02-28 09:17:25 -08:00
Puranjay Mohan	e198fdc928	libbpf: Fix arm syscall regs spec in bpf_tracing.h The syscall register definitions for ARM in bpf_tracing.h doesn't define the fifth parameter for the syscalls. Because of this some KPROBES based selftests fail to compile for ARM architecture. Define the fifth parameter that is passed in the R5 register (uregs[4]). Fixes: 3a95c42d65d5 ("libbpf: Define arm syscall regs spec in bpf_tracing.h") Signed-off-by: Puranjay Mohan <puranjay12@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230223095346.10129-1-puranjay12@gmail.com	2023-02-28 09:17:25 -08:00
Tiezhu Yang	e114bd2657	libbpf: Use struct user_pt_regs to define __PT_REGS_CAST() for LoongArch LoongArch provides struct user_pt_regs instead of struct pt_regs to userspace, use struct user_pt_regs to define __PT_REGS_CAST() to fix the following build error: CLNG-BPF [test_maps] loop1.bpf.o progs/loop1.c:22:9: error: incomplete definition of type 'struct pt_regs' m = PT_REGS_RC(ctx); ^~~~~~~~~~~~~~~ tools/testing/selftests/bpf/tools/include/bpf/bpf_tracing.h:493:41: note: expanded from macro 'PT_REGS_RC' #define PT_REGS_RC(x) (__PT_REGS_CAST(x)->__PT_RC_REG) ~~~~~~~~~~~~~~~~~^ tools/testing/selftests/bpf/tools/include/bpf/bpf_helper_defs.h:20:8: note: forward declaration of 'struct pt_regs' struct pt_regs; ^ 1 error generated. make: *** [Makefile:572: tools/testing/selftests/bpf/loop1.bpf.o] Error 1 make: Leaving directory 'tools/testing/selftests/bpf' Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1677235015-21717-2-git-send-email-yangtiezhu@loongson.cn	2023-02-28 09:17:25 -08:00
Rob Herring	bb0f8b32a5	perf: Add perf_event_attr::config3 Arm SPEv1.2 adds another 64-bits of event filtering control. As the existing perf_event_attr::configN fields are all used up for SPE PMU, an additional field is needed. Add a new 'config3' field. Tested-by: James Clark <james.clark@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220825-arm-spe-v8-7-v4-7-327f860daf28@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2023-02-28 09:17:25 -08:00
Andrii Nakryiko	f9106f6bac	ci: start using llvm-17 now LLVM 17 problems were fixed upstream, so switch to using latest v17 in CI. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-02-24 14:11:17 -08:00
Yonghong Song	7ef34fa945	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 6c20822fada1b8adb77fa450d03a0d449686a4a9 Checkpoint bpf-next commit: 951bce29c8988209cc359e1fa35a4aaa35542fd5 Baseline bpf commit: 6c20822fada1b8adb77fa450d03a0d449686a4a9 Checkpoint bpf commit: 3a70e0d4c9d74cb00f7c0ec022f5599f9f7ba07d Ilya Leoshkevich (2): libbpf: Introduce bpf_{btf,link,map,prog}_get_info_by_fd() libbpf: Use bpf_{btf,link,map,prog}_get_info_by_fd() Martin KaFai Lau (1): bpf: Add BPF_FIB_LOOKUP_SKIP_NEIGH for bpf_fib_lookup include/uapi/linux/bpf.h \| 6 ++++++ src/bpf.c \| 20 ++++++++++++++++++++ src/bpf.h \| 9 +++++++++ src/btf.c \| 8 ++++---- src/libbpf.c \| 14 +++++++------- src/libbpf.map \| 5 +++++ src/netlink.c \| 2 +- src/ringbuf.c \| 4 ++-- 8 files changed, 54 insertions(+), 14 deletions(-) Signed-off-by: Yonghong Song <yhs@fb.com>	2023-02-21 22:27:55 -08:00
Yonghong Song	7cfc12cb41	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions. Signed-off-by: Yonghong Song <yhs@fb.com>	2023-02-21 22:27:55 -08:00
Martin KaFai Lau	c16cae9381	bpf: Add BPF_FIB_LOOKUP_SKIP_NEIGH for bpf_fib_lookup The bpf_fib_lookup() also looks up the neigh table. This was done before bpf_redirect_neigh() was added. In the use case that does not manage the neigh table and requires bpf_fib_lookup() to lookup a fib to decide if it needs to redirect or not, the bpf prog can depend only on using bpf_redirect_neigh() to lookup the neigh. It also keeps the neigh entries fresh and connected. This patch adds a bpf_fib_lookup flag, SKIP_NEIGH, to avoid the double neigh lookup when the bpf prog always call bpf_redirect_neigh() to do the neigh lookup. The params->smac output is skipped together when SKIP_NEIGH is set because bpf_redirect_neigh() will figure out the smac also. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230217205515.3583372-1-martin.lau@linux.dev	2023-02-21 22:27:55 -08:00
Ilya Leoshkevich	768164af0e	libbpf: Use bpf_{btf,link,map,prog}_get_info_by_fd() Use the new type-safe wrappers around bpf_obj_get_info_by_fd(). Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230214231221.249277-3-iii@linux.ibm.com	2023-02-21 22:27:55 -08:00
Ilya Leoshkevich	30f6bc3c0a	libbpf: Introduce bpf_{btf,link,map,prog}_get_info_by_fd() These are type-safe wrappers around bpf_obj_get_info_by_fd(). They found one problem in selftests, and are also useful for adding Memory Sanitizer annotations. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230214231221.249277-2-iii@linux.ibm.com	2023-02-21 22:27:55 -08:00
Song Liu	ea28429902	ci: Remove xdp_info from ALLOWLIST-5.5.0 xdp_info now depends on newer functionalities. Let's skip it for 5.5.0 kernel. Signed-off-by: Song Liu <song@kernel.org>	2023-02-17 17:17:27 -08:00
Song Liu	34212c94a6	ci: regenerate vmlinux.h Regenerate latest vmlinux.h for old kernel CI tests. Signed-off-by: Song Liu <song@kernel.org>	2023-02-17 17:17:27 -08:00
Song Liu	6f1c8eddb2	sync: Add netdev.h from kernel tree Add netdev.h to include/uapi/linux to make build success. Signed-off-by: Song Liu <song@kernel.org>	2023-02-17 17:17:27 -08:00
Song Liu	4b492df97e	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: a5f6b9d577eba18601c14bba2dbff4a9b76af962 Checkpoint bpf-next commit: 6c20822fada1b8adb77fa450d03a0d449686a4a9 Baseline bpf commit: e8c8fd9b8393d7064152c8806f5ac446d760a23e Checkpoint bpf commit: 6c20822fada1b8adb77fa450d03a0d449686a4a9 Dave Marchevsky (1): bpf: Add basic bpf_rb_{root,node} support Florian Lehner (1): bpf: fix typo in header for bpf_perf_prog_read_value Grant Seltzer (2): libbpf: Fix malformed documentation formatting libbpf: Add documentation to map pinning API functions Hao Xiang (1): libbpf: Correctly set the kernel code version in Debian kernel. Ilya Leoshkevich (4): libbpf: Simplify barrier_var() libbpf: Fix unbounded memory access in bpf_usdt_arg() libbpf: Fix BPF_PROBE_READ{_STR}_INTO() on s390x libbpf: Fix alen calculation in libbpf_nla_dump_errormsg() Jon Doron (1): libbpf: Add sample_period to creation options Lorenzo Bianconi (3): libbpf: add the capability to specify netlink proto in libbpf_netlink_send_recv libbpf: add API to get XDP/XSK supported features libbpf: Always use libbpf_err to return an error in bpf_xdp_query() Randy Dunlap (1): Documentation: bpf: correct spelling Tiezhu Yang (1): tools/bpf: Use tab instead of white spaces to sync bpf.h docs/libbpf_naming_convention.rst \| 6 +- include/uapi/linux/bpf.h \| 17 ++++- src/bpf_core_read.h \| 4 +- src/bpf_helpers.h \| 2 +- src/libbpf.c \| 46 ++---------- src/libbpf.h \| 97 +++++++++++++++++++++--- src/libbpf_probes.c \| 83 +++++++++++++++++++++ src/netlink.c \| 118 +++++++++++++++++++++++++++--- src/nlattr.c \| 2 +- src/nlattr.h \| 12 +++ src/usdt.bpf.h \| 5 +- 11 files changed, 321 insertions(+), 71 deletions(-) Signed-off-by: Song Liu <song@kernel.org>	2023-02-17 17:17:27 -08:00
Song Liu	24476fe699	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions. Signed-off-by: Song Liu <song@kernel.org>	2023-02-17 17:17:27 -08:00
Dave Marchevsky	d74065659a	bpf: Add basic bpf_rb_{root,node} support This patch adds special BPF_RB_{ROOT,NODE} btf_field_types similar to BPF_LIST_{HEAD,NODE}, adds the necessary plumbing to detect the new types, and adds bpf_rb_root_free function for freeing bpf_rb_root in map_values. structs bpf_rb_root and bpf_rb_node are opaque types meant to obscure structs rb_root_cached rb_node, respectively. btf_struct_access will prevent BPF programs from touching these special fields automatically now that they're recognized. btf_check_and_fixup_fields now groups list_head and rb_root together as "graph root" fields and {list,rb}_node as "graph node", and does same ownership cycle checking as before. Note that this function does _not_ prevent ownership type mixups (e.g. rb_root owning list_node) - that's handled by btf_parse_graph_root. After this patch, a bpf program can have a struct bpf_rb_root in a map_value, but not add anything to nor do anything useful with it. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/r/20230214004017.2534011-2-davemarchevsky@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-02-17 17:17:27 -08:00
Ilya Leoshkevich	418962b686	libbpf: Fix alen calculation in libbpf_nla_dump_errormsg() The code assumes that everything that comes after nlmsgerr are nlattrs. When calculating their size, it does not account for the initial nlmsghdr. This may lead to accessing uninitialized memory. Fixes: bbf48c18ee0c ("libbpf: add error reporting in XDP") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230210001210.395194-8-iii@linux.ibm.com	2023-02-17 17:17:27 -08:00
Jon Doron	8c8243a409	libbpf: Add sample_period to creation options Add option to set when the perf buffer should wake up, by default the perf buffer becomes signaled for every event that is being pushed to it. In case of a high throughput of events it will be more efficient to wake up only once you have X events ready to be read. So your application can wakeup once and drain the entire perf buffer. Signed-off-by: Jon Doron <jond@wiz.io> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230207081916.3398417-1-arilou@gmail.com	2023-02-17 17:17:27 -08:00
Lorenzo Bianconi	6333ea6a3a	libbpf: Always use libbpf_err to return an error in bpf_xdp_query() In order to properly set errno, rely on libbpf_err utility routine in bpf_xdp_query() to return an error to the caller. Fixes: 04d58f1b26a4 ("libbpf: add API to get XDP/XSK supported features") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/827d40181f9f90fb37702f44328e1614df7c0503.1675768112.git.lorenzo@kernel.org	2023-02-17 17:17:27 -08:00
Hao Xiang	855bf91055	libbpf: Correctly set the kernel code version in Debian kernel. In a previous commit, Ubuntu kernel code version is correctly set by retrieving the information from /proc/version_signature. commit<5b3d72987701d51bf31823b39db49d10970f5c2d> (libbpf: Improve LINUX_VERSION_CODE detection) The /proc/version_signature file doesn't present in at least the older versions of Debian distributions (eg, Debian 9, 10). The Debian kernel has a similar issue where the release information from uname() syscall doesn't give the kernel code version that matches what the kernel actually expects. Below is an example content from Debian 10. release: 4.19.0-23-amd64 version: #1 SMP Debian 4.19.269-1 (2022-12-20) x86_64 Debian reports incorrect kernel version in utsname::release returned by uname() syscall, which in older kernels (Debian 9, 10) leads to kprobe BPF programs failing to load due to the version check mismatch. Fortunately, the correct kernel code version presents in the utsname::version returned by uname() syscall in Debian kernels. This change adds another get kernel version function to handle Debian in addition to the previously added get kernel version function to handle Ubuntu. Some minor refactoring work is also done to make the code more readable. Signed-off-by: Hao Xiang <hao.xiang@bytedance.com> Signed-off-by: Ho-Ren (Jack) Chuang <horenchuang@bytedance.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230203234842.2933903-1-hao.xiang@bytedance.com	2023-02-17 17:17:27 -08:00
Florian Lehner	5e0270f66e	bpf: fix typo in header for bpf_perf_prog_read_value Fix a simple typo in the documentation for bpf_perf_prog_read_value. Signed-off-by: Florian Lehner <dev@der-flo.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20230203121439.25884-1-dev@der-flo.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-02-17 17:17:27 -08:00
Lorenzo Bianconi	547881e04e	libbpf: add API to get XDP/XSK supported features Extend bpf_xdp_query routine in order to get XDP/XSK supported features of netdev over route netlink interface. Extend libbpf netlink implementation in order to support netlink_generic protocol. Co-developed-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Co-developed-by: Marek Majtyka <alardam@gmail.com> Signed-off-by: Marek Majtyka <alardam@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/a72609ef4f0de7fee5376c40dbf54ad7f13bfb8d.1675245258.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-02-17 17:17:27 -08:00
Lorenzo Bianconi	41b96a8c08	libbpf: add the capability to specify netlink proto in libbpf_netlink_send_recv This is a preliminary patch in order to introduce netlink_generic protocol support to libbpf. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/7878a54667e74afeec3ee519999c044bd514b44c.1675245258.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-02-17 17:17:27 -08:00
Tiezhu Yang	700d755151	tools/bpf: Use tab instead of white spaces to sync bpf.h Just silence the following build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/bpf.h' differs from latest version at 'include/uapi/linux/bpf.h' Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Link: https://lore.kernel.org/r/1675319486-27744-2-git-send-email-yangtiezhu@loongson.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-02-17 17:17:27 -08:00
Ilya Leoshkevich	981da2b380	libbpf: Fix BPF_PROBE_READ{_STR}_INTO() on s390x BPF_PROBE_READ_INTO() and BPF_PROBE_READ_STR_INTO() should map to bpf_probe_read() and bpf_probe_read_str() respectively in order to work correctly on architectures with !ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-24-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-02-17 17:17:27 -08:00
Ilya Leoshkevich	23898cf858	libbpf: Fix unbounded memory access in bpf_usdt_arg() Loading programs that use bpf_usdt_arg() on s390x fails with: ; if (arg_num >= BPF_USDT_MAX_ARG_CNT \|\| arg_num >= spec->arg_cnt) 128: (79) r1 = (u64 )(r10 -24) ; frame1: R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0 129: (25) if r1 > 0xb goto pc+83 ; frame1: R1_w=scalar(umax=11,var_off=(0x0; 0xf)) ... ; arg_spec = &spec->args[arg_num]; 135: (79) r1 = (u64 )(r10 -24) ; frame1: R1_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0 ... ; switch (arg_spec->arg_type) { 139: (61) r1 = (u32 )(r2 +8) R2 unbounded memory access, make sure to bounds check any such access The reason is that, even though the C code enforces that arg_num < BPF_USDT_MAX_ARG_CNT, the verifier cannot propagate this constraint to the arg_spec assignment yet. Help it by forcing r1 back to stack after comparison. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-23-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-02-17 17:17:27 -08:00
Ilya Leoshkevich	7285d529cf	libbpf: Simplify barrier_var() Use a single "+r" constraint instead of the separate "=r" and "0". Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-22-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-02-17 17:17:27 -08:00
Randy Dunlap	dd460a52bc	Documentation: bpf: correct spelling Correct spelling problems for Documentation/bpf/ as reported by codespell. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: bpf@vger.kernel.org Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20230128195046.13327-1-rdunlap@infradead.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-02-17 17:17:27 -08:00
Grant Seltzer	44c1d381ff	libbpf: Add documentation to map pinning API functions This adds documentation for the following API functions: - bpf_map__set_pin_path() - bpf_map__pin_path() - bpf_map__is_pinned() - bpf_map__pin() - bpf_map__unpin() - bpf_object__pin_maps() - bpf_object__unpin_maps() Signed-off-by: Grant Seltzer <grantseltzer@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230126024225.520685-1-grantseltzer@gmail.com	2023-02-17 17:17:27 -08:00
Grant Seltzer	522fe6f721	libbpf: Fix malformed documentation formatting This fixes the doxygen format documentation above the user_ring_buffer__* APIs. There has to be a newline before the @brief, otherwise doxygen won't render them for libbpf.readthedocs.org. Signed-off-by: Grant Seltzer <grantseltzer@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230126024749.522278-1-grantseltzer@gmail.com	2023-02-17 17:17:27 -08:00
Andrii Nakryiko	04aafdf9c9	ci: replicate BPF CI changes for clang installation Add ability to install specified version of Clang. This replicates what was done in https://github.com/libbpf/ci/pull/86. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-02-09 16:49:06 -08:00
Andrii Nakryiko	416620416f	sync: sync include/uapi/linux/openat2.h As reported in [0], we are missing openat2.h in libbpf-local UAPI headers. Sync it and adjust sync script to keep syncing it going forward. [0] https://github.com/libbpf/libbpf/issues/649 Closes: https://github.com/libbpf/libbpf/issues/649 Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-01-31 16:46:18 -08:00
Joanne Koong	6b4a3f3131	ci: Update default llvm version to 17 Currently, CI is unable to locate llvm-16 on aarch64/gcc, aarch64/llvm-16, and s390x/gcc [0]. This change upgrades the default llvm version to 17. [0] https://github.com/kernel-patches/bpf/actions/runs/4040302668 Signed-off-by: Joanne Koong <joannekoong@gmail.com>	2023-01-30 17:20:12 -08:00
Dave Marchevsky	d73ecc91e1	Add patch fixing s390 issues Signed-off-by: Dave Marchevsky <davemarchevsky@gmail.com>	2023-01-26 11:23:52 -08:00
Andrii Nakryiko	c2e797c8de	ci: temporarily denylist decap_sanity test It is mysteriously fails in CI, for now don't run it. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-01-25 20:44:09 -08:00
Andrii Nakryiko	f99818dd1a	libbpf: regenerate vmlinux.h Regenerate latest vmlinux.h for old kernel CI tests. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-01-25 20:44:09 -08:00
Andrii Nakryiko	b2e29a1026	libbpf: dump version to v1.2 in Makefile Bump LIBBPF_MINOR_VERSION to 2 for v1.2 dev cycle. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-01-25 20:44:09 -08:00
Andrii Nakryiko	e398e7eaf4	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 7b43df6c6ec38c9097420902a1c8165c4b25bf70 Checkpoint bpf-next commit: a5f6b9d577eba18601c14bba2dbff4a9b76af962 Baseline bpf commit: 54c3f1a81421f85e60ae2eaae7be3727a09916ee Checkpoint bpf commit: e8c8fd9b8393d7064152c8806f5ac446d760a23e Alexei Starovoitov (1): libbpf: Restore errno after pr_warn. Andrii Nakryiko (24): libbpf: start v1.2 development cycle libbpf: Add support for fetching up to 8 arguments in kprobes libbpf: Add 6th argument support for x86-64 in bpf_tracing.h libbpf: Fix arm and arm64 specs in bpf_tracing.h libbpf: Complete mips spec in bpf_tracing.h libbpf: Complete powerpc spec in bpf_tracing.h libbpf: Complete sparc spec in bpf_tracing.h libbpf: Complete riscv arch spec in bpf_tracing.h libbpf: Fix and complete ARC spec in bpf_tracing.h libbpf: Complete LoongArch (loongarch) spec in bpf_tracing.h libbpf: Add BPF_UPROBE and BPF_URETPROBE macro aliases libbpf: Improve syscall tracing support in bpf_tracing.h libbpf: Define x86-64 syscall regs spec in bpf_tracing.h libbpf: Define i386 syscall regs spec in bpf_tracing.h libbpf: Define s390x syscall regs spec in bpf_tracing.h libbpf: Define arm syscall regs spec in bpf_tracing.h libbpf: Define arm64 syscall regs spec in bpf_tracing.h libbpf: Define mips syscall regs spec in bpf_tracing.h libbpf: Define powerpc syscall regs spec in bpf_tracing.h libbpf: Define sparc syscall regs spec in bpf_tracing.h libbpf: Define riscv syscall regs spec in bpf_tracing.h libbpf: Define arc syscall regs spec in bpf_tracing.h libbpf: Define loongarch syscall regs spec in bpf_tracing.h libbpf: Clean up now not needed __PT_PARM{1-6}_SYSCALL_REG defaults Changbin Du (1): libbpf: Return -ENODATA for missing btf section Daniel T. Lee (1): libbpf: Fix invalid return address register in s390 David Vernet (1): libbpf: Support sleepable struct_ops.s section Hengqi Chen (1): libbpf: Add LoongArch support to bpf_tracing.h Ludovic L'Hours (1): libbpf: Fix map creation flags sanitization Menglong Dong (1): libbpf: Replace '.' with '_' in legacy kprobe event name Rong Tao (1): libbpf: Poison strlcpy() Stanislav Fomichev (1): bpf: Introduce device-bound XDP programs Xin Liu (2): libbpf: fix errno is overwritten after being closed. libbpf: Added the description of some API functions Ziyang Xuan (1): bpf: Add ipip6 and ip6ip decap support for bpf_skb_adjust_room() include/uapi/linux/bpf.h \| 12 ++ src/bpf_tracing.h \| 320 ++++++++++++++++++++++++++++++++++----- src/btf.c \| 2 +- src/libbpf.c \| 10 +- src/libbpf.h \| 29 +++- src/libbpf.map \| 3 + src/libbpf_internal.h \| 5 +- src/libbpf_version.h \| 2 +- 8 files changed, 341 insertions(+), 42 deletions(-) Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-01-25 20:44:09 -08:00
Andrii Nakryiko	c93ba3907f	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-01-25 20:44:09 -08:00
David Vernet	112479afb7	libbpf: Support sleepable struct_ops.s section In a prior change, the verifier was updated to support sleepable BPF_PROG_TYPE_STRUCT_OPS programs. A caller could set the program as sleepable with bpf_program__set_flags(), but it would be more ergonomic and more in-line with other sleepable program types if we supported suffixing a struct_ops section name with .s to indicate that it's sleepable. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20230125164735.785732-3-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-01-25 20:44:09 -08:00
Andrii Nakryiko	004ed7120b	libbpf: Clean up now not needed __PT_PARM{1-6}_SYSCALL_REG defaults Each architecture supports at least 6 syscall argument registers, so now that specs for each architecture is defined in bpf_tracing.h, remove unnecessary macro overrides, which previously were required to keep existing BPF_KSYSCALL() uses compiling and working. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230120200914.3008030-26-andrii@kernel.org	2023-01-25 20:44:09 -08:00
Andrii Nakryiko	97740e5103	libbpf: Define loongarch syscall regs spec in bpf_tracing.h Define explicit table of registers used for syscall argument passing. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230120200914.3008030-24-andrii@kernel.org	2023-01-25 20:44:09 -08:00
Andrii Nakryiko	ef191974b3	libbpf: Define arc syscall regs spec in bpf_tracing.h Define explicit table of registers used for syscall argument passing. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230120200914.3008030-23-andrii@kernel.org	2023-01-25 20:44:09 -08:00
Andrii Nakryiko	ed66fb297d	libbpf: Define riscv syscall regs spec in bpf_tracing.h Define explicit table of registers used for syscall argument passing. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Pu Lehui <pulehui@huawei.com> # RISC-V Link: https://lore.kernel.org/bpf/20230120200914.3008030-22-andrii@kernel.org	2023-01-25 20:44:09 -08:00
Andrii Nakryiko	2c58ba33fb	libbpf: Define sparc syscall regs spec in bpf_tracing.h Define explicit table of registers used for syscall argument passing. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230120200914.3008030-21-andrii@kernel.org	2023-01-25 20:44:09 -08:00
Andrii Nakryiko	9a6f8da473	libbpf: Define powerpc syscall regs spec in bpf_tracing.h Define explicit table of registers used for syscall argument passing. Note that 7th arg is supported on 32-bit powerpc architecture, by not on powerpc64. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230120200914.3008030-20-andrii@kernel.org	2023-01-25 20:44:09 -08:00
Andrii Nakryiko	a005bb2ff8	libbpf: Define mips syscall regs spec in bpf_tracing.h Define explicit table of registers used for syscall argument passing. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230120200914.3008030-19-andrii@kernel.org	2023-01-25 20:44:09 -08:00
Andrii Nakryiko	7f627a6202	libbpf: Define arm64 syscall regs spec in bpf_tracing.h Define explicit table of registers used for syscall argument passing. We need PT_REGS_PARM1_[CORE_]SYSCALL macros overrides, similarly to s390x, due to orig_x0 not being present in UAPI's pt_regs, so we need to utilize BPF CO-RE and custom pt_regs___arm64 definition. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Alan Maguire <alan.maguire@oracle.com> # arm64 Link: https://lore.kernel.org/bpf/20230120200914.3008030-18-andrii@kernel.org	2023-01-25 20:44:09 -08:00
Andrii Nakryiko	a095b4f04d	libbpf: Define arm syscall regs spec in bpf_tracing.h Define explicit table of registers used for syscall argument passing. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230120200914.3008030-17-andrii@kernel.org	2023-01-25 20:44:09 -08:00
Andrii Nakryiko	bd6e1ec311	libbpf: Define s390x syscall regs spec in bpf_tracing.h Define explicit table of registers used for syscall argument passing. Note that we need custom overrides for PT_REGS_PARM1_[CORE_]SYSCALL macros due to the need to use BPF CO-RE and custom local pt_regs definitions to fetch orig_gpr2, storing 1st argument. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Ilya Leoshkevich <iii@linux.ibm.com> # s390x Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/bpf/20230120200914.3008030-16-andrii@kernel.org	2023-01-25 20:44:09 -08:00
Andrii Nakryiko	b2d8a8d269	libbpf: Define i386 syscall regs spec in bpf_tracing.h Define explicit table of registers used for syscall argument passing. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230120200914.3008030-15-andrii@kernel.org	2023-01-25 20:44:09 -08:00
Andrii Nakryiko	df16188dc2	libbpf: Define x86-64 syscall regs spec in bpf_tracing.h Define explicit table of registers used for syscall argument passing. Remove now unnecessary overrides of PT_REGS_PARM5_[CORE_]SYSCALL macros. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230120200914.3008030-14-andrii@kernel.org	2023-01-25 20:44:09 -08:00
Andrii Nakryiko	672401ae09	libbpf: Improve syscall tracing support in bpf_tracing.h Set up generic support in bpf_tracing.h for up to 7 syscall arguments tracing with BPF_KSYSCALL, which seems to be the limit according to syscall(2) manpage. Also change the way that syscall convention is specified to be more explicit. Subsequent patches will adjust and define proper per-architecture syscall conventions. __PT_PARM1_SYSCALL_REG through __PT_PARM6_SYSCALL_REG is added temporarily to keep everything working before each architecture has syscall reg tables defined. They will be removed afterwards. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Alan Maguire <alan.maguire@oracle.com> # arm64 Link: https://lore.kernel.org/bpf/20230120200914.3008030-13-andrii@kernel.org	2023-01-25 20:44:09 -08:00
Andrii Nakryiko	ed8b4c90ea	libbpf: Add BPF_UPROBE and BPF_URETPROBE macro aliases Add BPF_UPROBE and BPF_URETPROBE macros, aliased to BPF_KPROBE and BPF_KRETPROBE, respectively. This makes uprobe-based BPF program code much less confusing, especially to people new to tracing, at no cost in terms of maintainability. We'll use this macro in selftests in subsequent patch. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230120200914.3008030-11-andrii@kernel.org	2023-01-25 20:44:09 -08:00
Andrii Nakryiko	2094e1b37e	libbpf: Complete LoongArch (loongarch) spec in bpf_tracing.h Add PARM6 through PARM8 definitions. Add kernel docs link describing ABI for LoongArch. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230120200914.3008030-10-andrii@kernel.org	2023-01-25 20:44:09 -08:00
Andrii Nakryiko	c978366c38	libbpf: Fix and complete ARC spec in bpf_tracing.h Add PARM6 through PARM8 definitions. Also fix frame pointer (FP) register definition. Also leave a link to where to find ABI spec. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230120200914.3008030-9-andrii@kernel.org	2023-01-25 20:44:09 -08:00
Andrii Nakryiko	9db84de5f0	libbpf: Complete riscv arch spec in bpf_tracing.h Add PARM6 through PARM8 definitions for RISC V (riscv) arch. Leave the link for ABI doc for future reference. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Pu Lehui <pulehui@huawei.com> # RISC-V Link: https://lore.kernel.org/bpf/20230120200914.3008030-8-andrii@kernel.org	2023-01-25 20:44:09 -08:00
Andrii Nakryiko	ffbc84cf6f	libbpf: Complete sparc spec in bpf_tracing.h Add PARM6 definition for sparc architecture. Leave a link to calling convention documentation. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230120200914.3008030-7-andrii@kernel.org	2023-01-25 20:44:09 -08:00
Andrii Nakryiko	7b86294a90	libbpf: Complete powerpc spec in bpf_tracing.h Add definitions of PARM6 through PARM8 for powerpc architecture. Add also a link to a functiona call sequence documentation for future reference. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230120200914.3008030-6-andrii@kernel.org	2023-01-25 20:44:09 -08:00
Andrii Nakryiko	31e29d9346	libbpf: Complete mips spec in bpf_tracing.h Add registers for PARM6 through PARM8. Add a link to an ABI. We don't distinguish between O32, N32, and N64, so document that we assume N64 right now. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230120200914.3008030-5-andrii@kernel.org	2023-01-25 20:44:09 -08:00
Andrii Nakryiko	1b48e879a4	libbpf: Fix arm and arm64 specs in bpf_tracing.h Remove invalid support for PARM5 on 32-bit arm, as per ABI. Add three more argument registers for arm64. Also leave links to ABI specs for future reference. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Alan Maguire <alan.maguire@oracle.com> # arm64 Link: https://lore.kernel.org/bpf/20230120200914.3008030-4-andrii@kernel.org	2023-01-25 20:44:09 -08:00
Andrii Nakryiko	4759a83309	libbpf: Add 6th argument support for x86-64 in bpf_tracing.h Add r9 as register containing 6th argument on x86-64 architecture, as per its ABI. Add also a link to a page describing ABI for easier future reference. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230120200914.3008030-3-andrii@kernel.org	2023-01-25 20:44:09 -08:00
Andrii Nakryiko	c94a3fd806	libbpf: Add support for fetching up to 8 arguments in kprobes Add BPF_KPROBE() and PT_REGS_PARMx() support for up to 8 arguments, if target architecture supports this. Currently all architectures are limited to only 5 register-placed arguments, which is limiting even on x86-64. This patch adds generic macro machinery to support up to 8 arguments both when explicitly fetching it from pt_regs through PT_REGS_PARMx() macros, as well as more ergonomic access in BPF_KPROBE(). Also, for i386 architecture we now don't have to define fake PARM4 and PARM5 definitions, they will be generically substituted, just like for PARM6 through PARM8. Subsequent patches will fill out architecture-specific definitions, where appropriate. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Alan Maguire <alan.maguire@oracle.com> # arm64 Tested-by: Ilya Leoshkevich <iii@linux.ibm.com> # s390x Link: https://lore.kernel.org/bpf/20230120200914.3008030-2-andrii@kernel.org	2023-01-25 20:44:09 -08:00
Stanislav Fomichev	7d68fca99c	bpf: Introduce device-bound XDP programs New flag BPF_F_XDP_DEV_BOUND_ONLY plus all the infra to have a way to associate a netdev with a BPF program at load time. netdevsim checks are dropped in favor of generic check in dev_xdp_attach. Cc: John Fastabend <john.fastabend@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Willem de Bruijn <willemb@google.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Anatoly Burakov <anatoly.burakov@intel.com> Cc: Alexander Lobakin <alexandr.lobakin@intel.com> Cc: Magnus Karlsson <magnus.karlsson@gmail.com> Cc: Maryam Tahhan <mtahhan@redhat.com> Cc: xdp-hints@xdp-project.net Cc: netdev@vger.kernel.org Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20230119221536.3349901-6-sdf@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-01-25 20:44:09 -08:00
Ziyang Xuan	ed09f7e65b	bpf: Add ipip6 and ip6ip decap support for bpf_skb_adjust_room() Add ipip6 and ip6ip decap support for bpf_skb_adjust_room(). Main use case is for using cls_bpf on ingress hook to decapsulate IPv4 over IPv6 and IPv6 over IPv4 tunnel packets. Add two new flags BPF_F_ADJ_ROOM_DECAP_L3_IPV{4,6} to indicate the new IP header version after decapsulating the outer IP header. Suggested-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/b268ec7f0ff9431f4f43b1b40ab856ebb28cb4e1.1673574419.git.william.xuanziyang@huawei.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-01-25 20:44:09 -08:00
Menglong Dong	49e950dcfa	libbpf: Replace '.' with '_' in legacy kprobe event name '.' is not allowed in the event name of kprobe. Therefore, we will get a EINVAL if the kernel function name has a '.' in legacy kprobe attach case, such as 'icmp_reply.constprop.0'. In order to adapt this case, we need to replace the '.' with other char in gen_kprobe_legacy_event_name(). And I use '_' for this propose. Signed-off-by: Menglong Dong <imagedong@tencent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/20230113093427.1666466-1-imagedong@tencent.com	2023-01-25 20:44:09 -08:00
Ludovic L'Hours	ce8d078ac7	libbpf: Fix map creation flags sanitization As BPF_F_MMAPABLE flag is now conditionnaly set (by map_is_mmapable), it should not be toggled but disabled if not supported by kernel. Fixes: 4fcac46c7e10 ("libbpf: only add BPF_F_MMAPABLE flag for data maps with global vars") Signed-off-by: Ludovic L'Hours <ludovic.lhours@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230108182018.24433-1-ludovic.lhours@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-01-25 20:44:09 -08:00
Rong Tao	42d77b062c	libbpf: Poison strlcpy() Since commit 9fc205b413b3("libbpf: Add sane strncpy alternative and use it internally") introduce libbpf_strlcpy(), thus add strlcpy() to a poison list to prevent accidental use of it. Signed-off-by: Rong Tao <rongtao@cestc.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/tencent_5695A257C4D16B4413036BA1DAACDECB0B07@qq.com	2023-01-25 20:44:09 -08:00
Changbin Du	d572b6359e	libbpf: Return -ENODATA for missing btf section As discussed before, return -ENODATA (No data available) would be more meaningful than ENOENT (No such file or directory). Suggested-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Changbin Du <changbin.du@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20221231151436.6541-1-changbin.du@gmail.com	2023-01-25 20:44:09 -08:00
Hengqi Chen	f758104b07	libbpf: Add LoongArch support to bpf_tracing.h Add PT_REGS macros for LoongArch ([0]). [0]: https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Huacai Chen <chenhuacai@loongson.cn> Link: https://lore.kernel.org/bpf/20221231100757.3177034-1-hengqi.chen@gmail.com	2023-01-25 20:44:09 -08:00
Alexei Starovoitov	b92963bbe2	libbpf: Restore errno after pr_warn. pr_warn calls into user-provided callback, which can clobber errno, so `errno = saved_errno` should happen after pr_warn. Fixes: 07453245620c ("libbpf: fix errno is overwritten after being closed.") Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-01-25 20:44:09 -08:00
Xin Liu	34fadd0fbe	libbpf: Added the description of some API functions Currently, many API functions are not described in the document. Add add API description of the following four API functions: - libbpf_set_print; - bpf_object__open; - bpf_object__load; - bpf_object__close. Signed-off-by: Xin Liu <liuxin350@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20221224112058.12038-1-liuxin350@huawei.com	2023-01-25 20:44:09 -08:00
Daniel T. Lee	09f1324bd7	libbpf: Fix invalid return address register in s390 There is currently an invalid register mapping in the s390 return address register. As the manual[1] states, the return address can be found at r14. In bpf_tracing.h, the s390 registers were named gprs(general purpose registers). This commit fixes the problem by correcting the mistyped mapping. [1]: https://uclibc.org/docs/psABI-s390x.pdf#page=14 Fixes: 3cc31d794097 ("libbpf: Normalize PT_REGS_xxx() macro definitions") Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20221224071527.2292-7-danieltimlee@gmail.com	2023-01-25 20:44:09 -08:00
Xin Liu	8cd371816b	libbpf: fix errno is overwritten after being closed. In the ensure_good_fd function, if the fcntl function succeeds but the close function fails, ensure_good_fd returns a normal fd and sets errno, which may cause users to misunderstand. The close failure is not a serious problem, and the correct FD has been handed over to the upper-layer application. Let's restore errno here. Signed-off-by: Xin Liu <liuxin350@huawei.com> Link: https://lore.kernel.org/r/20221223133618.10323-1-liuxin350@huawei.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-01-25 20:44:09 -08:00
Andrii Nakryiko	7d075a739e	libbpf: start v1.2 development cycle Bump current version for new development cycle to v1.2. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20221221180049.853365-1-andrii@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-01-25 20:44:09 -08:00
Quentin Monnet	3423d5e7cd	sync: Remove "git format-patch" signature (version) from cover letter When syncing with the kernel, the script generates a cover letter for the latest changes using "git format-patch". Unless specified otherwise, it uses a signature (as in, email footer signature) which defaults to the Git version in use, and ends up in the commit logs. This doesn't bring any useful information in there: let's get rid of this version number. Signed-off-by: Quentin Monnet <quentin@isovalent.com>	2023-01-04 09:23:49 -08:00
Daniel Müller	e3a40329bb	ci: Add patch setting CONFIG_FUNCTION_ERROR_INJECTION in CI Similar to what we did for vmtest [0], libbpf needs the patch setting CONFIG_FUNCTION_ERROR_INJECTION in CI. Add it. [0] https://github.com/kernel-patches/vmtest/pull/181 Signed-off-by: Daniel Müller <deso@posteo.net>	2022-12-21 12:11:32 -08:00
Andrii Nakryiko	6597330c45	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 0e43662e61f2569500ab83b8188c065603530785 Checkpoint bpf-next commit: 7b43df6c6ec38c9097420902a1c8165c4b25bf70 Baseline bpf commit: f506439ec3dee11e0e77b0a1f3fb3eec22c97873 Checkpoint bpf commit: 54c3f1a81421f85e60ae2eaae7be3727a09916ee Changbin Du (1): libbpf: Show error info about missing ".BTF" section Christian Ehrig (1): bpf: Add flag BPF_F_NO_TUNNEL_KEY to bpf_skb_set_tunnel_key() Khem Raj (1): libbpf: Fix build warning on ref_ctr_off for 32-bit architectures include/uapi/linux/bpf.h \| 4 ++++ src/btf.c \| 1 + src/libbpf.c \| 2 +- 3 files changed, 6 insertions(+), 1 deletion(-) -- 2.30.2 Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-12-20 22:23:18 -08:00
Andrii Nakryiko	2e287cd201	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-12-20 22:23:18 -08:00
Changbin Du	49bd40e869	libbpf: Show error info about missing ".BTF" section Show the real problem instead of just saying "No such file or directory". Now will print below info: libbpf: failed to find '.BTF' ELF section in /home/changbin/work/linux/vmlinux Error: failed to load BTF from /home/changbin/work/linux/vmlinux: No such file or directory Signed-off-by: Changbin Du <changbin.du@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20221217223509.88254-2-changbin.du@gmail.com	2022-12-20 22:23:18 -08:00
Khem Raj	f7dba2c313	libbpf: Fix build warning on ref_ctr_off for 32-bit architectures Clang warns on 32-bit ARM on this comparision: libbpf.c:10497:18: error: result of comparison of constant 4294967296 with expression of type 'size_t' (aka 'unsigned int') is always false [-Werror,-Wtautological-constant-out-of-range-compare] if (ref_ctr_off >= (1ULL << PERF_UPROBE_REF_CTR_OFFSET_BITS)) ~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Typecast ref_ctr_off to __u64 in the check conditional, it is false on 32bit anyways. Signed-off-by: Khem Raj <raj.khem@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20221219191526.296264-1-raj.khem@gmail.com	2022-12-20 22:23:18 -08:00
Christian Ehrig	41ac436073	bpf: Add flag BPF_F_NO_TUNNEL_KEY to bpf_skb_set_tunnel_key() This patch allows to remove TUNNEL_KEY from the tunnel flags bitmap when using bpf_skb_set_tunnel_key by providing a BPF_F_NO_TUNNEL_KEY flag. On egress, the resulting tunnel header will not contain a tunnel key if the protocol and implementation supports it. At the moment bpf_tunnel_key wants a user to specify a numeric tunnel key. This will wrap the inner packet into a tunnel header with the key bit and value set accordingly. This is problematic when using a tunnel protocol that supports optional tunnel keys and a receiving tunnel device that is not expecting packets with the key bit set. The receiver won't decapsulate and drop the packet. RFC 2890 and RFC 2784 GRE tunnels are examples where this flag is useful. It allows for generating packets, that can be decapsulated by a GRE tunnel device not operating in collect metadata mode or not expecting the key bit set. Signed-off-by: Christian Ehrig <cehrig@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20221218051734.31411-1-cehrig@cloudflare.com	2022-12-20 22:23:18 -08:00
Andrii Nakryiko	75987cc295	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: b148c8b9b926e257a59c8eb2cd6fa3adfd443254 Checkpoint bpf-next commit: 0e43662e61f2569500ab83b8188c065603530785 Baseline bpf commit: 4121d4481b72501aa4d22680be4ea1096d69d133 Checkpoint bpf commit: f506439ec3dee11e0e77b0a1f3fb3eec22c97873 Andrii Nakryiko (1): libbpf: Fix btf_dump's packed struct determination src/btf_dump.c \| 33 ++++++--------------------------- 1 file changed, 6 insertions(+), 27 deletions(-) -- 2.30.2 Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-12-15 14:34:52 -08:00
Andrii Nakryiko	b9f1a06c70	libbpf: Fix btf_dump's packed struct determination Fix bug in btf_dump's logic of determining if a given struct type is packed or not. The notion of "natural alignment" is not needed and is even harmful in this case, so drop it altogether. The biggest difference in btf_is_struct_packed() compared to its original implementation is that we don't really use btf__align_of() to determine overall alignment of a struct type (because it could be 1 for both packed and non-packed struct, depending on specifci field definitions), and just use field's actual alignment to calculate whether any field is requiring packing or struct's size overall necessitates packing. Add two simple test cases that demonstrate the difference this change would make. Fixes: ea2ce1ba99aa ("libbpf: Fix BTF-to-C converter's padding logic") Reported-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20221215183605.4149488-1-andrii@kernel.org	2022-12-15 14:34:52 -08:00
Andrii Nakryiko	30554b08fe	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 706819495921ddad6b3780140b9d9e9293b6dedc Checkpoint bpf-next commit: b148c8b9b926e257a59c8eb2cd6fa3adfd443254 Baseline bpf commit: e931a173a685fe213127ae5aa6b7f2196c1d875d Checkpoint bpf commit: 4121d4481b72501aa4d22680be4ea1096d69d133 Andrii Nakryiko (4): libbpf: Fix single-line struct definition output in btf_dump libbpf: Handle non-standardly sized enums better in BTF-to-C dumper libbpf: Fix btf__align_of() by taking into account field offsets libbpf: Fix BTF-to-C converter's padding logic Eyal Birger (1): tools: add IFLA_XFRM_COLLECT_METADATA to uapi/linux/if_link.h Kumar Kartikeya Dwivedi (1): bpf: Rework process_dynptr_func Timo Hunziker (1): libbpf: Parse usdt args without offset on x86 (e.g. 8@(%rsp)) Xin Liu (1): libbpf: Optimized return value in libbpf_strerror when errno is libbpf errno include/uapi/linux/bpf.h \| 8 +- include/uapi/linux/if_link.h \| 1 + src/btf.c \| 13 +++ src/btf_dump.c \| 214 +++++++++++++++++++++++++++-------- src/libbpf_errno.c \| 16 ++- src/usdt.c \| 8 ++ 6 files changed, 204 insertions(+), 56 deletions(-) -- 2.30.2 Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-12-14 22:09:00 -08:00
Andrii Nakryiko	b0ff8e90f7	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-12-14 22:09:00 -08:00
Andrii Nakryiko	0b80970cb6	libbpf: Fix BTF-to-C converter's padding logic Turns out that btf_dump API doesn't handle a bunch of tricky corner cases, as reported by Per, and further discovered using his testing Python script ([0]). This patch revamps btf_dump's padding logic significantly, making it more correct and also avoiding unnecessary explicit padding, where compiler would pad naturally. This overall topic turned out to be very tricky and subtle, there are lots of subtle corner cases. The comments in the code tries to give some clues, but comments themselves are supposed to be paired with good understanding of C alignment and padding rules. Plus some experimentation to figure out subtle things like whether `long :0;` means that struct is now forced to be long-aligned (no, it's not, turns out). Anyways, Per's script, while not completely correct in some known situations, doesn't show any obvious cases where this logic breaks, so this is a nice improvement over the previous state of this logic. Some selftests had to be adjusted to accommodate better use of natural alignment rules, eliminating some unnecessary padding, or changing it to `type: 0;` alignment markers. Note also that for when we are in between bitfields, we emit explicit bit size, while otherwise we use `: 0`, this feels much more natural in practice. Next patch will add few more test cases, found through randomized Per's script. [0] https://lore.kernel.org/bpf/85f83c333f5355c8ac026f835b18d15060725fcb.camel@ericsson.com/ Reported-by: Per Sundström XP <per.xp.sundstrom@ericsson.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20221212211505.558851-6-andrii@kernel.org	2022-12-14 22:09:00 -08:00
Andrii Nakryiko	58b164237a	libbpf: Fix btf__align_of() by taking into account field offsets btf__align_of() is supposed to be return alignment requirement of a requested BTF type. For STRUCT/UNION it doesn't always return correct value, because it calculates alignment only based on field types. But for packed structs this is not enough, we need to also check field offsets and struct size. If field offset isn't aligned according to field type's natural alignment, then struct must be packed. Similarly, if struct size is not a multiple of struct's natural alignment, then struct must be packed as well. This patch fixes this issue precisely by additionally checking these conditions. Fixes: 3d208f4ca111 ("libbpf: Expose btf__align_of() API") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20221212211505.558851-5-andrii@kernel.org	2022-12-14 22:09:00 -08:00
Andrii Nakryiko	e6e0e3fd85	libbpf: Handle non-standardly sized enums better in BTF-to-C dumper Turns out C allows to force enum to be 1-byte or 8-byte explicitly using mode(byte) or mode(word), respecticely. Linux sources are using this in some cases. This is imporant to handle correctly, as enum size determines corresponding fields in a struct that use that enum type. And if enum size is incorrect, this will lead to invalid struct layout. So add mode(byte) and mode(word) attribute support to btf_dump APIs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20221212211505.558851-3-andrii@kernel.org	2022-12-14 22:09:00 -08:00
Andrii Nakryiko	db11704944	libbpf: Fix single-line struct definition output in btf_dump btf_dump APIs emit unnecessary tabs when emitting struct/union definition that fits on the single line. Before this patch we'd get: struct blah {<tab>}; This patch fixes this and makes sure that we get more natural: struct blah {}; Fixes: 44a726c3f23c ("bpftool: Print newline before '}' for struct with padding only fields") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20221212211505.558851-2-andrii@kernel.org	2022-12-14 22:09:00 -08:00
Xin Liu	8d719b0c08	libbpf: Optimized return value in libbpf_strerror when errno is libbpf errno This is a small improvement in libbpf_strerror. When libbpf_strerror is used to obtain the system error description, if the length of the buf is insufficient, libbpf_sterror returns ERANGE and sets errno to ERANGE. However, this processing is not performed when the error code customized by libbpf is obtained. Make some minor improvements here, return -ERANGE and set errno to ERANGE when buf is not enough for custom description. Signed-off-by: Xin Liu <liuxin350@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20221210082045.233697-1-liuxin350@huawei.com	2022-12-14 22:09:00 -08:00
Kumar Kartikeya Dwivedi	6b90604fa7	bpf: Rework process_dynptr_func Recently, user ringbuf support introduced a PTR_TO_DYNPTR register type for use in callback state, because in case of user ringbuf helpers, there is no dynptr on the stack that is passed into the callback. To reflect such a state, a special register type was created. However, some checks have been bypassed incorrectly during the addition of this feature. First, for arg_type with MEM_UNINIT flag which initialize a dynptr, they must be rejected for such register type. Secondly, in the future, there are plans to add dynptr helpers that operate on the dynptr itself and may change its offset and other properties. In all of these cases, PTR_TO_DYNPTR shouldn't be allowed to be passed to such helpers, however the current code simply returns 0. The rejection for helpers that release the dynptr is already handled. For fixing this, we take a step back and rework existing code in a way that will allow fitting in all classes of helpers and have a coherent model for dealing with the variety of use cases in which dynptr is used. First, for ARG_PTR_TO_DYNPTR, it can either be set alone or together with a DYNPTR_TYPE_* constant that denotes the only type it accepts. Next, helpers which initialize a dynptr use MEM_UNINIT to indicate this fact. To make the distinction clear, use MEM_RDONLY flag to indicate that the helper only operates on the memory pointed to by the dynptr, not the dynptr itself. In C parlance, it would be equivalent to taking the dynptr as a point to const argument. When either of these flags are not present, the helper is allowed to mutate both the dynptr itself and also the memory it points to. Currently, the read only status of the memory is not tracked in the dynptr, but it would be trivial to add this support inside dynptr state of the register. With these changes and renaming PTR_TO_DYNPTR to CONST_PTR_TO_DYNPTR to better reflect its usage, it can no longer be passed to helpers that initialize a dynptr, i.e. bpf_dynptr_from_mem, bpf_ringbuf_reserve_dynptr. A note to reviewers is that in code that does mark_stack_slots_dynptr, and unmark_stack_slots_dynptr, we implicitly rely on the fact that PTR_TO_STACK reg is the only case that can reach that code path, as one cannot pass CONST_PTR_TO_DYNPTR to helpers that don't set MEM_RDONLY. In both cases such helpers won't be setting that flag. The next patch will add a couple of selftest cases to make sure this doesn't break. Fixes: 205715673844 ("bpf: Add bpf_user_ringbuf_drain() helper") Acked-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20221207204141.308952-4-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-12-14 22:09:00 -08:00
Timo Hunziker	74244c5bd7	libbpf: Parse usdt args without offset on x86 (e.g. 8@(%rsp)) Parse USDT arguments like "8@(%rsp)" on x86. These are emmited by SystemTap. The argument syntax is similar to the existing "memory dereference case" but the offset left out as it's zero (i.e. read the value from the address in the register). We treat it the same as the the "memory dereference case", but set the offset to 0. I've tested that this fixes the "unrecognized arg #N spec: 8@(%rsp).." error I've run into when attaching to a probe with such an argument. Attaching and reading the correct argument values works. Something similar might be needed for the other supported architectures. [0] Closes: https://github.com/libbpf/libbpf/issues/559 Signed-off-by: Timo Hunziker <timo.hunziker@gmx.ch> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20221203123746.2160-1-timo.hunziker@eclipso.ch	2022-12-14 22:09:00 -08:00
Eyal Birger	da08611c65	tools: add IFLA_XFRM_COLLECT_METADATA to uapi/linux/if_link.h Needed for XFRM metadata tests. Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Link: https://lore.kernel.org/r/20221203084659.1837829-4-eyal.birger@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2022-12-14 22:09:00 -08:00
Andrii Nakryiko	1e479aec4f	ci: don't run test_maps in libbpf CI It crashes often, it doesn't really test libbpf much. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-12-07 09:28:07 -08:00
Andrii Nakryiko	8846dc7a20	ci: fix Ubuntu version for kernel tests and pahole workflows Having too new build environment in workflows that build selftests on the host, but run them in a separate QEMU image can lead to problems with runtime linker complaining about missing new enough version of glibc and other dependencies. Until we update images, fix used Ubuntu version to ubuntu-20.04 to mitigate. Suggested-by: Manu Bretelle <chantr4@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-12-05 11:52:11 -08:00
Andrii Nakryiko	eb9b5c567d	sync: regenerate vmlinux.h Update checked in vmlinux.h for 5.5 and 4.9 kernels. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-12-02 22:12:29 -08:00
Andrii Nakryiko	be8f15bb93	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 5b1d640800de7fe02d68bf592d9d101de24c87f2 Checkpoint bpf-next commit: 706819495921ddad6b3780140b9d9e9293b6dedc Baseline bpf commit: 47df8a2f78bc34ff170d147d05b121f84e252b85 Checkpoint bpf commit: e931a173a685fe213127ae5aa6b7f2196c1d875d Alexei Starovoitov (1): selftests/bpf: Workaround for llvm nop-4 bug Andrii Nakryiko (2): libbpf: Ignore hashmap__find() result explicitly in btf_dump libbpf: Avoid enum forward-declarations in public API in C++ mode Donald Hunter (1): docs/bpf: Add table of BPF program types to libbpf docs Hou Tao (4): libbpf: Use page size as max_entries when probing ring buffer map libbpf: Handle size overflow for ringbuf mmap libbpf: Handle size overflow for user ringbuf mmap libbpf: Check the validity of size in user_ring_buffer__reserve() Ji Rongfeng (1): bpf: Update bpf_{g,s}etsockopt() documentation docs/index.rst \| 3 + docs/program_types.rst \| 203 +++++++++++++++++++++++++++++++++++++++ include/uapi/linux/bpf.h \| 23 +++-- src/bpf.h \| 7 ++ src/btf_dump.c \| 2 +- src/libbpf.c \| 3 +- src/libbpf_probes.c \| 2 +- src/ringbuf.c \| 26 +++-- 8 files changed, 250 insertions(+), 19 deletions(-) create mode 100644 docs/program_types.rst -- 2.30.2 Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-12-02 22:12:29 -08:00
Andrii Nakryiko	2bf5ed3a48	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-12-02 22:12:29 -08:00
Andrii Nakryiko	0fbf777e0b	libbpf: Avoid enum forward-declarations in public API in C++ mode C++ enum forward declarations are fundamentally not compatible with pure C enum definitions, and so libbpf's use of `enum bpf_stats_type;` forward declaration in libbpf/bpf.h public API header is causing C++ compilation issues. More details can be found in [0], but it comes down to C++ supporting enum forward declaration only with explicitly specified backing type: enum bpf_stats_type: int; In C (and I believe it's a GCC extension also), such forward declaration is simply: enum bpf_stats_type; Further, in Linux UAPI this enum is defined in pure C way: enum bpf_stats_type { BPF_STATS_RUN_TIME = 0; } And even though in both cases backing type is int, which can be confirmed by looking at DWARF information, for C++ compiler actual enum definition and forward declaration are incompatible. To eliminate this problem, for C++ mode define input argument as int, which makes enum unnecessary in libbpf public header. This solves the issue and as demonstrated by next patch doesn't cause any unwanted compiler warnings, at least with default warnings setting. [0] https://stackoverflow.com/questions/42766839/c11-enum-forward-causes-underlying-type-mismatch [1] Closes: https://github.com/libbpf/libbpf/issues/249 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20221130200013.2997831-1-andrii@kernel.org	2022-12-02 22:12:29 -08:00
Hou Tao	4d21c979ce	libbpf: Check the validity of size in user_ring_buffer__reserve() The top two bits of size are used as busy and discard flags, so reject the reservation that has any of these special bits in the size. With the addition of validity check, these is also no need to check whether or not total_size is overflowed. Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20221116072351.1168938-5-houtao@huaweicloud.com	2022-12-02 22:12:29 -08:00
Hou Tao	11ad834557	libbpf: Handle size overflow for user ringbuf mmap Similar with the overflow problem on ringbuf mmap, in user_ringbuf_map() 2 * max_entries may overflow u32 when mapping writeable region. Fixing it by casting the size of writable mmap region into a __u64 and checking whether or not there will be overflow during mmap. Fixes: b66ccae01f1d ("bpf: Add libbpf logic for user-space ring buffer") Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20221116072351.1168938-4-houtao@huaweicloud.com	2022-12-02 22:12:29 -08:00
Hou Tao	f056d1bd54	libbpf: Handle size overflow for ringbuf mmap The maximum size of ringbuf is 2GB on x86-64 host, so 2 * max_entries will overflow u32 when mapping producer page and data pages. Only casting max_entries to size_t is not enough, because for 32-bits application on 64-bits kernel the size of read-only mmap region also could overflow size_t. So fixing it by casting the size of read-only mmap region into a __u64 and checking whether or not there will be overflow during mmap. Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support") Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20221116072351.1168938-3-houtao@huaweicloud.com	2022-12-02 22:12:29 -08:00
Hou Tao	b822a139e3	libbpf: Use page size as max_entries when probing ring buffer map Using page size as max_entries when probing ring buffer map, else the probe may fail on host with 64KB page size (e.g., an ARM64 host). After the fix, the output of "bpftool feature" on above host will be correct. Before : eBPF map_type ringbuf is NOT available eBPF map_type user_ringbuf is NOT available After : eBPF map_type ringbuf is available eBPF map_type user_ringbuf is available Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20221116072351.1168938-2-houtao@huaweicloud.com	2022-12-02 22:12:29 -08:00
Ji Rongfeng	a5b4a53781	bpf: Update bpf_{g,s}etsockopt() documentation * append missing optnames to the end * simplify bpf_getsockopt()'s doc Signed-off-by: Ji Rongfeng <SikoJobs@outlook.com> Link: https://lore.kernel.org/r/DU0P192MB15479B86200B1216EC90E162D6099@DU0P192MB1547.EURP192.PROD.OUTLOOK.COM Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2022-12-02 22:12:29 -08:00
Donald Hunter	e84419ff5a	docs/bpf: Add table of BPF program types to libbpf docs Extend the libbpf documentation with a table of program types, attach points and ELF section names. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20221121121734.98329-1-donald.hunter@gmail.com	2022-12-02 22:12:29 -08:00
Alexei Starovoitov	ca515c0dda	selftests/bpf: Workaround for llvm nop-4 bug Currently LLVM fails to recognize .data.* as data section and defaults to .text section. Later BPF backend tries to emit 4-byte NOP instruction which doesn't exist in BPF ISA and aborts. The fix for LLVM is pending: https://reviews.llvm.org/D138477 While waiting for the fix lets workaround the linked_list test case by using .bss.* prefix which is properly recognized by LLVM as BSS section. Fix libbpf to support .bss. prefix and adjust tests. Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-12-02 22:12:29 -08:00
Andrii Nakryiko	95959419a7	libbpf: Ignore hashmap__find() result explicitly in btf_dump Coverity is reporting that btf_dump_name_dups() doesn't check return result of hashmap__find() call. This is intentional, so make it explicit with (void) cast. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20221117192824.4093553-1-andrii@kernel.org	2022-12-02 22:12:29 -08:00
Andrii Nakryiko	3c659715ec	sync: fix sync scripts commit_signature function After recent lint changes, commit_signature() function now gets optional array of paths as multiple arguments, instead of entire array as second argument. So adjust commit_signature() to handle this correctly. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-12-02 21:04:03 -08:00
Andrii Nakryiko	f46b17ef0e	sync: add Signed-off-by for auto-generated sync commits Now that we enforce Signed-off-by on every commit, make sure that auto-generatd sync commits also get corrected Signed-off-by tags. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-12-02 20:51:21 -08:00
Evgeny Vereshchagin	1596a09b5d	oss-fuzz: bump elfutils to make it less likely for the libbpf fuzz target to run into elfutils bugs that have been fixed upstream since two new fuzz targets were added there back in April. Signed-off-by: Evgeny Vereshchagin <evvers@ya.ru>	2022-11-18 13:54:40 -08:00
Kui-Feng Lee	5322b8e76c	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: b548b17a93fd18357a5a6f535c10c1e68719ad32 Checkpoint bpf-next commit: 5b1d640800de7fe02d68bf592d9d101de24c87f2 Baseline bpf commit: 9cbd48d5fa14e4c65f8580de16686077f7cea02b Checkpoint bpf commit: 47df8a2f78bc34ff170d147d05b121f84e252b85 David Michael (1): libbpf: Fix uninitialized warning in btf_dump_dump_type_data Jiri Olsa (1): libbpf: Use correct return pointer in attach_raw_tp Kang Minchul (3): libbpf: checkpatch: Fixed code alignments in btf.c libbpf: Fixed various checkpatch issues in libbpf.c libbpf: checkpatch: Fixed code alignments in ringbuf.c Kumar Kartikeya Dwivedi (1): bpf: Support bpf_list_head in map values include/uapi/linux/bpf.h \| 10 +++++++++ src/btf.c \| 5 +++-- src/btf_dump.c \| 2 +- src/libbpf.c \| 47 +++++++++++++++++++++++++--------------- src/ringbuf.c \| 4 ++-- 5 files changed, 45 insertions(+), 23 deletions(-) -- 2.30.2	2022-11-18 13:53:39 -08:00
Kui-Feng Lee	15bbaabed8	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2022-11-18 13:53:39 -08:00
Jiri Olsa	eb77c7210b	libbpf: Use correct return pointer in attach_raw_tp We need to pass '*link' to final libbpf_get_error, because that one holds the return value, not 'link'. Fixes: 4fa5bcfe07f7 ("libbpf: Allow BPF program auto-attach handlers to bail out") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20221114145257.882322-1-jolsa@kernel.org	2022-11-18 13:53:39 -08:00
Kumar Kartikeya Dwivedi	2557efc8e1	bpf: Support bpf_list_head in map values Add the support on the map side to parse, recognize, verify, and build metadata table for a new special field of the type struct bpf_list_head. To parameterize the bpf_list_head for a certain value type and the list_node member it will accept in that value type, we use BTF declaration tags. The definition of bpf_list_head in a map value will be done as follows: struct foo { struct bpf_list_node node; int data; }; struct map_value { struct bpf_list_head head __contains(foo, node); }; Then, the bpf_list_head only allows adding to the list 'head' using the bpf_list_node 'node' for the type struct foo. The 'contains' annotation is a BTF declaration tag composed of four parts, "contains:name:node" where the name is then used to look up the type in the map BTF, with its kind hardcoded to BTF_KIND_STRUCT during the lookup. The node defines name of the member in this type that has the type struct bpf_list_node, which is actually used for linking into the linked list. For now, 'kind' part is hardcoded as struct. This allows building intrusive linked lists in BPF, using container_of to obtain pointer to entry, while being completely type safe from the perspective of the verifier. The verifier knows exactly the type of the nodes, and knows that list helpers return that type at some fixed offset where the bpf_list_node member used for this list exists. The verifier also uses this information to disallow adding types that are not accepted by a certain list. For now, no elements can be added to such lists. Support for that is coming in future patches, hence draining and freeing items is done with a TODO that will be resolved in a future patch. Note that the bpf_list_head_free function moves the list out to a local variable under the lock and releases it, doing the actual draining of the list items outside the lock. While this helps with not holding the lock for too long pessimizing other concurrent list operations, it is also necessary for deadlock prevention: unless every function called in the critical section would be notrace, a fentry/fexit program could attach and call bpf_map_update_elem again on the map, leading to the same lock being acquired if the key matches and lead to a deadlock. While this requires some special effort on part of the BPF programmer to trigger and is highly unlikely to occur in practice, it is always better if we can avoid such a condition. While notrace would prevent this, doing the draining outside the lock has advantages of its own, hence it is used to also fix the deadlock related problem. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20221114191547.1694267-5-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-18 13:53:39 -08:00
Kang Minchul	9781b9eced	libbpf: checkpatch: Fixed code alignments in ringbuf.c Fixed some checkpatch issues in ringbuf.c Signed-off-by: Kang Minchul <tegongkang@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20221113190648.38556-4-tegongkang@gmail.com	2022-11-18 13:53:39 -08:00
Kang Minchul	4c3b53d09c	libbpf: Fixed various checkpatch issues in libbpf.c Fixed following checkpatch issues: WARNING: Block comments use a trailing / on a separate line + other BPF program's BTF object / WARNING: Possible repeated word: 'be' + name. This is important to be be able to find corresponding BTF ERROR: switch and case should be at the same indent + switch (ext->kcfg.sz) { + case 1: (__u8 )ext_val = value; break; + case 2: (__u16 )ext_val = value; break; + case 4: (__u32 )ext_val = value; break; + case 8: (__u64 )ext_val = value; break; + default: ERROR: trailing statements should be on next line + case 1: (__u8 )ext_val = value; break; ERROR: trailing statements should be on next line + case 2: (__u16 )ext_val = value; break; ERROR: trailing statements should be on next line + case 4: (__u32 )ext_val = value; break; ERROR: trailing statements should be on next line + case 8: (__u64 )ext_val = value; break; ERROR: code indent should use tabs where possible + }$ WARNING: please, no spaces at the start of a line + }$ WARNING: Block comments use a trailing / on a separate line + for faster search / ERROR: code indent should use tabs where possible +^I^I^I^I^I^I &ext->kcfg.is_signed);$ WARNING: braces {} are not necessary for single statement blocks + if (err) { + return err; + } ERROR: code indent should use tabs where possible +^I^I^I^I sizeof(obj->btf_modules), obj->btf_module_cnt + 1);$ Signed-off-by: Kang Minchul <tegongkang@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20221113190648.38556-3-tegongkang@gmail.com	2022-11-18 13:53:39 -08:00
Kang Minchul	7b18ff1212	libbpf: checkpatch: Fixed code alignments in btf.c Fixed some checkpatch issues in btf.c Signed-off-by: Kang Minchul <tegongkang@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20221113190648.38556-2-tegongkang@gmail.com	2022-11-18 13:53:39 -08:00
David Michael	c975797ebe	libbpf: Fix uninitialized warning in btf_dump_dump_type_data GCC 11.3.0 fails to compile btf_dump.c due to the following error, which seems to originate in btf_dump_struct_data where the returned value would be uninitialized if btf_vlen returns zero. btf_dump.c: In function ‘btf_dump_dump_type_data’: btf_dump.c:2363:12: error: ‘err’ may be used uninitialized in this function [-Werror=maybe-uninitialized] 2363 \| if (err < 0) \| ^ Fixes: 920d16af9b42 ("libbpf: BTF dumper support for typed data") Signed-off-by: David Michael <fedora.dm0@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Stanislav Fomichev <sdf@google.com> Acked-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/87zgcu60hq.fsf@gmail.com	2022-11-18 13:53:39 -08:00
Manu Bretelle	9167308b4a	ci: remove s390x-self-hosted-builder from libbpf/libbpf Those were moved to libbpf/ci: https://github.com/libbpf/ci/tree/master/rootfs/s390x-self-hosted-builder Signed-off-by: Manu Bretelle <chantr4@gmail.com>	2022-11-16 13:58:37 -08:00
Manu Bretelle	7049d3a2ea	ci: Use `s390x` label to schedule workflows on s390x The runners are having their labels uniformized across architecture. z15 is being removed in favor of s390x. Signed-off-by: Manu Bretelle <chantr4@gmail.com>	2022-11-16 13:55:31 -08:00
Andrii Nakryiko	ea931ec6c5	ci: drop LGTM integration LGTM is deprecated, remove it. We have CodeQL now. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-11-16 12:17:40 -08:00
Andrii Nakryiko	3a73d6f865	readme: replace LGTM badge with CodeQL badge LGTM is going to be removed, CodeQL is supposed to be a replacement. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-11-16 12:17:40 -08:00
Andrii Nakryiko	7b0891ac6b	ci: build libbpf with more versions of clang and gcc Add few more versions of clang and gcc used to compile-test libbpf. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-11-16 12:16:17 -08:00
Andrii Nakryiko	c80f12f7f6	ci: fix Debian builds due to pkg-config dependency change Seems like we need pkgconfig dependency instead of pkg-config. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-11-16 11:25:17 -08:00
Andrii Nakryiko	3b6093fd43	sync: start syncing include/uapi/linux/fcntl.h UAPI header Libbpf relies on F_DUPFD_CLOEXEC constant coming from fcntl.h UAPI header, so we need to sync it along other UAPI headers. Also update sync script to keep doing this automatically going forward. Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-11-16 10:56:59 -08:00
Andrii Nakryiko	8d358ab948	sync: make LIBBPF_PATHS and LIBBPF_VIEW_PATHS into real array variables Use correct Bash syntax to define these two variables as arrays. Drop shellcheck opt-out for unquoted use of array. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-11-14 21:42:37 -08:00
Andrii Nakryiko	971ad8f8d0	sync: fix sync script's use of bash array variables Don't wrap LIBBPF_PATHS[@] and LIBBPF_VIEW_PATHS[@] in quotes when passing it to git commands. Not clear how it worked before, but something recently broke. Either git commands became stricter or something. But either way, we do want to pass each element of LIBBPF_PATHS or LIBBPF_VIEW_PATHS as separate command line arguments, so putting them in quotes doesn't make sense, as that makes them look like a single argument to git. So drop all the quotes around these arrays. The only place where it's still needed is in commit_signature call, as we do want to pass array as single arg ($2) and then internally we unfold it into multiple command line arguments. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-11-12 18:24:12 -08:00
Andrii Nakryiko	2ed27f9e63	ci: update vmlinux.h Update vmlinux.h to get latest enums for some of selftests. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-11-12 18:24:12 -08:00
Andrii Nakryiko	4bdbb7ea28	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 62c69e89e81bfbdb9a87ae3e0599dcc6aacf786b Checkpoint bpf-next commit: b548b17a93fd18357a5a6f535c10c1e68719ad32 Baseline bpf commit: e7b09357453a99e6f9e74c39e9ca1363c22c0b96 Checkpoint bpf commit: 9cbd48d5fa14e4c65f8580de16686077f7cea02b Alan Maguire (1): libbpf: Btf dedup identical struct test needs check for nested structs/arrays Andrii Nakryiko (2): libbpf: clean up and refactor BTF fixup step libbpf: only add BPF_F_MMAPABLE flag for data maps with global vars Anshuman Khandual (4): perf: Add system error and not in transaction branch types perf: Extend branch type classification perf: Capture branch privilege information perf: Add PERF_BR_NEW_ARCH_[N] map for BRBE on arm64 platform Eduard Zingerman (4): libbpf: Resolve enum fwd as full enum64 and vice versa libbpf: Hashmap interface update to allow both long and void* keys/values libbpf: Resolve unambigous forward declarations libbpf: Hashmap.h update to fix build issues using LLVM14 Martin KaFai Lau (1): bpf: Add hwtstamp field for the sockops prog Namhyung Kim (1): perf: Kill __PERF_SAMPLE_CALLCHAIN_EARLY Ravi Bangoria (3): perf/mem: Introduce PERF_MEM_LVLNUM_{EXTN_MEM\|IO} perf/uapi: Define PERF_MEM_SNOOPX_PEER in kernel header file perf/mem: Rename PERF_MEM_LVLNUM_EXTN_MEM to PERF_MEM_LVLNUM_CXL Sandipan Das (1): perf/core: Add speculation info to branch entries Xu Kuohai (1): libbpf: Avoid allocating reg_name with sscanf in parse_usdt_arg() Yonghong Song (2): bpf: Implement cgroup storage available to non-cgroup-attached bpf progs libbpf: Support new cgroup local storage include/uapi/linux/bpf.h \| 51 +++++- include/uapi/linux/perf_event.h \| 57 ++++++- src/btf.c \| 267 ++++++++++++++++++++++---------- src/btf_dump.c \| 15 +- src/hashmap.c \| 18 +-- src/hashmap.h \| 91 +++++++---- src/libbpf.c \| 196 ++++++++++++++--------- src/libbpf_probes.c \| 1 + src/strset.c \| 18 +-- src/usdt.c \| 44 +++--- 10 files changed, 511 insertions(+), 247 deletions(-) -- 2.30.2	2022-11-12 18:24:12 -08:00
Andrii Nakryiko	4978cf9cd8	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2022-11-12 18:24:12 -08:00
Martin KaFai Lau	00fc9f407c	bpf: Add hwtstamp field for the sockops prog The bpf-tc prog has already been able to access the skb_hwtstamps(skb)->hwtstamp. This patch extends the same hwtstamp access to the sockops prog. In sockops, the skb is also available to the bpf prog during the BPF_SOCK_OPS_PARSE_HDR_OPT_CB event. There is a use case that the hwtstamp will be useful to the sockops prog to better measure the one-way-delay when the sender has put the tx timestamp in the tcp header option. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20221107230420.4192307-2-martin.lau@linux.dev	2022-11-12 18:24:12 -08:00
Eduard Zingerman	e1b34c589d	libbpf: Hashmap.h update to fix build issues using LLVM14 A fix for the LLVM compilation error while building bpftool. Replaces the expression: _Static_assert((p) == NULL \|\| ...) by expression: _Static_assert((__builtin_constant_p((p)) ? (p) == NULL : 0) \|\| ...) When "p" is not a constant the former is not considered to be a constant expression by LLVM 14. The error was introduced in the following patch-set: [1]. The error was reported here: [2]. [1] https://lore.kernel.org/bpf/20221109142611.879983-1-eddyz87@gmail.com/ [2] https://lore.kernel.org/all/202211110355.BcGcbZxP-lkp@intel.com/ Reported-by: kernel test robot <lkp@intel.com> Fixes: c302378bc157 ("libbpf: Hashmap interface update to allow both long and void* keys/values") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20221110223240.1350810-1-eddyz87@gmail.com	2022-11-12 18:24:12 -08:00
Eduard Zingerman	7583310911	libbpf: Resolve unambigous forward declarations Resolve forward declarations that don't take part in type graphs comparisons if declaration name is unambiguous. Example: CU #1: struct foo; // standalone forward declaration struct foo some_global; CU #2: struct foo { int x; }; struct foo another_global; The `struct foo` from CU #1 is not a part of any definition that is compared against another definition while `btf_dedup_struct_types` processes structural types. The the BTF after `btf_dedup_struct_types` the BTF looks as follows: [1] STRUCT 'foo' size=4 vlen=1 ... [2] INT 'int' size=4 ... [3] PTR '(anon)' type_id=1 [4] FWD 'foo' fwd_kind=struct [5] PTR '(anon)' type_id=4 This commit adds a new pass `btf_dedup_resolve_fwds`, that maps such forward declarations to structs or unions with identical name in case if the name is not ambiguous. The pass is positioned before `btf_dedup_ref_types` so that types [3] and [5] could be merged as a same type after [1] and [4] are merged. The final result for the example above looks as follows: [1] STRUCT 'foo' size=4 vlen=1 'x' type_id=2 bits_offset=0 [2] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED [3] PTR '(anon)' type_id=1 For defconfig kernel with BTF enabled this removes 63 forward declarations. Examples of removed declarations: `pt_regs`, `in6_addr`. The running time of `btf__dedup` function is increased by about 3%. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20221109142611.879983-3-eddyz87@gmail.com	2022-11-12 18:24:12 -08:00
Eduard Zingerman	4a65c5d888	libbpf: Hashmap interface update to allow both long and void* keys/values An update for libbpf's hashmap interface from void* -> void* to a polymorphic one, allowing both long and void* keys and values. This simplifies many use cases in libbpf as hashmaps there are mostly integer to integer. Perf copies hashmap implementation from libbpf and has to be updated as well. Changes to libbpf, selftests/bpf and perf are packed as a single commit to avoid compilation issues with any future bisect. Polymorphic interface is acheived by hiding hashmap interface functions behind auxiliary macros that take care of necessary type casts, for example: #define hashmap_cast_ptr(p) \ ({ \ _Static_assert((p) == NULL \|\| sizeof((p)) == sizeof(long),\ #p " pointee should be a long-sized integer or a pointer"); \ (long )(p); \ }) bool hashmap_find(const struct hashmap map, long key, long value); #define hashmap__find(map, key, value) \ hashmap_find((map), (long)(key), hashmap_cast_ptr(value)) - hashmap__find macro casts key and value parameters to long and long* respectively - hashmap_cast_ptr ensures that value pointer points to a memory of appropriate size. This hack was suggested by Andrii Nakryiko in [1]. This is a follow up for [2]. [1] https://lore.kernel.org/bpf/CAEf4BzZ8KFneEJxFAaNCCFPGqp20hSpS2aCj76uRk3-qZUH5xg@mail.gmail.com/ [2] https://lore.kernel.org/bpf/af1facf9-7bc8-8a3d-0db4-7b3f333589a2@meta.com/T/#m65b28f1d6d969fcd318b556db6a3ad499a42607d Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20221109142611.879983-2-eddyz87@gmail.com	2022-11-12 18:24:12 -08:00
Eduard Zingerman	3a387f5a8f	libbpf: Resolve enum fwd as full enum64 and vice versa Changes de-duplication logic for enums in the following way: - update btf_hash_enum to ignore size and kind fields to get ENUM and ENUM64 types in a same hash bucket; - update btf_compat_enum to consider enum fwd to be compatible with full enum64 (and vice versa); This allows BTF de-duplication in the following case: // CU #1 enum foo; struct s { enum foo a; } x; // CU #2 enum foo { x = 0xfffffffff // big enough to force enum64 }; struct s { enum foo a; } y; De-duplicated BTF prior to this commit: [1] ENUM64 'foo' encoding=UNSIGNED size=8 vlen=1 'x' val=68719476735ULL [2] INT 'long unsigned int' size=8 bits_offset=0 nr_bits=64 encoding=(none) [3] STRUCT 's' size=8 vlen=1 'a' type_id=4 bits_offset=0 [4] PTR '(anon)' type_id=1 [5] PTR '(anon)' type_id=3 [6] STRUCT 's' size=8 vlen=1 'a' type_id=8 bits_offset=0 [7] ENUM 'foo' encoding=UNSIGNED size=4 vlen=0 [8] PTR '(anon)' type_id=7 [9] PTR '(anon)' type_id=6 De-duplicated BTF after this commit: [1] ENUM64 'foo' encoding=UNSIGNED size=8 vlen=1 'x' val=68719476735ULL [2] INT 'long unsigned int' size=8 bits_offset=0 nr_bits=64 encoding=(none) [3] STRUCT 's' size=8 vlen=1 'a' type_id=4 bits_offset=0 [4] PTR '(anon)' type_id=1 [5] PTR '(anon)' type_id=3 Enum forward declarations in C do not provide information about enumeration values range. Thus the `btf_type->size` field is meaningless for forward enum declarations. In fact, GCC does not encode size in DWARF for forward enum declarations (but dwarves sets enumeration size to a default value of `sizeof(int) * 8` when size is not specified see dwarf_loader.c:die__create_new_enumeration). Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20221101235413.1824260-1-eddyz87@gmail.com	2022-11-12 18:24:12 -08:00
Ravi Bangoria	a2eba90326	perf/mem: Rename PERF_MEM_LVLNUM_EXTN_MEM to PERF_MEM_LVLNUM_CXL PERF_MEM_LVLNUM_EXTN_MEM was introduced to cover CXL devices but it's bit ambiguous name and also not generic enough to cover cxl.cache and cxl.io devices. Rename it to PERF_MEM_LVLNUM_CXL to be more specific. Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/f6268268-b4e9-9ed6-0453-65792644d953@amd.com	2022-11-12 18:24:12 -08:00
Yonghong Song	7106ebe768	libbpf: Support new cgroup local storage Add support for new cgroup local storage. Acked-by: David Vernet <void@manifault.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221026042856.673989-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-12 18:24:12 -08:00
Yonghong Song	3c6d127e50	bpf: Implement cgroup storage available to non-cgroup-attached bpf progs Similar to sk/inode/task storage, implement similar cgroup local storage. There already exists a local storage implementation for cgroup-attached bpf programs. See map type BPF_MAP_TYPE_CGROUP_STORAGE and helper bpf_get_local_storage(). But there are use cases such that non-cgroup attached bpf progs wants to access cgroup local storage data. For example, tc egress prog has access to sk and cgroup. It is possible to use sk local storage to emulate cgroup local storage by storing data in socket. But this is a waste as it could be lots of sockets belonging to a particular cgroup. Alternatively, a separate map can be created with cgroup id as the key. But this will introduce additional overhead to manipulate the new map. A cgroup local storage, similar to existing sk/inode/task storage, should help for this use case. The life-cycle of storage is managed with the life-cycle of the cgroup struct. i.e. the storage is destroyed along with the owning cgroup with a call to bpf_cgrp_storage_free() when cgroup itself is deleted. The userspace map operations can be done by using a cgroup fd as a key passed to the lookup, update and delete operations. Typically, the following code is used to get the current cgroup: struct task_struct task = bpf_get_current_task_btf(); ... task->cgroups->dfl_cgrp ... and in structure task_struct definition: struct task_struct { .... struct css_set __rcu cgroups; .... } With sleepable program, accessing task->cgroups is not protected by rcu_read_lock. So the current implementation only supports non-sleepable program and supporting sleepable program will be the next step together with adding rcu_read_lock protection for rcu tagged structures. Since map name BPF_MAP_TYPE_CGROUP_STORAGE has been used for old cgroup local storage support, the new map name BPF_MAP_TYPE_CGRP_STORAGE is used for cgroup storage available to non-cgroup-attached bpf programs. The old cgroup storage supports bpf_get_local_storage() helper to get the cgroup data. The new cgroup storage helper bpf_cgrp_storage_get() can provide similar functionality. While old cgroup storage pre-allocates storage memory, the new mechanism can also pre-allocate with a user space bpf_map_update_elem() call to avoid potential run-time memory allocation failure. Therefore, the new cgroup storage can provide all functionality w.r.t. the old one. So in uapi bpf.h, the old BPF_MAP_TYPE_CGROUP_STORAGE is alias to BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED to indicate the old cgroup storage can be deprecated since the new one can provide the same functionality. Acked-by: David Vernet <void@manifault.com> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221026042850.673791-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-12 18:24:12 -08:00
Alan Maguire	6ebbbacb5c	libbpf: Btf dedup identical struct test needs check for nested structs/arrays When examining module BTF, it is common to see core kernel structures such as sk_buff, net_device duplicated in the module. After adding debug messaging to BTF it turned out that much of the problem was down to the identical struct test failing during deduplication; sometimes the compiler adds identical structs. However it turns out sometimes that type ids of identical struct members can also differ, even when the containing structs are still identical. To take an example, for struct sk_buff, debug messaging revealed that the identical struct matching was failing for the anon struct "headers"; specifically for the first field: __u8 __pkt_type_offset[0]; /* 128 0 */ Looking at the code in BTF deduplication, we have code that guards against the possibility of identical struct definitions, down to type ids, and identical array definitions. However in this case we have a struct which is being defined twice but does not have identical type ids since each duplicate struct has separate type ids for the above array member. A similar problem (though not observed) could occur for struct-in-struct. The solution is to make the "identical struct" test check members not just for matching ids, but to also check if they in turn are identical structs or arrays. The results of doing this are quite dramatic (for some modules at least); I see the number of type ids drop from around 10000 to just over 1000 in one module for example. For testing use latest pahole or apply [1], otherwise dedups can fail for the reasons described there. Also fix return type of btf_dedup_identical_arrays() as suggested by Andrii to match boolean return type used elsewhere. Fixes: efdd3eb8015e ("libbpf: Accommodate DWARF/compiler bug with duplicated structs") Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1666622309-22289-1-git-send-email-alan.maguire@oracle.com [1] https://lore.kernel.org/bpf/1666364523-9648-1-git-send-email-alan.maguire	2022-11-12 18:24:12 -08:00
Xu Kuohai	1bb7a8349a	libbpf: Avoid allocating reg_name with sscanf in parse_usdt_arg() The reg_name in parse_usdt_arg() is used to hold register name, which is short enough to be held in a 16-byte array, so we could define reg_name as char reg_name[16] to avoid dynamically allocating reg_name with sscanf. Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20221018145538.2046842-1-xukuohai@huaweicloud.com	2022-11-12 18:24:12 -08:00
Andrii Nakryiko	3cd45b660c	libbpf: only add BPF_F_MMAPABLE flag for data maps with global vars Teach libbpf to not add BPF_F_MMAPABLE flag unnecessarily for ARRAY maps that are backing data sections, if such data sections don't expose any variables to user-space. Exposed variables are those that have STB_GLOBAL or STB_WEAK ELF binding and correspond to BTF VAR's BTF_VAR_GLOBAL_ALLOCATED linkage. The overall idea is that if some data section doesn't have any variable that is exposed through BPF skeleton, then there is no reason to make such BPF array mmapable. Making BPF array mmapable is not a free no-op action, because BPF verifier doesn't allow users to put special objects (such as BPF spin locks, RB tree nodes, linked list nodes, kptrs, etc; anything that has a sensitive internal state that should not be modified arbitrarily from user space) into mmapable arrays, as there is no way to prevent user space from corrupting such sensitive state through direct memory access through memory-mapped region. By making sure that libbpf doesn't add BPF_F_MMAPABLE flag to BPF array maps corresponding to data sections that only have static variables (which are not supposed to be visible to user space according to libbpf and BPF skeleton rules), users now can have spinlocks, kptrs, etc in either default .bss/.data sections or custom .data.* sections (assuming there are no global variables in such sections). The only possible hiccup with this approach is the need to use global variables during BPF static linking, even if it's not intended to be shared with user space through BPF skeleton. To allow such scenarios, extend libbpf's STV_HIDDEN ELF visibility attribute handling to variables. Libbpf is already treating global hidden BPF subprograms as static subprograms and adjusts BTF accordingly to make BPF verifier verify such subprograms as static subprograms with preserving entire BPF verifier state between subprog calls. This patch teaches libbpf to treat global hidden variables as static ones and adjust BTF information accordingly as well. This allows to share variables between multiple object files during static linking, but still keep them internal to BPF program and not get them exposed through BPF skeleton. Note, that if the user has some advanced scenario where they absolutely need BPF_F_MMAPABLE flag on .data/.bss/.rodata BPF array map despite only having static variables, they still can achieve this by forcing it through explicit bpf_map__set_map_flags() API. Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/r/20221019002816.359650-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-12 18:24:12 -08:00
Andrii Nakryiko	0e195e4597	libbpf: clean up and refactor BTF fixup step Refactor libbpf's BTF fixup step during BPF object open phase. The only functional change is that we now ignore BTF_VAR_GLOBAL_EXTERN variables during fix up, not just BTF_VAR_STATIC ones, which shouldn't cause any change in behavior as there shouldn't be any extern variable in data sections for valid BPF object anyways. Otherwise it's just collapsing two functions that have no reason to be separate, and switching find_elf_var_offset() helper to return entire symbol pointer, not just its offset. This will be used by next patch to get ELF symbol visibility. While refactoring, also "normalize" debug messages inside btf_fixup_datasec() to follow general libbpf style and print out data section name consistently, where it's available. Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20221019002816.359650-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-11-12 18:24:12 -08:00
Ravi Bangoria	08830e9d2f	perf/uapi: Define PERF_MEM_SNOOPX_PEER in kernel header file PERF_MEM_SNOOPX_PEER is defined only in tools uapi header. Although it's used only by perf tool, not defining it in kernel header can create problems in future. Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220928095805.596-8-ravi.bangoria@amd.com	2022-11-12 18:24:12 -08:00
Ravi Bangoria	1022f26d04	perf/mem: Introduce PERF_MEM_LVLNUM_{EXTN_MEM\|IO} PERF_MEM_LVLNUM_EXTN_MEM which can be used to indicate accesses to extension memory like CXL etc. PERF_MEM_LVL_IO can be used for IO accesses but it can not distinguish between local and remote IO. Introduce new field PERF_MEM_LVLNUM_IO which can be clubbed with PERF_MEM_REMOTE_REMOTE to indicate Remote IO accesses. Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220928095805.596-2-ravi.bangoria@amd.com	2022-11-12 18:24:12 -08:00
Namhyung Kim	b4ca1f6407	perf: Kill __PERF_SAMPLE_CALLCHAIN_EARLY There's no in-tree user anymore. Let's get rid of it. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220908214104.3851807-3-namhyung@kernel.org	2022-11-12 18:24:12 -08:00
Anshuman Khandual	fd71ca941b	perf: Add PERF_BR_NEW_ARCH_[N] map for BRBE on arm64 platform BRBE captured branch types will overflow perf_branch_entry.type and generic branch types in perf_branch_entry.new_type. So override each available arch specific branch type in the following manner to comprehensively process all reported branch types in BRBE. PERF_BR_ARM64_FIQ PERF_BR_NEW_ARCH_1 PERF_BR_ARM64_DEBUG_HALT PERF_BR_NEW_ARCH_2 PERF_BR_ARM64_DEBUG_EXIT PERF_BR_NEW_ARCH_3 PERF_BR_ARM64_DEBUG_INST PERF_BR_NEW_ARCH_4 PERF_BR_ARM64_DEBUG_DATA PERF_BR_NEW_ARCH_5 Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: James Clark <james.clark@arm.com> Link: https://lkml.kernel.org/r/20220824044822.70230-5-anshuman.khandual@arm.com	2022-11-12 18:24:12 -08:00
Anshuman Khandual	a14b39bd31	perf: Capture branch privilege information Platforms like arm64 could capture privilege level information for all the branch records. Hence this adds a new element in the struct branch_entry to record the privilege level information, which could be requested through a new event.attr.branch_sample_type based flag PERF_SAMPLE_BRANCH_PRIV_SAVE. This flag helps user choose whether privilege information is captured. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: James Clark <james.clark@arm.com> Link: https://lkml.kernel.org/r/20220824044822.70230-4-anshuman.khandual@arm.com	2022-11-12 18:24:12 -08:00
Anshuman Khandual	ade228b8f0	perf: Extend branch type classification branch_entry.type now has ran out of space to accommodate more branch types classification. This will prevent perf branch stack implementation on arm64 (via BRBE) to capture all available branch types. Extending this bit field i.e branch_entry.type [4 bits] is not an option as it will break user space ABI both for little and big endian perf tools. Extend branch classification with a new field branch_entry.new_type via a new branch type PERF_BR_EXTEND_ABI in branch_entry.type. Perf tools which could decode PERF_BR_EXTEND_ABI, will then parse branch_entry.new_type as well. branch_entry.new_type is a 4 bit field which can hold upto 16 branch types. The first three branch types will hold various generic page faults followed by five architecture specific branch types, which can be overridden by the platform for specific use cases. These architecture specific branch types gets overridden on arm64 platform for BRBE implementation. New generic branch types - PERF_BR_NEW_FAULT_ALGN - PERF_BR_NEW_FAULT_DATA - PERF_BR_NEW_FAULT_INST New arch specific branch types - PERF_BR_NEW_ARCH_1 - PERF_BR_NEW_ARCH_2 - PERF_BR_NEW_ARCH_3 - PERF_BR_NEW_ARCH_4 - PERF_BR_NEW_ARCH_5 Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: James Clark <james.clark@arm.com> Link: https://lkml.kernel.org/r/20220824044822.70230-3-anshuman.khandual@arm.com	2022-11-12 18:24:12 -08:00
Anshuman Khandual	41ab246bdf	perf: Add system error and not in transaction branch types This expands generic branch type classification by adding two more entries there in i.e system error and not in transaction. This also updates the x86 implementation to process X86_BR_NO_TX records as appropriate. This changes branch types reported to user space on x86 platform but it should not be a problem. The possible scenarios and impacts are enumerated here. -------------------------------------------------------------------------- \| kernel \| perf tool \| Impact \| -------------------------------------------------------------------------- \| old \| old \| Works as before \| -------------------------------------------------------------------------- \| old \| new \| PERF_BR_UNKNOWN is processed \| -------------------------------------------------------------------------- \| new \| old \| PERF_BR_NO_TX is blocked via old PERF_BR_MAX \| -------------------------------------------------------------------------- \| new \| new \| PERF_BR_NO_TX is recognized \| -------------------------------------------------------------------------- When PERF_BR_NO_TX is blocked via old PERF_BR_MAX (new kernel with old perf tool) the user space might throw up an warning complaining about an unrecognized branch types being reported, but it's expected. PERF_BR_SERROR & PERF_BR_NO_TX branch types will be used for BRBE implementation on arm64 platform. PERF_BR_NO_TX complements 'abort' and 'in_tx' elements in perf_branch_entry which represent other transaction states for a given branch record. Because this completes the transaction state classification. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: James Clark <james.clark@arm.com> Link: https://lkml.kernel.org/r/20220824044822.70230-2-anshuman.khandual@arm.com	2022-11-12 18:24:12 -08:00
Sandipan Das	d918025bc8	perf/core: Add speculation info to branch entries Add a new "spec" bitfield to branch entries for providing speculation information. This will be populated using hints provided by branch sampling features on supported hardware. The following cases are covered: * No branch speculation information is available * Branch is speculative but taken on the wrong path * Branch is non-speculative but taken on the correct path * Branch is speculative and taken on the correct path Suggested-by: Stephane Eranian <eranian@google.com> Signed-off-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/834088c302faf21c7b665031dd111f424e509a64.1660211399.git.sandipan.das@amd.com	2022-11-12 18:24:12 -08:00
Daniel Müller	918d7712c0	ci: Make sure to keep ci/diffs/ directory around Commit `837664758d` ("ci: Allow usage of .patch patches") removed the ci/diffs/.do_not_use_dot_patch_here marker file. Given that we currently have no CI patches present and that git does not track (empty) directories, ci/diffs/ got removed. That's fine functionality-wise, but it makes for a bit of a discoverability hurdle. Add back a marker file to keep the directory around. Signed-off-by: Daniel Müller <deso@posteo.net>	2022-11-08 08:33:47 -08:00
Daniel Müller	4a84a7619f	ci: Provide KBUILD_OUTPUT to actions asking for it As of https://github.com/libbpf/ci/pull/67 a bunch of actions honor KBUILD_OUTPUT. Doing so will make it possible to separate source code from build artifacts, which in turn may allow us to support incremental kernel compilation in CI down the line. Irrespective of these future changes, actions pertaining the kernel build now ask for an additional input defining where to store or expect build artifacts. Provide it. Signed-off-by: Daniel Müller <deso@posteo.net>	2022-11-07 11:02:01 -08:00
Daniel Müller	837664758d	ci: Allow usage of .patch patches With https://github.com/libbpf/ci/pull/68 merged we can now keep the .patch extension for patches and don't have to worry about forgetting the rename to .diff. Remove the marker file reminding us of that need. Signed-off-by: Daniel Müller <deso@posteo.net>	2022-11-07 11:00:56 -08:00
Daniel Müller	11bf829873	ci: Remove no longer needed patches Patch "selftests/bpf: Fix OOB write in test_verifier" has made it to the bpf branch (after originally landing on bpf-next). Remove it from CI, as it is no longer necessary. Signed-off-by: Daniel Müller <deso@posteo.net>	2022-11-07 11:00:56 -08:00
Matteo Croce	c97b16d96c	ci: enable shellcheck linter Run shellckeck linter in a github action, as in https://github.com/libbpf/ci/pull/61 Signed-off-by: Matteo Croce <teknoraver@meta.com>	2022-10-27 16:46:38 -07:00
Matteo Croce	1c17672353	shellcheck: fix errors Signed-off-by: Matteo Croce <teknoraver@meta.com>	2022-10-27 16:46:38 -07:00
Tobias Waldekranz	68e6f83f22	Makefile: Fix cross-compilation for 32-bit targets Determining the correct library installation path (lib vs. lib64) using uname(1) breaks in cross compilation scenarios where word widths differ between the host and target system. Instead, source the information from the compilers '-dumpmachine' option (supported by both GCC and Clang). We call this the "host" architecture, using the same nomenclature as Autotools (--host configure option). Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>	2022-10-18 17:33:04 -07:00
grantseltzer	383ffb79a6	Add documentation badge to README This adds a documentation badge that links to libbpf.readthedocs.org When rendered on github it will display the status of the docs build Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>	2022-10-17 13:55:59 -07:00
David Vernet	50315fd763	README: Fix Arch packaging link libbpf is now packaged as part of the core repository, not the extra repository. Fix the current link which gets a 404. Signed-off-by: David Vernet <void@manifault.com>	2022-10-17 13:17:40 -07:00
Andrii Nakryiko	534a2c6f53	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 87dbdc230d162bf9ee1ac77c8ade178b6b1e199e Checkpoint bpf-next commit: 62c69e89e81bfbdb9a87ae3e0599dcc6aacf786b Baseline bpf commit: 60240bc26114543fcbfcd8a28466e67e77b20388 Checkpoint bpf commit: e7b09357453a99e6f9e74c39e9ca1363c22c0b96 Andrii Nakryiko (1): bpf: explicitly define BPF_FUNC_xxx integer values Eduard Zingerman (1): bpftool: Print newline before '}' for struct with padding only fields Kui-Feng Lee (2): bpf: Parameterize task iterators. bpf: Handle bpf_link_info for the parameterized task BPF iterators. Roberto Sassu (5): libbpf: Fix LIBBPF_1.0.0 declaration in libbpf.map libbpf: Introduce bpf_get_fd_by_id_opts and bpf_map_get_fd_by_id_opts() libbpf: Introduce bpf_prog_get_fd_by_id_opts() libbpf: Introduce bpf_btf_get_fd_by_id_opts() libbpf: Introduce bpf_link_get_fd_by_id_opts() Shung-Hsi Yu (3): libbpf: Use elf_getshdrnum() instead of e_shnum libbpf: Deal with section with no data gracefully libbpf: Fix null-pointer dereference in find_prog_by_sec_insn() Xin Liu (1): libbpf: Fix overrun in netlink attribute iteration Xu Kuohai (2): libbpf: Fix use-after-free in btf_dump_name_dups libbpf: Fix memory leak in parse_usdt_arg() include/uapi/linux/bpf.h \| 442 ++++++++++++++++++++------------------- src/bpf.c \| 48 ++++- src/bpf.h \| 16 ++ src/btf_dump.c \| 35 +++- src/libbpf.c \| 22 +- src/libbpf.map \| 6 +- src/nlattr.c \| 2 +- src/usdt.c \| 11 +- 8 files changed, 347 insertions(+), 235 deletions(-) -- 2.30.2	2022-10-17 13:13:02 -07:00
Shung-Hsi Yu	3a3ef0c1d0	libbpf: Fix null-pointer dereference in find_prog_by_sec_insn() When there are no program sections, obj->programs is left unallocated, and find_prog_by_sec_insn()'s search lands on &obj->programs[0] == NULL, and will cause null-pointer dereference in the following access to prog->sec_idx. Guard the search with obj->nr_programs similar to what's being done in __bpf_program__iter() to prevent null-pointer access from happening. Fixes: db2b8b06423c ("libbpf: Support CO-RE relocations for multi-prog sections") Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20221012022353.7350-4-shung-hsi.yu@suse.com	2022-10-17 13:13:02 -07:00
Shung-Hsi Yu	3ee4823fcb	libbpf: Deal with section with no data gracefully ELF section data pointer returned by libelf may be NULL (if section has SHT_NOBITS), so null check section data pointer before attempting to copy license and kversion section. Fixes: cb1e5e961991 ("bpf tools: Collect version and license from ELF sections") Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20221012022353.7350-3-shung-hsi.yu@suse.com	2022-10-17 13:13:02 -07:00
Shung-Hsi Yu	7412775110	libbpf: Use elf_getshdrnum() instead of e_shnum This commit replace e_shnum with the elf_getshdrnum() helper to fix two oss-fuzz-reported heap-buffer overflow in __bpf_object__open. Both reports are incorrectly marked as fixed and while still being reproducible in the latest libbpf. # clusterfuzz-testcase-minimized-bpf-object-fuzzer-5747922482888704 libbpf: loading object 'fuzz-object' from buffer libbpf: sec_cnt is 0 libbpf: elf: section(1) .data, size 0, link 538976288, flags 2020202020202020, type=2 libbpf: elf: section(2) .data, size 32, link 538976288, flags 202020202020ff20, type=1 ================================================================= ==13==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6020000000c0 at pc 0x0000005a7b46 bp 0x7ffd12214af0 sp 0x7ffd12214ae8 WRITE of size 4 at 0x6020000000c0 thread T0 SCARINESS: 46 (4-byte-write-heap-buffer-overflow-far-from-bounds) #0 0x5a7b45 in bpf_object__elf_collect /src/libbpf/src/libbpf.c:3414:24 #1 0x5733c0 in bpf_object_open /src/libbpf/src/libbpf.c:7223:16 #2 0x5739fd in bpf_object__open_mem /src/libbpf/src/libbpf.c:7263:20 ... The issue lie in libbpf's direct use of e_shnum field in ELF header as the section header count. Where as libelf implemented an extra logic that, when e_shnum == 0 && e_shoff != 0, will use sh_size member of the initial section header as the real section header count (part of ELF spec to accommodate situation where section header counter is larger than SHN_LORESERVE). The above inconsistency lead to libbpf writing into a zero-entry calloc area. So intead of using e_shnum directly, use the elf_getshdrnum() helper provided by libelf to retrieve the section header counter into sec_cnt. Fixes: 0d6988e16a12 ("libbpf: Fix section counting logic") Fixes: 25bbbd7a444b ("libbpf: Remove assumptions about uniqueness of .rodata/.data/.bss maps") Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=40868 Link: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=40957 Link: https://lore.kernel.org/bpf/20221012022353.7350-2-shung-hsi.yu@suse.com	2022-10-17 13:13:02 -07:00
Xu Kuohai	881a10980b	libbpf: Fix memory leak in parse_usdt_arg() In the arm64 version of parse_usdt_arg(), when sscanf returns 2, reg_name is allocated but not freed. Fix it. Fixes: 0f8619929c57 ("libbpf: Usdt aarch64 arg parsing support") Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/bpf/20221011120108.782373-3-xukuohai@huaweicloud.com	2022-10-17 13:13:02 -07:00
Xu Kuohai	54caf920db	libbpf: Fix use-after-free in btf_dump_name_dups ASAN reports an use-after-free in btf_dump_name_dups: ERROR: AddressSanitizer: heap-use-after-free on address 0xffff927006db at pc 0xaaaab5dfb618 bp 0xffffdd89b890 sp 0xffffdd89b928 READ of size 2 at 0xffff927006db thread T0 #0 0xaaaab5dfb614 in __interceptor_strcmp.part.0 (test_progs+0x21b614) #1 0xaaaab635f144 in str_equal_fn tools/lib/bpf/btf_dump.c:127 #2 0xaaaab635e3e0 in hashmap_find_entry tools/lib/bpf/hashmap.c:143 #3 0xaaaab635e72c in hashmap__find tools/lib/bpf/hashmap.c:212 #4 0xaaaab6362258 in btf_dump_name_dups tools/lib/bpf/btf_dump.c:1525 #5 0xaaaab636240c in btf_dump_resolve_name tools/lib/bpf/btf_dump.c:1552 #6 0xaaaab6362598 in btf_dump_type_name tools/lib/bpf/btf_dump.c:1567 #7 0xaaaab6360b48 in btf_dump_emit_struct_def tools/lib/bpf/btf_dump.c:912 #8 0xaaaab6360630 in btf_dump_emit_type tools/lib/bpf/btf_dump.c:798 #9 0xaaaab635f720 in btf_dump__dump_type tools/lib/bpf/btf_dump.c:282 #10 0xaaaab608523c in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:236 #11 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875 #12 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062 #13 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697 #14 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308 #15 0xaaaab5d65990 (test_progs+0x185990) 0xffff927006db is located 11 bytes inside of 16-byte region [0xffff927006d0,0xffff927006e0) freed by thread T0 here: #0 0xaaaab5e2c7c4 in realloc (test_progs+0x24c7c4) #1 0xaaaab634f4a0 in libbpf_reallocarray tools/lib/bpf/libbpf_internal.h:191 #2 0xaaaab634f840 in libbpf_add_mem tools/lib/bpf/btf.c:163 #3 0xaaaab636643c in strset_add_str_mem tools/lib/bpf/strset.c:106 #4 0xaaaab6366560 in strset__add_str tools/lib/bpf/strset.c:157 #5 0xaaaab6352d70 in btf__add_str tools/lib/bpf/btf.c:1519 #6 0xaaaab6353e10 in btf__add_field tools/lib/bpf/btf.c:2032 #7 0xaaaab6084fcc in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:232 #8 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875 #9 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062 #10 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697 #11 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308 #12 0xaaaab5d65990 (test_progs+0x185990) previously allocated by thread T0 here: #0 0xaaaab5e2c7c4 in realloc (test_progs+0x24c7c4) #1 0xaaaab634f4a0 in libbpf_reallocarray tools/lib/bpf/libbpf_internal.h:191 #2 0xaaaab634f840 in libbpf_add_mem tools/lib/bpf/btf.c:163 #3 0xaaaab636643c in strset_add_str_mem tools/lib/bpf/strset.c:106 #4 0xaaaab6366560 in strset__add_str tools/lib/bpf/strset.c:157 #5 0xaaaab6352d70 in btf__add_str tools/lib/bpf/btf.c:1519 #6 0xaaaab6353ff0 in btf_add_enum_common tools/lib/bpf/btf.c:2070 #7 0xaaaab6354080 in btf__add_enum tools/lib/bpf/btf.c:2102 #8 0xaaaab6082f50 in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:162 #9 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875 #10 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062 #11 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697 #12 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308 #13 0xaaaab5d65990 (test_progs+0x185990) The reason is that the key stored in hash table name_map is a string address, and the string memory is allocated by realloc() function, when the memory is resized by realloc() later, the old memory may be freed, so the address stored in name_map references to a freed memory, causing use-after-free. Fix it by storing duplicated string address in name_map. Fixes: 919d2b1dbb07 ("libbpf: Allow modification of BTF and add btf__add_str API") Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/bpf/20221011120108.782373-2-xukuohai@huaweicloud.com	2022-10-17 13:13:02 -07:00
Roberto Sassu	0d6c47523c	libbpf: Introduce bpf_link_get_fd_by_id_opts() Introduce bpf_link_get_fd_by_id_opts(), for symmetry with bpf_map_get_fd_by_id_opts(), to let the caller pass the newly introduced data structure bpf_get_fd_by_id_opts. Keep the existing bpf_link_get_fd_by_id(), and call bpf_link_get_fd_by_id_opts() with NULL as opts argument, to prevent setting open_flags. Currently, the kernel does not support non-zero open_flags for bpf_link_get_fd_by_id_opts(), and a call with them will result in an error returned by the bpf() system call. The caller should always pass zero open_flags. Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20221006110736.84253-6-roberto.sassu@huaweicloud.com	2022-10-17 13:13:02 -07:00
Roberto Sassu	998282f179	libbpf: Introduce bpf_btf_get_fd_by_id_opts() Introduce bpf_btf_get_fd_by_id_opts(), for symmetry with bpf_map_get_fd_by_id_opts(), to let the caller pass the newly introduced data structure bpf_get_fd_by_id_opts. Keep the existing bpf_btf_get_fd_by_id(), and call bpf_btf_get_fd_by_id_opts() with NULL as opts argument, to prevent setting open_flags. Currently, the kernel does not support non-zero open_flags for bpf_btf_get_fd_by_id_opts(), and a call with them will result in an error returned by the bpf() system call. The caller should always pass zero open_flags. Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20221006110736.84253-5-roberto.sassu@huaweicloud.com	2022-10-17 13:13:02 -07:00
Roberto Sassu	d6d1ec5b25	libbpf: Introduce bpf_prog_get_fd_by_id_opts() Introduce bpf_prog_get_fd_by_id_opts(), for symmetry with bpf_map_get_fd_by_id_opts(), to let the caller pass the newly introduced data structure bpf_get_fd_by_id_opts. Keep the existing bpf_prog_get_fd_by_id(), and call bpf_prog_get_fd_by_id_opts() with NULL as opts argument, to prevent setting open_flags. Currently, the kernel does not support non-zero open_flags for bpf_prog_get_fd_by_id_opts(), and a call with them will result in an error returned by the bpf() system call. The caller should always pass zero open_flags. Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20221006110736.84253-4-roberto.sassu@huaweicloud.com	2022-10-17 13:13:02 -07:00
Roberto Sassu	a719cae6aa	libbpf: Introduce bpf_get_fd_by_id_opts and bpf_map_get_fd_by_id_opts() Define a new data structure called bpf_get_fd_by_id_opts, with the member open_flags, to be used by callers of the _opts variants of bpf_*_get_fd_by_id() to specify the permissions needed for the file descriptor to be obtained. Also, introduce bpf_map_get_fd_by_id_opts(), to let the caller pass a bpf_get_fd_by_id_opts structure. Finally, keep the existing bpf_map_get_fd_by_id(), and call bpf_map_get_fd_by_id_opts() with NULL as opts argument, to request read-write permissions (current behavior). Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20221006110736.84253-3-roberto.sassu@huaweicloud.com	2022-10-17 13:13:02 -07:00
Roberto Sassu	07024c87de	libbpf: Fix LIBBPF_1.0.0 declaration in libbpf.map Add the missing LIBBPF_0.8.0 at the end of the LIBBPF_1.0.0 declaration, similarly to other version declarations. Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20221006110736.84253-2-roberto.sassu@huaweicloud.com	2022-10-17 13:13:02 -07:00
Andrii Nakryiko	19ef40cee6	bpf: explicitly define BPF_FUNC_xxx integer values Historically enum bpf_func_id's BPF_FUNC_xxx enumerators relied on implicit sequential values being assigned by compiler. This is convenient, as new BPF helpers are always added at the very end, but it also has its downsides, some of them being: - with over 200 helpers now it's very hard to know what's each helper's ID, which is often important to know when working with BPF assembly (e.g., by dumping raw bpf assembly instructions with llvm-objdump -d command). it's possible to work around this by looking into vmlinux.h, dumping /sys/btf/kernel/vmlinux, looking at libbpf-provided bpf_helper_defs.h, etc. But it always feels like an unnecessary step and one should be able to quickly figure this out from UAPI header. - when backporting and cherry-picking only some BPF helpers onto older kernels it's important to be able to skip some enum values for helpers that weren't backported, but preserve absolute integer IDs to keep BPF helper IDs stable so that BPF programs stay portable across upstream and backported kernels. While neither problem is insurmountable, they come up frequently enough and are annoying enough to warrant improving the situation. And for the backporting the problem can easily go unnoticed for a while, especially if backport is done with people not very familiar with BPF subsystem overall. Anyways, it's easy to fix this by making sure that __BPF_FUNC_MAPPER macro provides explicit helper IDs. Unfortunately that would potentially break existing users that use UAPI-exposed __BPF_FUNC_MAPPER and are expected to pass macro that accepts only symbolic helper identifier (e.g., map_lookup_elem for bpf_map_lookup_elem() helper). As such, we need to introduce a new macro (___BPF_FUNC_MAPPER) which would specify both identifier and integer ID, but in such a way as to allow existing __BPF_FUNC_MAPPER be expressed in terms of new ___BPF_FUNC_MAPPER macro. And that's what this patch is doing. To avoid duplication and allow __BPF_FUNC_MAPPER stay exactly the same, ___BPF_FUNC_MAPPER accepts arbitrary "context" arguments, which can be used to pass any extra macros, arguments, and whatnot. In our case we use this to pass original user-provided macro that expects single argument and __BPF_FUNC_MAPPER is using it's own three-argument __BPF_FUNC_MAPPER_APPLY intermediate macro to impedance-match new and old "callback" macros. Once we resolve this, we use new ___BPF_FUNC_MAPPER to define enum bpf_func_id with explicit values. The other users of __BPF_FUNC_MAPPER in kernel (namely in kernel/bpf/disasm.c) are kept exactly the same both as demonstration that backwards compat works, but also to avoid unnecessary code churn. Note that new ___BPF_FUNC_MAPPER() doesn't forcefully insert comma between values, as that might not be appropriate in all possible cases where ___BPF_FUNC_MAPPER might be used by users. This doesn't reduce usability, as it's trivial to insert that comma inside "callback" macro. To validate all the manually specified IDs are exactly right, we used BTF to compare before and after values: $ bpftool btf dump file ~/linux-build/default/vmlinux \| rg bpf_func_id -A 211 > after.txt $ git stash # stach UAPI changes $ make -j90 ... re-building kernel without UAPI changes ... $ bpftool btf dump file ~/linux-build/default/vmlinux \| rg bpf_func_id -A 211 > before.txt $ diff -u before.txt after.txt --- before.txt 2022-10-05 10:48:18.119195916 -0700 +++ after.txt 2022-10-05 10:46:49.446615025 -0700 @@ -1,4 +1,4 @@ -[14576] ENUM 'bpf_func_id' encoding=UNSIGNED size=4 vlen=211 +[9560] ENUM 'bpf_func_id' encoding=UNSIGNED size=4 vlen=211 'BPF_FUNC_unspec' val=0 'BPF_FUNC_map_lookup_elem' val=1 'BPF_FUNC_map_update_elem' val=2 As can be seen from diff above, the only thing that changed was resulting BTF type ID of ENUM bpf_func_id, not any of the enumerators, their names or integer values. The only other place that needed fixing was scripts/bpf_doc.py used to generate man pages and bpf_helper_defs.h header for libbpf and selftests. That script is tightly-coupled to exact shape of ___BPF_FUNC_MAPPER macro definition, so had to be trivially adapted. Cc: Quentin Monnet <quentin@isovalent.com> Reported-by: Andrea Terzolo <andrea.terzolo@polito.it> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20221006042452.2089843-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-10-17 13:13:02 -07:00
Eduard Zingerman	3d3ff49213	bpftool: Print newline before '}' for struct with padding only fields btf_dump_emit_struct_def attempts to print empty structures at a single line, e.g. `struct empty {}`. However, it has to account for a case when there are no regular but some padding fields in the struct. In such case `vlen` would be zero, but size would be non-zero. E.g. here is struct bpf_timer from vmlinux.h before this patch: struct bpf_timer { long: 64; long: 64;}; And after this patch: struct bpf_dynptr { long: 64; long: 64; }; Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20221001104425.415768-1-eddyz87@gmail.com	2022-10-17 13:13:02 -07:00
Xin Liu	3745a20b28	libbpf: Fix overrun in netlink attribute iteration I accidentally found that a change in commit 1045b03e07d8 ("netlink: fix overrun in attribute iteration") was not synchronized to the function `nla_ok` in tools/lib/bpf/nlattr.c, I think it is necessary to modify, this patch will do it. Signed-off-by: Xin Liu <liuxin350@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220930090708.62394-1-liuxin350@huawei.com	2022-10-17 13:13:02 -07:00
Kui-Feng Lee	b9e909dd41	bpf: Handle bpf_link_info for the parameterized task BPF iterators. Add new fields to bpf_link_info that users can query it through bpf_obj_get_info_by_fd(). Signed-off-by: Kui-Feng Lee <kuifeng@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/bpf/20220926184957.208194-3-kuifeng@fb.com	2022-10-17 13:13:02 -07:00
Kui-Feng Lee	73c0c44b67	bpf: Parameterize task iterators. Allow creating an iterator that loops through resources of one thread/process. People could only create iterators to loop through all resources of files, vma, and tasks in the system, even though they were interested in only the resources of a specific task or process. Passing the additional parameters, people can now create an iterator to go through all resources or only the resources of a task. Signed-off-by: Kui-Feng Lee <kuifeng@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/bpf/20220926184957.208194-2-kuifeng@fb.com	2022-10-17 13:13:02 -07:00
Daniel Müller	abde7fb314	Remove lru_bug from DENYLIST-latest.s390x The comment associated with the entry is a bit confusing. It stemmed from the test being denylisted on bpf, but not bpf-next in the past. Regardless, by now said change has propagated to both trees, so we no longer need to carry around this deny list entry here. Signed-off-by: Daniel Müller <deso@posteo.net>	2022-10-12 09:29:08 -07:00
Manu Bretelle	63389d32f6	ci: remove mkrootfs from libbpf/libbpf This is being moved to libbpf/ci instead https://github.com/libbpf/ci/pull/44 Signed-off-by: Manu Bretelle <chantr4@gmail.com>	2022-10-11 09:14:31 -07:00
Frantisek Sumsal	59080bd06c	ci: use CodeQL instead of LGTM As LGTM is going to be shut down by EOY[0], let's move the code scanning to CodeQL as recommended. Thanks to GH integration the results from such scans will be shown both in the respective PR and in the Security -> Code Scanning tab[1]. [0] https://github.blog/2022-08-15-the-next-step-for-lgtm-com-github-code-scanning/ [1] https://github.com/libbpf/libbpf/security/code-scanning	2022-10-10 16:31:14 -07:00
Daniel Müller	8b0b41f812	Remove travis-ci symlink With https://github.com/libbpf/ci/pull/41 merged we no longer require the travis-ci symlink in this repository. Remove it. Also, it turns out we still have a few locations referencing travis-ci/ instead of ci/. Convert those. Signed-off-by: Daniel Müller <deso@posteo.net>	2022-10-07 11:54:39 -07:00
chantra	6bd5b40bcd	ci: install wget package on s390x runners `wget` is installed by default in GH runners. It is used in [`get-linux-source`](`79c799d6fb/get-linux-source/checkout_latest_kernel.sh (L32)`) to download source faster than through a git fetch. Signed-off-by: Manu Bretelle <chantr4@gmail.com>	2022-10-06 14:24:50 -07:00
chantra	6cd8907a4a	ci: update actions-runner to 2.298.2 on s390x Signed-off-by: Manu Bretelle <chantr4@gmail.com>	2022-10-06 14:24:50 -07:00
chantra	fa2875be8a	ci: install zstd on s390x runners zstd is installed by [default in GH runners](https://github.com/actions/runner-images/blob/main/images/linux/Ubuntu2004-Readme.md). Having it by default, we can start leveraging it when uploading artifacts. It has a better compression ratio and is multithreaded. Signed-off-by: Manu Bretelle <chantr4@gmail.com>	2022-10-06 14:24:50 -07:00
chantra	27a93eae7c	[s390x][ci] Force replacing workers when a worker already exist with same name. This is essentially aligning whith what is done in `0f2883e196/entrypoint.sh (L90-L91)` The issue at hand did manifest on s390x host when restarting a runner and GH having an existing runner with the same name. The logic was to default to not replace it and the runner would be started with somne defaults, which mean the name would change, and the labels would be lost, making the runner unusable (while still running): https://gist.github.com/chantra/ef0bd3e0c9e35bb82619636acf2f7c98 By replacing the existing runner, we will not get into that state.	2022-10-04 10:52:07 -07:00
Andrii Nakryiko	1714037104	vmtest: regenerate latest vmlinux.h Update checked in vmlinux.h for 5.5 kernel tests. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-09-27 15:23:45 -07:00
Andrii Nakryiko	d598cb20c7	libbpf: bump version to 1.1.0 Bump LIBBPF_MINOR_VERSION to 1 for v1.1.0. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-09-27 15:23:45 -07:00
Andrii Nakryiko	ce321d6fd4	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: e34cfee65ec891a319ce79797dda18083af33a76 Checkpoint bpf-next commit: 87dbdc230d162bf9ee1ac77c8ade178b6b1e199e Baseline bpf commit: 14b20b784f59bdd95f6f1cfb112c9818bcec4d84 Checkpoint bpf commit: 60240bc26114543fcbfcd8a28466e67e77b20388 Andrii Nakryiko (3): libbpf: Fix crash if SEC("freplace") programs don't have attach_prog_fd set libbpf: restore memory layout of bpf_object_open_opts libbpf: Don't require full struct enum64 in UAPI headers Benjamin Tissoires (1): libbpf: add map_get_fd_by_id and map_delete_elem in light skeleton Daniel Borkmann (1): libbpf: Remove gcc support for bpf_tail_call_static for now David Vernet (3): bpf: Define new BPF_MAP_TYPE_USER_RINGBUF map type bpf: Add bpf_user_ringbuf_drain() helper bpf: Add libbpf logic for user-space ring buffer Hao Luo (2): bpf: Introduce cgroup iter bpf: Add CGROUP prefix to cgroup_iter_order James Hilliard (1): libbpf: Add GCC support for bpf_tail_call_static Jiri Olsa (1): bpf: Return value in kprobe get_func_ip only for entry address Jon Doron (1): libbpf: Fix the case of running as non-root with capabilities Pu Lehui (1): bpf, cgroup: Reject prog_attach_flags array when effective query Quentin Monnet (1): bpf: Fix a few typos in BPF helpers documentation Shmulik Ladkani (2): bpf, flow_dissector: Introduce BPF_FLOW_DISSECTOR_CONTINUE retcode for bpf progs bpf: Support getting tunnel flags Stanislav Fomichev (1): bpf: update bpf_{g,s}et_retval documentation Tao Chen (1): libbpf: Support raw BTF placed in the default search path Wang Yufen (1): libbpf: Add pathname_concat() helper Xin Liu (2): libbpf: Clean up legacy bpf maps declaration in bpf_helpers libbpf: Fix NULL pointer exception in API btf_dump__dump_type_data Yonghong Song (3): bpf: Update descriptions for helpers bpf_get_func_arg[_cnt]() libbpf: Add new BPF_PROG2 macro libbpf: Improve BPF_PROG2 macro code quality and description include/uapi/linux/bpf.h \| 139 +++++++++++++++++--- src/bpf_helpers.h \| 12 -- src/bpf_tracing.h \| 107 ++++++++++++++++ src/btf.c \| 32 ++--- src/btf.h \| 25 +++- src/btf_dump.c \| 2 +- src/libbpf.c \| 106 ++++++++------- src/libbpf.h \| 111 +++++++++++++++- src/libbpf.map \| 10 ++ src/libbpf_probes.c \| 1 + src/libbpf_version.h \| 2 +- src/ringbuf.c \| 271 +++++++++++++++++++++++++++++++++++++++ src/skel_internal.h \| 23 ++++ src/usdt.c \| 2 +- 14 files changed, 731 insertions(+), 112 deletions(-) -- 2.30.2	2022-09-27 15:23:45 -07:00
Andrii Nakryiko	0f5b3a10ae	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2022-09-27 15:23:45 -07:00
Pu Lehui	5859c59e50	bpf, cgroup: Reject prog_attach_flags array when effective query Attach flags is only valid for attached progs of this layer cgroup, but not for effective progs. For querying with EFFECTIVE flags, exporting attach flags does not make sense. So when effective query, we reject prog_attach_flags array and don't need to populate it. Also we limit attach_flags to output 0 during effective query. Fixes: b79c9fc9551b ("bpf: implement BPF_PROG_QUERY for BPF_LSM_CGROUP") Signed-off-by: Pu Lehui <pulehui@huawei.com> Link: https://lore.kernel.org/r/20220921104604.2340580-2-pulehui@huaweicloud.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2022-09-27 15:23:45 -07:00
Andrii Nakryiko	85f8b7c4dc	libbpf: Don't require full struct enum64 in UAPI headers Drop the requirement for system-wide kernel UAPI headers to provide full struct btf_enum64 definition. This is an unexpected requirement that slipped in libbpf 1.0 and put unnecessary pressure ([0]) on users to have a bleeding-edge kernel UAPI header from unreleased Linux 6.0. To achieve this, we forward declare struct btf_enum64. But that's not enough as there is btf_enum64_value() helper that expects to know the layout of struct btf_enum64. So we get a bit creative with reinterpreting memory layout as array of __u32 and accesing lo32/hi32 fields as array elements. Alternative way would be to have a local pointer variable for anonymous struct with exactly the same layout as struct btf_enum64, but that gets us into C++ compiler errors complaining about invalid type casts. So play it safe, if ugly. [0] Closes: https://github.com/libbpf/libbpf/issues/562 Fixes: d90ec262b35b ("libbpf: Add enum64 support for btf_dump") Reported-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> Link: https://lore.kernel.org/bpf/20220927042940.147185-1-andrii@kernel.org	2022-09-27 15:23:45 -07:00
Jon Doron	9da0dcb621	libbpf: Fix the case of running as non-root with capabilities When running rootless with special capabilities like: FOWNER / DAC_OVERRIDE / DAC_READ_SEARCH The "access" API will not make the proper check if there is really access to a file or not. >From the access man page: " The check is done using the calling process's real UID and GID, rather than the effective IDs as is done when actually attempting an operation (e.g., open(2)) on the file. Similarly, for the root user, the check uses the set of permitted capabilities rather than the set of effective capabilities; *and for non-root users, the check uses an empty set of capabilities.* " What that means is that for non-root user the access API will not do the proper validation if the process really has permission to a file or not. To resolve this this patch replaces all the access API calls with faccessat with AT_EACCESS flag. Signed-off-by: Jon Doron <jond@wiz.io> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220925070431.1313680-1-arilou@gmail.com	2022-09-27 15:23:45 -07:00
Jiri Olsa	82c4054376	bpf: Return value in kprobe get_func_ip only for entry address Changing return value of kprobe's version of bpf_get_func_ip to return zero if the attach address is not on the function's entry point. For kprobes attached in the middle of the function we can't easily get to the function address especially now with the CONFIG_X86_KERNEL_IBT support. If user cares about current IP for kprobes attached within the function body, they can get it with PT_REGS_IP(ctx). Suggested-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220926153340.1621984-6-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-27 15:23:45 -07:00
Andrii Nakryiko	b3a117773d	libbpf: restore memory layout of bpf_object_open_opts When attach_prog_fd field was removed in libbpf 1.0 and replaced with `long: 0` placeholder, it actually shifted all the subsequent fields by 8 byte. This is due to `long: 0` promising to adjust next field's offset to long-aligned offset. But in this case we were already long-aligned as pin_root_path is a pointer. So `long: 0` had no effect, and thus didn't feel the gap created by removed attach_prog_fd. Non-zero bitfield should have been used instead. I validated using pahole. Originally kconfig field was at offset 40. With `long: 0` it's at offset 32, which is wrong. With this change it's back at offset 40. While technically libbpf 1.0 is allowed to break backwards compatibility and applications should have been recompiled against libbpf 1.0 headers, but given how trivial it is to preserve memory layout, let's fix this. Reported-by: Grant Seltzer Richman <grantseltzer@gmail.com> Fixes: 146bf811f5ac ("libbpf: remove most other deprecated high-level APIs") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220923230559.666608-1-andrii@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2022-09-27 15:23:45 -07:00
Wang Yufen	fc2577c54c	libbpf: Add pathname_concat() helper Move snprintf and len check to common helper pathname_concat() to make the code simpler. Signed-off-by: Wang Yufen <wangyufen@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1663828124-10437-1-git-send-email-wangyufen@huawei.com	2022-09-27 15:23:45 -07:00
Tao Chen	0420f75dbc	libbpf: Support raw BTF placed in the default search path Currently, the default vmlinux files at '/boot/vmlinux-', '/lib/modules//vmlinux-*' etc. are parsed with 'btf__parse_elf()' to extract BTF. It is possible that these files are actually raw BTF files similar to /sys/kernel/btf/vmlinux. So parse these files with 'btf__parse' which tries both raw format and ELF format. This might be useful in some scenarios where users put their custom BTF into known locations and don't want to specify btf_custom_path option. Signed-off-by: Tao Chen <chentao.kernel@linux.alibaba.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/3f59fb5a345d2e4f10e16fe9e35fbc4c03ecaa3e.1662999860.git.chentao.kernel@linux.alibaba.com	2022-09-27 15:23:45 -07:00
Yonghong Song	aa25f218b4	libbpf: Improve BPF_PROG2 macro code quality and description Commit 34586d29f8df ("libbpf: Add new BPF_PROG2 macro") added BPF_PROG2 macro for trampoline based programs with struct arguments. Andrii made a few suggestions to improve code quality and description. This patch implemented these suggestions including better internal macro name, consistent usage pattern for __builtin_choose_expr(), simpler macro definition for always-inline func arguments and better macro description. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20220910025214.1536510-1-yhs@fb.com	2022-09-27 15:23:45 -07:00
David Vernet	9e9bf46c92	bpf: Add libbpf logic for user-space ring buffer Now that all of the logic is in place in the kernel to support user-space produced ring buffers, we can add the user-space logic to libbpf. This patch therefore adds the following public symbols to libbpf: struct user_ring_buffer * user_ring_buffer__new(int map_fd, const struct user_ring_buffer_opts opts); void user_ring_buffer__reserve(struct user_ring_buffer rb, __u32 size); void user_ring_buffer__reserve_blocking(struct user_ring_buffer rb, __u32 size, int timeout_ms); void user_ring_buffer__submit(struct user_ring_buffer rb, void sample); void user_ring_buffer__discard(struct user_ring_buffer rb, void user_ring_buffer__free(struct user_ring_buffer rb); A user-space producer must first create a struct user_ring_buffer object with user_ring_buffer__new(), and can then reserve samples in the ring buffer using one of the following two symbols: void user_ring_buffer__reserve(struct user_ring_buffer rb, __u32 size); void user_ring_buffer__reserve_blocking(struct user_ring_buffer rb, __u32 size, int timeout_ms); With user_ring_buffer__reserve(), a pointer to a 'size' region of the ring buffer will be returned if sufficient space is available in the buffer. user_ring_buffer__reserve_blocking() provides similar semantics, but will block for up to 'timeout_ms' in epoll_wait if there is insufficient space in the buffer. This function has the guarantee from the kernel that it will receive at least one event-notification per invocation to bpf_ringbuf_drain(), provided that at least one sample is drained, and the BPF program did not pass the BPF_RB_NO_WAKEUP flag to bpf_ringbuf_drain(). Once a sample is reserved, it must either be committed to the ring buffer with user_ring_buffer__submit(), or discarded with user_ring_buffer__discard(). Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220920000100.477320-4-void@manifault.com	2022-09-27 15:23:45 -07:00
David Vernet	28903eb40e	bpf: Add bpf_user_ringbuf_drain() helper In a prior change, we added a new BPF_MAP_TYPE_USER_RINGBUF map type which will allow user-space applications to publish messages to a ring buffer that is consumed by a BPF program in kernel-space. In order for this map-type to be useful, it will require a BPF helper function that BPF programs can invoke to drain samples from the ring buffer, and invoke callbacks on those samples. This change adds that capability via a new BPF helper function: bpf_user_ringbuf_drain(struct bpf_map map, void callback_fn, void ctx, u64 flags) BPF programs may invoke this function to run callback_fn() on a series of samples in the ring buffer. callback_fn() has the following signature: long callback_fn(struct bpf_dynptr dynptr, void context); Samples are provided to the callback in the form of struct bpf_dynptr 's, which the program can read using BPF helper functions for querying struct bpf_dynptr's. In order to support bpf_ringbuf_drain(), a new PTR_TO_DYNPTR register type is added to the verifier to reflect a dynptr that was allocated by a helper function and passed to a BPF program. Unlike PTR_TO_STACK dynptrs which are allocated on the stack by a BPF program, PTR_TO_DYNPTR dynptrs need not use reference tracking, as the BPF helper is trusted to properly free the dynptr before returning. The verifier currently only supports PTR_TO_DYNPTR registers that are also DYNPTR_TYPE_LOCAL. Note that while the corresponding user-space libbpf logic will be added in a subsequent patch, this patch does contain an implementation of the .map_poll() callback for BPF_MAP_TYPE_USER_RINGBUF maps. This .map_poll() callback guarantees that an epoll-waiting user-space producer will receive at least one event notification whenever at least one sample is drained in an invocation of bpf_user_ringbuf_drain(), provided that the function is not invoked with the BPF_RB_NO_WAKEUP flag. If the BPF_RB_FORCE_WAKEUP flag is provided, a wakeup notification is sent even if no sample was drained. Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220920000100.477320-3-void@manifault.com	2022-09-27 15:23:45 -07:00
David Vernet	8138aa78bd	bpf: Define new BPF_MAP_TYPE_USER_RINGBUF map type We want to support a ringbuf map type where samples are published from user-space, to be consumed by BPF programs. BPF currently supports a kernel -> user-space circular ring buffer via the BPF_MAP_TYPE_RINGBUF map type. We'll need to define a new map type for user-space -> kernel, as none of the helpers exported for BPF_MAP_TYPE_RINGBUF will apply to a user-space producer ring buffer, and we'll want to add one or more helper functions that would not apply for a kernel-producer ring buffer. This patch therefore adds a new BPF_MAP_TYPE_USER_RINGBUF map type definition. The map type is useless in its current form, as there is no way to access or use it for anything until we one or more BPF helpers. A follow-on patch will therefore add a new helper function that allows BPF programs to run callbacks on samples that are published to the ring buffer. Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220920000100.477320-2-void@manifault.com	2022-09-27 15:23:45 -07:00
Xin Liu	8ac9773f52	libbpf: Fix NULL pointer exception in API btf_dump__dump_type_data We found that function btf_dump__dump_type_data can be called by the user as an API, but in this function, the `opts` parameter may be used as a null pointer.This causes `opts->indent_str` to trigger a NULL pointer exception. Fixes: 2ce8450ef5a3 ("libbpf: add bpf_object__open_{file, mem} w/ extensible opts") Signed-off-by: Xin Liu <liuxin350@huawei.com> Signed-off-by: Weibin Kong <kongweibin2@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220917084809.30770-1-liuxin350@huawei.com	2022-09-27 15:23:45 -07:00
Xin Liu	b63791cbde	libbpf: Clean up legacy bpf maps declaration in bpf_helpers Legacy BPF map declarations are no longer supported in libbpf v1.0 [0]. Only BTF-defined maps are supported starting from v1.0, so it is time to remove the definition of bpf_map_def in bpf_helpers.h. [0] https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0 Signed-off-by: Xin Liu <liuxin350@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/bpf/20220913073643.19960-1-liuxin350@huawei.com	2022-09-27 15:23:45 -07:00
Andrii Nakryiko	0ff6d28aec	libbpf: Fix crash if SEC("freplace") programs don't have attach_prog_fd set Fix SIGSEGV caused by libbpf trying to find attach type in vmlinux BTF for freplace programs. It's wrong to search in vmlinux BTF and libbpf doesn't even mark vmlinux BTF as required for freplace programs. So trying to search anything in obj->vmlinux_btf might cause NULL dereference if nothing else in BPF object requires vmlinux BTF. Instead, error out if freplace (EXT) program doesn't specify attach_prog_fd during at the load time. Fixes: 91abb4a6d79d ("libbpf: Support attachment of BPF tracing programs to kernel modules") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220909193053.577111-3-andrii@kernel.org	2022-09-27 15:23:45 -07:00
Daniel Borkmann	861364fa45	libbpf: Remove gcc support for bpf_tail_call_static for now This reverts commit 14e5ce79943a ("libbpf: Add GCC support for bpf_tail_call_static"). Reason is that gcc invented their own BPF asm which is not conform with LLVM one, and going forward this would be more painful to maintain here and in other areas of the library. Thus remove it; ask to gcc folks is to align with LLVM one to use exact same syntax. Fixes: 14e5ce79943a ("libbpf: Add GCC support for bpf_tail_call_static") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: James Hilliard <james.hilliard1@gmail.com> Cc: Jose E. Marchesi <jose.marchesi@oracle.com>	2022-09-27 15:23:45 -07:00
Yonghong Song	21ec5ca723	libbpf: Add new BPF_PROG2 macro To support struct arguments in trampoline based programs, existing BPF_PROG doesn't work any more since the type size is needed to find whether a parameter takes one or two registers. So this patch added a new BPF_PROG2 macro to support such trampoline programs. The idea is suggested by Andrii. For example, if the to-be-traced function has signature like typedef struct { void *x; int t; } sockptr; int blah(sockptr x, char y); In the new BPF_PROG2 macro, the argument can be represented as __bpf_prog_call( ({ union { struct { __u64 x, y; } ___z; sockptr x; } ___tmp = { .___z = { ctx[0], ctx[1] }}; ___tmp.x; }), ({ union { struct { __u8 x; } ___z; char y; } ___tmp = { .___z = { ctx[2] }}; ___tmp.y; })); In the above, the values stored on the stack are properly assigned to the actual argument type value by using 'union' magic. Note that the macro also works even if no arguments are with struct types. Note that new BPF_PROG2 works for both llvm16 and pre-llvm16 compilers where llvm16 supports bpf target passing value with struct up to 16 byte size and pre-llvm16 will pass by reference by storing values on the stack. With static functions with struct argument as always inline, the compiler is able to optimize and remove additional stack saving of struct values. Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220831152707.2079473-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-27 15:23:45 -07:00
Yonghong Song	255690da57	bpf: Update descriptions for helpers bpf_get_func_arg[_cnt]() Now instead of the number of arguments, the number of registers holding argument values are stored in trampoline. Update the description of bpf_get_func_arg[_cnt]() helpers. Previous programs without struct arguments should continue to work as usual. Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220831152657.2078805-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-27 15:23:45 -07:00
Shmulik Ladkani	b1753eaf3b	bpf: Support getting tunnel flags Existing 'bpf_skb_get_tunnel_key' extracts various tunnel parameters (id, ttl, tos, local and remote) but does not expose ip_tunnel_info's tun_flags to the BPF program. It makes sense to expose tun_flags to the BPF program. Assume for example multiple GRE tunnels maintained on a single GRE interface in collect_md mode. The program expects origins to initiate over GRE, however different origins use different GRE characteristics (e.g. some prefer to use GRE checksum, some do not; some pass a GRE key, some do not, etc..). A BPF program getting tun_flags can therefore remember the relevant flags (e.g. TUNNEL_CSUM, TUNNEL_SEQ...) for each initiating remote. In the reply path, the program can use 'bpf_skb_set_tunnel_key' in order to correctly reply to the remote, using similar characteristics, based on the stored tunnel flags. Introduce BPF_F_TUNINFO_FLAGS flag for bpf_skb_get_tunnel_key. If specified, 'bpf_tunnel_key->tunnel_flags' is set with the tun_flags. Decided to use the existing unused 'tunnel_ext' as the storage for the 'tunnel_flags' in order to avoid changing bpf_tunnel_key's layout. Also, the following has been considered during the design: 1. Convert the "interesting" internal TUNNEL_xxx flags back to BPF_F_yyy and place into the new 'tunnel_flags' field. This has 2 drawbacks: - The BPF_F_yyy flags are from set_tunnel_key enumeration space, e.g. BPF_F_ZERO_CSUM_TX. It is awkward that it is "returned" into tunnel_flags from a get_tunnel_key call. - Not all "interesting" TUNNEL_xxx flags can be mapped to existing BPF_F_yyy flags, and it doesn't make sense to create new BPF_F_yyy flags just for purposes of the returned tunnel_flags. 2. Place key.tun_flags into 'tunnel_flags' but mask them, keeping only "interesting" flags. That's ok, but the drawback is that what's "interesting" for my usecase might be limiting for other usecases. Therefore I decided to expose what's in key.tun_flags as is, which seems most flexible. The BPF user can just choose to ignore bits he's not interested in. The TUNNEL_xxx are also UAPI, so no harm exposing them back in the get_tunnel_key call. Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220831144010.174110-1-shmulik.ladkani@gmail.com	2022-09-27 15:23:45 -07:00
James Hilliard	eeb2bc4061	libbpf: Add GCC support for bpf_tail_call_static The bpf_tail_call_static function is currently not defined unless using clang >= 8. To support bpf_tail_call_static on GCC we can check if __clang__ is not defined to enable bpf_tail_call_static. We need to use GCC assembly syntax when the compiler does not define __clang__ as LLVM inline assembly is not fully compatible with GCC. Signed-off-by: James Hilliard <james.hilliard1@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220829210546.755377-1-james.hilliard1@gmail.com	2022-09-27 15:23:45 -07:00
Quentin Monnet	a11587cc01	bpf: Fix a few typos in BPF helpers documentation Address a few typos in the documentation for the BPF helper functions. They were reported by Jakub [0], who ran spell checkers on the generated man page [1]. [0] https://lore.kernel.org/linux-man/d22dcd47-023c-8f52-d369-7b5308e6c842@gmail.com/T/#mb02e7d4b7fb61d98fa914c77b581184e9a9537af [1] https://lore.kernel.org/linux-man/eb6a1e41-c48e-ac45-5154-ac57a2c76108@gmail.com/T/#m4a8d1b003616928013ffcd1450437309ab652f9f v3: Do not copy unrelated (and breaking) elements to tools/ header v2: Turn a ',' into a ';' Reported-by: Jakub Wilk <jwilk@jwilk.net> Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220825220806.107143-1-quentin@isovalent.com	2022-09-27 15:23:45 -07:00
Benjamin Tissoires	7fb6138fae	libbpf: add map_get_fd_by_id and map_delete_elem in light skeleton This allows to have a better control over maps from the kernel when preloading eBPF programs. Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20220824134055.1328882-8-benjamin.tissoires@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-27 15:23:45 -07:00
Hao Luo	c918b3e724	bpf: Add CGROUP prefix to cgroup_iter_order bpf_cgroup_iter_order is globally visible but the entries do not have CGROUP prefix. As requested by Andrii, put a CGROUP in the names in bpf_cgroup_iter_order. This patch fixes two previous commits: one introduced the API and the other uses the API in bpf selftest (that is, the selftest cgroup_hierarchical_stats). I tested this patch via the following command: test_progs -t cgroup,iter,btf_dump Fixes: d4ccaf58a847 ("bpf: Introduce cgroup iter") Fixes: 88886309d2e8 ("selftests/bpf: add a selftest for cgroup hierarchical stats collection") Suggested-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Hao Luo <haoluo@google.com> Link: https://lore.kernel.org/r/20220825223936.1865810-1-haoluo@google.com Signed-off-by: Martin KaFai Lau <kafai@fb.com>	2022-09-27 15:23:45 -07:00
Hao Luo	981001bf46	bpf: Introduce cgroup iter Cgroup_iter is a type of bpf_iter. It walks over cgroups in four modes: - walking a cgroup's descendants in pre-order. - walking a cgroup's descendants in post-order. - walking a cgroup's ancestors. - process only the given cgroup. When attaching cgroup_iter, one can set a cgroup to the iter_link created from attaching. This cgroup is passed as a file descriptor or cgroup id and serves as the starting point of the walk. If no cgroup is specified, the starting point will be the root cgroup v2. For walking descendants, one can specify the order: either pre-order or post-order. For walking ancestors, the walk starts at the specified cgroup and ends at the root. One can also terminate the walk early by returning 1 from the iter program. Note that because walking cgroup hierarchy holds cgroup_mutex, the iter program is called with cgroup_mutex held. Currently only one session is supported, which means, depending on the volume of data bpf program intends to send to user space, the number of cgroups that can be walked is limited. For example, given the current buffer size is 8 * PAGE_SIZE, if the program sends 64B data for each cgroup, assuming PAGE_SIZE is 4kb, the total number of cgroups that can be walked is 512. This is a limitation of cgroup_iter. If the output data is larger than the kernel buffer size, after all data in the kernel buffer is consumed by user space, the subsequent read() syscall will signal EOPNOTSUPP. In order to work around, the user may have to update their program to reduce the volume of data sent to output. For example, skip some uninteresting cgroups. In future, we may extend bpf_iter flags to allow customizing buffer size. Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Hao Luo <haoluo@google.com> Link: https://lore.kernel.org/r/20220824233117.1312810-2-haoluo@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-27 15:23:45 -07:00
Stanislav Fomichev	ee7d295f83	bpf: update bpf_{g,s}et_retval documentation * replace 'syscall' with 'upper layers', still mention that it's being exported via syscall errno * describe what happens in set_retval(-EPERM) + return 1 * describe what happens with bind's 'return 3' Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20220823222555.523590-5-sdf@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-09-27 15:23:45 -07:00
Shmulik Ladkani	94d69cc07f	bpf, flow_dissector: Introduce BPF_FLOW_DISSECTOR_CONTINUE retcode for bpf progs Currently, attaching BPF_PROG_TYPE_FLOW_DISSECTOR programs completely replaces the flow-dissector logic with custom dissection logic. This forces implementors to write programs that handle dissection for any flows expected in the namespace. It makes sense for flow-dissector BPF programs to just augment the dissector with custom logic (e.g. dissecting certain flows or custom protocols), while enjoying the broad capabilities of the standard dissector for any other traffic. Introduce BPF_FLOW_DISSECTOR_CONTINUE retcode. Flow-dissector BPF programs may return this to indicate no dissection was made, and fallback to the standard dissector is requested. Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Stanislav Fomichev <sdf@google.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20220821113519.116765-3-shmulik.ladkani@gmail.com	2022-09-27 15:23:45 -07:00
Mikhail Tuzikov	12a41a80c5	Adding network diag utils into actions-runner-libbpf container	2022-09-27 11:06:30 -07:00
Daniel Müller	10a32130e7	Clean up local allow/deny lists Now that we are including the upstream allow/deny lists we can remove any duplicates from our local lists. While at it, we also add some usdt tests to the denylist, which are currently failing. This is the same step we took in the vmtest repository [0]. [0] https://github.com/kernel-patches/vmtest/pull/133 Signed-off-by: Daniel Müller <deso@posteo.net>	2022-09-06 15:01:05 -07:00
Daniel Müller	fad270918d	Use deny/allow lists from upstream So far we have relied on allow/deny lists maintained in this repository to decide which tests to explicitly include/exclude from running in CI. With recent changes [0] this information is now available in upstream Linux. As such, this change switches us over to using the upstream allow/deny lists in addition to the local ones. We unconditionally honor the upstream lists for all kernel versions. [0] https://lore.kernel.org/bpf/165893461358.29339.11641967418379627671.git-patchwork-notify@kernel.org/T/#m2a97b0ea9ef0ddee7a53bbf7919e3f324b233937 Signed-off-by: Daniel Müller <deso@posteo.net>	2022-09-06 15:01:05 -07:00
Daniel Müller	c091b07808	Fix comment: WHITELIST -> ALLOWLIST Commit `693de729d0` ("Rename blacklists and whitelists") renamed the black and white lists but missed the adjustment of a comment, referencing a file name. Update it accordingly. Signed-off-by: Daniel Müller <deso@posteo.net>	2022-09-06 14:07:51 -07:00
Daniel Müller	efd33720cd	Set KERNEL and REPO_ROOT environment variable for run-qemu action With an upcoming change we would like to invoke bpftool checks from the run-qemu action (https://github.com/libbpf/ci/pull/37). This action requires two environment variables, KERNEL and REPO_ROOT, set in order to function. Make sure to set them now. Long term we should probably make them explicit input arguments instead of implicit global state, but there are many more such instances that we need to clean up. Signed-off-by: Daniel Müller <deso@posteo.net>	2022-09-01 11:00:13 -07:00
Daniel Müller	9aedff8d03	Provide kernel-root argument to run-qemu action With https://github.com/libbpf/ci/pull/36 merged the run-qemu action now accepts an additional argument, `kernel-root`. Provide it to the action with the value appropriate for this repository. Signed-off-by: Daniel Müller <deso@posteo.net>	2022-09-01 10:36:35 -07:00
Daniel Müller	51e63f7229	Explicitly provide kernel-root argument to prepare-rootfs action Let's make the "kernel-root" explicit when using the prepare-rootfs action, instead of relying on the default, .kernel. Signed-off-by: Daniel Müller <deso@posteo.net>	2022-08-29 11:14:39 -07:00
chantra	c53af98d1a	[s390x][runner] update action runner to 2.296.0 (latest)	2022-08-27 17:14:28 -07:00
chantra	2c44349e09	[s390x][runners] Use consistent runner name across restarts Currently, the runner name is taken from the docker container's hostname. This changes across restarts, causing the runner name to change across restarts too. This uses the host name to keep a consistent name.	2022-08-27 17:14:28 -07:00
Daniel Müller	58361243ec	Fix sourcing of helpers.sh in coverity workflow The path to the helpers.sh script to source was put one level too deep by `cfbd763ef8` ("Use foldable helpers where applicable") and the GITHUB_ACTION_PATH variable is not actually defined in a workflow. Fix up both issues. Signed-off-by: Daniel Müller <deso@posteo.net>	2022-08-26 11:30:12 -07:00
Andrii Nakryiko	c32e1cf948	README: add dark background logo image Add auto-selectable libbpf logo for light and dark themes. Suggested-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-08-24 22:09:09 -07:00
Andrii Nakryiko	c4f44c7c11	assets: add libbpf logo images Add three layouts of libbpf logos (sparse, compact, sideways) with three color variants (light bg, dark bg, monochrome). Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-08-24 21:51:42 -07:00
Daniel Müller	a7a525d47a	Rename test_progs_noalu function to test_progs_no_alu32 As a follow up to `66b788c1a4` ("Factor out test_progs_noalu function") and taking into account feedback [0], this change renames the test_progs_noalu function to test_progs_no_alu32, to stay closer to the name of the binary being invoked. [0] https://github.com/kernel-patches/vmtest/pull/124#discussion_r953175641 Signed-off-by: Daniel Müller <deso@posteo.net>	2022-08-24 08:08:21 -07:00
Daniel Müller	cfbd763ef8	Use foldable helpers where applicable As discussed at some earlier point in time, some of the actions/workflow logic does not use our foldable helpers despite being able to. Switch them over. Signed-off-by: Daniel Müller <deso@posteo.net>	2022-08-23 12:04:38 -07:00
Andrii Nakryiko	a0325403af	readme: add logo and clarify initial section Add libbpf logo to the header and restructure and rewrite a bit intro part about libbpf, it's bpf-next origins, etc. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-08-22 12:10:03 -07:00
Andrii Nakryiko	7436656dbf	README: add link to readthedocs doc site Add link to https://libbpf.readthedocs.io/en/latest/api.html for API documentation.	2022-08-19 10:37:43 -07:00
Daniel Müller	7984737fbf	Support running of individual tests This change adjusts the run_selftests.sh script to accept an optional list of arguments specifying the tests to run. We will make use of it once we run selftests in parallel. Signed-off-by: Daniel Müller <deso@posteo.net>	2022-08-18 15:31:52 -07:00
Andrii Nakryiko	a0d1e22c77	ci: blacklist lru_bug selftest on s390x Make sure we don't fail on lru_bug selftests as it relies of BPF trampoline, not supported by s390x. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-08-18 15:29:04 -07:00
Andrii Nakryiko	e58c615210	ci: update vmlinux.h to latest config Some selftests require conn->mark, regenerate vmlinux.h. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-08-18 15:29:04 -07:00
Andrii Nakryiko	aec0b1cd7d	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 73cf09a36bf7bfb3e5a3ff23755c36d49137c44d Checkpoint bpf-next commit: e34cfee65ec891a319ce79797dda18083af33a76 Baseline bpf commit: e7c677bdd03d54e9a1bafcaf1faf5c573a506bba Checkpoint bpf commit: 14b20b784f59bdd95f6f1cfb112c9818bcec4d84 Andrii Nakryiko (3): libbpf: Fix potential NULL dereference when parsing ELF libbpf: Streamline bpf_attr and perf_event_attr initialization libbpf: Clean up deprecated and legacy aliases Hangbin Liu (2): libbpf: Add names for auxiliary maps libbpf: Making bpf_prog_load() ignore name if kernel doesn't support Hao Luo (1): libbpf: Allows disabling auto attach Quentin Monnet (1): bpf: Clear up confusion in bpf_skb_adjust_room()'s documentation include/uapi/linux/bpf.h \| 6 +- src/bpf.c \| 186 ++++++++++++++++++++++----------------- src/btf.c \| 2 - src/btf.h \| 1 - src/libbpf.c \| 81 ++++++++++++----- src/libbpf.h \| 2 + src/libbpf.map \| 2 + src/libbpf_internal.h \| 3 + src/libbpf_legacy.h \| 2 + src/netlink.c \| 3 +- src/skel_internal.h \| 10 ++- 11 files changed, 183 insertions(+), 115 deletions(-) -- 2.30.2	2022-08-18 15:29:04 -07:00
Andrii Nakryiko	a202bd7433	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2022-08-18 15:29:04 -07:00
Andrii Nakryiko	ba81a5b778	libbpf: Clean up deprecated and legacy aliases Remove three missed deprecated APIs that were aliased to new APIs: bpf_object__unload, bpf_prog_attach_xattr and btf__load. Also move legacy API libbpf_find_kernel_btf (aliased to btf__load_vmlinux_btf) into libbpf_legacy.h. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Hao Luo <haoluo@google.com> Link: https://lore.kernel.org/bpf/20220816001929.369487-4-andrii@kernel.org	2022-08-18 15:29:04 -07:00
Andrii Nakryiko	f7cee4152f	libbpf: Streamline bpf_attr and perf_event_attr initialization Make sure that entire libbpf code base is initializing bpf_attr and perf_event_attr with memset(0). Also for bpf_attr make sure we clear and pass to kernel only relevant parts of bpf_attr. bpf_attr is a huge union of independent sub-command attributes, so there is no need to clear and pass entire union bpf_attr, which over time grows quite a lot and for most commands this growth is completely irrelevant. Few cases where we were relying on compiler initialization of BPF UAPI structs (like bpf_prog_info, bpf_map_info, etc) with `= {};` were switched to memset(0) pattern for future-proofing. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Hao Luo <haoluo@google.com> Link: https://lore.kernel.org/bpf/20220816001929.369487-3-andrii@kernel.org	2022-08-18 15:29:04 -07:00
Andrii Nakryiko	06c4624c8c	libbpf: Fix potential NULL dereference when parsing ELF Fix if condition filtering empty ELF sections to prevent NULL dereference. Fixes: 47ea7417b074 ("libbpf: Skip empty sections in bpf_object__init_global_data_maps") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Hao Luo <haoluo@google.com> Link: https://lore.kernel.org/bpf/20220816001929.369487-2-andrii@kernel.org	2022-08-18 15:29:04 -07:00
Hao Luo	c8f4b9c878	libbpf: Allows disabling auto attach Adds libbpf APIs for disabling auto-attach for individual functions. This is motivated by the use case of cgroup iter [1]. Some iter types require their parameters to be non-zero, therefore applying auto-attach on them will fail. With these two new APIs, users who want to use auto-attach and these types of iters can disable auto-attach on the program and perform manual attach. [1] https://lore.kernel.org/bpf/CAEf4BzZ+a2uDo_t6kGBziqdz--m2gh2_EUwkGLDtMd65uwxUjA@mail.gmail.com/ Signed-off-by: Hao Luo <haoluo@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220816234012.910255-1-haoluo@google.com	2022-08-18 15:29:04 -07:00
Hangbin Liu	079bc8536d	libbpf: Making bpf_prog_load() ignore name if kernel doesn't support Similar with commit 10b62d6a38f7 ("libbpf: Add names for auxiliary maps"), let's make bpf_prog_load() also ignore name if kernel doesn't support program name. To achieve this, we need to call sys_bpf_prog_load() directly in probe_kern_prog_name() to avoid circular dependency. sys_bpf_prog_load() also need to be exported in the libbpf_internal.h file. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20220813000936.6464-1-liuhangbin@gmail.com	2022-08-18 15:29:04 -07:00
Quentin Monnet	8be13ee80b	bpf: Clear up confusion in bpf_skb_adjust_room()'s documentation Adding or removing room space _below_ layers 2 or 3, as the description mentions, is ambiguous. This was written with a mental image of the packet with layer 2 at the top, layer 3 under it, and so on. But it has led users to believe that it was on lower layers (before the beginning of the L2 and L3 headers respectively). Let's make it more explicit, and specify between which layers the room space is adjusted. Reported-by: Rumen Telbizov <rumen.telbizov@menlosecurity.com> Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220812153727.224500-3-quentin@isovalent.com	2022-08-18 15:29:04 -07:00
Hangbin Liu	3db7585378	libbpf: Add names for auxiliary maps The bpftool self-created maps can appear in final map show output due to deferred removal in kernel. These maps don't have a name, which would make users confused about where it comes from. With a libbpf_ prefix name, users could know who created these maps. It also could make some tests (like test_offload.py, which skip base maps without names as a workaround) filter them out. Kernel adds bpf prog/map name support in the same merge commit fadad670a8ab ("Merge branch 'bpf-extend-info'"). So we can also use kernel_supports(NULL, FEAT_PROG_NAME) to check if kernel supports map name. As discussed [1], Let's make bpf_map_create accept non-null name string, and silently ignore the name if kernel doesn't support. [1] https://lore.kernel.org/bpf/CAEf4BzYL1TQwo1231s83pjTdFPk9XWWhfZC5=KzkU-VO0k=0Ug@mail.gmail.com/ Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220811034020.529685-1-liuhangbin@gmail.com	2022-08-18 15:29:04 -07:00
Daniel Müller	69938da6d7	Explicitly specify Qemu image path to use The path to the file system image used by our invocation of Qemu is currently hard coded to /tmp/root.img somewhere in a different repository. With `da44c0b6ee` landed we have the option of specifying it explicitly from here. Let's do just that, so that we can remove the default value from libbpf/ci altogether. Signed-off-by: Daniel Müller <deso@posteo.net>	2022-08-18 14:38:23 -07:00
Daniel Müller	bfdf7653e0	Rename travis-ci/ directory to ci/ We are no longer using Travis. As such, we should move away from a lot of CI functionality located in a folder called travis-ci/. This change renames the travis-ci/ directory to the more generic ci/. To preserve backwards compatibility until all "consumers" have transitioned, we add a symbolic link called travis-ci back. It will be removed in the near term future. Signed-off-by: Daniel Müller <deso@posteo.net>	2022-08-18 09:02:13 -07:00
Daniel Müller	d700dcf162	Print allow and denylists We should include the deny and allow lists used somewhere in the output of our CI runs in order to improve debuggability in general. With this change we print out these lists once assembled. Signed-off-by: Daniel Müller <deso@posteo.net>	2022-08-17 11:41:22 -07:00
Daniel Müller	c03b9f6d0b	Move kernel version check inwards The run_selftests.sh script defines functions for running individual tests. However, not all tests are run in all configurations. E.g., test_progs is not run on 4.9.0 kernels and test_maps is only run when testing on the "latest" kernel version. The checks for these conditions, however, are applied inconsistently: some are in the functions themselves and others on the call site. This change unifies all checks to happen within the test function itself. Signed-off-by: Daniel Müller <deso@posteo.net>	2022-08-17 11:41:22 -07:00
Daniel Müller	66b788c1a4	Factor out test_progs_noalu function This change factors out a new function, test_progs_noalu, in the run_selftests.sh script. Having this function available will make it easier for us to run tests conditionally later on, but it's also a matter of having one function for one binary. Signed-off-by: Daniel Müller <deso@posteo.net>	2022-08-17 11:41:22 -07:00
Daniel Müller	e3c2b8a48d	Re-enable test_maps selftest Back in 2020, we disabled the test_maps selftest with `e05f9be4f4` ("vmtests: temporarily disable test_maps") for reasons not closely elaborated. It appears that by now the test is succeeding again, so let's enable it back. Signed-off-by: Daniel Müller <deso@posteo.net>	2022-08-15 15:50:55 -07:00
Andrii Nakryiko	13a26d78f3	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 71930846b36f8e4e68267f8a3d47e33435c3657a Checkpoint bpf-next commit: 73cf09a36bf7bfb3e5a3ff23755c36d49137c44d Baseline bpf commit: f946964a9f79f8dcb5a6329265281eebfc23aee5 Checkpoint bpf commit: e7c677bdd03d54e9a1bafcaf1faf5c573a506bba Alexei Starovoitov (1): bpf: Disallow bpf programs call prog_run command. Andrii Nakryiko (2): libbpf: Reject legacy 'maps' ELF section libbpf: preserve errno across pr_warn/pr_info/pr_debug Dave Marchevsky (1): bpf: Improve docstring for BPF_F_USER_BUILD_ID flag Florian Fainelli (1): libbpf: Initialize err in probe_map_create Gustavo A. R. Silva (1): treewide: uapi: Replace zero-length arrays with flexible-array members Hengqi Chen (1): libbpf: Do not require executable permission for shared libraries James Hilliard (2): libbpf: Skip empty sections in bpf_object__init_global_data_maps libbpf: Ensure functions with always_inline attribute are inline Jesper Dangaard Brouer (1): bpf: Add BPF-helper for accessing CLOCK_TAI Namhyung Kim (1): perf/core: Add a new read format to get a number of lost samples include/uapi/linux/bpf.h \| 27 +++++++++++++++++++++++++-- include/uapi/linux/perf_event.h \| 7 +++++-- include/uapi/linux/pkt_cls.h \| 4 ++-- src/bpf_tracing.h \| 14 +++++++------- src/libbpf.c \| 25 +++++++++++++++++-------- src/libbpf_probes.c \| 2 +- src/skel_internal.h \| 4 ++-- src/usdt.bpf.h \| 4 ++-- 8 files changed, 61 insertions(+), 26 deletions(-) -- 2.30.2	2022-08-10 14:07:19 -07:00
Andrii Nakryiko	6b92311c3a	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2022-08-10 14:07:19 -07:00
Alexei Starovoitov	6fdbfb00f1	bpf: Disallow bpf programs call prog_run command. The verifier cannot perform sufficient validation of bpf_attr->test.ctx_in pointer, therefore bpf programs should not be allowed to call BPF_PROG_RUN command from within the program. To fix this issue split bpf_sys_bpf() bpf helper into normal kern_sys_bpf() kernel function that can only be used by the kernel light skeleton directly. Reported-by: YiFei Zhu <zhuyifei@google.com> Fixes: b1d18a7574d0 ("bpf: Extend sys_bpf commands for bpf_syscall programs.") Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-08-10 14:07:19 -07:00
Andrii Nakryiko	45dca19bd2	libbpf: preserve errno across pr_warn/pr_info/pr_debug As suggested in [0], make sure that libbpf_print saves and restored errno and as such guaranteed that no matter what actual print callback user installs, macros like pr_warn/pr_info/pr_debug are completely transparent as far as errno goes. While libbpf code is pretty careful about not clobbering important errno values accidentally with pr_warn(), it's a trivial change to make sure that pr_warn can be used anywhere without a risk of clobbering errno. No functional changes, just future proofing. [0] https://github.com/libbpf/libbpf/pull/536 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Daniel Müller <deso@posteo.net> Link: https://lore.kernel.org/r/20220810183425.1998735-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-08-10 14:07:19 -07:00
Jesper Dangaard Brouer	2fe1958ec8	bpf: Add BPF-helper for accessing CLOCK_TAI Commit 3dc6ffae2da2 ("timekeeping: Introduce fast accessor to clock tai") introduced a fast and NMI-safe accessor for CLOCK_TAI. Especially in time sensitive networks (TSN), where all nodes are synchronized by Precision Time Protocol (PTP), it's helpful to have the possibility to generate timestamps based on CLOCK_TAI instead of CLOCK_MONOTONIC. With a BPF helper for TAI in place, it becomes very convenient to correlate activity across different machines in the network. Use cases for such a BPF helper include functionalities such as Tx launch time (e.g. ETF and TAPRIO Qdiscs) and timestamping. Note: CLOCK_TAI is nothing new per se, only the NMI-safe variant of it is. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> [Kurt: Wrote changelog and renamed helper] Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Link: https://lore.kernel.org/r/20220809060803.5773-2-kurt@linutronix.de Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-08-10 14:07:19 -07:00
Dave Marchevsky	cbd9b7e5d8	bpf: Improve docstring for BPF_F_USER_BUILD_ID flag Most tools which use bpf_get_stack or bpf_get_stackid symbolicate the stack - meaning the stack of addresses in the target process' address space is transformed into meaningful symbol names. The BPF_F_USER_BUILD_ID flag eases this process by finding the build_id of the file-backed vma which the address falls in and translating the address to an offset within the backing file. To be more specific, the offset is a "file offset" from the beginning of the backing file. The symbols in ET_DYN ELF objects have a st_value which is also described as an "offset" - but an offset in the process address space, relative to the base address of the object. It's necessary to translate between the "file offset" and "virtual address offset" during symbolication before they can be directly compared. Failure to do so can lead to confusing bugs, so this patch clarifies language in the documentation in an attempt to keep this from happening. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220808164723.3107500-1-davemarchevsky@fb.com	2022-08-10 14:07:19 -07:00
Hengqi Chen	0cc6bfab39	libbpf: Do not require executable permission for shared libraries Currently, resolve_full_path() requires executable permission for both programs and shared libraries. This causes failures on distos like Debian since the shared libraries are not installed executable and Linux is not requiring shared libraries to have executable permissions. Let's remove executable permission check for shared libraries. Reported-by: Goro Fuji <goro@fastly.com> Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220806102021.3867130-1-hengqi.chen@gmail.com	2022-08-10 14:07:19 -07:00
Andrii Nakryiko	41c612167e	libbpf: Reject legacy 'maps' ELF section Add explicit error message if BPF object file is still using legacy BPF map definitions in SEC("maps"). Before this change, if BPF object file is still using legacy map definition user will see a bit confusing: libbpf: elf: skipping unrecognized data section(4) maps libbpf: prog 'handler': bad map relo against 'server_map' in section 'maps' Now libbpf will be explicit about rejecting "maps" ELF section: libbpf: elf: legacy map definitions in 'maps' section are not supported by libbpf v1.0+ Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220803214202.23750-1-andrii@kernel.org	2022-08-10 14:07:19 -07:00
James Hilliard	69d537ba0b	libbpf: Ensure functions with always_inline attribute are inline GCC expects the always_inline attribute to only be set on inline functions, as such we should make all functions with this attribute use the __always_inline macro which makes the function inline and sets the attribute. Fixes errors like: /home/buildroot/bpf-next/tools/testing/selftests/bpf/tools/include/bpf/bpf_tracing.h:439:1: error: ‘always_inline’ function might not be inlinable [-Werror=attributes] 439 \| ____##name(unsigned long long *ctx, ##args) \| ^~~~ Signed-off-by: James Hilliard <james.hilliard1@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20220803151403.793024-1-james.hilliard1@gmail.com	2022-08-10 14:07:19 -07:00
Florian Fainelli	bd1e5cff31	libbpf: Initialize err in probe_map_create GCC-11 warns about the possibly unitialized err variable in probe_map_create: libbpf_probes.c: In function 'probe_map_create': libbpf_probes.c:361:38: error: 'err' may be used uninitialized in this function [-Werror=maybe-uninitialized] 361 \| return fd < 0 && err == exp_err ? 1 : 0; \| ~~~~^~~~~~~~~~ Fixes: 878d8def0603 ("libbpf: Rework feature-probing APIs") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20220801025109.1206633-1-f.fainelli@gmail.com	2022-08-10 14:07:19 -07:00
James Hilliard	3d484ca473	libbpf: Skip empty sections in bpf_object__init_global_data_maps The GNU assembler generates an empty .bss section. This is a well established behavior in GAS that happens in all supported targets. The LLVM assembler doesn't generate an empty .bss section. bpftool chokes on the empty .bss section. Additionally in bpf_object__elf_collect the sec_desc->data is not initialized when a section is not recognized. In this case, this happens with .comment. So we must check that sec_desc->data is initialized before checking if the size is 0. Signed-off-by: James Hilliard <james.hilliard1@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20220731232649.4668-1-james.hilliard1@gmail.com	2022-08-10 14:07:19 -07:00
Gustavo A. R. Silva	c25544735b	treewide: uapi: Replace zero-length arrays with flexible-array members There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. This code was transformed with the help of Coccinelle: (linux-5.19-rc2$ spatch --jobs $(getconf _NPROCESSORS_ONLN) --sp-file script.cocci --include-headers --dir . > output.patch) @@ identifier S, member, array; type T1, T2; @@ struct S { ... T1 member; T2 array[ - 0 ]; }; -fstrict-flex-arrays=3 is coming and we need to land these changes to prevent issues like these in the short future: ../fs/minix/dir.c:337:3: warning: 'strcpy' will always overflow; destination buffer has size 0, but the source string has length 2 (including NUL byte) [-Wfortify-source] strcpy(de3->name, "."); ^ Since these are all [0] to [] changes, the risk to UAPI is nearly zero. If this breaks anything, we can use a union with a new member name. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.16/process/deprecated.html#zero-length-and-one-element-arrays Link: https://github.com/KSPP/linux/issues/78 Build-tested-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/lkml/62b675ec.wKX6AOZ6cbE71vtF%25lkp@intel.com/ Acked-by: Dan Williams <dan.j.williams@intel.com> # For ndctl.h Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>	2022-08-10 14:07:19 -07:00
Namhyung Kim	179c7940eb	perf/core: Add a new read format to get a number of lost samples Sometimes we want to know an accurate number of samples even if it's lost. Currenlty PERF_RECORD_LOST is generated for a ring-buffer which might be shared with other events. So it's hard to know per-event lost count. Add event->lost_samples field and PERF_FORMAT_LOST to retrieve it from userspace. Original-patch-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220616180623.1358843-1-namhyung@kernel.org	2022-08-10 14:07:19 -07:00
Daniel Müller	f6692dc4e8	Remove checked-in configuration Both the bpf and bpf-next tree have suitable BPF selftest configurations available for usage with the latest kernel now upstream. While we do test on 4.9 and 5.5 kernels as well, there we just download prebuilt binaries. The configuration we use for building selftests is always the upstream one. With this change we remove the checked-in configuration, as it is now no longer needed. Signed-off-by: Daniel Müller <deso@posteo.net>	2022-08-10 10:24:28 -07:00
Daniel Müller	693de729d0	Rename blacklists and whitelists Upstream uses denylist and allowlist terminology instead of blacklist and whitelist. It also has established a less deeply nested directory structure. This change renames the blacklist & whitelist files accordingly and moves them one level up out of their containing directory to mirror the layout we have upstream as well as in kernel-patches/vmtest. Signed-off-by: Daniel Müller <deso@posteo.net>	2022-08-10 08:31:17 -07:00
Daniel Müller	0667206913	Use checkout action in version v3 The current version of actions/checkout is v3. That means that v2, which we currently use, has been superseded. Update the version we use accordingly. Signed-off-by: Daniel Müller <deso@posteo.net>	2022-08-09 14:02:50 -07:00
Daniel Müller	a2ebd9ceff	Rely on upstream kernel configuration So far we have relied on the kernel configuration as checked into the this repository. However, a suitable configuration is now included in upstream Linux [0]. With this change we add support for using the configuration from there. [0] https://lore.kernel.org/bpf/165893461358.29339.11641967418379627671.git-patchwork-notify@kernel.org/T/#m2a97b0ea9ef0ddee7a53bbf7919e3f324b233937 Signed-off-by: Daniel Müller <deso@posteo.net>	2022-08-09 09:23:59 -07:00
Daniel Müller	0e43565ad8	ci: Bump LLVM version we use to 16 Development on LLVM 16 has started and version 15 is no longer available in the repository we install it from. Bump the version we use accordingly. Signed-off-by: Daniel Müller <deso@posteo.net>	2022-08-01 13:10:42 -07:00
Andrii Nakryiko	5b795f7b30	ci: blacklist skeleton selftest Selftest relies on new 5.19+ kernel support for big ARRAY maps. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-07-31 16:45:48 -07:00
Andrii Nakryiko	3fa2c28d2c	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: b0d93b44641a83c28014ca38001e85bf6dc8501e Checkpoint bpf-next commit: 71930846b36f8e4e68267f8a3d47e33435c3657a Baseline bpf commit: d28b25a62a47a8c8aa19bd543863aab6717e68c9 Checkpoint bpf commit: f946964a9f79f8dcb5a6329265281eebfc23aee5 Andrii Nakryiko (7): libbpf: add bpf_core_type_matches() helper macro libbpf: Remove unnecessary usdt_rel_ip assignments libbpf: generalize virtual __kconfig externs and use it for USDT libbpf: improve BPF_KPROBE_SYSCALL macro and rename it to BPF_KSYSCALL libbpf: add ksyscall/kretsyscall sections support for syscall kprobes libbpf: fallback to tracefs mount point if debugfs is not mounted libbpf: make RINGBUF map size adjustments more eagerly Anquan Wu (1): libbpf: Fix the name of a reused map Chuang Wang (3): libbpf: Cleanup the legacy kprobe_event on failed add/attach_event() libbpf: Fix wrong variable used in perf_event_uprobe_open_legacy() libbpf: Cleanup the legacy uprobe_event on failed add/attach_event() Dan Carpenter (3): libbpf: fix an snprintf() overflow check libbpf: Fix sign expansion bug in btf_dump_get_enum_value() libbpf: Fix str_has_sfx()'s return value Daniel Müller (4): bpf: Introduce TYPE_MATCH related constants/macros bpf, libbpf: Add type match support bpf: Correctly propagate errors up from bpf_core_composites_match libbpf: Support PPC in arch_specific_syscall_pfx Hangbin Liu (1): Bonding: add per-port priority for failover re-selection Hengqi Chen (1): libbpf: Error out when binary_path is NULL for uprobe and USDT Ilya Leoshkevich (1): libbpf: Extend BPF_KSYSCALL documentation James Hilliard (1): libbpf: Disable SEC pragma macro on GCC Joanne Koong (2): bpf: Add flags arg to bpf_dynptr_read and bpf_dynptr_write APIs bpf: fix bpf_skb_pull_data documentation Joe Burton (1): libbpf: Add bpf_obj_get_opts() Jon Doron (1): libbpf: perfbuf: Add API to get the ring buffer Pu Lehui (1): bpf, docs: Remove deprecated xsk libbpf APIs description Yixun Lan (1): libbpf, riscv: Use a0 for RC register docs/libbpf_naming_convention.rst \| 13 +- include/uapi/linux/bpf.h \| 15 +- include/uapi/linux/if_link.h \| 1 + src/bpf.c \| 9 + src/bpf.h \| 11 + src/bpf_core_read.h \| 11 + src/bpf_helpers.h \| 13 + src/bpf_tracing.h \| 60 +++- src/btf_dump.c \| 2 +- src/gen_loader.c \| 2 +- src/libbpf.c \| 440 ++++++++++++++++++++++-------- src/libbpf.h \| 62 +++++ src/libbpf.map \| 3 + src/libbpf_internal.h \| 8 +- src/relo_core.c \| 286 ++++++++++++++++++- src/relo_core.h \| 4 + src/usdt.bpf.h \| 16 +- src/usdt.c \| 6 +- 18 files changed, 793 insertions(+), 169 deletions(-) -- 2.30.2	2022-07-31 16:45:48 -07:00
Andrii Nakryiko	0fa013e705	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2022-07-31 16:45:48 -07:00
Joe Burton	d8e2c9d965	libbpf: Add bpf_obj_get_opts() Add an extensible variant of bpf_obj_get() capable of setting the `file_flags` parameter. This parameter is needed to enable unprivileged access to BPF maps. Without a method like this, users must manually make the syscall. Signed-off-by: Joe Burton <jevburton@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220729202727.3311806-1-jevburton.kernel@gmail.com	2022-07-31 16:45:48 -07:00
Daniel Müller	b2d7228d7c	libbpf: Support PPC in arch_specific_syscall_pfx Commit 708ac5bea0ce ("libbpf: add ksyscall/kretsyscall sections support for syscall kprobes") added the arch_specific_syscall_pfx() function, which returns a string representing the architecture in use. As it turns out this function is currently not aware of Power PC, where NULL is returned. That's being flagged by the libbpf CI system, which builds for ppc64le and the compiler sees a NULL pointer being passed in to a %s format string. With this change we add representations for two more architectures, for Power PC and Power PC 64, and also adjust the string format logic to handle NULL pointers gracefully, in an attempt to prevent similar issues with other architectures in the future. Fixes: 708ac5bea0ce ("libbpf: add ksyscall/kretsyscall sections support for syscall kprobes") Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220728222345.3125975-1-deso@posteo.net	2022-07-31 16:45:48 -07:00
Ilya Leoshkevich	427f2a0c83	libbpf: Extend BPF_KSYSCALL documentation Explicitly list known quirks. Mention that socket-related syscalls can be invoked via socketcall(). Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20220726134008.256968-2-iii@linux.ibm.com	2022-07-31 16:45:48 -07:00
Dan Carpenter	8663289b51	libbpf: Fix str_has_sfx()'s return value The return from strcmp() is inverted so it wrongly returns true instead of false and vice versa. Fixes: a1c9d61b19cb ("libbpf: Improve library identification for uprobe binary path resolution") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Cc: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/YtZ+/dAA195d99ak@kili	2022-07-31 16:45:48 -07:00
Dan Carpenter	77e514d626	libbpf: Fix sign expansion bug in btf_dump_get_enum_value() The code here is supposed to take a signed int and store it in a signed long long. Unfortunately, the way that the type promotion works with this conditional statement is that it takes a signed int, type promotes it to a __u32, and then stores that as a signed long long. The result is never negative. This is from static analysis, but I made a little test program just to test it before I sent the patch: #include <stdio.h> int main(void) { unsigned long long src = -1ULL; signed long long dst1, dst2; int is_signed = 1; dst1 = is_signed ? (int )&src : (unsigned int )0; dst2 = is_signed ? (signed long long)(int )&src : (unsigned int )0; printf("%lld\n", dst1); printf("%lld\n", dst2); return 0; } Fixes: d90ec262b35b ("libbpf: Add enum64 support for btf_dump") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/YtZ+LpgPADm7BeEd@kili	2022-07-31 16:45:48 -07:00
Dan Carpenter	b44b214118	libbpf: fix an snprintf() overflow check The snprintf() function returns the number of bytes it would have copied if there were enough space. So it can return > the sizeof(gen->attach_target). Fixes: 67234743736a ("libbpf: Generate loader program out of BPF ELF file.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/YtZ+oAySqIhFl6/J@kili Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-07-31 16:45:48 -07:00
Andrii Nakryiko	610707057a	libbpf: make RINGBUF map size adjustments more eagerly Make libbpf adjust RINGBUF map size (rounding it up to closest power-of-2 of page_size) more eagerly: during open phase when initializing the map and on explicit calls to bpf_map__set_max_entries(). Such approach allows user to check actual size of BPF ringbuf even before it's created in the kernel, but also it prevents various edge case scenarios where BPF ringbuf size can get out of sync with what it would be in kernel. One of them (reported in [0]) is during an attempt to pin/reuse BPF ringbuf. Move adjust_ringbuf_sz() helper closer to its first actual use. The implementation of the helper is unchanged. Also make detection of whether bpf_object is already loaded more robust by checking obj->loaded explicitly, given that map->fd can be < 0 even if bpf_object is already loaded due to ability to disable map creation with bpf_map__set_autocreate(map, false). [0] Closes: https://github.com/libbpf/libbpf/pull/530 Fixes: 0087a681fa8c ("libbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220715230952.2219271-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-07-31 16:45:48 -07:00
Joanne Koong	7e567b8761	bpf: fix bpf_skb_pull_data documentation Fix documentation for bpf_skb_pull_data() helper for when len == 0. Fixes: fa15601ab31e ("bpf: add documentation for eBPF helpers (33-41)") Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Acked-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/20220715193800.3940070-1-joannelkoong@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-07-31 16:45:48 -07:00
Andrii Nakryiko	1fe0248c61	libbpf: fallback to tracefs mount point if debugfs is not mounted Teach libbpf to fallback to tracefs mount point (/sys/kernel/tracing) if debugfs (/sys/kernel/debug/tracing) isn't mounted. Acked-by: Yonghong Song <yhs@fb.com> Suggested-by: Connor O'Brien <connoro@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220715185736.898848-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-07-31 16:45:48 -07:00
Andrii Nakryiko	0862e4e54d	libbpf: add ksyscall/kretsyscall sections support for syscall kprobes Add SEC("ksyscall")/SEC("ksyscall/<syscall_name>") and corresponding kretsyscall variants (for return kprobes) to allow users to kprobe syscall functions in kernel. These special sections allow to ignore complexities and differences between kernel versions and host architectures when it comes to syscall wrapper and corresponding __<arch>_sys_<syscall> vs __se_sys_<syscall> differences, depending on whether host kernel has CONFIG_ARCH_HAS_SYSCALL_WRAPPER (though libbpf itself doesn't rely on /proc/config.gz for detecting this, see BPF_KSYSCALL patch for how it's done internally). Combined with the use of BPF_KSYSCALL() macro, this allows to just specify intended syscall name and expected input arguments and leave dealing with all the variations to libbpf. In addition to SEC("ksyscall+") and SEC("kretsyscall+") add bpf_program__attach_ksyscall() API which allows to specify syscall name at runtime and provide associated BPF cookie value. At the moment SEC("ksyscall") and bpf_program__attach_ksyscall() do not handle all the calling convention quirks for mmap(), clone() and compat syscalls. It also only attaches to "native" syscall interfaces. If host system supports compat syscalls or defines 32-bit syscalls in 64-bit kernel, such syscall interfaces won't be attached to by libbpf. These limitations may or may not change in the future. Therefore it is recommended to use SEC("kprobe") for these syscalls or if working with compat and 32-bit interfaces is required. Tested-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220714070755.3235561-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-07-31 16:45:48 -07:00
Andrii Nakryiko	fd6c9d906a	libbpf: improve BPF_KPROBE_SYSCALL macro and rename it to BPF_KSYSCALL Improve BPF_KPROBE_SYSCALL (and rename it to shorter BPF_KSYSCALL to match libbpf's SEC("ksyscall") section name, added in next patch) to use __kconfig variable to determine how to properly fetch syscall arguments. Instead of relying on hard-coded knowledge of whether kernel's architecture uses syscall wrapper or not (which only reflects the latest kernel versions, but is not necessarily true for older kernels and won't necessarily hold for later kernel versions on some particular host architecture), determine this at runtime by attempting to create perf_event (with fallback to kprobe event creation through tracefs on legacy kernels, just like kprobe attachment code is doing) for kernel function that would correspond to bpf() syscall on a system that has CONFIG_ARCH_HAS_SYSCALL_WRAPPER set (e.g., for x86-64 it would try '__x64_sys_bpf'). If host kernel uses syscall wrapper, syscall kernel function's first argument is a pointer to struct pt_regs that then contains syscall arguments. In such case we need to use bpf_probe_read_kernel() to fetch actual arguments (which we do through BPF_CORE_READ() macro) from inner pt_regs. But if the kernel doesn't use syscall wrapper approach, input arguments can be read from struct pt_regs directly with no probe reading. All this feature detection is done without requiring /proc/config.gz existence and parsing, and BPF-side helper code uses newly added LINUX_HAS_SYSCALL_WRAPPER virtual __kconfig extern to keep in sync with user-side feature detection of libbpf. BPF_KSYSCALL() macro can be used both with SEC("kprobe") programs that define syscall function explicitly (e.g., SEC("kprobe/__x64_sys_bpf")) and SEC("ksyscall") program added in the next patch (which are the same kprobe program with added benefit of libbpf determining correct kernel function name automatically). Kretprobe and kretsyscall (added in next patch) programs don't need BPF_KSYSCALL as they don't provide access to input arguments. Normal BPF_KRETPROBE is completely sufficient and is recommended. Tested-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220714070755.3235561-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-07-31 16:45:48 -07:00
Andrii Nakryiko	d56d93baff	libbpf: generalize virtual __kconfig externs and use it for USDT Libbpf supports single virtual __kconfig extern currently: LINUX_KERNEL_VERSION. LINUX_KERNEL_VERSION isn't coming from /proc/kconfig.gz and is intead customly filled out by libbpf. This patch generalizes this approach to support more such virtual __kconfig externs. One such extern added in this patch is LINUX_HAS_BPF_COOKIE which is used for BPF-side USDT supporting code in usdt.bpf.h instead of using CO-RE-based enum detection approach for detecting bpf_get_attach_cookie() BPF helper. This allows to remove otherwise not needed CO-RE dependency and keeps user-space and BPF-side parts of libbpf's USDT support strictly in sync in terms of their feature detection. We'll use similar approach for syscall wrapper detection for BPF_KSYSCALL() BPF-side macro in follow up patch. Generally, currently libbpf reserves CONFIG_ prefix for Kconfig values and LINUX_ for virtual libbpf-backed externs. In the future we might extend the set of prefixes that are supported. This can be done without any breaking changes, as currently any __kconfig extern with unrecognized name is rejected. For LINUX_xxx externs we support the normal "weak rule": if libbpf doesn't recognize given LINUX_xxx extern but such extern is marked as __weak, it is not rejected and defaults to zero. This follows CONFIG_xxx handling logic and will allow BPF applications to opportunistically use newer libbpf virtual externs without breaking on older libbpf versions unnecessarily. Tested-by: Alan Maguire <alan.maguire@oracle.com> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220714070755.3235561-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-07-31 16:45:48 -07:00
Jon Doron	1648fa16b5	libbpf: perfbuf: Add API to get the ring buffer Add support for writing a custom event reader, by exposing the ring buffer. With the new API perf_buffer__buffer() you will get access to the raw mmaped()'ed per-cpu underlying memory of the ring buffer. This region contains both the perf buffer data and header (struct perf_event_mmap_page), which manages the ring buffer state (head/tail positions, when accessing the head/tail position it's important to take into consideration SMP). With this type of low level access one can implement different types of consumers here are few simple examples where this API helps with: 1. perf_event_read_simple is allocating using malloc, perhaps you want to handle the wrap-around in some other way. 2. Since perf buf is per-cpu then the order of the events is not guarnteed, for example: Given 3 events where each event has a timestamp t0 < t1 < t2, and the events are spread on more than 1 CPU, then we can end up with the following state in the ring buf: CPU[0] => [t0, t2] CPU[1] => [t1] When you consume the events from CPU[0], you could know there is a t1 missing, (assuming there are no drops, and your event data contains a sequential index). So now one can simply do the following, for CPU[0], you can store the address of t0 and t2 in an array (without moving the tail, so there data is not perished) then move on the CPU[1] and set the address of t1 in the same array. So you end up with something like: void **arr[] = [&t0, &t1, &t2], now you can consume it orderely and move the tails as you process in order. 3. Assuming there are multiple CPUs and we want to start draining the messages from them, then we can "pick" with which one to start with according to the remaining free space in the ring buffer. Signed-off-by: Jon Doron <jond@wiz.io> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220715181122.149224-1-arilou@gmail.com	2022-07-31 16:45:48 -07:00
Anquan Wu	9b6f4eb157	libbpf: Fix the name of a reused map BPF map name is limited to BPF_OBJ_NAME_LEN. A map name is defined as being longer than BPF_OBJ_NAME_LEN, it will be truncated to BPF_OBJ_NAME_LEN when a userspace program calls libbpf to create the map. A pinned map also generates a path in the /sys. If the previous program wanted to reuse the map， it can not get bpf_map by name, because the name of the map is only partially the same as the name which get from pinned path. The syscall information below show that map name "process_pinned_map" is truncated to "process_pinned_". bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/process_pinned_map", bpf_fd=0, file_flags=0}, 144) = -1 ENOENT (No such file or directory) bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=4,max_entries=1024, map_flags=0, inner_map_fd=0, map_name="process_pinned_",map_ifindex=0, btf_fd=3, btf_key_type_id=6, btf_value_type_id=10,btf_vmlinux_value_type_id=0}, 72) = 4 This patch check that if the name of pinned map are the same as the actual name for the first (BPF_OBJ_NAME_LEN - 1), bpf map still uses the name which is included in bpf object. Fixes: 26736eb9a483 ("tools: libbpf: allow map reuse") Signed-off-by: Anquan Wu <leiqi96@hotmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/OSZP286MB1725CEA1C95C5CB8E7CCC53FB8869@OSZP286MB1725.JPNP286.PROD.OUTLOOK.COM	2022-07-31 16:45:48 -07:00
Hengqi Chen	b3fe4be0b3	libbpf: Error out when binary_path is NULL for uprobe and USDT binary_path is a required non-null parameter for bpf_program__attach_usdt and bpf_program__attach_uprobe_opts. Check it against NULL to prevent coredump on strchr. Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220712025745.2703995-1-hengqi.chen@gmail.com	2022-07-31 16:45:48 -07:00
Joanne Koong	6d5026e434	bpf: Add flags arg to bpf_dynptr_read and bpf_dynptr_write APIs Commit 13bbbfbea759 ("bpf: Add bpf_dynptr_read and bpf_dynptr_write") added the bpf_dynptr_write() and bpf_dynptr_read() APIs. However, it will be needed for some dynptr types to pass in flags as well (e.g. when writing to a skb, the user may like to invalidate the hash or recompute the checksum). This patch adds a "u64 flags" arg to the bpf_dynptr_read() and bpf_dynptr_write() APIs before their UAPI signature freezes where we then cannot change them anymore with a 5.19.x released kernel. Fixes: 13bbbfbea759 ("bpf: Add bpf_dynptr_read and bpf_dynptr_write") Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20220706232547.4016651-1-joannelkoong@gmail.com	2022-07-31 16:45:48 -07:00
Daniel Müller	ca60209447	bpf: Correctly propagate errors up from bpf_core_composites_match This change addresses a comment made earlier [0] about a missing return of an error when __bpf_core_types_match is invoked from bpf_core_composites_match, which could have let to us erroneously ignoring errors. Regarding the typedef name check pointed out in the same context, it is not actually an issue, because callers of the function perform a name check for the root type anyway. To make that more obvious, let's add comments to the function (similar to what we have for bpf_core_types_are_compat, which is called in pretty much the same context). [0]: https://lore.kernel.org/bpf/165708121449.4919.13204634393477172905.git-patchwork-notify@kernel.org/T/#m55141e8f8cfd2e8d97e65328fa04852870d01af6 Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220707211931.3415440-1-deso@posteo.net	2022-07-31 16:45:48 -07:00
James Hilliard	b31ca3fa0e	libbpf: Disable SEC pragma macro on GCC It seems the gcc preprocessor breaks with pragmas when surrounding __attribute__. Disable these pragmas on GCC due to upstream bugs see: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55578 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90400 Fixes errors like: error: expected identifier or '(' before '#pragma' 106 \| SEC("cgroup/bind6") \| ^~~ error: expected '=', ',', ';', 'asm' or '__attribute__' before '#pragma' 114 \| char _license[] SEC("license") = "GPL"; \| ^~~ Signed-off-by: James Hilliard <james.hilliard1@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220706111839.1247911-1-james.hilliard1@gmail.com	2022-07-31 16:45:48 -07:00
Pu Lehui	295a4aae35	bpf, docs: Remove deprecated xsk libbpf APIs description Since xsk APIs has been removed from libbpf, let's clean up the BPF docs simutaneously. Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/bpf/20220708042736.669132-1-pulehui@huawei.com	2022-07-31 16:45:48 -07:00
Yixun Lan	8498996f9f	libbpf, riscv: Use a0 for RC register According to the RISC-V calling convention register usage here [0], a0 is used as return value register, so rename it to make it consistent with the spec. [0] section 18.2, table 18.2 https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf Fixes: 589fed479ba1 ("riscv, libbpf: Add RISC-V (RV64) support to bpf_tracing.h") Signed-off-by: Yixun Lan <dlan@gentoo.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn@kernel.org> Acked-by: Amjad OULED-AMEUR <ouledameur.amjad@gmail.com> Link: https://lore.kernel.org/bpf/20220706140204.47926-1-dlan@gentoo.org	2022-07-31 16:45:48 -07:00
Andrii Nakryiko	aa13a6ff58	libbpf: Remove unnecessary usdt_rel_ip assignments Coverity detected that usdt_rel_ip is unconditionally overwritten anyways, so there is no need to unnecessarily initialize it with unused value. Clean this up. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20220705224818.4026623-4-andrii@kernel.org	2022-07-31 16:45:48 -07:00
Chuang Wang	bace4782cd	libbpf: Cleanup the legacy uprobe_event on failed add/attach_event() A potential scenario, when an error is returned after add_uprobe_event_legacy() in perf_event_uprobe_open_legacy(), or bpf_program__attach_perf_event_opts() in bpf_program__attach_uprobe_opts() returns an error, the uprobe_event that was previously created is not cleaned. So, with this patch, when an error is returned, fix this by adding remove_uprobe_event_legacy() Signed-off-by: Chuang Wang <nashuiliang@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220629151848.65587-4-nashuiliang@gmail.com	2022-07-31 16:45:48 -07:00
Chuang Wang	ab2221de84	libbpf: Fix wrong variable used in perf_event_uprobe_open_legacy() Use "type" as opposed to "err" in pr_warn() after determine_uprobe_perf_type_legacy() returns an error. Signed-off-by: Chuang Wang <nashuiliang@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220629151848.65587-3-nashuiliang@gmail.com	2022-07-31 16:45:48 -07:00
Chuang Wang	d8a50bfe35	libbpf: Cleanup the legacy kprobe_event on failed add/attach_event() Before the 0bc11ed5ab60 commit ("kprobes: Allow kprobes coexist with livepatch"), in a scenario where livepatch and kprobe coexist on the same function entry, the creation of kprobe_event using add_kprobe_event_legacy() will be successful, at the same time as a trace event (e.g. /debugfs/tracing/events/kprobe/XXX) will exist, but perf_event_open() will return an error because both livepatch and kprobe use FTRACE_OPS_FL_IPMODIFY. As follows: 1) add a livepatch $ insmod livepatch-XXX.ko 2) add a kprobe using tracefs API (i.e. add_kprobe_event_legacy) $ echo 'p:mykprobe XXX' > /sys/kernel/debug/tracing/kprobe_events 3) enable this kprobe (i.e. sys_perf_event_open) This will return an error, -EBUSY. On Andrii Nakryiko's comment, few error paths in bpf_program__attach_kprobe_opts() that should need to call remove_kprobe_event_legacy(). With this patch, whenever an error is returned after add_kprobe_event_legacy() or bpf_program__attach_perf_event_opts(), this ensures that the created kprobe_event is cleaned. Signed-off-by: Chuang Wang <nashuiliang@gmail.com> Signed-off-by: Jingren Zhou <zhoujingren@didiglobal.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220629151848.65587-2-nashuiliang@gmail.com	2022-07-31 16:45:48 -07:00
Andrii Nakryiko	95971ddd48	libbpf: add bpf_core_type_matches() helper macro This patch finalizes support for the proposed type match relation in libbpf by adding bpf_core_type_matches() macro which emits TYPE_MATCH relocation. Clang support for this relocation was added in [0]. [0] https://reviews.llvm.org/D126838 Signed-off-by: Daniel Müller <deso@posteo.net>¬ Signed-off-by: Andrii Nakryiko <andrii@kernel.org>¬ Link: https://lore.kernel.org/bpf/20220628160127.607834-7-deso@posteo.net¬	2022-07-31 16:45:48 -07:00
Daniel Müller	7410ddc0f4	bpf, libbpf: Add type match support This patch adds support for the proposed type match relation to relo_core where it is shared between userspace and kernel. It plumbs through both kernel-side and libbpf-side support. The matching relation is defined as follows (copy from source): - modifiers and typedefs are stripped (and, hence, effectively ignored) - generally speaking types need to be of same kind (struct vs. struct, union vs. union, etc.) - exceptions are struct/union behind a pointer which could also match a forward declaration of a struct or union, respectively, and enum vs. enum64 (see below) Then, depending on type: - integers: - match if size and signedness match - arrays & pointers: - target types are recursively matched - structs & unions: - local members need to exist in target with the same name - for each member we recursively check match unless it is already behind a pointer, in which case we only check matching names and compatible kind - enums: - local variants have to have a match in target by symbolic name (but not numeric value) - size has to match (but enum may match enum64 and vice versa) - function pointers: - number and position of arguments in local type has to match target - for each argument and the return value we recursively check match Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220628160127.607834-5-deso@posteo.net	2022-07-31 16:45:48 -07:00
Daniel Müller	1b80b97a30	bpf: Introduce TYPE_MATCH related constants/macros In order to provide type match support we require a new type of relocation which, in turn, requires toolchain support. Recent LLVM/Clang versions support a new value for the last argument to the __builtin_preserve_type_info builtin, for example. With this change we introduce the necessary constants into relevant header files, mirroring what the compiler may support. Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220628160127.607834-2-deso@posteo.net	2022-07-31 16:45:48 -07:00
Hangbin Liu	434b56c497	Bonding: add per-port priority for failover re-selection Add per port priority support for bonding active slave re-selection during failover. A higher number means higher priority in selection. The primary slave still has the highest priority. This option also follows the primary_reselect rules. This option could only be configured via netlink. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Jonathan Toppins <jtoppins@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-07-31 16:45:48 -07:00
Daniel Müller	d060a88aa5	Remove Travis specific folding logic The foldable function from the CI helper infrastructure conceptually support emitting both GitHub and Travis fold markers. However, given that we no longer run anything on Travis, let's remove its special case, as it's effectively dead code. Signed-off-by: Daniel Müller <deso@posteo.net>	2022-07-25 11:45:46 -07:00
Daniel Müller	9340d9b650	Rename travis_fold function to foldable We are no longer using Travis. As such, it is confusing to anyone reading the code to see a function prefixed 'travis_' in GitHub actions code. This change renames the travis_fold function to 'foldable', as a first step towards eliminating such confusing constructs from the repository where possible. Signed-off-by: Daniel Müller <deso@posteo.net>	2022-07-25 11:45:46 -07:00
Andrii Nakryiko	b78c75fcb3	Makefile: remove xsk.c and xsk.h xsk.{c,h} are not part of libbpf anymore, remove them from Makefile. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-07-03 20:23:34 -07:00
Andrii Nakryiko	f42d136c1c	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: bb7a4257892717caf82fe6da45b259b35f73445c Checkpoint bpf-next commit: b0d93b44641a83c28014ca38001e85bf6dc8501e Baseline bpf commit: a2b1a5d40bd12b44322c2ccd40bb0ec1699708b6 Checkpoint bpf commit: d28b25a62a47a8c8aa19bd543863aab6717e68c9 Andrii Nakryiko (14): libbpf: move xsk.{c,h} into selftests/bpf libbpf: remove deprecated low-level APIs libbpf: remove deprecated XDP APIs libbpf: remove deprecated probing APIs libbpf: remove deprecated BTF APIs libbpf: clean up perfbuf APIs libbpf: remove prog_info_linear APIs libbpf: remove most other deprecated high-level APIs libbpf: remove multi-instance and custom private data APIs libbpf: cleanup LIBBPF_DEPRECATED_SINCE supporting macros for v0.x libbpf: remove internal multi-instance prog support libbpf: clean up SEC() handling libbpf: enforce strict libbpf 1.0 behaviors libbpf: fix up few libbpf.map problems Daniel Müller (1): bpf: Merge "types_are_compat" logic into relo_core.c Stanislav Fomichev (4): bpf: per-cgroup lsm flavor tools/bpf: Sync btf_ids.h to tools libbpf: add lsm_cgoup_sock type libbpf: implement bpf_prog_query_opts include/uapi/linux/bpf.h \| 4 + src/bpf.c \| 200 +---- src/bpf.h \| 98 +-- src/btf.c \| 183 +---- src/btf.h \| 86 +-- src/btf_dump.c \| 23 +- src/libbpf.c \| 1500 ++++---------------------------------- src/libbpf.h \| 469 +----------- src/libbpf.map \| 114 +-- src/libbpf_common.h \| 16 +- src/libbpf_internal.h \| 24 +- src/libbpf_legacy.h \| 28 +- src/libbpf_probes.c \| 125 +--- src/netlink.c \| 62 +- src/relo_core.c \| 80 ++ src/relo_core.h \| 2 + src/xsk.c \| 1260 -------------------------------- src/xsk.h \| 336 --------- 18 files changed, 339 insertions(+), 4271 deletions(-) delete mode 100644 src/xsk.c delete mode 100644 src/xsk.h -- 2.30.2	2022-07-03 20:23:34 -07:00
Stanislav Fomichev	812a95fdf7	libbpf: implement bpf_prog_query_opts Implement bpf_prog_query_opts as a more expendable version of bpf_prog_query. Expose new prog_attach_flags and attach_btf_func_id as well: * prog_attach_flags is a per-program attach_type; relevant only for lsm cgroup program which might have different attach_flags per attach_btf_id * attach_btf_func_id is a new field expose for prog_query which specifies real btf function id for lsm cgroup attachments Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20220628174314.1216643-10-sdf@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-07-03 20:23:34 -07:00
Stanislav Fomichev	f9f7f2d30a	libbpf: add lsm_cgoup_sock type lsm_cgroup/ is the prefix for BPF_LSM_CGROUP. Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20220628174314.1216643-9-sdf@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-07-03 20:23:34 -07:00
Stanislav Fomichev	25ba007681	tools/bpf: Sync btf_ids.h to tools Has been slowly getting out of sync, let's update it. resolve_btfids usage has been updated to match the header changes. Also bring new parts of tools/include/uapi/linux/bpf.h. Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20220628174314.1216643-8-sdf@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-07-03 20:23:34 -07:00
Stanislav Fomichev	9bdb296ec6	bpf: per-cgroup lsm flavor Allow attaching to lsm hooks in the cgroup context. Attaching to per-cgroup LSM works exactly like attaching to other per-cgroup hooks. New BPF_LSM_CGROUP is added to trigger new mode; the actual lsm hook we attach to is signaled via existing attach_btf_id. For the hooks that have 'struct socket' or 'struct sock' as its first argument, we use the cgroup associated with that socket. For the rest, we use 'current' cgroup (this is all on default hierarchy == v2 only). Note that for some hooks that work on 'struct sock' we still take the cgroup from 'current' because some of them work on the socket that hasn't been properly initialized yet. Behind the scenes, we allocate a shim program that is attached to the trampoline and runs cgroup effective BPF programs array. This shim has some rudimentary ref counting and can be shared between several programs attaching to the same lsm hook from different cgroups. Note that this patch bloats cgroup size because we add 211 cgroup_bpf_attach_type(s) for simplicity sake. This will be addressed in the subsequent patch. Also note that we only add non-sleepable flavor for now. To enable sleepable use-cases, bpf_prog_run_array_cg has to grab trace rcu, shim programs have to be freed via trace rcu, cgroup_bpf.effective should be also trace-rcu-managed + maybe some other changes that I'm not aware of. Reviewed-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20220628174314.1216643-4-sdf@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-07-03 20:23:34 -07:00
Andrii Nakryiko	f009af7889	libbpf: fix up few libbpf.map problems Seems like we missed to add 2 APIs to libbpf.map and another API was misspelled. Fix it in libbpf.map. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-16-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-07-03 20:23:34 -07:00
Andrii Nakryiko	62e8af46d2	libbpf: enforce strict libbpf 1.0 behaviors Remove support for legacy features and behaviors that previously had to be disabled by calling libbpf_set_strict_mode(): - legacy BPF map definitions are not supported now; - RLIMIT_MEMLOCK auto-setting, if necessary, is always on (but see libbpf_set_memlock_rlim()); - program name is used for program pinning (instead of section name); - cleaned up error returning logic; - entry BPF programs should have SEC() always. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-15-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-07-03 20:23:34 -07:00
Andrii Nakryiko	fcd1b668c6	libbpf: clean up SEC() handling Get rid of sloppy prefix logic and remove deprecated xdp_{devmap,cpumap} sections. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-13-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-07-03 20:23:34 -07:00
Andrii Nakryiko	0eb12dca7e	libbpf: remove internal multi-instance prog support Clean up internals that had to deal with the possibility of multi-instance bpf_programs. Libbpf 1.0 doesn't support this, so all this is not necessary now and can be simplified. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-12-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-07-03 20:23:34 -07:00
Andrii Nakryiko	fedeba74b7	libbpf: cleanup LIBBPF_DEPRECATED_SINCE supporting macros for v0.x Keep the LIBBPF_DEPRECATED_SINCE macro "framework" for future deprecations, but clean up 0.x related helper macros. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-11-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-07-03 20:23:34 -07:00
Andrii Nakryiko	bf51e3c336	libbpf: remove multi-instance and custom private data APIs Remove all the public APIs that are related to creating multi-instance bpf_programs through custom preprocessing callback and generally working with them. Also remove all the bpf_{object,map,program}__[set_]priv() APIs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-10-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-07-03 20:23:34 -07:00
Andrii Nakryiko	d8454ba8ad	libbpf: remove most other deprecated high-level APIs Remove a bunch of high-level bpf_object/bpf_map/bpf_program related APIs. All the APIs related to private per-object/map/prog state, program preprocessing callback, and generally everything multi-instance related is removed in a separate patch. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-9-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-07-03 20:23:34 -07:00
Andrii Nakryiko	ec3bbc05c0	libbpf: remove prog_info_linear APIs Remove prog_info_linear-related APIs previously used by perf. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-8-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-07-03 20:23:34 -07:00
Andrii Nakryiko	d32e7ea952	libbpf: clean up perfbuf APIs Remove deprecated perfbuf APIs and clean up opts structs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-7-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-07-03 20:23:34 -07:00
Andrii Nakryiko	6abeb4203d	libbpf: remove deprecated BTF APIs Get rid of deprecated BTF-related APIs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-6-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-07-03 20:23:34 -07:00
Andrii Nakryiko	e28a540c59	libbpf: remove deprecated probing APIs Get rid of deprecated feature-probing APIs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-07-03 20:23:34 -07:00
Andrii Nakryiko	e8802d6319	libbpf: remove deprecated XDP APIs Get rid of deprecated bpf_set_link() and bpf_get_link() APIs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-07-03 20:23:34 -07:00
Andrii Nakryiko	9476dce6fe	libbpf: remove deprecated low-level APIs Drop low-level APIs as well as high-level (and very confusingly named) BPF object loading bpf_prog_load_xattr() and bpf_prog_load_deprecated() APIs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-07-03 20:23:34 -07:00
Andrii Nakryiko	8ee1202ff4	libbpf: move xsk.{c,h} into selftests/bpf Remove deprecated xsk APIs from libbpf. But given we have selftests relying on this, move those files (with minimal adjustments to make them compilable) under selftests/bpf. We also remove all the removed APIs from libbpf.map, while overall keeping version inheritance chain, as most APIs are backwards compatible so there is no need to reassign them as LIBBPF_1.0.0 versions. Cc: Magnus Karlsson <magnus.karlsson@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220627211527.2245459-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-07-03 20:23:34 -07:00
Daniel Müller	7013b92fef	bpf: Merge "types_are_compat" logic into relo_core.c BPF type compatibility checks (bpf_core_types_are_compat()) are currently duplicated between kernel and user space. That's a historical artifact more than intentional doing and can lead to subtle bugs where one implementation is adjusted but another is forgotten. That happened with the enum64 work, for example, where the libbpf side was changed (commit 23b2a3a8f63a ("libbpf: Add enum64 relocation support")) to use the btf_kind_core_compat() helper function but the kernel side was not (commit 6089fb325cf7 ("bpf: Add btf enum64 support")). This patch addresses both the duplication issue, by merging both implementations and moving them into relo_core.c, and fixes the alluded to kind check (by giving preference to libbpf's already adjusted logic). For discussion of the topic, please refer to: https://lore.kernel.org/bpf/CAADnVQKbWR7oarBdewgOBZUPzryhRYvEbkhyPJQHHuxq=0K1gw@mail.gmail.com/T/#mcc99f4a33ad9a322afaf1b9276fb1f0b7add9665 Changelog: v1 -> v2: - limited libbpf recursion limit to 32 - changed name to __bpf_core_types_are_compat - included warning previously present in libbpf version - merged kernel and user space changes into a single patch Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220623182934.2582827-1-deso@posteo.net	2022-07-03 20:23:34 -07:00
Daniel Müller	20f0330235	Remove unused .travis.yml configuration Checking earlier pull requests, to the best of my understanding nothing is using Travis anymore -- all CI checks are GitHub Actions based. Further checking the Travis repository [0] the last CI run there was 2 years ago. Hence, let's remove stale configuration for Travis, as it's seemingly only bitrotting and causing confusion. [0]: https://travis-ci.org/github/libbpf/libbpf/builds Signed-off-by: Daniel Müller <deso@posteo.net>	2022-06-28 18:26:00 -07:00
Andrii Nakryiko	29869d6ef0	ci: disable attach_probe test on 5.5 It's assuming kprobe w/ sleepable flag is loadable, which is failing on 5.5 kernel. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-06-24 13:32:31 -07:00
Andrii Nakryiko	72dbaf2ac3	ci: update vmlinux.h for 5.5 and 4.9 kernels Update vmlinux.h to fix selftests build on 5.5 and 4.9 kernels. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-06-24 13:32:31 -07:00
Andrii Nakryiko	bc3673cdd5	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 3e6fe5ce4d4860c3a111c246fddc6f31492f4fb0 Checkpoint bpf-next commit: bb7a4257892717caf82fe6da45b259b35f73445c Baseline bpf commit: 5e0b0a4c52d30bb09659446f40b77a692361600d Checkpoint bpf commit: a2b1a5d40bd12b44322c2ccd40bb0ec1699708b6 Delyan Kratunov (1): libbpf: add support for sleepable uprobe programs Maxim Mikityanskiy (2): bpf: Fix documentation of th_len in bpf_tcp_{gen,check}_syncookie bpf: Add helpers to issue and check SYN cookies in XDP include/uapi/linux/bpf.h \| 88 ++++++++++++++++++++++++++++++++++++++-- src/libbpf.c \| 5 ++- 2 files changed, 88 insertions(+), 5 deletions(-) -- 2.30.2	2022-06-24 13:32:31 -07:00
Andrii Nakryiko	78909b8caf	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2022-06-24 13:32:31 -07:00
Maxim Mikityanskiy	ec718073b0	bpf: Add helpers to issue and check SYN cookies in XDP The new helpers bpf_tcp_raw_{gen,check}_syncookie_ipv{4,6} allow an XDP program to generate SYN cookies in response to TCP SYN packets and to check those cookies upon receiving the first ACK packet (the final packet of the TCP handshake). Unlike bpf_tcp_{gen,check}_syncookie these new helpers don't need a listening socket on the local machine, which allows to use them together with synproxy to accelerate SYN cookie generation. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20220615134847.3753567-4-maximmi@nvidia.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-24 13:32:31 -07:00
Maxim Mikityanskiy	9c73b6d422	bpf: Fix documentation of th_len in bpf_tcp_{gen,check}_syncookie bpf_tcp_gen_syncookie expects the full length of the TCP header (with all options), and bpf_tcp_check_syncookie accepts lengths bigger than sizeof(struct tcphdr). Fix the documentation that says these lengths should be exactly sizeof(struct tcphdr). While at it, fix a typo in the name of struct ipv6hdr. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20220615134847.3753567-2-maximmi@nvidia.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-24 13:32:31 -07:00
Delyan Kratunov	0c84902331	libbpf: add support for sleepable uprobe programs Add section mappings for u(ret)probe.s programs. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Delyan Kratunov <delyank@fb.com> Link: https://lore.kernel.org/r/aedbc3b74f3523f00010a7b0df8f3388cca59f16.1655248076.git.delyank@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-24 13:32:31 -07:00
Roberto Sassu	4cb682229d	configs: Enable CONFIG_MODULE_SIG Enable CONFIG_MODULE_SIG to test the new helper bpf_verify_pkcs7_signature(). Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>	2022-06-17 22:05:28 -07:00
Eyal Birger	0304a3c027	ci: enable vrf configs for x86_64 Set CONFIG_NET_L3_MASTER_DEV=y, CONFIG_NET_VRF=y for x86_64. These options are needed for performing LWT BPF tests in test_progs. Signed-off-by: Eyal Birger <eyal.birger@gmail.com>	2022-06-17 09:58:20 -07:00
Mykola Lysenko	a459010926	ci: temporarily disable varlen test	2022-06-17 09:47:40 -07:00
Andrii Nakryiko	e5ff285a44	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: fe92833524e368e59bba9c57e00f7359f133667f Checkpoint bpf-next commit: 3e6fe5ce4d4860c3a111c246fddc6f31492f4fb0 Baseline bpf commit: 825464e79db4aac936e0fdae62cdfb7546d0028f Checkpoint bpf commit: 5e0b0a4c52d30bb09659446f40b77a692361600d Andrii Nakryiko (1): libbpf: Fix internal USDT address translation logic for shared libraries Yonghong Song (1): libbpf: Fix an unsigned < 0 bug src/libbpf.c \| 2 +- src/usdt.c \| 123 ++++++++++++++++++++++++++------------------------- 2 files changed, 64 insertions(+), 61 deletions(-) -- 2.30.2	2022-06-16 16:58:52 -07:00
Andrii Nakryiko	2d91c46d1a	libbpf: Fix internal USDT address translation logic for shared libraries Perform the same virtual address to file offset translation that libbpf is doing for executable ELF binaries also for shared libraries. Currently libbpf is making a simplifying and sometimes wrong assumption that for shared libraries relative virtual addresses inside ELF are always equal to file offsets. Unfortunately, this is not always the case with LLVM's lld linker, which now by default generates quite more complicated ELF segments layout. E.g., for liburandom_read.so from selftests/bpf, here's an excerpt from readelf output listing ELF segments (a.k.a. program headers): Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align PHDR 0x000040 0x0000000000000040 0x0000000000000040 0x0001f8 0x0001f8 R 0x8 LOAD 0x000000 0x0000000000000000 0x0000000000000000 0x0005e4 0x0005e4 R 0x1000 LOAD 0x0005f0 0x00000000000015f0 0x00000000000015f0 0x000160 0x000160 R E 0x1000 LOAD 0x000750 0x0000000000002750 0x0000000000002750 0x000210 0x000210 RW 0x1000 LOAD 0x000960 0x0000000000003960 0x0000000000003960 0x000028 0x000029 RW 0x1000 Compare that to what is generated by GNU ld (or LLVM lld's with extra -znoseparate-code argument which disables this cleverness in the name of file size reduction): Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align LOAD 0x000000 0x0000000000000000 0x0000000000000000 0x000550 0x000550 R 0x1000 LOAD 0x001000 0x0000000000001000 0x0000000000001000 0x000131 0x000131 R E 0x1000 LOAD 0x002000 0x0000000000002000 0x0000000000002000 0x0000ac 0x0000ac R 0x1000 LOAD 0x002dc0 0x0000000000003dc0 0x0000000000003dc0 0x000262 0x000268 RW 0x1000 You can see from the first example above that for executable (Flg == "R E") PT_LOAD segment (LOAD #2), Offset doesn't match VirtAddr columns. And it does in the second case (GNU ld output). This is important because all the addresses, including USDT specs, operate in a virtual address space, while kernel is expecting file offsets when performing uprobe attach. So such mismatches have to be properly taken care of and compensated by libbpf, which is what this patch is fixing. Also patch clarifies few function and variable names, as well as updates comments to reflect this important distinction (virtaddr vs file offset) and to ephasize that shared libraries are not all that different from executables in this regard. This patch also changes selftests/bpf Makefile to force urand_read and liburand_read.so to be built with Clang and LLVM's lld (and explicitly request this ELF file size optimization through -znoseparate-code linker parameter) to validate libbpf logic and ensure regressions don't happen in the future. I've bundled these selftests changes together with libbpf changes to keep the above description tied with both libbpf and selftests changes. Fixes: 74cc6311cec9 ("libbpf: Add USDT notes parsing and resolution logic") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220616055543.3285835-1-andrii@kernel.org	2022-06-16 16:58:52 -07:00
Yonghong Song	d3e41fc1aa	libbpf: Fix an unsigned < 0 bug Andrii reported a bug with the following information: 2859 if (enum64_placeholder_id == 0) { 2860 enum64_placeholder_id = btf__add_int(btf, "enum64_placeholder", 1, 0); >>> CID 394804: Control flow issues (NO_EFFECT) >>> This less-than-zero comparison of an unsigned value is never true. "enum64_placeholder_id < 0U". 2861 if (enum64_placeholder_id < 0) 2862 return enum64_placeholder_id; 2863 ... Here enum64_placeholder_id declared as '__u32' so enum64_placeholder_id < 0 is always false. Declare enum64_placeholder_id as 'int' in order to capture the potential error properly. Fixes: f2a625889bb8 ("libbpf: Add enum64 sanitization") Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220613054314.1251905-1-yhs@fb.com	2022-06-16 16:58:52 -07:00
Andrii Nakryiko	645500dd7d	ci: blacklist mptcp test on s390x It is also blacklisted in kernel-patches CI. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-06-10 14:13:02 -07:00
Andrii Nakryiko	5497411f48	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 02f4afebf8a54ba16f99f4f6ca10df3efeac6229 Checkpoint bpf-next commit: fe92833524e368e59bba9c57e00f7359f133667f Baseline bpf commit: d08af2c46881b62f4efad8ebb7eae381fa1f1033 Checkpoint bpf commit: 825464e79db4aac936e0fdae62cdfb7546d0028f Andrii Nakryiko (1): libbpf: Fix uprobe symbol file offset calculation logic Yonghong Song (10): bpf: Add btf enum64 support libbpf: Permit 64bit relocation value libbpf: Fix an error in 64bit relocation value computation libbpf: Refactor btf__add_enum() for future code sharing libbpf: Add enum64 parsing and new enum64 public API libbpf: Add enum64 deduplication support libbpf: Add enum64 support for btf_dump libbpf: Add enum64 sanitization libbpf: Add enum64 support for bpf linking libbpf: Add enum64 relocation support include/uapi/linux/btf.h \| 17 +++- src/btf.c \| 201 +++++++++++++++++++++++++++++++++++---- src/btf.h \| 32 ++++++- src/btf_dump.c \| 137 +++++++++++++++++++------- src/libbpf.c \| 126 ++++++++++++++---------- src/libbpf.map \| 2 + src/libbpf_internal.h \| 2 + src/linker.c \| 2 + src/relo_core.c \| 105 ++++++++++++-------- src/relo_core.h \| 4 +- 10 files changed, 483 insertions(+), 145 deletions(-) -- 2.30.2	2022-06-10 14:13:02 -07:00
Andrii Nakryiko	74b22b6c8a	libbpf: Fix uprobe symbol file offset calculation logic Fix libbpf's bpf_program__attach_uprobe() logic of determining function's file offset (which is what kernel is actually expecting) when attaching uprobe/uretprobe by function name. Previously calculation was determining virtual address offset relative to base load address, which (offset) is not always the same as file offset (though very frequently it is which is why this went unnoticed for a while). Fixes: 433966e3ae04 ("libbpf: Support function name-based attach uprobes") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Riham Selim <rihams@fb.com> Cc: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/20220606220143.3796908-1-andrii@kernel.org	2022-06-10 14:13:02 -07:00
Yonghong Song	416351822c	libbpf: Add enum64 relocation support The enum64 relocation support is added. The bpf local type could be either enum or enum64 and the remote type could be either enum or enum64 too. The all combinations of local enum/enum64 and remote enum/enum64 are supported. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062647.3721719-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-10 14:13:02 -07:00
Yonghong Song	3f9d041e19	libbpf: Add enum64 support for bpf linking Add BTF_KIND_ENUM64 support for bpf linking, which is very similar to BTF_KIND_ENUM. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062642.3721494-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-10 14:13:02 -07:00
Yonghong Song	a945df2439	libbpf: Add enum64 sanitization When old kernel does not support enum64 but user space btf contains non-zero enum kflag or enum64, libbpf needs to do proper sanitization so modified btf can be accepted by the kernel. Sanitization for enum kflag can be achieved by clearing the kflag bit. For enum64, the type is replaced with an union of integer member types and the integer member size must be smaller than enum64 size. If such an integer type cannot be found, a new type is created and used for union members. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062636.3721375-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-10 14:13:02 -07:00
Yonghong Song	f429a582bf	libbpf: Add enum64 support for btf_dump Add enum64 btf dumping support. For long long and unsigned long long dump, suffixes 'LL' and 'ULL' are added to avoid compilation errors in some cases. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062631.3720526-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-10 14:13:02 -07:00
Yonghong Song	25238de149	libbpf: Add enum64 deduplication support Add enum64 deduplication support. BTF_KIND_ENUM64 handling is very similar to BTF_KIND_ENUM. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062626.3720166-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-10 14:13:02 -07:00
Yonghong Song	c3f8eecb16	libbpf: Add enum64 parsing and new enum64 public API Add enum64 parsing support and two new enum64 public APIs: btf__add_enum64 btf__add_enum64_value Also add support of signedness for BTF_KIND_ENUM. The BTF_KIND_ENUM API signatures are not changed. The signedness will be changed from unsigned to signed if btf__add_enum_value() finds any negative values. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062621.3719391-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-10 14:13:02 -07:00
Yonghong Song	25fd7a1cf5	libbpf: Refactor btf__add_enum() for future code sharing Refactor btf__add_enum() function to create a separate function btf_add_enum_common() so later the common function can be used to add enum64 btf type. There is no functionality change for this patch. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062615.3718063-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-10 14:13:02 -07:00
Yonghong Song	0167a88355	libbpf: Fix an error in 64bit relocation value computation Currently, the 64bit relocation value in the instruction is computed as follows: __u64 imm = insn[0].imm + ((__u64)insn[1].imm << 32) Suppose insn[0].imm = -1 (0xffffffff) and insn[1].imm = 1. With the above computation, insn[0].imm will first sign-extend to 64bit -1 (0xffffffffFFFFFFFF) and then add 0x1FFFFFFFF, producing incorrect value 0xFFFFFFFF. The correct value should be 0x1FFFFFFFF. Changing insn[0].imm to __u32 first will prevent 64bit sign extension and fix the issue. Merging high and low 32bit values also changed from '+' to '\|' to be consistent with other similar occurences in kernel and libbpf. Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062610.3717378-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-10 14:13:02 -07:00
Yonghong Song	23e3d8cf31	libbpf: Permit 64bit relocation value Currently, the libbpf limits the relocation value to be 32bit since all current relocations have such a limit. But with BTF_KIND_ENUM64 support, the enum value could be 64bit. So let us permit 64bit relocation value in libbpf. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062605.3716779-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-10 14:13:02 -07:00
Yonghong Song	9a976c6b98	bpf: Add btf enum64 support Currently, BTF only supports upto 32bit enum value with BTF_KIND_ENUM. But in kernel, some enum indeed has 64bit values, e.g., in uapi bpf.h, we have enum { BPF_F_INDEX_MASK = 0xffffffffULL, BPF_F_CURRENT_CPU = BPF_F_INDEX_MASK, BPF_F_CTXLEN_MASK = (0xfffffULL << 32), }; In this case, BTF_KIND_ENUM will encode the value of BPF_F_CTXLEN_MASK as 0, which certainly is incorrect. This patch added a new btf kind, BTF_KIND_ENUM64, which permits 64bit value to cover the above use case. The BTF_KIND_ENUM64 has the following three fields followed by the common type: struct bpf_enum64 { __u32 nume_off; __u32 val_lo32; __u32 val_hi32; }; Currently, btf type section has an alignment of 4 as all element types are u32. Representing the value with __u64 will introduce a pad for bpf_enum64 and may also introduce misalignment for the 64bit value. Hence, two members of val_hi32 and val_lo32 are chosen to avoid these issues. The kflag is also introduced for BTF_KIND_ENUM and BTF_KIND_ENUM64 to indicate whether the value is signed or unsigned. The kflag intends to provide consistent output of BTF C fortmat with the original source code. For example, the original BTF_KIND_ENUM bit value is 0xffffffff. The format C has two choices, printing out 0xffffffff or -1 and current libbpf prints out as unsigned value. But if the signedness is preserved in btf, the value can be printed the same as the original source code. The kflag value 0 means unsigned values, which is consistent to the default by libbpf and should also cover most cases as well. The new BTF_KIND_ENUM64 is intended to support the enum value represented as 64bit value. But it can represent all BTF_KIND_ENUM values as well. The compiler ([1]) and pahole will generate BTF_KIND_ENUM64 only if the value has to be represented with 64 bits. In addition, a static inline function btf_kind_core_compat() is introduced which will be used later when libbpf relo_core.c changed. Here the kernel shares the same relo_core.c with libbpf. [1] https://reviews.llvm.org/D124641 Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220607062600.3716578-1-yhs@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-10 14:13:02 -07:00
Andrii Nakryiko	e93b1010f3	ci: disable unpriv_bpf_disabled test on s390x Seems like it's relying on fentry which is not supported on s390x. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-06-07 17:39:28 -07:00
Andrii Nakryiko	76fc1ad6d5	ci: make sure to not override CFLAGS Use EXTRA_CFLAGS instead of overriding CFLAGS. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-06-07 17:39:28 -07:00
Andrii Nakryiko	33c5f2bec3	libbpf: bump Makefile version to 1.0.0 to match libbpf.map We are now in v1.0 dev cycle. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-06-07 17:39:28 -07:00
Andrii Nakryiko	d4998cbb6c	ci: update Kconfigs to make all selftests working Also disable fexit_stress which is using test_run's support for TRACING progs now. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-06-07 17:39:28 -07:00
Andrii Nakryiko	eb1d1ad83f	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: ac6a65868a5a45db49d5ee8524df3b701110d844 Checkpoint bpf-next commit: 02f4afebf8a54ba16f99f4f6ca10df3efeac6229 Baseline bpf commit: f3f19f939c11925dadd3f4776f99f8c278a7017b Checkpoint bpf commit: d08af2c46881b62f4efad8ebb7eae381fa1f1033 Andrii Nakryiko (2): libbpf: start 1.0 development cycle libbpf: remove bpf_create_map*() APIs Daniel Müller (5): libbpf: Introduce libbpf_bpf_prog_type_str libbpf: Introduce libbpf_bpf_map_type_str libbpf: Introduce libbpf_bpf_attach_type_str libbpf: Introduce libbpf_bpf_link_type_str libbpf: Fix a couple of typos Douglas Raillard (1): libbpf: Fix determine_ptr_size() guessing Eric Dumazet (1): net: add IFLA_TSO_{MAX_SIZE\|SEGS} attributes Geliang Tang (1): bpf: Add bpf_skc_to_mptcp_sock_proto Joanne Koong (5): bpf: Add verifier support for dynptrs bpf: Add bpf_dynptr_from_mem for local dynptrs bpf: Dynptr support for ring buffers bpf: Add bpf_dynptr_read and bpf_dynptr_write bpf: Add dynptr data slices Julia Lawall (1): libbpf: Fix typo in comment Yuze Chi (1): libbpf: Fix is_pow_of_2 include/uapi/linux/bpf.h \| 90 +++++++++++++++++++ include/uapi/linux/if_link.h \| 2 + src/bpf.c \| 80 ----------------- src/bpf.h \| 42 --------- src/btf.c \| 28 ++++-- src/libbpf.c \| 167 +++++++++++++++++++++++++++++++++-- src/libbpf.h \| 38 +++++++- src/libbpf.map \| 10 +++ src/libbpf_internal.h \| 5 ++ src/libbpf_version.h \| 4 +- src/linker.c \| 5 -- src/relo_core.c \| 8 +- 12 files changed, 332 insertions(+), 147 deletions(-) -- 2.30.2	2022-06-07 17:39:28 -07:00
Andrii Nakryiko	8aa946389d	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2022-06-07 17:39:28 -07:00
Yuze Chi	ad0783c430	libbpf: Fix is_pow_of_2 Move the correct definition from linker.c into libbpf_internal.h. Fixes: 0087a681fa8c ("libbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary") Reported-by: Yuze Chi <chiyuze@google.com> Signed-off-by: Yuze Chi <chiyuze@google.com> Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220603055156.2830463-1-irogers@google.com	2022-06-07 17:39:28 -07:00
Daniel Müller	55638904af	libbpf: Fix a couple of typos This change fixes a couple of typos that were encountered while studying the source code. Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20220601154025.3295035-1-deso@posteo.net	2022-06-07 17:39:28 -07:00
Douglas Raillard	a5d75daa8c	libbpf: Fix determine_ptr_size() guessing One strategy employed by libbpf to guess the pointer size is by finding the size of "unsigned long" type. This is achieved by looking for a type of with the expected name and checking its size. Unfortunately, the C syntax is friendlier to humans than to computers as there is some variety in how such a type can be named. Specifically, gcc and clang do not use the same names for integer types in debug info: - clang uses "unsigned long" - gcc uses "long unsigned int" Lookup all the names for such a type so that libbpf can hope to find the information it wants. Signed-off-by: Douglas Raillard <douglas.raillard@arm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220524094447.332186-1-douglas.raillard@arm.com	2022-06-07 17:39:28 -07:00
Daniel Müller	37218f49fa	libbpf: Introduce libbpf_bpf_link_type_str This change introduces a new function, libbpf_bpf_link_type_str, to the public libbpf API. The function allows users to get a string representation for a bpf_link_type enum variant. Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220523230428.3077108-11-deso@posteo.net	2022-06-07 17:39:28 -07:00
Daniel Müller	bdbce77631	libbpf: Introduce libbpf_bpf_attach_type_str This change introduces a new function, libbpf_bpf_attach_type_str, to the public libbpf API. The function allows users to get a string representation for a bpf_attach_type variant. Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220523230428.3077108-8-deso@posteo.net	2022-06-07 17:39:28 -07:00
Daniel Müller	242c116f04	libbpf: Introduce libbpf_bpf_map_type_str This change introduces a new function, libbpf_bpf_map_type_str, to the public libbpf API. The function allows users to get a string representation for a bpf_map_type enum variant. Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220523230428.3077108-5-deso@posteo.net	2022-06-07 17:39:28 -07:00
Daniel Müller	4d9cd51e7e	libbpf: Introduce libbpf_bpf_prog_type_str This change introduces a new function, libbpf_bpf_prog_type_str, to the public libbpf API. The function allows users to get a string representation for a bpf_prog_type variant. Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220523230428.3077108-2-deso@posteo.net	2022-06-07 17:39:28 -07:00
Joanne Koong	f035838503	bpf: Add dynptr data slices This patch adds a new helper function void bpf_dynptr_data(struct bpf_dynptr ptr, u32 offset, u32 len); which returns a pointer to the underlying data of a dynptr. len must be a statically known value. The bpf program may access the returned data slice as a normal buffer (eg can do direct reads and writes), since the verifier associates the length with the returned pointer, and enforces that no out of bounds accesses occur. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220523210712.3641569-6-joannelkoong@gmail.com	2022-06-07 17:39:28 -07:00
Joanne Koong	7ed5bf8f4c	bpf: Add bpf_dynptr_read and bpf_dynptr_write This patch adds two helper functions, bpf_dynptr_read and bpf_dynptr_write: long bpf_dynptr_read(void dst, u32 len, struct bpf_dynptr src, u32 offset); long bpf_dynptr_write(struct bpf_dynptr dst, u32 offset, void src, u32 len); The dynptr passed into these functions must be valid dynptrs that have been initialized. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220523210712.3641569-5-joannelkoong@gmail.com	2022-06-07 17:39:28 -07:00
Joanne Koong	1a0f5d1c87	bpf: Dynptr support for ring buffers Currently, our only way of writing dynamically-sized data into a ring buffer is through bpf_ringbuf_output but this incurs an extra memcpy cost. bpf_ringbuf_reserve + bpf_ringbuf_commit avoids this extra memcpy, but it can only safely support reservation sizes that are statically known since the verifier cannot guarantee that the bpf program won’t access memory outside the reserved space. The bpf_dynptr abstraction allows for dynamically-sized ring buffer reservations without the extra memcpy. There are 3 new APIs: long bpf_ringbuf_reserve_dynptr(void ringbuf, u32 size, u64 flags, struct bpf_dynptr ptr); void bpf_ringbuf_submit_dynptr(struct bpf_dynptr ptr, u64 flags); void bpf_ringbuf_discard_dynptr(struct bpf_dynptr ptr, u64 flags); These closely follow the functionalities of the original ringbuf APIs. For example, all ringbuffer dynptrs that have been reserved must be either submitted or discarded before the program exits. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20220523210712.3641569-4-joannelkoong@gmail.com	2022-06-07 17:39:28 -07:00
Joanne Koong	c68a2738fd	bpf: Add bpf_dynptr_from_mem for local dynptrs This patch adds a new api bpf_dynptr_from_mem: long bpf_dynptr_from_mem(void data, u32 size, u64 flags, struct bpf_dynptr ptr); which initializes a dynptr to point to a bpf program's local memory. For now only local memory that is of reg type PTR_TO_MAP_VALUE is supported. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220523210712.3641569-3-joannelkoong@gmail.com	2022-06-07 17:39:28 -07:00
Joanne Koong	97009215cb	bpf: Add verifier support for dynptrs This patch adds the bulk of the verifier work for supporting dynamic pointers (dynptrs) in bpf. A bpf_dynptr is opaque to the bpf program. It is a 16-byte structure defined internally as: struct bpf_dynptr_kern { void data; u32 size; u32 offset; } __aligned(8); The upper 8 bits of size* is reserved (it contains extra metadata about read-only status and dynptr type). Consequently, a dynptr only supports memory less than 16 MB. There are different types of dynptrs (eg malloc, ringbuf, ...). In this patchset, the most basic one, dynptrs to a bpf program's local memory, is added. For now only local memory that is of reg type PTR_TO_MAP_VALUE is supported. In the verifier, dynptr state information will be tracked in stack slots. When the program passes in an uninitialized dynptr (ARG_PTR_TO_DYNPTR \| MEM_UNINIT), the stack slots corresponding to the frame pointer where the dynptr resides at are marked STACK_DYNPTR. For helper functions that take in initialized dynptrs (eg bpf_dynptr_read + bpf_dynptr_write which are added later in this patchset), the verifier enforces that the dynptr has been initialized properly by checking that their corresponding stack slots have been marked as STACK_DYNPTR. The 6th patch in this patchset adds test cases that the verifier should successfully reject, such as for example attempting to use a dynptr after doing a direct write into it inside the bpf program. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20220523210712.3641569-2-joannelkoong@gmail.com	2022-06-07 17:39:28 -07:00
Julia Lawall	4c39a3e1aa	libbpf: Fix typo in comment Spelling mistake (triple letters) in comment. Detected with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Daniel Müller <deso@posteo.net> Link: https://lore.kernel.org/bpf/20220521111145.81697-71-Julia.Lawall@inria.fr	2022-06-07 17:39:28 -07:00
Geliang Tang	cb11988cf4	bpf: Add bpf_skc_to_mptcp_sock_proto This patch implements a new struct bpf_func_proto, named bpf_skc_to_mptcp_sock_proto. Define a new bpf_id BTF_SOCK_TYPE_MPTCP, and a new helper bpf_skc_to_mptcp_sock(), which invokes another new helper bpf_mptcp_sock_from_subflow() in net/mptcp/bpf.c to get struct mptcp_sock from a given subflow socket. v2: Emit BTF type, add func_id checks in verifier.c and bpf_trace.c, remove build check for CONFIG_BPF_JIT v5: Drop EXPORT_SYMBOL (Martin) Co-developed-by: Nicolas Rybowski <nicolas.rybowski@tessares.net> Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Nicolas Rybowski <nicolas.rybowski@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220519233016.105670-2-mathew.j.martineau@linux.intel.com	2022-06-07 17:39:28 -07:00
Andrii Nakryiko	7e8d4234ac	libbpf: remove bpf_create_map() APIs To test API removal, get rid of bpf_create_map() APIs. Perf defines __weak implementation of bpf_map_create() that redirects to old bpf_create_map() and that seems to compile and run fine. Cc: Jiri Olsa <jolsa@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220518185915.3529475-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-07 17:39:28 -07:00
Andrii Nakryiko	00f40c01fb	libbpf: start 1.0 development cycle Start libbpf 1.0 development cycle by adding LIBBPF_1.0.0 section to libbpf.map file and marking all current symbols as local. As we remove all the deprecated APIs we'll populate global list before the final 1.0 release. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220518185915.3529475-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-06-07 17:39:28 -07:00
Eric Dumazet	881eba7ef5	net: add IFLA_TSO_{MAX_SIZE\|SEGS} attributes New netlink attributes IFLA_TSO_MAX_SIZE and IFLA_TSO_MAX_SEGS are used to report to user-space the device TSO limits. ip -d link sh dev eth1 ... tso_max_size 65536 tso_max_segs 65535 Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Alexander Duyck <alexanderduyck@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-06-07 17:39:28 -07:00
wangjie	4eb6485c08	Makefile: add support for cross compilation Support CROSS_COMPILE and EXTRA_CFLAGS/EXTRA_LDFLAGS environments, to make cross compiling more flexible. Signed-off-by: Jie Wang <wangjie22@lixiang.com>	2022-05-24 23:24:54 -07:00
Ilya Leoshkevich	eaf9123419	vmtest: add netfilter to s390x config This is required for the new synproxy test. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>	2022-05-23 17:39:33 -07:00
Ilya Leoshkevich	cc904c1a74	vmtest: keep coreutils Kernel's vmtest.sh uses stdbuf, which is unfortunately not present in busybox. Do not delete coreutils, which has it. As a result, the compressed image grows by 1M (~5%). Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>	2022-05-23 17:39:33 -07:00
Ilya Leoshkevich	f3b96c873d	vmtest: add iptables iptables is required by the new selftests for raw syncookie helpers. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>	2022-05-23 17:39:33 -07:00
Maxim Mikityanskiy	47595c2f08	ci: blacklist xdp_syncookie on s390x The xdp_syncookie test uses kfunc, and BPF JIT doesn't support kfunc on s390x. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>	2022-05-20 17:14:41 -07:00
Andrii Nakryiko	86eb09863c	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: b2531d4bdce19f28364b45aac9132e153b1f23a4 Checkpoint bpf-next commit: ac6a65868a5a45db49d5ee8524df3b701110d844 Baseline bpf commit: f3f19f939c11925dadd3f4776f99f8c278a7017b Checkpoint bpf commit: f3f19f939c11925dadd3f4776f99f8c278a7017b Andrii Nakryiko (1): libbpf: fix memory leak in attach_tp for target-less tracepoint program src/libbpf.c \| 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) -- 2.30.2	2022-05-16 13:46:05 -07:00
Andrii Nakryiko	d43fc5a42f	libbpf: fix memory leak in attach_tp for target-less tracepoint program Fix sec_name memory leak if user defines target-less SEC("tp"). Fixes: 9af8efc45eb1 ("libbpf: Allow "incomplete" basic tracing SEC() definitions") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20220516184547.3204674-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-16 13:46:05 -07:00
Andrii Nakryiko	12e932ac0e	ci: whitelist 'usdt' test on 5.5 and update vmlinux.h Update vmlinux.h for latest selftests. Also whitelist usdt test on 5.5, as it should work. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-05-13 16:13:31 -07:00
Andrii Nakryiko	75452cd290	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: d54d06a4c4bc5d76815d02e4b041b31d9dbb3fef Checkpoint bpf-next commit: b2531d4bdce19f28364b45aac9132e153b1f23a4 Baseline bpf commit: ba3beec2ec1d3b4fd8672ca6e781dac4b3267f6e Checkpoint bpf commit: f3f19f939c11925dadd3f4776f99f8c278a7017b Andrii Nakryiko (12): libbpf: Allow "incomplete" basic tracing SEC() definitions libbpf: Support target-less SEC() definitions for BTF-backed programs libbpf: Append "..." in fixed up log if CO-RE spec is truncated libbpf: Use libbpf_mem_ensure() when allocating new map libbpf: Allow to opt-out from creating BPF maps libbpf: Make __kptr and __kptr_ref unconditionally use btf_type_tag() attr libbpf: Improve usability of field-based CO-RE helpers libbpf: Complete field-based CO-RE helpers with field offset helper libbpf: Provide barrier() and barrier_var() in bpf_helpers.h libbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary libbpf: Clean up ringbuf size adjustment implementation libbpf: Add safer high-level wrappers for map operations Feng Zhou (1): bpf: add bpf_map_lookup_percpu_elem for percpu map Jiri Olsa (1): libbpf: Add bpf_program__set_insns function Kaixi Fan (1): bpf: Add source ip in "struct bpf_tunnel_key" Kui-Feng Lee (3): bpf, x86: Generate trampolines from bpf_tramp_links bpf, x86: Attach a cookie to fentry/fexit/fmod_ret/lsm. libbpf: Assign cookies to links in libbpf. include/uapi/linux/bpf.h \| 23 ++ src/bpf.c \| 22 ++ src/bpf.h \| 4 + src/bpf_core_read.h \| 37 ++- src/bpf_helpers.h \| 29 ++- src/libbpf.c \| 473 ++++++++++++++++++++++++++++++++------- src/libbpf.h \| 156 +++++++++++++ src/libbpf.map \| 12 +- 8 files changed, 659 insertions(+), 97 deletions(-) -- 2.30.2	2022-05-13 16:13:31 -07:00
Andrii Nakryiko	ae67bfbae3	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2022-05-13 16:13:31 -07:00
Andrii Nakryiko	650adc5118	libbpf: Add safer high-level wrappers for map operations Add high-level API wrappers for most common and typical BPF map operations that works directly on instances of struct bpf_map * (so you don't have to call bpf_map__fd()) and validate key/value size expectations. These helpers require users to specify key (and value, where appropriate) sizes when performing lookup/update/delete/etc. This forces user to actually think and validate (for themselves) those. This is a good thing as user is expected by kernel to implicitly provide correct key/value buffer sizes and kernel will just read/write necessary amount of data. If it so happens that user doesn't set up buffers correctly (which bit people for per-CPU maps especially) kernel either randomly overwrites stack data or return -EFAULT, depending on user's luck and circumstances. These high-level APIs are meant to prevent such unpleasant and hard to debug bugs. This patch also adds bpf_map_delete_elem_flags() low-level API and requires passing flags to bpf_map__delete_elem() API for consistency across all similar APIs, even though currently kernel doesn't expect any extra flags for BPF_MAP_DELETE_ELEM operation. List of map operations that get these high-level APIs: - bpf_map_lookup_elem; - bpf_map_update_elem; - bpf_map_delete_elem; - bpf_map_lookup_and_delete_elem; - bpf_map_get_next_key. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220512220713.2617964-1-andrii@kernel.org	2022-05-13 16:13:31 -07:00
Feng Zhou	babc92b9f1	bpf: add bpf_map_lookup_percpu_elem for percpu map Add new ebpf helpers bpf_map_lookup_percpu_elem. The implementation method is relatively simple, refer to the implementation method of map_lookup_elem of percpu map, increase the parameters of cpu, and obtain it according to the specified cpu. Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com> Link: https://lore.kernel.org/r/20220511093854.411-2-zhoufeng.zf@bytedance.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-13 16:13:31 -07:00
Jiri Olsa	e335f3fa5f	libbpf: Add bpf_program__set_insns function Adding bpf_program__set_insns that allows to set new instructions for a BPF program. This is a very advanced libbpf API and users need to know what they are doing. This should be used from prog_prepare_load_fn callback only. We can have changed instructions after calling prog_prepare_load_fn callback, reloading them. One of the users of this new API will be perf's internal BPF prologue generation. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220510074659.2557731-2-jolsa@kernel.org	2022-05-13 16:13:31 -07:00
Andrii Nakryiko	7062757357	libbpf: Clean up ringbuf size adjustment implementation Drop unused iteration variable, move overflow prevention check into the for loop. Fixes: 0087a681fa8c ("libbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary") Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220510185159.754299-1-andrii@kernel.org	2022-05-13 16:13:31 -07:00
Kui-Feng Lee	aec48fffee	libbpf: Assign cookies to links in libbpf. Add a cookie field to the attributes of bpf_link_create(). Add bpf_program__attach_trace_opts() to attach a cookie to a link. Signed-off-by: Kui-Feng Lee <kuifeng@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220510205923.3206889-5-kuifeng@fb.com	2022-05-13 16:13:31 -07:00
Kui-Feng Lee	c116ae6130	bpf, x86: Attach a cookie to fentry/fexit/fmod_ret/lsm. Pass a cookie along with BPF_LINK_CREATE requests. Add a bpf_cookie field to struct bpf_tracing_link to attach a cookie. The cookie of a bpf_tracing_link is available by calling bpf_get_attach_cookie when running the BPF program of the attached link. The value of a cookie will be set at bpf_tramp_run_ctx by the trampoline of the link. Signed-off-by: Kui-Feng Lee <kuifeng@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220510205923.3206889-4-kuifeng@fb.com	2022-05-13 16:13:31 -07:00
Kui-Feng Lee	99b21d41e3	bpf, x86: Generate trampolines from bpf_tramp_links Replace struct bpf_tramp_progs with struct bpf_tramp_links to collect struct bpf_tramp_link(s) for a trampoline. struct bpf_tramp_link extends bpf_link to act as a linked list node. arch_prepare_bpf_trampoline() accepts a struct bpf_tramp_links to collects all bpf_tramp_link(s) that a trampoline should call. Change BPF trampoline and bpf_struct_ops to pass bpf_tramp_links instead of bpf_tramp_progs. Signed-off-by: Kui-Feng Lee <kuifeng@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220510205923.3206889-2-kuifeng@fb.com	2022-05-13 16:13:31 -07:00
Kaixi Fan	7a443259de	bpf: Add source ip in "struct bpf_tunnel_key" Add tunnel source ip field in "struct bpf_tunnel_key". Add related code to set and get tunnel source field. Signed-off-by: Kaixi Fan <fankaixi.li@bytedance.com> Link: https://lore.kernel.org/r/20220430074844.69214-2-fankaixi.li@bytedance.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-05-13 16:13:31 -07:00
Andrii Nakryiko	b3197662ba	libbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary Kernel imposes a pretty particular restriction on ringbuf map size. It has to be a power-of-2 multiple of page size. While generally this isn't hard for user to satisfy, sometimes it's impossible to do this declaratively in BPF source code or just plain inconvenient to do at runtime. One such example might be BPF libraries that are supposed to work on different architectures, which might not agree on what the common page size is. Let libbpf find the right size for user instead, if it turns out to not satisfy kernel requirements. If user didn't set size at all, that's most probably a mistake so don't upsize such zero size to one full page, though. Also we need to be careful about not overflowing __u32 max_entries. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220509004148.1801791-9-andrii@kernel.org	2022-05-13 16:13:31 -07:00
Andrii Nakryiko	486b1a080b	libbpf: Provide barrier() and barrier_var() in bpf_helpers.h Add barrier() and barrier_var() macros into bpf_helpers.h to be used by end users. While a bit advanced and specialized instruments, they are sometimes indispensable. Instead of requiring each user to figure out exact asm volatile incantations for themselves, provide them from bpf_helpers.h. Also remove conflicting definitions from selftests. Some tests rely on barrier_var() definition being nothing, those will still work as libbpf does the #ifndef/#endif guarding for barrier() and barrier_var(), allowing users to redefine them, if necessary. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220509004148.1801791-8-andrii@kernel.org	2022-05-13 16:13:31 -07:00
Andrii Nakryiko	ba9850c048	libbpf: Complete field-based CO-RE helpers with field offset helper Add bpf_core_field_offset() helper to complete field-based CO-RE helpers. This helper can be useful for feature-detection and for some more advanced cases of field reading (e.g., reading flexible array members). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220509004148.1801791-6-andrii@kernel.org	2022-05-13 16:13:31 -07:00
Andrii Nakryiko	5c1d6799df	libbpf: Improve usability of field-based CO-RE helpers Allow to specify field reference in two ways: - if user has variable of necessary type, they can use variable-based reference (my_var.my_field or my_var_ptr->my_field). This was the only supported syntax up till now. - now, bpf_core_field_exists() and bpf_core_field_size() support also specifying field in a fashion similar to offsetof() macro, by specifying type of the containing struct/union separately and field name separately: bpf_core_field_exists(struct my_type, my_field). This forms is quite often more convenient in practice and it matches type-based CO-RE helpers that support specifying type by its name without requiring any variables. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220509004148.1801791-4-andrii@kernel.org	2022-05-13 16:13:31 -07:00
Andrii Nakryiko	1f30788b41	libbpf: Make __kptr and __kptr_ref unconditionally use btf_type_tag() attr It will be annoying and surprising for users of __kptr and __kptr_ref if libbpf silently ignores them just because Clang used for compilation didn't support btf_type_tag(). It's much better to get clear compiler error than debug BPF verifier failures later on. Fixes: ef89654f2bc7 ("libbpf: Add kptr type tag macros to bpf_helpers.h") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220509004148.1801791-3-andrii@kernel.org	2022-05-13 16:13:31 -07:00
Andrii Nakryiko	a8bc578af9	libbpf: Allow to opt-out from creating BPF maps Add bpf_map__set_autocreate() API that allows user to opt-out from libbpf automatically creating BPF map during BPF object load. This is a useful feature when building CO-RE-enabled BPF application that takes advantage of some new-ish BPF map type (e.g., socket-local storage) if kernel supports it, but otherwise uses some alternative way (e.g., extra HASH map). In such case, being able to disable the creation of a map that kernel doesn't support allows to successfully create and load BPF object file with all its other maps and programs. It's still up to user to make sure that no "live" code in any of their BPF programs are referencing such map instance, which can be achieved by guarding such code with CO-RE relocation check or by using .rodata global variables. If user fails to properly guard such code to turn it into "dead code", libbpf will helpfully post-process BPF verifier log and will provide more meaningful error and map name that needs to be guarded properly. As such, instead of: ; value = bpf_map_lookup_elem(&missing_map, &zero); 4: (85) call unknown#2001000000 invalid func unknown#2001000000 ... user will see: ; value = bpf_map_lookup_elem(&missing_map, &zero); 4: <invalid BPF map reference> BPF map 'missing_map' is referenced but wasn't created Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220428041523.4089853-4-andrii@kernel.org	2022-05-13 16:13:31 -07:00
Andrii Nakryiko	d46f1aaa7c	libbpf: Use libbpf_mem_ensure() when allocating new map Reuse libbpf_mem_ensure() when adding a new map to the list of maps inside bpf_object. It takes care of proper resizing and reallocating of map array and zeroing out newly allocated memory. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220428041523.4089853-3-andrii@kernel.org	2022-05-13 16:13:31 -07:00
Andrii Nakryiko	1a18c6f051	libbpf: Append "..." in fixed up log if CO-RE spec is truncated Detect CO-RE spec truncation and append "..." to make user aware that there was supposed to be more of the spec there. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220428041523.4089853-2-andrii@kernel.org	2022-05-13 16:13:31 -07:00
Andrii Nakryiko	97ab064bc0	libbpf: Support target-less SEC() definitions for BTF-backed programs Similar to previous patch, support target-less definitions like SEC("fentry"), SEC("freplace"), etc. For such BTF-backed program types it is expected that user will specify BTF target programmatically at runtime using bpf_program__set_attach_target() before load phase. If not, libbpf will report this as an error. Aslo use SEC_ATTACH_BTF flag instead of explicitly listing a set of types that are expected to require attach_btf_id. This was an accidental omission during custom SEC() support refactoring. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20220428185349.3799599-3-andrii@kernel.org	2022-05-13 16:13:31 -07:00
Andrii Nakryiko	eee09dc704	libbpf: Allow "incomplete" basic tracing SEC() definitions In a lot of cases the target of kprobe/kretprobe, tracepoint, raw tracepoint, etc BPF program might not be known at the compilation time and will be discovered at runtime. This was always a supported case by libbpf, with APIs like bpf_program__attach_{kprobe,tracepoint,etc}() accepting full target definition, regardless of what was defined in SEC() definition in BPF source code. Unfortunately, up till now libbpf still enforced users to specify at least something for the fake target, e.g., SEC("kprobe/whatever"), which is cumbersome and somewhat misleading. This patch allows target-less SEC() definitions for basic tracing BPF program types: - kprobe/kretprobe; - multi-kprobe/multi-kretprobe; - tracepoints; - raw tracepoints. Such target-less SEC() definitions are meant to specify declaratively proper BPF program type only. Attachment of them will have to be handled programmatically using correct APIs. As such, skeleton's auto-attachment of such BPF programs is skipped and generic bpf_program__attach() will fail, if attempted, due to the lack of enough target information. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20220428185349.3799599-2-andrii@kernel.org	2022-05-13 16:13:31 -07:00
Ilya Leoshkevich	87dff0a2c7	vmtest: allow building foreign debian rootfs This would allow building s390x images without access to an IBM Z. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>	2022-05-09 12:17:29 -07:00
Ilya Leoshkevich	14777c3784	vmtest: use debian bookworm A newer iproute2 version is required for MPTCP tests. Use a newer distro version, which has it. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>	2022-05-09 12:17:29 -07:00
Andrii Nakryiko	3a4e26307d	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 34ba23b44c664792a4308ec37b5788a3162944ec Checkpoint bpf-next commit: d54d06a4c4bc5d76815d02e4b041b31d9dbb3fef Baseline bpf commit: 8de8b71b787f38983d414d2dba169a3bfefa668a Checkpoint bpf commit: ba3beec2ec1d3b4fd8672ca6e781dac4b3267f6e Alan Maguire (1): libbpf: Usdt aarch64 arg parsing support Andrii Nakryiko (10): libbpf: Support opting out from autoloading BPF programs declaratively libbpf: Teach bpf_link_create() to fallback to bpf_raw_tracepoint_open() libbpf: Fix anonymous type check in CO-RE logic libbpf: Drop unhelpful "program too large" guess libbpf: Fix logic for finding matching program for CO-RE relocation libbpf: Avoid joining .BTF.ext data with BPF programs by section name libbpf: Record subprog-resolved CO-RE relocations unconditionally libbpf: Refactor CO-RE relo human description formatting routine libbpf: Simplify bpf_core_parse_spec() signature libbpf: Fix up verifier log for unguarded failed CO-RE relos Gaosheng Cui (1): libbpf: Remove redundant non-null checks on obj_elf Grant Seltzer (4): libbpf: Add error returns to two API functions libbpf: Update API functions usage to check error libbpf: Add documentation to API functions libbpf: Improve libbpf API documentation link position Kumar Kartikeya Dwivedi (2): bpf: Allow storing referenced kptr in map libbpf: Add kptr type tag macros to bpf_helpers.h Pu Lehui (2): libbpf: Fix usdt_cookie being cast to 32 bits libbpf: Support riscv USDT argument parsing logic Runqing Yang (1): libbpf: Fix a bug with checking bpf_probe_read_kernel() support in old kernels Vladimir Isaev (1): libbpf: Add ARC support to bpf_tracing.h Yuntao Wang (1): libbpf: Remove unnecessary type cast docs/index.rst \| 3 +- include/uapi/linux/bpf.h \| 12 ++ src/bpf.c \| 34 ++++- src/bpf_helpers.h \| 7 + src/bpf_tracing.h \| 23 +++ src/btf.c \| 9 +- src/libbpf.c \| 322 ++++++++++++++++++++++++++++++--------- src/libbpf.h \| 82 +++++++++- src/libbpf_internal.h \| 9 +- src/relo_core.c \| 104 +++++++------ src/relo_core.h \| 6 + src/usdt.c \| 191 ++++++++++++++++++++++- 12 files changed, 668 insertions(+), 134 deletions(-) -- 2.30.2	2022-04-27 15:19:08 -07:00
Andrii Nakryiko	ef6f1fdfff	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2022-04-27 15:19:08 -07:00
Andrii Nakryiko	c3f58eb6cf	libbpf: Fix up verifier log for unguarded failed CO-RE relos Teach libbpf to post-process BPF verifier log on BPF program load failure and detect known error patterns to provide user with more context. Currently there is one such common situation: an "unguarded" failed BPF CO-RE relocation. While failing CO-RE relocation is expected, it is expected to be property guarded in BPF code such that BPF verifier always eliminates BPF instructions corresponding to such failed CO-RE relos as dead code. In cases when user failed to take such precautions, BPF verifier provides the best log it can: 123: (85) call unknown#195896080 invalid func unknown#195896080 Such incomprehensible log error is due to libbpf "poisoning" BPF instruction that corresponds to failed CO-RE relocation by replacing it with invalid `call 0xbad2310` instruction (195896080 == 0xbad2310 reads "bad relo" if you squint hard enough). Luckily, libbpf has all the necessary information to look up CO-RE relocation that failed and provide more human-readable description of what's going on: 5: <invalid CO-RE relocation> failed to resolve CO-RE relocation <byte_off> [6] struct task_struct___bad.fake_field_subprog (0:2 @ offset 8) This hopefully makes it much easier to understand what's wrong with user's BPF program without googling magic constants. This BPF verifier log fixup is setup to be extensible and is going to be used for at least one other upcoming feature of libbpf in follow up patches. Libbpf is parsing lines of BPF verifier log starting from the very end. Currently it processes up to 10 lines of code looking for familiar patterns. This avoids wasting lots of CPU processing huge verifier logs (especially for log_level=2 verbosity level). Actual verification error should normally be found in last few lines, so this should work reliably. If libbpf needs to expand log beyond available log_buf_size, it truncates the end of the verifier log. Given verifier log normally ends with something like: processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 ... truncating this on program load error isn't too bad (end user can always increase log size, if it needs to get complete log). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220426004511.2691730-10-andrii@kernel.org	2022-04-27 15:19:08 -07:00
Andrii Nakryiko	2c3a55bfe7	libbpf: Simplify bpf_core_parse_spec() signature Simplify bpf_core_parse_spec() signature to take struct bpf_core_relo as an input instead of requiring callers to decompose them into type_id, relo, spec_str, etc. This makes using and reusing this helper easier. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220426004511.2691730-9-andrii@kernel.org	2022-04-27 15:19:08 -07:00
Andrii Nakryiko	e2d8a820cb	libbpf: Refactor CO-RE relo human description formatting routine Refactor how CO-RE relocation is formatted. Now it dumps human-readable representation, currently used by libbpf in either debug or error message output during CO-RE relocation resolution process, into provided buffer. This approach allows for better reuse of this functionality outside of CO-RE relocation resolution, which we'll use in next patch for providing better error message for BPF verifier rejecting BPF program due to unguarded failed CO-RE relocation. It also gets rid of annoying "stitching" of libbpf_print() calls, which was the only place where we did this. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220426004511.2691730-8-andrii@kernel.org	2022-04-27 15:19:08 -07:00
Andrii Nakryiko	aaaeea6499	libbpf: Record subprog-resolved CO-RE relocations unconditionally Previously, libbpf recorded CO-RE relocations with insns_idx resolved according to finalized subprog locations (which are appended at the end of entry BPF program) to simplify the job of light skeleton generator. This is necessary because once subprogs' instructions are appended to main entry BPF program all the subprog instruction indices are shifted and that shift is different for each entry (main) BPF program, so it's generally impossible to map final absolute insn_idx of the finalized BPF program to their original locations inside subprograms. This information is now going to be used not only during light skeleton generation, but also to map absolute instruction index to subprog's instruction and its corresponding CO-RE relocation. So start recording these relocations always, not just when obj->gen_loader is set. This information is going to be freed at the end of bpf_object__load() step, as before (but this can change in the future if there will be a need for this information post load step). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220426004511.2691730-7-andrii@kernel.org	2022-04-27 15:19:08 -07:00
Andrii Nakryiko	f2e994e0b7	libbpf: Avoid joining .BTF.ext data with BPF programs by section name Instead of using ELF section names as a joining key between .BTF.ext and corresponding BPF programs, pre-build .BTF.ext section number to ELF section index mapping during bpf_object__open() and use it later for matching .BTF.ext information (func/line info or CO-RE relocations) to their respective BPF programs and subprograms. This simplifies corresponding joining logic and let's libbpf do manipulations with BPF program's ELF sections like dropping leading '?' character for non-autoloaded programs. Original joining logic in bpf_object__relocate_core() (see relevant comment that's now removed) was never elegant, so it's a good improvement regardless. But it also avoids unnecessary internal assumptions about preserving original ELF section name as BPF program's section name (which was broken when SEC("?abc") support was added). Fixes: a3820c481112 ("libbpf: Support opting out from autoloading BPF programs declaratively") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220426004511.2691730-5-andrii@kernel.org	2022-04-27 15:19:08 -07:00
Andrii Nakryiko	eb22de1f7d	libbpf: Fix logic for finding matching program for CO-RE relocation Fix the bug in bpf_object__relocate_core() which can lead to finding invalid matching BPF program when processing CO-RE relocation. IF matching program is not found, last encountered program will be assumed to be correct program and thus error detection won't detect the problem. Fixes: 9c82a63cf370 ("libbpf: Fix CO-RE relocs against .text section") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220426004511.2691730-4-andrii@kernel.org	2022-04-27 15:19:08 -07:00
Andrii Nakryiko	0a901dd1cd	libbpf: Drop unhelpful "program too large" guess libbpf pretends it knows actual limit of BPF program instructions based on UAPI headers it compiled with. There is neither any guarantee that UAPI headers match host kernel, nor BPF verifier actually uses BPF_MAXINSNS constant anymore. Just drop unhelpful "guess", BPF verifier will emit actual reason for failure in its logs anyways. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220426004511.2691730-3-andrii@kernel.org	2022-04-27 15:19:08 -07:00
Andrii Nakryiko	36582ee432	libbpf: Fix anonymous type check in CO-RE logic Use type name for checking whether CO-RE relocation is referring to anonymous type. Using spec string makes no sense. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220426004511.2691730-2-andrii@kernel.org	2022-04-27 15:19:08 -07:00
Kumar Kartikeya Dwivedi	e7f46e2cae	libbpf: Add kptr type tag macros to bpf_helpers.h Include convenience definitions: __kptr: Unreferenced kptr __kptr_ref: Referenced kptr Users can use them to tag the pointer type meant to be used with the new support directly in the map value definition. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220424214901.2743946-11-memxor@gmail.com	2022-04-27 15:19:08 -07:00
Kumar Kartikeya Dwivedi	179ca056b0	bpf: Allow storing referenced kptr in map Extending the code in previous commits, introduce referenced kptr support, which needs to be tagged using 'kptr_ref' tag instead. Unlike unreferenced kptr, referenced kptr have a lot more restrictions. In addition to the type matching, only a newly introduced bpf_kptr_xchg helper is allowed to modify the map value at that offset. This transfers the referenced pointer being stored into the map, releasing the references state for the program, and returning the old value and creating new reference state for the returned pointer. Similar to unreferenced pointer case, return value for this case will also be PTR_TO_BTF_ID_OR_NULL. The reference for the returned pointer must either be eventually released by calling the corresponding release function, otherwise it must be transferred into another map. It is also allowed to call bpf_kptr_xchg with a NULL pointer, to clear the value, and obtain the old value if any. BPF_LDX, BPF_STX, and BPF_ST cannot access referenced kptr. A future commit will permit using BPF_LDX for such pointers, but attempt at making it safe, since the lifetime of object won't be guaranteed. There are valid reasons to enforce the restriction of permitting only bpf_kptr_xchg to operate on referenced kptr. The pointer value must be consistent in face of concurrent modification, and any prior values contained in the map must also be released before a new one is moved into the map. To ensure proper transfer of this ownership, bpf_kptr_xchg returns the old value, which the verifier would require the user to either free or move into another map, and releases the reference held for the pointer being moved in. In the future, direct BPF_XCHG instruction may also be permitted to work like bpf_kptr_xchg helper. Note that process_kptr_func doesn't have to call check_helper_mem_access, since we already disallow rdonly/wronly flags for map, which is what check_map_access_type checks, and we already ensure the PTR_TO_MAP_VALUE refers to kptr by obtaining its off_desc, so check_map_access is also not required. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220424214901.2743946-4-memxor@gmail.com	2022-04-27 15:19:08 -07:00
Yuntao Wang	56dff81d46	libbpf: Remove unnecessary type cast The link variable is already of type 'struct bpf_link ', casting it to 'struct bpf_link ' is redundant, drop it. Signed-off-by: Yuntao Wang <ytcoode@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220424143420.457082-1-ytcoode@gmail.com	2022-04-27 15:19:08 -07:00
Andrii Nakryiko	0d4cefc4fc	libbpf: Teach bpf_link_create() to fallback to bpf_raw_tracepoint_open() Teach bpf_link_create() to fallback to bpf_raw_tracepoint_open() on older kernels for programs that are attachable through BPF_RAW_TRACEPOINT_OPEN. This makes bpf_link_create() more unified and convenient interface for creating bpf_link-based attachments. With this approach end users can just use bpf_link_create() for tp_btf/fentry/fexit/fmod_ret/lsm program attachments without needing to care about kernel support, as libbpf will handle this transparently. On the other hand, as newer features (like BPF cookie) are added to LINK_CREATE interface, they will be readily usable though the same bpf_link_create() API without any major refactoring from user's standpoint. bpf_program__attach_btf_id() is now using bpf_link_create() internally as well and will take advantaged of this unified interface when BPF cookie is added for fentry/fexit. Doing proactive feature detection of LINK_CREATE support for fentry/tp_btf/etc is quite involved. It requires parsing vmlinux BTF, determining some stable and guaranteed to be in all kernels versions target BTF type (either raw tracepoint or fentry target function), actually attaching this program and thus potentially affecting the performance of the host kernel briefly, etc. So instead we are taking much simpler "lazy" approach of falling back to bpf_raw_tracepoint_open() call only if initial LINK_CREATE command fails. For modern kernels this will mean zero added overhead, while older kernels will incur minimal overhead with a single fast-failing LINK_CREATE call. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Kui-Feng Lee <kuifeng@fb.com> Link: https://lore.kernel.org/bpf/20220421033945.3602803-3-andrii@kernel.org	2022-04-27 15:19:08 -07:00
Grant Seltzer	5954a6c4aa	libbpf: Improve libbpf API documentation link position This puts the link for libbpf API documentation into the sidebar for much easier navigation. You can preview this change at: https://libbpf-test.readthedocs.io/en/latest/ Note that the link is hardcoded to the production version, so you can see that it self references itself here for now: https://libbpf-test.readthedocs.io/en/latest/api.html This will need to make its way into the libbpf mirror, before being deployed to libbpf.readthedocs.org Signed-off-by: Grant Seltzer <grantseltzer@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220422031050.303984-1-grantseltzer@gmail.com	2022-04-27 15:19:08 -07:00
Gaosheng Cui	38be0379c9	libbpf: Remove redundant non-null checks on obj_elf Obj_elf is already non-null checked at the function entry, so remove redundant non-null checks on obj_elf. Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220421031803.2283974-1-cuigaosheng1@huawei.com	2022-04-27 15:19:08 -07:00
Grant Seltzer	5fa8bb6b42	libbpf: Add documentation to API functions This adds documentation for the following API functions: - bpf_program__set_expected_attach_type() - bpf_program__set_type() - bpf_program__set_attach_target() - bpf_program__attach() - bpf_program__pin() - bpf_program__unpin() Signed-off-by: Grant Seltzer <grantseltzer@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220420161226.86803-3-grantseltzer@gmail.com	2022-04-27 15:19:08 -07:00
Grant Seltzer	c5b91a333e	libbpf: Update API functions usage to check error This updates usage of the following API functions within libbpf so their newly added error return is checked: - bpf_program__set_expected_attach_type() - bpf_program__set_type() Signed-off-by: Grant Seltzer <grantseltzer@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220420161226.86803-2-grantseltzer@gmail.com	2022-04-27 15:19:08 -07:00
Grant Seltzer	8073e03491	libbpf: Add error returns to two API functions This adds an error return to the following API functions: - bpf_program__set_expected_attach_type() - bpf_program__set_type() In both cases, the error occurs when the BPF object has already been loaded when the function is called. In this case -EBUSY is returned. Signed-off-by: Grant Seltzer <grantseltzer@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220420161226.86803-1-grantseltzer@gmail.com	2022-04-27 15:19:08 -07:00
Pu Lehui	eb2b216081	libbpf: Support riscv USDT argument parsing logic Add riscv-specific USDT argument specification parsing logic. riscv USDT argument format is shown below: - Memory dereference case: "size@off(reg)", e.g. "-8@-88(s0)" - Constant value case: "size@val", e.g. "4@5" - Register read case: "size@reg", e.g. "-8@a1" s8 will be marked as poison while it's a reg of riscv, we need to alias it in advance. Both RV32 and RV64 have been tested. Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220419145238.482134-3-pulehui@huawei.com	2022-04-27 15:19:08 -07:00
Pu Lehui	bddd106e80	libbpf: Fix usdt_cookie being cast to 32 bits The usdt_cookie is defined as __u64, which should not be used as a long type because it will be cast to 32 bits in 32-bit platforms. Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220419145238.482134-2-pulehui@huawei.com	2022-04-27 15:19:08 -07:00
Andrii Nakryiko	e205664ddb	libbpf: Support opting out from autoloading BPF programs declaratively Establish SEC("?abc") naming convention (i.e., adding question mark in front of otherwise normal section name) that allows to set corresponding program's autoload property to false. This is effectively just a declarative way to do bpf_program__set_autoload(prog, false). Having a way to do this declaratively in BPF code itself is useful and convenient for various scenarios. E.g., for testing, when BPF object consists of multiple independent BPF programs that each needs to be tested separately. Opting out all of them by default and then setting autoload to true for just one of them at a time simplifies testing code (see next patch for few conversions in BPF selftests taking advantage of this new feature). Another real-world use case is in libbpf-tools for cases when different BPF programs have to be picked depending on particulars of the host kernel due to various incompatible changes (like kernel function renames or signature change, or to pick kprobe vs fentry depending on corresponding kernel support for the latter). Marking all the different BPF program candidates as non-autoloaded declaratively makes this more obvious in BPF source code and allows simpler code in user-space code. When BPF program marked as SEC("?abc") it is otherwise treated just like SEC("abc") and bpf_program__section_name() reported will be "abc". Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220419002452.632125-1-andrii@kernel.org	2022-04-27 15:19:08 -07:00
Alan Maguire	557499a13e	libbpf: Usdt aarch64 arg parsing support Parsing of USDT arguments is architecture-specific. On aarch64 it is relatively easy since registers used are x[0-31] and sp. Format is slightly different compared to x86_64. Possible forms are: - "size@[reg[,offset]]" for dereferences, e.g. "-8@[sp,76]" and "-4@[sp]"; - "size@reg" for register values, e.g. "-4@x0"; - "size@value" for raw values, e.g. "-8@1". Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1649690496-1902-2-git-send-email-alan.maguire@oracle.com	2022-04-27 15:19:08 -07:00
Runqing Yang	ffd4015f3b	libbpf: Fix a bug with checking bpf_probe_read_kernel() support in old kernels Background: Libbpf automatically replaces calls to BPF bpf_probe_read_{kernel,user} [_str]() helpers with bpf_probe_read[_str](), if libbpf detects that kernel doesn't support new APIs. Specifically, libbpf invokes the probe_kern_probe_read_kernel function to load a small eBPF program into the kernel in which bpf_probe_read_kernel API is invoked and lets the kernel checks whether the new API is valid. If the loading fails, libbpf considers the new API invalid and replaces it with the old API. static int probe_kern_probe_read_kernel(void) { struct bpf_insn insns[] = { BPF_MOV64_REG(BPF_REG_1, BPF_REG_10), /* r1 = r10 (fp) / BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -8), / r1 += -8 / BPF_MOV64_IMM(BPF_REG_2, 8), / r2 = 8 / BPF_MOV64_IMM(BPF_REG_3, 0), / r3 = 0 */ BPF_RAW_INSN(BPF_JMP \| BPF_CALL, 0, 0, 0, BPF_FUNC_probe_read_kernel), BPF_EXIT_INSN(), }; int fd, insn_cnt = ARRAY_SIZE(insns); fd = bpf_prog_load(BPF_PROG_TYPE_KPROBE, NULL, "GPL", insns, insn_cnt, NULL); return probe_fd(fd); } Bug: On older kernel versions [0], the kernel checks whether the version number provided in the bpf syscall, matches the LINUX_VERSION_CODE. If not matched, the bpf syscall fails. eBPF However, the probe_kern_probe_read_kernel code does not set the kernel version number provided to the bpf syscall, which causes the loading process alwasys fails for old versions. It means that libbpf will replace the new API with the old one even the kernel supports the new one. Solution: After a discussion in [1], the solution is using BPF_PROG_TYPE_TRACEPOINT program type instead of BPF_PROG_TYPE_KPROBE because kernel does not enfoce version check for tracepoint programs. I test the patch in old kernels (4.18 and 4.19) and it works well. [0] https://elixir.bootlin.com/linux/v4.19/source/kernel/bpf/syscall.c#L1360 [1] Closes: https://github.com/libbpf/libbpf/issues/473 Signed-off-by: Runqing Yang <rainkin1993@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220409144928.27499-1-rainkin1993@gmail.com	2022-04-27 15:19:08 -07:00
Vladimir Isaev	68e7624e9f	libbpf: Add ARC support to bpf_tracing.h Add PT_REGS macros suitable for ARCompact and ARCv2. Signed-off-by: Vladimir Isaev <isaev@synopsys.com> Signed-off-by: Sergey Matyukevich <geomatsi@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20220408224442.599566-1-geomatsi@gmail.com	2022-04-27 15:19:08 -07:00
Maxim Mikityanskiy	b221db664f	ci: enable synproxy config for all architectures Enable the following options in Kconfig for x86-64 and s390x: CONFIG_NETFILTER_SYNPROXY=y CONFIG_NETFILTER_XT_TARGET_CT=y CONFIG_NETFILTER_XT_MATCH_STATE=y CONFIG_IP_NF_FILTER=y CONFIG_IP_NF_TARGET_SYNPROXY=y CONFIG_IP_NF_RAW=y These options are needed to run the selftests for the new BPF SYN cookie helpers. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>	2022-04-27 15:18:22 -07:00
chantra	7bf9ee2dba	[rootfs] update rootfs to ship with ethtool Add `ethtool` as a dependency to the rootfs image. Tested by running and building the rootfs images with both `sudo ./mkrootfs_arch.sh` and `sudo ./mkrootfs_debian.sh` and running in qemu with: ``` wget https://libbpf-ci.s3-us-west-1.amazonaws.com/x86_64/vmlinuz-5.5.0 rootfs_img=rootfs.img kernel_bzimage=vmlinuz-5.5.0 mkdir rootfs touch rootfs.img truncate -s 2G rootfs.img sudo mount -o loop rootfs.img rootfs cat ~/Downloads/libbpf-vmtest-rootfs-2022.04.25.tar.zst \| sudo tar -C rootfs -I zstd -xvf - sudo install -m 755 -o root -g root /dev/stdin rootfs/etc/rcS.d/S50-startup <<'EOF' ethtool -h cat /etc/issue EOF qemu-system-x86_64 -nodefaults -display none -serial mon:stdio -enable-kvm -m 4G -drive file="${rootfs_img}",format=raw,index=1,media=disk,if=virtio,cache=none -kernel "${kernel_bzimage}" -append "root=/dev/vda rw console=ttyS0,115200" ``` The last block printed ethtool's help, confirming the presence of ethtool in the rootfs. `libbpf-vmtest-rootfs-2022.04.25.tar.zst` was generated and uploaded to S3. INDEX in libbpf/ci needs to be changed to make the CI pick it up.	2022-04-25 16:30:32 -07:00
grantseltzer	533c7666eb	Fix downloads formats Signed-off-by: grantseltzer <grantseltzer@gmail.com>	2022-04-22 14:30:27 -07:00
grantseltzer	dea5ae9fc9	Enable downloads feature Signed-off-by: grantseltzer <grantseltzer@gmail.com>	2022-04-19 16:08:19 -07:00
Evgeny Vereshchagin	8bc3e510fc	ci: turn off _FORTIFY_SOURCE explicitly libelf is compiled with _FORTIFY_SOURCE by default and it isn't compatible with MSan. It was borrowed from https://github.com/google/oss-fuzz/pull/7422	2022-04-10 18:57:38 -07:00
Evgeny Vereshchagin	14414c6ea5	ci: turn on the alignment check to catch issues like https://github.com/libbpf/libbpf/issues/391	2022-04-10 18:57:38 -07:00
Evgeny Vereshchagin	ea10235072	ci: point elfutils to a commit where a couple bugs are fixed Fixes ``` ./out/bpf-object-fuzzer: Running 1 inputs 1 time(s) each. Running: CORPUS/036ff286c13e4590646c7ef59435ec642432da8e elf_begin.c:232:20: runtime error: member access within misaligned address 0x000001655e71 for type 'Elf64_Shdr', which requires 8 byte alignment 0x000001655e71: note: pointer points here 00 00 00 7f 45 4c 46 02 02 01 00 00 00 07 fb 00 1d 00 00 6c 69 63 65 42 fb 00 41 00 57 03 00 20 ^ #0 0x574d51 in get_shnum /home/libbpf/elfutils/libelf/elf_begin.c:232:20 #1 0x574d51 in file_read_elf /home/libbpf/elfutils/libelf/elf_begin.c:296:19 #2 0x569c2c in __libelf_read_mmaped_file /home/libbpf/elfutils/libelf/elf_begin.c:559:14 #3 0x58e812 in elf_memory /home/libbpf/elfutils/libelf/elf_memory.c:49:10 #4 0x4905b4 in bpf_object__elf_init /home/libbpf/src/libbpf.c:1255:9 #5 0x4905b4 in bpf_object_open /home/libbpf/src/libbpf.c:7104:8 #6 0x49144e in bpf_object__open_mem /home/libbpf/src/libbpf.c:7171:20 #7 0x483018 in LLVMFuzzerTestOneInput /home/libbpf/fuzz/bpf-object-fuzzer.c:16:8 #8 0x439389 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const, unsigned long) (/home/libbpf/out/bpf-object-fuzzer+0x439389) #9 0x419e2f in fuzzer::RunOneTest(fuzzer::Fuzzer, char const, unsigned long) (/home/libbpf/out/bpf-object-fuzzer+0x419e2f) #10 0x421aee in fuzzer::FuzzerDriver(int, char**, int ()(unsigned char const, unsigned long)) (/home/libbpf/out/bpf-object-fuzzer+0x421aee) #11 0x410f96 in main (/home/libbpf/out/bpf-object-fuzzer+0x410f96) #12 0x7f153e21255f in __libc_start_call_main (/lib64/libc.so.6+0x2d55f) #13 0x7f153e21260b in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2d60b) #14 0x410fe4 in _start (/home/libbpf/out/bpf-object-fuzzer+0x410fe4) SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior elf_begin.c:232:20 in ``` and ``` ./out/bpf-object-fuzzer: Running 1 inputs 1 time(s) each. Running: CORPUS/446b578d82c47fe177de6fd675f4cb6bae8d1ea9 elf_begin.c:485:40: runtime error: addition of unsigned offset to 0x000002277e70 overflowed to 0x0000021d7e6f #0 0x5748f1 in file_read_elf /home/libbpf/elfutils/libelf/elf_begin.c:485:40 #1 0x569c2c in __libelf_read_mmaped_file /home/libbpf/elfutils/libelf/elf_begin.c:559:14 #2 0x58e812 in elf_memory /home/libbpf/elfutils/libelf/elf_memory.c:49:10 #3 0x4905b4 in bpf_object__elf_init /home/libbpf/src/libbpf.c:1255:9 #4 0x4905b4 in bpf_object_open /home/libbpf/src/libbpf.c:7104:8 #5 0x49144e in bpf_object__open_mem /home/libbpf/src/libbpf.c:7171:20 #6 0x483018 in LLVMFuzzerTestOneInput /home/libbpf/fuzz/bpf-object-fuzzer.c:16:8 #7 0x439389 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const, unsigned long) (/home/libbpf/out/bpf-object-fuzzer+0x439389) #8 0x419e2f in fuzzer::RunOneTest(fuzzer::Fuzzer, char const, unsigned long) (/home/libbpf/out/bpf-object-fuzzer+0x419e2f) #9 0x421aee in fuzzer::FuzzerDriver(int, char*, int ()(unsigned char const*, unsigned long)) (/home/libbpf/out/bpf-object-fuzzer+0x421aee) #10 0x410f96 in main (/home/libbpf/out/bpf-object-fuzzer+0x410f96) #11 0x7f753e38255f in __libc_start_call_main (/lib64/libc.so.6+0x2d55f) #12 0x7f753e38260b in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2d60b) #13 0x410fe4 in _start (/home/libbpf/out/bpf-object-fuzzer+0x410fe4) SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior elf_begin.c:485:40 in ```	2022-04-10 18:57:38 -07:00
Evgeny Vereshchagin	f3cc144922	ci: turn off unaligned access in libelf explicitly	2022-04-10 18:57:38 -07:00
Andrii Nakryiko	b69f8ee93e	ci: allow usdt selftest on s390x libbpf now has s390x support for USDT, so enable corresponding selftest. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-04-09 09:17:51 -07:00
Andrii Nakryiko	bbfb018473	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 2d0df01974ce2b59b6f7d5bd3ea58d74f12ddf85 Checkpoint bpf-next commit: 34ba23b44c664792a4308ec37b5788a3162944ec Baseline bpf commit: 0a210af6d0a0595fef566e7eeb072f10f37774be Checkpoint bpf commit: 8de8b71b787f38983d414d2dba169a3bfefa668a Alan Maguire (2): libbpf: Improve library identification for uprobe binary path resolution libbpf: Improve string parsing for uprobe auto-attach Andrii Nakryiko (5): libbpf: Fix use #ifdef instead of #if to avoid compiler warning libbpf: Use strlcpy() in path resolution fallback logic libbpf: Allow WEAK and GLOBAL bindings during BTF fixup libbpf: Don't error out on CO-RE relos for overriden weak subprogs libbpf: Use weak hidden modifier for USDT BPF-side API functions Colin Ian King (1): libbpf: Fix spelling mistake "libaries" -> "libraries" Haowen Bai (1): libbpf: Potential NULL dereference in usdt_manager_attach_usdt() Ilya Leoshkevich (3): libbpf: Minor style improvements in USDT code libbpf: Make BPF-side of USDT support work on big-endian machines libbpf: Add s390-specific USDT arg spec parsing logic src/libbpf.c \| 105 ++++++++++++++++++++---------------------- src/libbpf_internal.h \| 11 +++++ src/usdt.bpf.h \| 13 ++++-- src/usdt.c \| 79 ++++++++++++++++++++++++++----- 4 files changed, 136 insertions(+), 72 deletions(-) -- 2.30.2	2022-04-09 09:17:51 -07:00
Andrii Nakryiko	1ce956ab3a	libbpf: Use weak hidden modifier for USDT BPF-side API functions Use __weak __hidden for bpf_usdt_xxx() APIs instead of much more confusing `static inline __noinline`. This was previously impossible due to libbpf erroring out on CO-RE relocations pointing to eliminated weak subprogs. Now that previous patch fixed this issue, switch back to __weak __hidden as it's a more direct way of specifying the desired behavior. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220408181425.2287230-3-andrii@kernel.org	2022-04-09 09:17:51 -07:00
Andrii Nakryiko	5016f30a24	libbpf: Don't error out on CO-RE relos for overriden weak subprogs During BPF static linking, all the ELF relocations and .BTF.ext information (including CO-RE relocations) are preserved for __weak subprograms that were logically overriden by either previous weak subprogram instance or by corresponding "strong" (non-weak) subprogram. This is just how native user-space linkers work, nothing new. But libbpf is over-zealous when processing CO-RE relocation to error out when CO-RE relocation belonging to such eliminated weak subprogram is encountered. Instead of erroring out on this expected situation, log debug-level message and skip the relocation. Fixes: db2b8b06423c ("libbpf: Support CO-RE relocations for multi-prog sections") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220408181425.2287230-2-andrii@kernel.org	2022-04-09 09:17:51 -07:00
Andrii Nakryiko	075c96c298	libbpf: Allow WEAK and GLOBAL bindings during BTF fixup During BTF fix up for global variables, global variable can be global weak and will have STB_WEAK binding in ELF. Support such global variables in addition to non-weak ones. This is not the problem when using BPF static linking, as BPF static linker "fixes up" BTF during generation so that libbpf doesn't have to do it anymore during bpf_object__open(), which led to this not being noticed for a while, along with a pretty rare (currently) use of __weak variables and maps. Reported-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220407230446.3980075-2-andrii@kernel.org	2022-04-09 09:17:51 -07:00
Andrii Nakryiko	f044607934	libbpf: Use strlcpy() in path resolution fallback logic Coverity static analyzer complains that strcpy() can cause buffer overflow. Use libbpf_strlcpy() instead to be 100% sure this doesn't happen. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220407230446.3980075-1-andrii@kernel.org	2022-04-09 09:17:51 -07:00
Ilya Leoshkevich	4fd682d358	libbpf: Add s390-specific USDT arg spec parsing logic The logic is superficially similar to that of x86, but the small differences (no need for register table and dynamic allocation of register names, no $ sign before constants) make maintaining a common implementation too burdensome. Therefore simply add a s390x-specific version of parse_usdt_arg(). Note that while bcc supports index registers, this patch does not. This should not be a problem in most cases, since s390 uses a default value "nor" for STAP_SDT_ARG_CONSTRAINT. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220407214411.257260-4-iii@linux.ibm.com	2022-04-09 09:17:51 -07:00
Ilya Leoshkevich	3663820dda	libbpf: Make BPF-side of USDT support work on big-endian machines BPF_USDT_ARG_REG_DEREF handling always reads 8 bytes, regardless of the actual argument size. On little-endian the relevant argument bits end up in the lower bits of val, and later on the code that handles all the argument types expects them to be there. On big-endian they end up in the upper bits of val, breaking that expectation. Fix by right-shifting val on big-endian. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220407214411.257260-3-iii@linux.ibm.com	2022-04-09 09:17:51 -07:00
Ilya Leoshkevich	fcb67a3e70	libbpf: Minor style improvements in USDT code Fix several typos and references to non-existing headers. Also use __BYTE_ORDER__ instead of __BYTE_ORDER for consistency with the rest of the bpf code - see commit 45f2bebc8079 ("libbpf: Fix endianness detection in BPF_CORE_READ_BITFIELD_PROBED()") for rationale). Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220407214411.257260-2-iii@linux.ibm.com	2022-04-09 09:17:51 -07:00
Andrii Nakryiko	73b8386f2e	libbpf: Fix use #ifdef instead of #if to avoid compiler warning As reported by Naresh: perf build errors on i386 [1] on Linux next-20220407 [2] usdt.c:1181:5: error: "__x86_64__" is not defined, evaluates to 0 [-Werror=undef] 1181 \| #if __x86_64__ \| ^~~~~~~~~~ usdt.c:1196:5: error: "__x86_64__" is not defined, evaluates to 0 [-Werror=undef] 1196 \| #if __x86_64__ \| ^~~~~~~~~~ cc1: all warnings being treated as errors Use #ifdef instead of #if to avoid this. Fixes: 4c59e584d158 ("libbpf: Add x86-specific USDT arg spec parsing logic") Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220407203842.3019904-1-andrii@kernel.org	2022-04-09 09:17:51 -07:00
Haowen Bai	462e3f600a	libbpf: Potential NULL dereference in usdt_manager_attach_usdt() link could be null but still dereference bpf_link__destroy(&link->link) and it will lead to a null pointer access. Signed-off-by: Haowen Bai <baihaowen@meizu.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1649299098-2069-1-git-send-email-baihaowen@meizu.com	2022-04-09 09:17:51 -07:00
Alan Maguire	13fe7fedfa	libbpf: Improve string parsing for uprobe auto-attach For uprobe auto-attach, the parsing can be simplified for the SEC() name to a single sscanf(); the return value of the sscanf can then be used to distinguish between sections that simply specify "u[ret]probe" (and thus cannot auto-attach), those that specify "u[ret]probe/binary_path:function+offset" etc. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1649245431-29956-3-git-send-email-alan.maguire@oracle.com	2022-04-09 09:17:51 -07:00
Alan Maguire	b974879969	libbpf: Improve library identification for uprobe binary path resolution In the process of doing path resolution for uprobe attach, libraries are identified by matching a ".so" substring in the binary_path. This matches a lot of patterns that do not conform to library.so[.version] format, so instead match a ".so" _suffix_, and if that fails match a ".so." substring for the versioned library case. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1649245431-29956-2-git-send-email-alan.maguire@oracle.com	2022-04-09 09:17:51 -07:00
Colin Ian King	2b674f2b21	libbpf: Fix spelling mistake "libaries" -> "libraries" There is a spelling mistake in a pr_warn message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220406080835.14879-1-colin.i.king@gmail.com	2022-04-09 09:17:51 -07:00
Hengqi Chen	5810af7446	Makefile: Add usdt.bpf.h to list of HEADERS Add usdt.bpf.h to HEADERS so that it can be installed and included by users. Signed-off-by: Hengqi Chen <chenhengqi@outlook.com>	2022-04-06 20:28:15 -07:00
Andrii Nakryiko	042471d356	ci: blacklist usdt selftest on s390x libbpf doesn't support USDTs on s390x yet, blacklist corresponding selftest. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-04-06 07:34:58 -07:00
Andrii Nakryiko	f7833c0819	ci: ensure CONFIG_DEBUG_INFO_BTF=y by choosing DWARF debug info With recent upstream changes, the default for debug info is CONFIG_DEBUG_INFO_NONE=y, which prevents BTF from being generated. Choose CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y to make sure we do get DWARF generated. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-04-06 07:34:58 -07:00
Andrii Nakryiko	c562444fb0	Makefile: add usdt.o to list of OBJS Compile user-space parts of USDT support. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-04-06 07:34:58 -07:00
Andrii Nakryiko	750c9fb595	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 9492450fd28736262dea9143ebb3afc2c131ace1 Checkpoint bpf-next commit: 2d0df01974ce2b59b6f7d5bd3ea58d74f12ddf85 Baseline bpf commit: 6bd0c76bd70447aedfeafa9e1fcc249991d6c678 Checkpoint bpf commit: 0a210af6d0a0595fef566e7eeb072f10f37774be Alan Maguire (3): libbpf: auto-resolve programs/libraries when necessary for uprobes libbpf: Support function name-based attach uprobes libbpf: Add auto-attach for uprobes based on section name Andrii Nakryiko (6): libbpf: Avoid NULL deref when initializing map BTF info libbpf: Add BPF-side of USDT support libbpf: Wire up USDT API and bpf_link integration libbpf: Add USDT notes parsing and resolution logic libbpf: Wire up spec management and other arch-independent USDT logic libbpf: Add x86-specific USDT arg spec parsing logic Anshuman Khandual (1): perf: Add irq and exception return branch types Geliang Tang (1): bpf: Sync comments for bpf_get_stack Haiyue Wang (1): bpf: Correct the comment for BTF kind bitfield Hengqi Chen (1): libbpf: Close fd in bpf_object__reuse_map Ilya Leoshkevich (1): libbpf: Support Debian in resolve_full_path() Yuntao Wang (1): libbpf: Don't return -EINVAL if hdr_len < offsetofend(core_relo_len) include/uapi/linux/bpf.h \| 8 +- include/uapi/linux/btf.h \| 4 +- include/uapi/linux/perf_event.h \| 2 + src/btf.c \| 6 +- src/libbpf.c \| 486 +++++++++++- src/libbpf.h \| 41 +- src/libbpf.map \| 1 + src/libbpf_internal.h \| 19 + src/usdt.bpf.h \| 256 +++++++ src/usdt.c \| 1280 +++++++++++++++++++++++++++++++ 10 files changed, 2080 insertions(+), 23 deletions(-) create mode 100644 src/usdt.bpf.h create mode 100644 src/usdt.c -- 2.30.2	2022-04-06 07:34:58 -07:00
Andrii Nakryiko	08cc701fae	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2022-04-06 07:34:58 -07:00
Andrii Nakryiko	fa323673c5	libbpf: Add x86-specific USDT arg spec parsing logic Add x86/x86_64-specific USDT argument specification parsing. Each architecture will require their own logic, as all this is arch-specific assembly-based notation. Architectures that libbpf doesn't support for USDTs will pr_warn() with specific error and return -ENOTSUP. We use sscanf() as a very powerful and easy to use string parser. Those spaces in sscanf's format string mean "skip any whitespaces", which is pretty nifty (and somewhat little known) feature. All this was tested on little-endian architecture, so bit shifts are probably off on big-endian, which our CI will hopefully prove. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Reviewed-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20220404234202.331384-6-andrii@kernel.org	2022-04-06 07:34:58 -07:00
Andrii Nakryiko	876b933999	libbpf: Wire up spec management and other arch-independent USDT logic Last part of architecture-agnostic user-space USDT handling logic is to set up BPF spec and, optionally, IP-to-ID maps from user-space. usdt_manager performs a compact spec ID allocation to utilize fixed-sized BPF maps as efficiently as possible. We also use hashmap to deduplicate USDT arg spec strings and map identical strings to single USDT spec, minimizing the necessary BPF map size. usdt_manager supports arbitrary sequences of attachment and detachment, both of the same USDT and multiple different USDTs and internally maintains a free list of unused spec IDs. bpf_link_usdt's logic is extended with proper setup and teardown of this spec ID free list and supporting BPF maps. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Reviewed-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20220404234202.331384-5-andrii@kernel.org	2022-04-06 07:34:58 -07:00
Andrii Nakryiko	406386b441	libbpf: Add USDT notes parsing and resolution logic Implement architecture-agnostic parts of USDT parsing logic. The code is the documentation in this case, it's futile to try to succinctly describe how USDT parsing is done in any sort of concreteness. But still, USDTs are recorded in special ELF notes section (.note.stapsdt), where each USDT call site is described separately. Along with USDT provider and USDT name, each such note contains USDT argument specification, which uses assembly-like syntax to describe how to fetch value of USDT argument. USDT arg spec could be just a constant, or a register, or a register dereference (most common cases in x86_64), but it technically can be much more complicated cases, like offset relative to global symbol and stuff like that. One of the later patches will implement most common subset of this for x86 and x86-64 architectures, which seems to handle a lot of real-world production application. USDT arg spec contains a compact encoding allowing usdt.bpf.h from previous patch to handle the above 3 cases. Instead of recording which register might be needed, we encode register's offset within struct pt_regs to simplify BPF-side implementation. USDT argument can be of different byte sizes (1, 2, 4, and 8) and signed or unsigned. To handle this, libbpf pre-calculates necessary bit shifts to do proper casting and sign-extension in a short sequences of left and right shifts. The rest is in the code with sometimes extensive comments and references to external "documentation" for USDTs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Reviewed-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20220404234202.331384-4-andrii@kernel.org	2022-04-06 07:34:58 -07:00
Andrii Nakryiko	1b4b798916	libbpf: Wire up USDT API and bpf_link integration Wire up libbpf USDT support APIs without yet implementing all the nitty-gritty details of USDT discovery, spec parsing, and BPF map initialization. User-visible user-space API is simple and is conceptually very similar to uprobe API. bpf_program__attach_usdt() API allows to programmatically attach given BPF program to a USDT, specified through binary path (executable or shared lib), USDT provider and name. Also, just like in uprobe case, PID filter is specified (0 - self, -1 - any process, or specific PID). Optionally, USDT cookie value can be specified. Such single API invocation will try to discover given USDT in specified binary and will use (potentially many) BPF uprobes to attach this program in correct locations. Just like any bpf_program__attach_xxx() APIs, bpf_link is returned that represents this attachment. It is a virtual BPF link that doesn't have direct kernel object, as it can consist of multiple underlying BPF uprobe links. As such, attachment is not atomic operation and there can be brief moment when some USDT call sites are attached while others are still in the process of attaching. This should be taken into consideration by user. But bpf_program__attach_usdt() guarantees that in the case of success all USDT call sites are successfully attached, or all the successfuly attachments will be detached as soon as some USDT call sites failed to be attached. So, in theory, there could be cases of failed bpf_program__attach_usdt() call which did trigger few USDT program invocations. This is unavoidable due to multi-uprobe nature of USDT and has to be handled by user, if it's important to create an illusion of atomicity. USDT BPF programs themselves are marked in BPF source code as either SEC("usdt"), in which case they won't be auto-attached through skeleton's <skel>__attach() method, or it can have a full definition, which follows the spirit of fully-specified uprobes: SEC("usdt/<path>:<provider>:<name>"). In the latter case skeleton's attach method will attempt auto-attachment. Similarly, generic bpf_program__attach() will have enought information to go off of for parameterless attachment. USDT BPF programs are actually uprobes, and as such for kernel they are marked as BPF_PROG_TYPE_KPROBE. Another part of this patch is USDT-related feature probing: - BPF cookie support detection from user-space; - detection of kernel support for auto-refcounting of USDT semaphore. The latter is optional. If kernel doesn't support such feature and USDT doesn't rely on USDT semaphores, no error is returned. But if libbpf detects that USDT requires setting semaphores and kernel doesn't support this, libbpf errors out with explicit pr_warn() message. Libbpf doesn't support poking process's memory directly to increment semaphore value, like BCC does on legacy kernels, due to inherent raciness and danger of such process memory manipulation. Libbpf let's kernel take care of this properly or gives up. Logistically, all the extra USDT-related infrastructure of libbpf is put into a separate usdt.c file and abstracted behind struct usdt_manager. Each bpf_object has lazily-initialized usdt_manager pointer, which is only instantiated if USDT programs are attempted to be attached. Closing BPF object frees up usdt_manager resources. usdt_manager keeps track of USDT spec ID assignment and few other small things. Subsequent patches will fill out remaining missing pieces of USDT initialization and setup logic. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/20220404234202.331384-3-andrii@kernel.org	2022-04-06 07:34:58 -07:00
Andrii Nakryiko	f5390e4f07	libbpf: Add BPF-side of USDT support Add BPF-side implementation of libbpf-provided USDT support. This consists of single header library, usdt.bpf.h, which is meant to be used from user's BPF-side source code. This header is added to the list of installed libbpf header, along bpf_helpers.h and others. BPF-side implementation consists of two BPF maps: - spec map, which contains "a USDT spec" which encodes information necessary to be able to fetch USDT arguments and other information (argument count, user-provided cookie value, etc) at runtime; - IP-to-spec-ID map, which is only used on kernels that don't support BPF cookie feature. It allows to lookup spec ID based on the place in user application that triggers USDT program. These maps have default sizes, 256 and 1024, which are chosen conservatively to not waste a lot of space, but handling a lot of common cases. But there could be cases when user application needs to either trace a lot of different USDTs, or USDTs are heavily inlined and their arguments are located in a lot of differing locations. For such cases it might be necessary to size those maps up, which libbpf allows to do by overriding BPF_USDT_MAX_SPEC_CNT and BPF_USDT_MAX_IP_CNT macros. It is an important aspect to keep in mind. Single USDT (user-space equivalent of kernel tracepoint) can have multiple USDT "call sites". That is, single logical USDT is triggered from multiple places in user application. This can happen due to function inlining. Each such inlined instance of USDT invocation can have its own unique USDT argument specification (instructions about the location of the value of each of USDT arguments). So while USDT looks very similar to usual uprobe or kernel tracepoint, under the hood it's actually a collection of uprobes, each potentially needing different spec to know how to fetch arguments. User-visible API consists of three helper functions: - bpf_usdt_arg_cnt(), which returns number of arguments of current USDT; - bpf_usdt_arg(), which reads value of specified USDT argument (by it's zero-indexed position) and returns it as 64-bit value; - bpf_usdt_cookie(), which functions like BPF cookie for USDT programs; this is necessary as libbpf doesn't allow specifying actual BPF cookie and utilizes it internally for USDT support implementation. Each bpf_usdt_xxx() APIs expect struct pt_regs * context, passed into BPF program. On kernels that don't support BPF cookie it is used to fetch absolute IP address of the underlying uprobe. usdt.bpf.h also provides BPF_USDT() macro, which functions like BPF_PROG() and BPF_KPROBE() and allows much more user-friendly way to get access to USDT arguments, if USDT definition is static and known to the user. It is expected that majority of use cases won't have to use bpf_usdt_arg_cnt() and bpf_usdt_arg() directly and BPF_USDT() will cover all their needs. Last, usdt.bpf.h is utilizing BPF CO-RE for one single purpose: to detect kernel support for BPF cookie. If BPF CO-RE dependency is undesirable, user application can redefine BPF_USDT_HAS_BPF_COOKIE to either a boolean constant (or equivalently zero and non-zero), or even point it to its own .rodata variable that can be specified from user's application user-space code. It is important that BPF_USDT_HAS_BPF_COOKIE is known to BPF verifier as static value (thus .rodata and not just .data), as otherwise BPF code will still contain bpf_get_attach_cookie() BPF helper call and will fail validation at runtime, if not dead-code eliminated. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/20220404234202.331384-2-andrii@kernel.org	2022-04-06 07:34:58 -07:00
Ilya Leoshkevich	00cd090f81	libbpf: Support Debian in resolve_full_path() attach_probe selftest fails on Debian-based distros with `failed to resolve full path for 'libc.so.6'`. The reason is that these distros embraced multiarch to the point where even for the "main" architecture they store libc in /lib/<triple>. This is configured in /etc/ld.so.conf and in theory it's possible to replicate the loader's parsing and processing logic in libbpf, however a much simpler solution is to just enumerate the known library paths. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220404225020.51029-1-iii@linux.ibm.com	2022-04-06 07:34:58 -07:00
Yuntao Wang	0167a314e7	libbpf: Don't return -EINVAL if hdr_len < offsetofend(core_relo_len) Since core relos is an optional part of the .BTF.ext ELF section, we should skip parsing it instead of returning -EINVAL if header size is less than offsetofend(struct btf_ext_header, core_relo_len). Signed-off-by: Yuntao Wang <ytcoode@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220404005320.1723055-1-ytcoode@gmail.com	2022-04-06 07:34:58 -07:00
Alan Maguire	8dcb95d509	libbpf: Add auto-attach for uprobes based on section name Now that u[ret]probes can use name-based specification, it makes sense to add support for auto-attach based on SEC() definition. The format proposed is SEC("u[ret]probe/binary:[raw_offset\|[function_name[+offset]]") For example, to trace malloc() in libc: SEC("uprobe/libc.so.6:malloc") ...or to trace function foo2 in /usr/bin/foo: SEC("uprobe//usr/bin/foo:foo2") Auto-attach is done for all tasks (pid -1). prog can be an absolute path or simply a program/library name; in the latter case, we use PATH/LD_LIBRARY_PATH to resolve the full path, falling back to standard locations (/usr/bin:/usr/sbin or /usr/lib64:/usr/lib) if the file is not found via environment-variable specified locations. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1648654000-21758-4-git-send-email-alan.maguire@oracle.com	2022-04-06 07:34:58 -07:00
Alan Maguire	d112c9ce24	libbpf: Support function name-based attach uprobes kprobe attach is name-based, using lookups of kallsyms to translate a function name to an address. Currently uprobe attach is done via an offset value as described in [1]. Extend uprobe opts for attach to include a function name which can then be converted into a uprobe-friendly offset. The calcualation is done in several steps: 1. First, determine the symbol address using libelf; this gives us the offset as reported by objdump 2. If the function is a shared library function - and the binary provided is a shared library - no further work is required; the address found is the required address 3. Finally, if the function is local, subtract the base address associated with the object, retrieved from ELF program headers. The resultant value is then added to the func_offset value passed in to specify the uprobe attach address. So specifying a func_offset of 0 along with a function name "printf" will attach to printf entry. The modes of operation supported are then 1. to attach to a local function in a binary; function "foo1" in "/usr/bin/foo" 2. to attach to a shared library function in a shared library - function "malloc" in libc. [1] https://www.kernel.org/doc/html/latest/trace/uprobetracer.html Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1648654000-21758-3-git-send-email-alan.maguire@oracle.com	2022-04-06 07:34:58 -07:00
Alan Maguire	4a7fa5b2bc	libbpf: auto-resolve programs/libraries when necessary for uprobes bpf_program__attach_uprobe_opts() requires a binary_path argument specifying binary to instrument. Supporting simply specifying "libc.so.6" or "foo" should be possible too. Library search checks LD_LIBRARY_PATH, then /usr/lib64, /usr/lib. This allows users to run BPF programs prefixed with LD_LIBRARY_PATH=/path2/lib while still searching standard locations. Similarly for non .so files, we check PATH and /usr/bin, /usr/sbin. Path determination will be useful for auto-attach of BPF uprobe programs using SEC() definition. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1648654000-21758-2-git-send-email-alan.maguire@oracle.com	2022-04-06 07:34:58 -07:00
Haiyue Wang	ff845b85e8	bpf: Correct the comment for BTF kind bitfield The commit 8fd886911a6a ("bpf: Add BTF_KIND_FLOAT to uapi") has extended the BTF kind bitfield from 4 to 5 bits, correct the comment. Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220403115327.205964-1-haiyue.wang@intel.com	2022-04-06 07:34:58 -07:00
Geliang Tang	fee7b9400a	bpf: Sync comments for bpf_get_stack Commit ee2a098851bf missed updating the comments for helper bpf_get_stack in tools/include/uapi/linux/bpf.h. Sync it. Fixes: ee2a098851bf ("bpf: Adjust BPF stack helper functions to accommodate skip > 0") Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/ce54617746b7ed5e9ba3b844e55e74cb8a60e0b5.1648110794.git.geliang.tang@suse.com	2022-04-06 07:34:58 -07:00
Hengqi Chen	360ed84faa	libbpf: Close fd in bpf_object__reuse_map pin_fd is dup-ed and assigned in bpf_map__reuse_fd. Close it in bpf_object__reuse_map after reuse. Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220319030533.3132250-1-hengqi.chen@gmail.com	2022-04-06 07:34:58 -07:00
Anshuman Khandual	3fbed0f1b2	perf: Add irq and exception return branch types This expands generic branch type classification by adding two more entries there in i.e irq and exception return. Also updates the x86 implementation to process X86_BR_IRET and X86_BR_IRQ records as appropriate. This changes branch types reported to user space on x86 platform but it should not be a problem. The possible scenarios and impacts are enumerated here. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1645681014-3346-1-git-send-email-anshuman.khandual@arm.com	2022-04-06 07:34:58 -07:00
Andrii Nakryiko	67a4b14643	ci: remove subprogs from 5.5 whitelist It seems like it started to cause kernel panic in CI, so drop it from whitelist. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-03-19 23:08:50 -07:00
Andrii Nakryiko	7db9ce5fda	libbpf: avoid NULL deref when initializing map BTF info If BPF object doesn't have an BTF info, don't attempt to search for BTF types describing BPF map key or value layout. Fixes: 262cfb74ffda ("libbpf: Init btf_{key,value}_type_id on internal map open") Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-03-19 23:08:50 -07:00
Andrii Nakryiko	f1b6bc31a5	ci: update s390x blacklist Sync s390x blacklist with the one currently used for kernel-patches CI. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-03-19 23:08:50 -07:00
Andrii Nakryiko	3ef1813702	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: c344b9fc2108eeaa347c387219886cf87e520e93 Checkpoint bpf-next commit: 9492450fd28736262dea9143ebb3afc2c131ace1 Baseline bpf commit: 18b1ab7aa76bde181bdb1ab19a87fa9523c32f21 Checkpoint bpf commit: 6bd0c76bd70447aedfeafa9e1fcc249991d6c678 Delyan Kratunov (3): libbpf: .text routines are subprograms in strict mode libbpf: Init btf_{key,value}_type_id on internal map open libbpf: Add subskeleton scaffolding Guo Zhengkui (1): libbpf: Fix array_size.cocci warning Hengqi Chen (1): bpf: Fix comment for helper bpf_current_task_under_cgroup() Jiri Olsa (5): bpf: Add multi kprobe link bpf: Add cookie support to programs attached with kprobe multi link libbpf: Add libbpf_kallsyms_parse function libbpf: Add bpf_link_create support for multi kprobes libbpf: Add bpf_program__attach_kprobe_multi_opts function Martin KaFai Lau (1): bpf: Remove BPF_SKB_DELIVERY_TIME_NONE and rename s/delivery_time_/tstamp_/ Roberto Sassu (1): bpf-lsm: Introduce new helper bpf_ima_file_hash() Toke Høiland-Jørgensen (2): bpf: Add "live packet" mode for XDP in BPF_PROG_RUN libbpf: Support batch_size option to bpf_prog_test_run lic121 (1): libbpf: Unmap rings when umem deleted include/uapi/linux/bpf.h \| 72 +++++--- src/bpf.c \| 13 +- src/bpf.h \| 12 +- src/libbpf.c \| 383 ++++++++++++++++++++++++++++++++++----- src/libbpf.h \| 52 ++++++ src/libbpf.map \| 3 + src/libbpf_internal.h \| 5 + src/libbpf_legacy.h \| 4 + src/xsk.c \| 15 +- 9 files changed, 487 insertions(+), 72 deletions(-) -- 2.30.2	2022-03-19 23:08:50 -07:00
Andrii Nakryiko	d580bc49d1	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2022-03-19 23:08:50 -07:00
Delyan Kratunov	cc4ef17c78	libbpf: Add subskeleton scaffolding In symmetry with bpf_object__open_skeleton(), bpf_object__open_subskeleton() performs the actual walking and linking of maps, progs, and globals described by bpf_*_skeleton objects. Signed-off-by: Delyan Kratunov <delyank@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/6942a46fbe20e7ebf970affcca307ba616985b15.1647473511.git.delyank@fb.com	2022-03-19 23:08:50 -07:00
Delyan Kratunov	e7084d4363	libbpf: Init btf_{key,value}_type_id on internal map open For internal and user maps, look up the key and value btf types on open() and not load(), so that `bpf_map_btf_value_type_id` is usable in `bpftool gen`. Signed-off-by: Delyan Kratunov <delyank@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/78dbe4e457b4a05e098fc6c8f50014b680c86e4e.1647473511.git.delyank@fb.com	2022-03-19 23:08:50 -07:00
Delyan Kratunov	c2ec92f0ee	libbpf: .text routines are subprograms in strict mode Currently, libbpf considers a single routine in .text to be a program. This is particularly confusing when it comes to library objects - a single routine meant to be used as an extern will instead be considered a bpf_program. This patch hides this compatibility behavior behind the pre-existing SEC_NAME strict mode flag. Signed-off-by: Delyan Kratunov <delyank@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/018de8d0d67c04bf436055270d35d394ba393505.1647473511.git.delyank@fb.com	2022-03-19 23:08:50 -07:00
Jiri Olsa	05acce9e03	libbpf: Add bpf_program__attach_kprobe_multi_opts function Adding bpf_program__attach_kprobe_multi_opts function for attaching kprobe program to multiple functions. struct bpf_link * bpf_program__attach_kprobe_multi_opts(const struct bpf_program prog, const char pattern, const struct bpf_kprobe_multi_opts opts); User can specify functions to attach with 'pattern' argument that allows wildcards (?' supported) or provide symbols or addresses directly through opts argument. These 3 options are mutually exclusive. When using symbols or addresses, user can also provide cookie value for each symbol/address that can be retrieved later in bpf program with bpf_get_attach_cookie helper. struct bpf_kprobe_multi_opts { size_t sz; const char *syms; const unsigned long addrs; const __u64 cookies; size_t cnt; bool retprobe; size_t :0; }; Symbols, addresses and cookies are provided through opts object (syms/addrs/cookies) as array pointers with specified count (cnt). Each cookie value is paired with provided function address or symbol with the same array index. The program can be also attached as return probe if 'retprobe' is set. For quick usage with NULL opts argument, like: bpf_program__attach_kprobe_multi_opts(prog, "ksys_", NULL) the 'prog' will be attached as kprobe to 'ksys_*' functions. Also adding new program sections for automatic attachment: kprobe.multi/<symbol_pattern> kretprobe.multi/<symbol_pattern> The symbol_pattern is used as 'pattern' argument in bpf_program__attach_kprobe_multi_opts function. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220316122419.933957-10-jolsa@kernel.org	2022-03-19 23:08:50 -07:00
Jiri Olsa	2e6e39ef80	libbpf: Add bpf_link_create support for multi kprobes Adding new kprobe_multi struct to bpf_link_create_opts object to pass multiple kprobe data to link_create attr uapi. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220316122419.933957-9-jolsa@kernel.org	2022-03-19 23:08:50 -07:00
Jiri Olsa	42f78dd5ac	libbpf: Add libbpf_kallsyms_parse function Move the kallsyms parsing in internal libbpf_kallsyms_parse function, so it can be used from other places. It will be used in following changes. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220316122419.933957-8-jolsa@kernel.org	2022-03-19 23:08:50 -07:00
Jiri Olsa	50ae8c25d2	bpf: Add cookie support to programs attached with kprobe multi link Adding support to call bpf_get_attach_cookie helper from kprobe programs attached with kprobe multi link. The cookie is provided by array of u64 values, where each value is paired with provided function address or symbol with the same array index. When cookie array is provided it's sorted together with addresses (check bpf_kprobe_multi_cookie_swap). This way we can find cookie based on the address in bpf_get_attach_cookie helper. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220316122419.933957-7-jolsa@kernel.org	2022-03-19 23:08:50 -07:00
Jiri Olsa	e85e26492d	bpf: Add multi kprobe link Adding new link type BPF_LINK_TYPE_KPROBE_MULTI that attaches kprobe program through fprobe API. The fprobe API allows to attach probe on multiple functions at once very fast, because it works on top of ftrace. On the other hand this limits the probe point to the function entry or return. The kprobe program gets the same pt_regs input ctx as when it's attached through the perf API. Adding new attach type BPF_TRACE_KPROBE_MULTI that allows attachment kprobe to multiple function with new link. User provides array of addresses or symbols with count to attach the kprobe program to. The new link_create uapi interface looks like: struct { __u32 flags; __u32 cnt; __aligned_u64 syms; __aligned_u64 addrs; } kprobe_multi; The flags field allows single BPF_TRACE_KPROBE_MULTI bit to create return multi kprobe. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220316122419.933957-4-jolsa@kernel.org	2022-03-19 23:08:50 -07:00
Roberto Sassu	9fb154ee77	bpf-lsm: Introduce new helper bpf_ima_file_hash() ima_file_hash() has been modified to calculate the measurement of a file on demand, if it has not been already performed by IMA or the measurement is not fresh. For compatibility reasons, ima_inode_hash() remains unchanged. Keep the same approach in eBPF and introduce the new helper bpf_ima_file_hash() to take advantage of the modified behavior of ima_file_hash(). Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220302111404.193900-4-roberto.sassu@huawei.com	2022-03-19 23:08:50 -07:00
Hengqi Chen	34d57cc0eb	bpf: Fix comment for helper bpf_current_task_under_cgroup() Fix the descriptions of the return values of helper bpf_current_task_under_cgroup(). Fixes: c6b5fb8690fa ("bpf: add documentation for eBPF helpers (42-50)") Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220310155335.1278783-1-hengqi.chen@gmail.com	2022-03-19 23:08:50 -07:00
Martin KaFai Lau	a557610d11	bpf: Remove BPF_SKB_DELIVERY_TIME_NONE and rename s/delivery_time_/tstamp_/ This patch is to simplify the uapi bpf.h regarding to the tstamp type and use a similar way as the kernel to describe the value stored in __sk_buff->tstamp. My earlier thought was to avoid describing the semantic and clock base for the rcv timestamp until there is more clarity on the use case, so the __sk_buff->delivery_time_type naming instead of __sk_buff->tstamp_type. With some thoughts, it can reuse the UNSPEC naming. This patch first removes BPF_SKB_DELIVERY_TIME_NONE and also rename BPF_SKB_DELIVERY_TIME_UNSPEC to BPF_SKB_TSTAMP_UNSPEC and BPF_SKB_DELIVERY_TIME_MONO to BPF_SKB_TSTAMP_DELIVERY_MONO. The semantic of BPF_SKB_TSTAMP_DELIVERY_MONO is the same: __sk_buff->tstamp has delivery time in mono clock base. BPF_SKB_TSTAMP_UNSPEC means __sk_buff->tstamp has the (rcv) tstamp at ingress and the delivery time at egress. At egress, the clock base could be found from skb->sk->sk_clockid. __sk_buff->tstamp == 0 naturally means NONE, so NONE is not needed. With BPF_SKB_TSTAMP_UNSPEC for the rcv tstamp at ingress, the __sk_buff->delivery_time_type is also renamed to __sk_buff->tstamp_type which was also suggested in the earlier discussion: https://lore.kernel.org/bpf/b181acbe-caf8-502d-4b7b-7d96b9fc5d55@iogearbox.net/ The above will then make __sk_buff->tstamp and __sk_buff->tstamp_type the same as its kernel skb->tstamp and skb->mono_delivery_time counter part. The internal kernel function bpf_skb_convert_dtime_type_read() is then renamed to bpf_skb_convert_tstamp_type_read() and it can be simplified with the BPF_SKB_DELIVERY_TIME_NONE gone. A BPF_ALU32_IMM(BPF_AND) insn is also saved by using BPF_JMP32_IMM(BPF_JSET). The bpf helper bpf_skb_set_delivery_time() is also renamed to bpf_skb_set_tstamp(). The arg name is changed from dtime to tstamp also. It only allows setting tstamp 0 for BPF_SKB_TSTAMP_UNSPEC and it could be relaxed later if there is use case to change mono delivery time to non mono. prog->delivery_time_access is also renamed to prog->tstamp_type_access. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220309090509.3712315-1-kafai@fb.com	2022-03-19 23:08:50 -07:00
Toke Høiland-Jørgensen	5ad674a007	libbpf: Support batch_size option to bpf_prog_test_run Add support for setting the new batch_size parameter to BPF_PROG_TEST_RUN to libbpf; just add it as an option and pass it through to the kernel. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20220309105346.100053-4-toke@redhat.com	2022-03-19 23:08:50 -07:00
Toke Høiland-Jørgensen	d647265e4b	bpf: Add "live packet" mode for XDP in BPF_PROG_RUN This adds support for running XDP programs through BPF_PROG_RUN in a mode that enables live packet processing of the resulting frames. Previous uses of BPF_PROG_RUN for XDP returned the XDP program return code and the modified packet data to userspace, which is useful for unit testing of XDP programs. The existing BPF_PROG_RUN for XDP allows userspace to set the ingress ifindex and RXQ number as part of the context object being passed to the kernel. This patch reuses that code, but adds a new mode with different semantics, which can be selected with the new BPF_F_TEST_XDP_LIVE_FRAMES flag. When running BPF_PROG_RUN in this mode, the XDP program return codes will be honoured: returning XDP_PASS will result in the frame being injected into the networking stack as if it came from the selected networking interface, while returning XDP_TX and XDP_REDIRECT will result in the frame being transmitted out that interface. XDP_TX is translated into an XDP_REDIRECT operation to the same interface, since the real XDP_TX action is only possible from within the network drivers themselves, not from the process context where BPF_PROG_RUN is executed. Internally, this new mode of operation creates a page pool instance while setting up the test run, and feeds pages from that into the XDP program. The setup cost of this is amortised over the number of repetitions specified by userspace. To support the performance testing use case, we further optimise the setup step so that all pages in the pool are pre-initialised with the packet data, and pre-computed context and xdp_frame objects stored at the start of each page. This makes it possible to entirely avoid touching the page content on each XDP program invocation, and enables sending up to 9 Mpps/core on my test box. Because the data pages are recycled by the page pool, and the test runner doesn't re-initialise them for each run, subsequent invocations of the XDP program will see the packet data in the state it was after the last time it ran on that particular page. This means that an XDP program that modifies the packet before redirecting it has to be careful about which assumptions it makes about the packet content, but that is only an issue for the most naively written programs. Enabling the new flag is only allowed when not setting ctx_out and data_out in the test specification, since using it means frames will be redirected somewhere else, so they can't be returned. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20220309105346.100053-2-toke@redhat.com	2022-03-19 23:08:50 -07:00
Guo Zhengkui	21cd83a1d1	libbpf: Fix array_size.cocci warning Fix the following coccicheck warning: tools/lib/bpf/bpf.c:114:31-32: WARNING: Use ARRAY_SIZE tools/lib/bpf/xsk.c:484:34-35: WARNING: Use ARRAY_SIZE tools/lib/bpf/xsk.c:485:35-36: WARNING: Use ARRAY_SIZE It has been tested with gcc (Debian 8.3.0-6) 8.3.0 on x86_64. Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220306023426.19324-1-guozhengkui@vivo.com	2022-03-19 23:08:50 -07:00
lic121	6e77ef94f0	libbpf: Unmap rings when umem deleted xsk_umem__create() does mmap for fill/comp rings, but xsk_umem__delete() doesn't do the unmap. This works fine for regular cases, because xsk_socket__delete() does unmap for the rings. But for the case that xsk_socket__create_shared() fails, umem rings are not unmapped. fill_save/comp_save are checked to determine if rings have already be unmapped by xsk. If fill_save and comp_save are NULL, it means that the rings have already been used by xsk. Then they are supposed to be unmapped by xsk_socket__delete(). Otherwise, xsk_umem__delete() does the unmap. Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices") Signed-off-by: Cheng Li <lic121@chinatelecom.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220301132623.GA19995@vscode.7~	2022-03-19 23:08:50 -07:00
Andrii Nakryiko	c84815ee37	ci: enable CONFIG_FPROBE=y for multi-attach kprobe tests Recently landed multi-attach kprobe functionality expects CONFIG_FPROBE=y. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-03-18 00:52:43 -07:00
Mykola Lysenko	4282f3cdec	ci: Add troubleshooting steps to s390x setup readme Related to libbpf CI. Added more information on how to setup and troubleshoot GitHub action runners for s390x platform. Signed-off-by: Mykola Lysenko <mykolal@fb.com>	2022-03-17 21:19:03 -07:00
Andrii Nakryiko	3591deb9bc	ci: blacklist s390x tests Blacklist timer_crash_mode as requiring BPF trampoline. Temporary blacklist sk_lookup due to big-endian problems that haven't been resolved upstream yet. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-03-07 22:16:11 -08:00
Andrii Nakryiko	767badc609	Makefile: update libbpf version to 0.8.0 New version cycle, bump LIBBPF_MINOR_VERSION to 8 in Makefile. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-03-07 22:16:11 -08:00
Andrii Nakryiko	8e654d74c4	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: b75dacaac4650478ed5a9d33975b91b99016daff Checkpoint bpf-next commit: c344b9fc2108eeaa347c387219886cf87e520e93 Baseline bpf commit: 75134f16e7dd0007aa474b281935c5f42e79f2c8 Checkpoint bpf commit: 18b1ab7aa76bde181bdb1ab19a87fa9523c32f21 Andrii Nakryiko (2): libbpf: Allow BPF program auto-attach handlers to bail out libbpf: Support custom SEC() handlers Hangbin Liu (1): bonding: add new option ns_ip6_target Martin KaFai Lau (1): bpf: Add __sk_buff->delivery_time_type and bpf_skb_set_skb_delivery_time() Stijn Tintel (1): libbpf: Fix BPF_MAP_TYPE_PERF_EVENT_ARRAY auto-pinning Xu Kuohai (1): libbpf: Skip forward declaration when counting duplicated type names Yuntao Wang (3): libbpf: Remove redundant check in btf_fixup_datasec() libbpf: Simplify the find_elf_sec_sz() function libbpf: Add a check to ensure that page_cnt is non-zero include/uapi/linux/bpf.h \| 41 +++- include/uapi/linux/if_link.h \| 1 + src/btf_dump.c \| 5 + src/libbpf.c \| 388 +++++++++++++++++++++++------------ src/libbpf.h \| 109 ++++++++++ src/libbpf.map \| 6 + src/libbpf_version.h \| 2 +- 7 files changed, 423 insertions(+), 129 deletions(-) -- 2.30.2	2022-03-07 22:16:11 -08:00
Andrii Nakryiko	dac1e23c97	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2022-03-07 22:16:11 -08:00
Andrii Nakryiko	dc679587eb	libbpf: Support custom SEC() handlers Allow registering and unregistering custom handlers for BPF program. This allows user applications and libraries to plug into libbpf's declarative SEC() definition handling logic. This allows to offload complex and intricate custom logic into external libraries, but still provide a great user experience. One such example is USDT handling library, which has a lot of code and complexity which doesn't make sense to put into libbpf directly, but it would be really great for users to be able to specify BPF programs with something like SEC("usdt/<path-to-binary>:<usdt_provider>:<usdt_name>") and have correct BPF program type set (BPF_PROGRAM_TYPE_KPROBE, as it is uprobe) and even support BPF skeleton's auto-attach logic. In some cases, it might be even good idea to override libbpf's default handling, like for SEC("perf_event") programs. With custom library, it's possible to extend logic to support specifying perf event specification right there in SEC() definition without burdening libbpf with lots of custom logic or extra library dependecies (e.g., libpfm4). With current patch it's possible to override libbpf's SEC("perf_event") handling and specify a completely custom ones. Further, it's possible to specify a generic fallback handling for any SEC() that doesn't match any other custom or standard libbpf handlers. This allows to accommodate whatever legacy use cases there might be, if necessary. See doc comments for libbpf_register_prog_handler() and libbpf_unregister_prog_handler() for detailed semantics. This patch also bumps libbpf development version to v0.8 and adds new APIs there. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Alan Maguire <alan.maguire@oracle.com> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/20220305010129.1549719-3-andrii@kernel.org	2022-03-07 22:16:11 -08:00
Andrii Nakryiko	0d834905d8	libbpf: Allow BPF program auto-attach handlers to bail out Allow some BPF program types to support auto-attach only in subste of cases. Currently, if some BPF program type specifies attach callback, it is assumed that during skeleton attach operation all such programs either successfully attach or entire skeleton attachment fails. If some program doesn't support auto-attachment from skeleton, such BPF program types shouldn't have attach callback specified. This is limiting for cases when, depending on how full the SEC("") definition is, there could either be enough details to support auto-attach or there might not be and user has to use some specific API to provide more details at runtime. One specific example of such desired behavior might be SEC("uprobe"). If it's specified as just uprobe auto-attach isn't possible. But if it's SEC("uprobe/<some_binary>:<some_func>") then there are enough details to support auto-attach. Note that there is a somewhat subtle difference between auto-attach behavior of BPF skeleton and using "generic" bpf_program__attach(prog) (which uses the same attach handlers under the cover). Skeleton allow some programs within bpf_object to not have auto-attach implemented and doesn't treat that as an error. Instead such BPF programs are just skipped during skeleton's (optional) attach step. bpf_program__attach(), on the other hand, is called when user expects auto-attach to work, so if specified program doesn't implement or doesn't support auto-attach functionality, that will be treated as an error. Another improvement to the way libbpf is handling SEC()s would be to not require providing dummy kernel function name for kprobe. Currently, SEC("kprobe/whatever") is necessary even if actual kernel function is determined by user at runtime and bpf_program__attach_kprobe() is used to specify it. With changes in this patch, it's possible to support both SEC("kprobe") and SEC("kprobe/<actual_kernel_function"), while only in the latter case auto-attach will be performed. In the former one, such kprobe will be skipped during skeleton attach operation. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Alan Maguire <alan.maguire@oracle.com> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/20220305010129.1549719-2-andrii@kernel.org	2022-03-07 22:16:11 -08:00
Yuntao Wang	0a43bc8905	libbpf: Add a check to ensure that page_cnt is non-zero The page_cnt parameter is used to specify the number of memory pages allocated for each per-CPU buffer, it must be non-zero and a power of 2. Currently, the __perf_buffer__new() function attempts to validate that the page_cnt is a power of 2 but forgets checking for the case where page_cnt is zero, we can fix it by replacing 'page_cnt & (page_cnt - 1)' with 'page_cnt == 0 \|\| (page_cnt & (page_cnt - 1))'. If so, we also don't need to add a check in perf_buffer__new_v0_6_0() to make sure that page_cnt is non-zero and the check for zero in perf_buffer__new_raw_v0_6_0() can also be removed. The code will be cleaner and more readable. Signed-off-by: Yuntao Wang <ytcoode@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220303005921.53436-1-ytcoode@gmail.com	2022-03-07 22:16:11 -08:00
Xu Kuohai	5d491d5d07	libbpf: Skip forward declaration when counting duplicated type names Currently if a declaration appears in the BTF before the definition, the definition is dumped as a conflicting name, e.g.: $ bpftool btf dump file vmlinux format raw \| grep "'unix_sock'" [81287] FWD 'unix_sock' fwd_kind=struct [89336] STRUCT 'unix_sock' size=1024 vlen=14 $ bpftool btf dump file vmlinux format c \| grep "struct unix_sock" struct unix_sock; struct unix_sock___2 { <--- conflict, the "___2" is unexpected struct unix_sock___2 *unix_sk; This causes a compilation error if the dump output is used as a header file. Fix it by skipping declaration when counting duplicated type names. Fixes: 351131b51c7a ("libbpf: add btf_dump API for BTF-to-C conversion") Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20220301053250.1464204-2-xukuohai@huawei.com	2022-03-07 22:16:11 -08:00
Stijn Tintel	9b53decb02	libbpf: Fix BPF_MAP_TYPE_PERF_EVENT_ARRAY auto-pinning When a BPF map of type BPF_MAP_TYPE_PERF_EVENT_ARRAY doesn't have the max_entries parameter set, the map will be created with max_entries set to the number of available CPUs. When we try to reuse such a pinned map, map_is_reuse_compat will return false, as max_entries in the map definition differs from max_entries of the existing map, causing the following error: libbpf: couldn't reuse pinned map at '/sys/fs/bpf/m_logging': parameter mismatch Fix this by overwriting max_entries in the map definition. For this to work, we need to do this in bpf_object__create_maps, before calling bpf_object__reuse_map. Fixes: 57a00f41644f ("libbpf: Add auto-pinning of maps when loading BPF objects") Signed-off-by: Stijn Tintel <stijn@linux-ipv6.be> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20220225152355.315204-1-stijn@linux-ipv6.be	2022-03-07 22:16:11 -08:00
Yuntao Wang	426672106e	libbpf: Simplify the find_elf_sec_sz() function The check in the last return statement is unnecessary, we can just return the ret variable. But we can simplify the function further by returning 0 immediately if we find the section size and -ENOENT otherwise. Thus we can also remove the ret variable. Signed-off-by: Yuntao Wang <ytcoode@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220223085244.3058118-1-ytcoode@gmail.com	2022-03-07 22:16:11 -08:00
Yuntao Wang	c85a8bbe9c	libbpf: Remove redundant check in btf_fixup_datasec() The check 't->size && t->size != size' is redundant because if t->size compares unequal to 0, we will just skip straight to sorting variables. Signed-off-by: Yuntao Wang <ytcoode@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220220072750.209215-1-ytcoode@gmail.com	2022-03-07 22:16:11 -08:00
Martin KaFai Lau	e7997e49ea	bpf: Add __sk_buff->delivery_time_type and bpf_skb_set_skb_delivery_time() * __sk_buff->delivery_time_type: This patch adds __sk_buff->delivery_time_type. It tells if the delivery_time is stored in __sk_buff->tstamp or not. It will be most useful for ingress to tell if the __sk_buff->tstamp has the (rcv) timestamp or delivery_time. If delivery_time_type is 0 (BPF_SKB_DELIVERY_TIME_NONE), it has the (rcv) timestamp. Two non-zero types are defined for the delivery_time_type, BPF_SKB_DELIVERY_TIME_MONO and BPF_SKB_DELIVERY_TIME_UNSPEC. For UNSPEC, it can only happen in egress because only mono delivery_time can be forwarded to ingress now. The clock of UNSPEC delivery_time can be deduced from the skb->sk->sk_clockid which is how the sch_etf doing it also. * Provide forwarded delivery_time to tc-bpf@ingress: With the help of the new delivery_time_type, the tc-bpf has a way to tell if the __sk_buff->tstamp has the (rcv) timestamp or the delivery_time. During bpf load time, the verifier will learn if the bpf prog has accessed the new __sk_buff->delivery_time_type. If it does, it means the tc-bpf@ingress is expecting the skb->tstamp could have the delivery_time. The kernel will then read the skb->tstamp as-is during bpf insn rewrite without checking the skb->mono_delivery_time. This is done by adding a new prog->delivery_time_access bit. The same goes for writing skb->tstamp. * bpf_skb_set_delivery_time(): The bpf_skb_set_delivery_time() helper is added to allow setting both delivery_time and the delivery_time_type at the same time. If the tc-bpf does not need to change the delivery_time_type, it can directly write to the __sk_buff->tstamp as the existing tc-bpf has already been doing. It will be most useful at ingress to change the __sk_buff->tstamp from the (rcv) timestamp to a mono delivery_time and then bpf_redirect_*(). bpf only has mono clock helper (bpf_ktime_get_ns), and the current known use case is the mono EDT for fq, and only mono delivery time can be kept during forward now, so bpf_skb_set_delivery_time() only supports setting BPF_SKB_DELIVERY_TIME_MONO. It can be extended later when use cases come up and the forwarding path also supports other clock bases. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-03-07 22:16:11 -08:00
Hangbin Liu	4c560383a6	bonding: add new option ns_ip6_target This patch add a new bonding option ns_ip6_target, which correspond to the arp_ip_target. With this we set IPv6 targets and send IPv6 NS request to determine the health of the link. For other related options like the validation, we still use arp_validate, and will change to ns_validate later. Note: the sysfs configuration support was removed based on https://lore.kernel.org/netdev/8863.1645071997@famine Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-03-07 22:16:11 -08:00
Andrii Nakryiko	9c44c8a8e0	LICENSE: fix BSD-2-Clause by adding year and authors Seems like 2015 is the year of the first libbpf commit. So use Lorenz's suggestion and add "(c) 2015 The Libbpf Authors". Closes: https://github.com/libbpf/libbpf/issues/461 Reported-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-02-23 17:55:21 -08:00
Andrii Nakryiko	1c173e5fc8	libbpf: fix libbpf.pc generation w.r.t. patch versions Ensure that libbpf.pc gets full libbpf's version, including patch releases. Also add some mechanism to ensure that official released version (e.g., 0.7.1) and the one recorded in libbpf.map (which never bumps patch version, so will be 0.7.0) are in sync up to major and minor versions. This should ensure that major mistakes are captured. We'll still need to be very careful with zeroing out patch version on minor version bumps. Closes: https://github.com/libbpf/libbpf/issues/455 Reported-by: Michel Salim <michel@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-02-22 20:06:42 -08:00
Andrii Nakryiko	93c570ca4b	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 2e3f7bed28376a1a41ce4a58b7163b586e97a546 Checkpoint bpf-next commit: b75dacaac4650478ed5a9d33975b91b99016daff Baseline bpf commit: 45ce4b4f9009102cd9f581196d480a59208690c1 Checkpoint bpf commit: 75134f16e7dd0007aa474b281935c5f42e79f2c8 Andrii Nakryiko (1): libbpf: Fix memleak in libbpf_netlink_recv() src/netlink.c \| 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) -- 2.30.2	2022-02-17 11:33:57 -08:00
Andrii Nakryiko	33201b7ebd	libbpf: Fix memleak in libbpf_netlink_recv() Ensure that libbpf_netlink_recv() frees dynamically allocated buffer in all code paths. Fixes: 9c3de619e13e ("libbpf: Use dynamically allocated buffer when receiving netlink messages") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20220217073958.276959-1-andrii@kernel.org	2022-02-17 11:33:57 -08:00
Andrii Nakryiko	6edaacad4f	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 8cbf062a250ed52148badf6f3ffd03657dd4a3f0 Checkpoint bpf-next commit: 2e3f7bed28376a1a41ce4a58b7163b586e97a546 Baseline bpf commit: 61d06f01f9710b327a53492e5add9f972eb909b3 Checkpoint bpf commit: 45ce4b4f9009102cd9f581196d480a59208690c1 Mauricio Vásquez (2): libbpf: Split bpf_core_apply_relo() libbpf: Expose bpf_core_{add,free}_cands() to bpftool src/libbpf.c \| 88 ++++++++++++++++++++++++------------------- src/libbpf_internal.h \| 9 +++++ src/relo_core.c \| 79 +++++++++++--------------------------- src/relo_core.h \| 42 ++++++++++++++++++--- 4 files changed, 118 insertions(+), 100 deletions(-) -- 2.30.2	2022-02-16 13:58:30 -08:00
Mauricio Vásquez	af29a83fe2	libbpf: Expose bpf_core_{add,free}_cands() to bpftool Expose bpf_core_add_cands() and bpf_core_free_cands() to handle candidates list. Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io> Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com> Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co> Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220215225856.671072-3-mauricio@kinvolk.io	2022-02-16 13:58:30 -08:00
Mauricio Vásquez	6387d3900f	libbpf: Split bpf_core_apply_relo() BTFGen needs to run the core relocation logic in order to understand what are the types involved in a given relocation. Currently bpf_core_apply_relo() calculates and applies a relocation to an instruction. Having both operations in the same function makes it difficult to only calculate the relocation without patching the instruction. This commit splits that logic in two different phases: (1) calculate the relocation and (2) patch the instruction. For the first phase bpf_core_apply_relo() is renamed to bpf_core_calc_relo_insn() who is now only on charge of calculating the relocation, the second phase uses the already existing bpf_core_patch_insn(). bpf_object__relocate_core() uses both of them and the BTFGen will use only bpf_core_calc_relo_insn(). Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io> Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com> Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co> Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220215225856.671072-2-mauricio@kinvolk.io	2022-02-16 13:58:30 -08:00
Andrii Nakryiko	196da61f1d	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: dc37dc617fabfb1c3a16d49f5d8cc20e9e3608ca Checkpoint bpf-next commit: 8cbf062a250ed52148badf6f3ffd03657dd4a3f0 Baseline bpf commit: fe68195daf34d5dddacd3f93dd3eafc4beca3a0e Checkpoint bpf commit: 61d06f01f9710b327a53492e5add9f972eb909b3 Alexei Starovoitov (1): libbpf: Prepare light skeleton for the kernel. Jakub Sitnicki (1): selftests/bpf: Cover 4-byte load from remote_port in bpf_sk_lookup Marco Elver (1): perf: uapi: Document perf_event_attr::sig_data truncation on 32 bit architectures Toke Høiland-Jørgensen (1): libbpf: Use dynamically allocated buffer when receiving netlink messages include/uapi/linux/bpf.h \| 3 +- include/uapi/linux/perf_event.h \| 2 + src/gen_loader.c \| 15 ++- src/netlink.c \| 55 +++++++++- src/skel_internal.h \| 185 ++++++++++++++++++++++++++++---- 5 files changed, 234 insertions(+), 26 deletions(-) -- 2.30.2	2022-02-15 22:32:04 -08:00
Marco Elver	db8dc47ce8	perf: uapi: Document perf_event_attr::sig_data truncation on 32 bit architectures Due to the alignment requirements of siginfo_t, as described in 3ddb3fd8cdb0 ("signal, perf: Fix siginfo_t by avoiding u64 on 32-bit architectures"), siginfo_t::si_perf_data is limited to an unsigned long. However, perf_event_attr::sig_data is an u64, to avoid having to deal with compat conversions. Due to being an u64, it may not immediately be clear to users that sig_data is truncated on 32 bit architectures. Add a comment to explicitly point this out, and hopefully help some users save time by not having to deduce themselves what's happening. Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Link: https://lore.kernel.org/r/20220131103407.1971678-3-elver@google.com	2022-02-15 22:32:04 -08:00
Toke Høiland-Jørgensen	f7d89c3910	libbpf: Use dynamically allocated buffer when receiving netlink messages When receiving netlink messages, libbpf was using a statically allocated stack buffer of 4k bytes. This happened to work fine on systems with a 4k page size, but on systems with larger page sizes it can lead to truncated messages. The user-visible impact of this was that libbpf would insist no XDP program was attached to some interfaces because that bit of the netlink message got chopped off. Fix this by switching to a dynamically allocated buffer; we borrow the approach from iproute2 of using recvmsg() with MSG_PEEK\|MSG_TRUNC to get the actual size of the pending message before receiving it, adjusting the buffer as necessary. While we're at it, also add retries on interrupted system calls around the recvmsg() call. v2: - Move peek logic to libbpf_netlink_recv(), don't double free on ENOMEM. Fixes: 8bbb77b7c7a2 ("libbpf: Add various netlink helpers") Reported-by: Zhiqian Guan <zhguan@redhat.com> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/bpf/20220211234819.612288-1-toke@redhat.com	2022-02-15 22:32:04 -08:00
Alexei Starovoitov	0d6262ad0a	libbpf: Prepare light skeleton for the kernel. Prepare light skeleton to be used in the kernel module and in the user space. The look and feel of lskel.h is mostly the same with the difference that for user space the skel->rodata is the same pointer before and after skel_load operation, while in the kernel the skel->rodata after skel_open and the skel->rodata after skel_load are different pointers. Typical usage of skeleton remains the same for kernel and user space: skel = my_bpf__open(); skel->rodata->my_global_var = init_val; err = my_bpf__load(skel); err = my_bpf__attach(skel); // access skel->rodata->my_global_var; // access skel->bss->another_var; Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220209232001.27490-3-alexei.starovoitov@gmail.com	2022-02-15 22:32:04 -08:00
Jakub Sitnicki	7593fc7a85	selftests/bpf: Cover 4-byte load from remote_port in bpf_sk_lookup Extend the context access tests for sk_lookup prog to cover the surprising case of a 4-byte load from the remote_port field, where the expected value is actually shifted by 16 bits. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220209184333.654927-3-jakub@cloudflare.com	2022-02-15 22:32:04 -08:00
Andrii Nakryiko	67f813c8a8	README: add libbpf distro packaging badge Add badge displaying libbpf's packaging status across various Linux distros.	2022-02-11 21:21:10 -08:00
Andrii Nakryiko	2cd2d03f63	libbpf: Fix libbpf.map inheritance chain for LIBBPF_0.7.0 Ensure that LIBBPF_0.7.0 inherits everything from LIBBPF_0.6.0. Fixes: dbdd2c7f8cec ("libbpf: Add API to get/set log_level at per-program level") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220211205235.2089104-1-andrii@kernel.org	2022-02-11 13:01:37 -08:00
Andrii Nakryiko	528094c0c1	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 227a0713b319e7a8605312dee1c97c97a719a9fc Checkpoint bpf-next commit: dc37dc617fabfb1c3a16d49f5d8cc20e9e3608ca Baseline bpf commit: 77b1b8b43ec3c060ecf7e926a92b0f8772171046 Checkpoint bpf commit: fe68195daf34d5dddacd3f93dd3eafc4beca3a0e Andrii Nakryiko (1): libbpf: Fix compilation warning due to mismatched printf format Dan Carpenter (1): libbpf: Fix signedness bug in btf_dump_array_data() Hengqi Chen (1): libbpf: Add BPF_KPROBE_SYSCALL macro Ilya Leoshkevich (7): libbpf: Add PT_REGS_SYSCALL_REGS macro libbpf: Fix accessing syscall arguments on powerpc libbpf: Fix riscv register names libbpf: Fix accessing syscall arguments on riscv libbpf: Allow overriding PT_REGS_PARM1{_CORE}_SYSCALL libbpf: Fix accessing the first syscall argument on arm64 libbpf: Fix accessing the first syscall argument on s390 Mauricio Vásquez (1): libbpf: Remove mode check in libbpf_set_strict_mode() src/bpf_tracing.h \| 85 +++++++++++++++++++++++++++++++++++++++++------ src/btf_dump.c \| 6 ++-- src/libbpf.c \| 8 ----- 3 files changed, 79 insertions(+), 20 deletions(-) -- 2.30.2	2022-02-09 09:48:32 -08:00
Andrii Nakryiko	37493e639f	libbpf: Fix compilation warning due to mismatched printf format On ppc64le architecture __s64 is long int and requires %ld. Cast to ssize_t and use %zd to avoid architecture-specific specifiers. Fixes: 4172843ed4a3 ("libbpf: Fix signedness bug in btf_dump_array_data()") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220209063909.1268319-1-andrii@kernel.org	2022-02-09 09:48:32 -08:00
Hengqi Chen	b0a3b9e8fe	libbpf: Add BPF_KPROBE_SYSCALL macro Add syscall-specific variant of BPF_KPROBE named BPF_KPROBE_SYSCALL ([0]). The new macro hides the underlying way of getting syscall input arguments. With the new macro, the following code: SEC("kprobe/__x64_sys_close") int BPF_KPROBE(do_sys_close, struct pt_regs regs) { int fd; fd = PT_REGS_PARM1_CORE(regs); / do something with fd / } can be written as: SEC("kprobe/__x64_sys_close") int BPF_KPROBE_SYSCALL(do_sys_close, int fd) { / do something with fd */ } [0] Closes: https://github.com/libbpf/libbpf/issues/425 Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220207143134.2977852-2-hengqi.chen@gmail.com	2022-02-09 09:48:32 -08:00
Ilya Leoshkevich	f7e08b4a8f	libbpf: Fix accessing the first syscall argument on s390 On s390, the first syscall argument should be accessed via orig_gpr2 (see arch/s390/include/asm/syscall.h). Currently gpr[2] is used instead, leading to bpf_syscall_macro test failure. orig_gpr2 cannot be added to user_pt_regs, since its layout is a part of the ABI. Therefore provide access to it only through PT_REGS_PARM1_CORE_SYSCALL() by using a struct pt_regs flavor. Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220209021745.2215452-11-iii@linux.ibm.com	2022-02-09 09:48:32 -08:00
Ilya Leoshkevich	9f6e3a7a59	libbpf: Fix accessing the first syscall argument on arm64 On arm64, the first syscall argument should be accessed via orig_x0 (see arch/arm64/include/asm/syscall.h). Currently regs[0] is used instead, leading to bpf_syscall_macro test failure. orig_x0 cannot be added to struct user_pt_regs, since its layout is a part of the ABI. Therefore provide access to it only through PT_REGS_PARM1_CORE_SYSCALL() by using a struct pt_regs flavor. Reported-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220209021745.2215452-10-iii@linux.ibm.com	2022-02-09 09:48:32 -08:00
Ilya Leoshkevich	f1a756d793	libbpf: Allow overriding PT_REGS_PARM1{_CORE}_SYSCALL arm64 and s390 need a special way to access the first syscall argument. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220209021745.2215452-9-iii@linux.ibm.com	2022-02-09 09:48:32 -08:00
Ilya Leoshkevich	32c19d8505	libbpf: Fix accessing syscall arguments on riscv riscv does not select ARCH_HAS_SYSCALL_WRAPPER, so its syscall handlers take "unpacked" syscall arguments. Indicate this to libbpf using PT_REGS_SYSCALL_REGS macro. Reported-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220209021745.2215452-7-iii@linux.ibm.com	2022-02-09 09:48:32 -08:00
Ilya Leoshkevich	497ec1d35c	libbpf: Fix riscv register names riscv registers are accessed via struct user_regs_struct, not struct pt_regs. The program counter member in this struct is called pc, not epc. The frame pointer is called s0, not fp. Fixes: 3cc31d794097 ("libbpf: Normalize PT_REGS_xxx() macro definitions") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220209021745.2215452-6-iii@linux.ibm.com	2022-02-09 09:48:32 -08:00
Ilya Leoshkevich	8a28842a20	libbpf: Fix accessing syscall arguments on powerpc powerpc does not select ARCH_HAS_SYSCALL_WRAPPER, so its syscall handlers take "unpacked" syscall arguments. Indicate this to libbpf using PT_REGS_SYSCALL_REGS macro. Reported-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Link: https://lore.kernel.org/bpf/20220209021745.2215452-5-iii@linux.ibm.com	2022-02-09 09:48:32 -08:00
Ilya Leoshkevich	50b4d99bbc	libbpf: Add PT_REGS_SYSCALL_REGS macro Architectures that select ARCH_HAS_SYSCALL_WRAPPER pass a pointer to struct pt_regs to syscall handlers, others unpack it into individual function parameters. Introduce a macro to describe what a particular arch does. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220209021745.2215452-3-iii@linux.ibm.com	2022-02-09 09:48:32 -08:00
Dan Carpenter	f5bd7054f9	libbpf: Fix signedness bug in btf_dump_array_data() The btf__resolve_size() function returns negative error codes so "elem_size" must be signed for the error handling to work. Fixes: 920d16af9b42 ("libbpf: BTF dumper support for typed data") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220208071552.GB10495@kili	2022-02-09 09:48:32 -08:00
Mauricio Vásquez	1ceec54cb0	libbpf: Remove mode check in libbpf_set_strict_mode() libbpf_set_strict_mode() checks that the passed mode doesn't contain extra bits for LIBBPF_STRICT_* flags that don't exist yet. It makes it difficult for applications to disable some strict flags as something like "LIBBPF_STRICT_ALL & ~LIBBPF_STRICT_MAP_DEFINITIONS" is rejected by this check and they have to use a rather complicated formula to calculate it.[0] One possibility is to change LIBBPF_STRICT_ALL to only contain the bits of all existing LIBBPF_STRICT_* flags instead of 0xffffffff. However it's not possible because the idea is that applications compiled against older libbpf_legacy.h would still be opting into latest LIBBPF_STRICT_ALL features.[1] The other possibility is to remove that check so something like "LIBBPF_STRICT_ALL & ~LIBBPF_STRICT_MAP_DEFINITIONS" is allowed. It's what this commit does. [0]: https://lore.kernel.org/bpf/20220204220435.301896-1-mauricio@kinvolk.io/ [1]: https://lore.kernel.org/bpf/CAEf4BzaTWa9fELJLh+bxnOb0P1EMQmaRbJVG0L+nXZdy0b8G3Q@mail.gmail.com/ Fixes: 93b8952d223a ("libbpf: deprecate legacy BPF map definitions") Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220207145052.124421-2-mauricio@kinvolk.io	2022-02-09 09:48:32 -08:00
Andrii Nakryiko	d6783c28b4	README: update Ubuntu package link Point to libbpf package in Ubuntu Impish, so it doesn't expire soon. Closes: https://github.com/libbpf/libbpf/issues/451	2022-02-07 09:37:49 -08:00
Andrii Nakryiko	cdef8257a8	ci: bump LLVM_VER to 15 Another bump of latest clang version to be used during BPF selftests build. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-02-04 15:28:06 -08:00
Andrii Nakryiko	fd181bc349	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: b3dddab2ff10853aa3ef70483415d07fee3034ba Checkpoint bpf-next commit: 227a0713b319e7a8605312dee1c97c97a719a9fc Baseline bpf commit: e2bcbd7769ee8f05e1b3d10848aace98973844e4 Checkpoint bpf commit: 77b1b8b43ec3c060ecf7e926a92b0f8772171046 Alexei Starovoitov (2): libbpf: Open code low level bpf commands. libbpf: Open code raw_tp_open and link_create commands. Andrii Nakryiko (2): libbpf: Stop using deprecated bpf_map__is_offload_neutral() libbpf: Deprecate forgotten btf__get_map_kv_tids() Dave Marchevsky (1): libbpf: Deprecate btf_ext rec_size APIs Delyan Kratunov (2): libbpf: Deprecate bpf_prog_test_run_xattr and bpf_prog_test_run libbpf: Deprecate priv/set_priv storage Jakub Sitnicki (1): selftests/bpf: Extend verifier and bpf_sock tests for dst_port loads Lorenzo Bianconi (1): libbpf: Deprecate xdp_cpumap, xdp_devmap and classifier sec definitions include/uapi/linux/bpf.h \| 3 +- src/bpf.h \| 4 ++- src/btf.h \| 7 ++-- src/libbpf.c \| 19 ++++++++--- src/libbpf.h \| 7 +++- src/skel_internal.h \| 70 ++++++++++++++++++++++++++++++++++++++-- 6 files changed, 99 insertions(+), 11 deletions(-) -- 2.30.2	2022-02-04 10:27:02 -08:00
Andrii Nakryiko	47673bd255	libbpf: Deprecate forgotten btf__get_map_kv_tids() btf__get_map_kv_tids() is in the same group of APIs as btf_ext__reloc_func_info()/btf_ext__reloc_line_info() which were only used by BCC. It was missed to be marked as deprecated in [0]. Fixing that to complete [1]. [0] https://patchwork.kernel.org/project/netdevbpf/patch/20220201014610.3522985-1-davemarchevsky@fb.com/ [1] Closes: https://github.com/libbpf/libbpf/issues/277 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220203225017.1795946-1-andrii@kernel.org	2022-02-04 10:27:02 -08:00
Delyan Kratunov	20442822c0	libbpf: Deprecate priv/set_priv storage Arbitrary storage via bpf_*__set_priv/__priv is being deprecated without a replacement ([1]). perf uses this capability, but most of that is going away with the removal of prologue generation ([2]). perf is already suppressing deprecation warnings, so the remaining cleanup will happen separately. [1]: Closes: https://github.com/libbpf/libbpf/issues/294 [2]: https://lore.kernel.org/bpf/20220123221932.537060-1-jolsa@kernel.org/ Signed-off-by: Delyan Kratunov <delyank@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220203180032.1921580-1-delyank@fb.com	2022-02-04 10:27:02 -08:00
Andrii Nakryiko	a9cd83ae25	libbpf: Stop using deprecated bpf_map__is_offload_neutral() Open-code bpf_map__is_offload_neutral() logic in one place in to-be-deprecated bpf_prog_load_xattr2. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20220202225916.3313522-2-andrii@kernel.org	2022-02-04 10:27:02 -08:00
Delyan Kratunov	352c13cdee	libbpf: Deprecate bpf_prog_test_run_xattr and bpf_prog_test_run Deprecate non-extendable bpf_prog_test_run{,_xattr} in favor of OPTS-based bpf_prog_test_run_opts ([0]). [0] Closes: https://github.com/libbpf/libbpf/issues/286 Signed-off-by: Delyan Kratunov <delyank@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220202235423.1097270-5-delyank@fb.com	2022-02-04 10:27:02 -08:00
Alexei Starovoitov	a7a3a8811c	libbpf: Open code raw_tp_open and link_create commands. Open code raw_tracepoint_open and link_create used by light skeleton to be able to avoid full libbpf eventually. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20220131220528.98088-4-alexei.starovoitov@gmail.com	2022-02-04 10:27:02 -08:00
Alexei Starovoitov	bd402dccaf	libbpf: Open code low level bpf commands. Open code low level bpf commands used by light skeleton to be able to avoid full libbpf eventually. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20220131220528.98088-3-alexei.starovoitov@gmail.com	2022-02-04 10:27:02 -08:00
Lorenzo Bianconi	c245b0eeaf	libbpf: Deprecate xdp_cpumap, xdp_devmap and classifier sec definitions Deprecate xdp_cpumap, xdp_devmap and classifier sec definitions. Introduce xdp/devmap and xdp/cpumap definitions according to the standard for SEC("") in libbpf: - prog_type.prog_flags/attach_place Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/5c7bd9426b3ce6a31d9a4b1f97eb299e1467fc52.1643727185.git.lorenzo@kernel.org	2022-02-04 10:27:02 -08:00
Dave Marchevsky	74dcd1bf6a	libbpf: Deprecate btf_ext rec_size APIs btf_ext__{func,line}_info_rec_size functions are used in conjunction with already-deprecated btf_ext__reloc_{func,line}_info functions. Since struct btf_ext is opaque to the user it was necessary to expose rec_size getters in the past. btf_ext__reloc_{func,line}_info were deprecated in commit 8505e8709b5ee ("libbpf: Implement generalized .BTF.ext func/line info adjustment") as they're not compatible with support for multiple programs per section. It was decided[0] that users of these APIs should implement their own .btf.ext parsing to access this data, in which case the rec_size getters are unnecessary. So deprecate them from libbpf 0.7.0 onwards. [0] Closes: https://github.com/libbpf/libbpf/issues/277 Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220201014610.3522985-1-davemarchevsky@fb.com	2022-02-04 10:27:02 -08:00
Jakub Sitnicki	9f276b240b	selftests/bpf: Extend verifier and bpf_sock tests for dst_port loads Add coverage to the verifier tests and tests for reading bpf_sock fields to ensure that 32-bit, 16-bit, and 8-bit loads from dst_port field are allowed only at intended offsets and produce expected values. While 16-bit and 8-bit access to dst_port field is straight-forward, 32-bit wide loads need be allowed and produce a zero-padded 16-bit value for backward compatibility. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/r/20220130115518.213259-3-jakub@cloudflare.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-02-04 10:27:02 -08:00
Andrii Nakryiko	9a64065733	sync: move bpf-next checkpoint to include selftests fixes There are no relevant libbpf commits, but new checkpoint commit contains important BPF selftests commits fixing CI failures from kernel repo side. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-01-31 23:37:33 -08:00
Evgeny Vereshchagin	ae220adbb2	ci: no longer remove elfutils while building the fuzzer Without it coverage reports can't be built ``` [2022-01-31 00:05:36,094 DEBUG] Generating file view html index file as: "/out/report/linux/file_view_index.html". Traceback (most recent call last): File "/opt/code_coverage/coverage_utils.py", line 829, in <module> sys.exit(Main()) File "/opt/code_coverage/coverage_utils.py", line 823, in Main return _CmdPostProcess(args) File "/opt/code_coverage/coverage_utils.py", line 780, in _CmdPostProcess processor.PrepareHtmlReport() File "/opt/code_coverage/coverage_utils.py", line 577, in PrepareHtmlReport self.GenerateFileViewHtmlIndexFile(per_file_coverage_summary, File "/opt/code_coverage/coverage_utils.py", line 450, in GenerateFileViewHtmlIndexFile self.GetCoverageHtmlReportPathForFile(file_path), File "/opt/code_coverage/coverage_utils.py", line 422, in GetCoverageHtmlReportPathForFile assert os.path.isfile( AssertionError: "/tmp/tmp.UYax4l19Gh/lib/system.h" is not a file. ``` It's a follow-up to `393a058d06` Signed-off-by: Evgeny Vereshchagin <evvers@ya.ru>	2022-01-31 15:45:11 -08:00
Andrii Nakryiko	fec0813359	sync: regenerate vmlinux.h to include TASK_COMM_LEN constant TASK_COMM_LEN is now part of vmlinux.h on latest kernel. Regenerate vmlinux.h to have it on 5.5 and 4.9 kernels. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-01-25 23:37:04 -08:00
Andrii Nakryiko	1e702e8ffe	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 820e6e227c4053b6b631ae65ef1f65d560cb392b Checkpoint bpf-next commit: c446fdacb10dcb3b9a9ed3b91d91e72d71d94b03 Baseline bpf commit: baa59504c1cd0cca7d41954a45ee0b3dc78e41a0 Checkpoint bpf commit: e2bcbd7769ee8f05e1b3d10848aace98973844e4 Andrii Nakryiko (3): libbpf: hide and discourage inconsistently named getters libbpf: deprecate bpf_map__resize() libbpf: deprecate bpf_program__is_<type>() and bpf_program__set_<type>() APIs Christy Lee (2): libbpf: Mark bpf_object__open_buffer() API deprecated libbpf: Mark bpf_object__open_xattr() deprecated Kajol Jain (1): perf: Add new macros for mem_hops field Kenny Yu (2): bpf: Add bpf_copy_from_user_task() helper libbpf: Add "iter.s" section for sleepable bpf iterator programs Kenta Tada (1): libbpf: Fix the incorrect register read for syscalls on x86_64 Lorenzo Bianconi (4): bpf: introduce BPF_F_XDP_HAS_FRAGS flag in prog_flags loading the ebpf program bpf: introduce bpf_xdp_get_buff_len helper libbpf: Add SEC name for xdp frags programs net: xdp: introduce bpf_xdp_pointer utility routine include/uapi/linux/bpf.h \| 41 +++++++++++++++++++++++++++++++++ include/uapi/linux/perf_event.h \| 5 +++- src/bpf_tracing.h \| 34 +++++++++++++++++++++++++++ src/btf.h \| 5 +--- src/libbpf.c \| 32 +++++++++++++++++-------- src/libbpf.h \| 34 ++++++++++++++++++++++++--- src/libbpf.map \| 2 ++ src/libbpf_internal.h \| 3 +++ src/libbpf_legacy.h \| 17 ++++++++++++++ 9 files changed, 155 insertions(+), 18 deletions(-) -- 2.30.2	2022-01-25 23:37:04 -08:00
Andrii Nakryiko	bad4fa116c	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2022-01-25 23:37:04 -08:00
Andrii Nakryiko	b2a7b16287	libbpf: deprecate bpf_program__is_<type>() and bpf_program__set_<type>() APIs Not sure why these APIs were added in the first place instead of a completely generic (and not requiring constantly adding new APIs with each new BPF program type) bpf_program__type() and bpf_program__set_type() APIs. But as it is right now, there are 13 such specialized is_type/set_type APIs, while latest kernel is already at 30+ BPF program types. Instead of completing the set of APIs and keep chasing kernel's bpf_prog_type enum, deprecate existing subset and recommend generic bpf_program__type() and bpf_program__set_type() APIs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220124194254.2051434-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-25 23:37:04 -08:00
Andrii Nakryiko	638311dcb8	libbpf: deprecate bpf_map__resize() Deprecated bpf_map__resize() in favor of bpf_map__set_max_entries() setter. In addition to having a surprising name (users often don't realize that they need to use bpf_map__resize()), the name also implies some magic way of resizing BPF map after it is created, which is clearly not the case. Another minor annoyance is that bpf_map__resize() disallows 0 value for max_entries, which in some cases is totally acceptable (e.g., like for BPF perf buf case to let libbpf auto-create one buffer per each available CPU core). [0] Closes: https://github.com/libbpf/libbpf/issues/304 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220124194254.2051434-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-25 23:37:04 -08:00
Andrii Nakryiko	e72ac682ae	libbpf: hide and discourage inconsistently named getters Move a bunch of "getters" into libbpf_legacy.h to keep them there in libbpf 1.0. See [0] for discussion of "Discouraged APIs". These getters don't add any maintenance burden and are simple alias, but they are inconsistent in naming. So keep them in libbpf_legacy.h instead of libbpf.h to "hide" them in favor of preferred getters ([1]). Also add two missing getters: bpf_program__type() and bpf_program__expected_attach_type(). [0] https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0#handling-deprecation-of-apis-and-functionality [1] Closes: https://github.com/libbpf/libbpf/issues/307 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220124194254.2051434-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-25 23:37:04 -08:00
Kenta Tada	d2818a8f2c	libbpf: Fix the incorrect register read for syscalls on x86_64 Currently, rcx is read as the fourth parameter of syscall on x86_64. But x86_64 Linux System Call convention uses r10 actually. This commit adds the wrapper for users who want to access to syscall params to analyze the user space. Signed-off-by: Kenta Tada <Kenta.Tada@sony.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220124141622.4378-3-Kenta.Tada@sony.com	2022-01-25 23:37:04 -08:00
Christy Lee	c0c3f46ca6	libbpf: Mark bpf_object__open_xattr() deprecated Mark bpf_object__open_xattr() as deprecated, use bpf_object__open_file() instead. [0] Closes: https://github.com/libbpf/libbpf/issues/287 Signed-off-by: Christy Lee <christylee@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220125010917.679975-1-christylee@fb.com	2022-01-25 23:37:04 -08:00
Christy Lee	1af0f62fac	libbpf: Mark bpf_object__open_buffer() API deprecated Deprecate bpf_object__open_buffer() API in favor of the unified opts-based bpf_object__open_mem() API. Signed-off-by: Christy Lee <christylee@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220125005923.418339-2-christylee@fb.com	2022-01-25 23:37:04 -08:00
Kenny Yu	04aef6ce9b	libbpf: Add "iter.s" section for sleepable bpf iterator programs This adds a new bpf section "iter.s" to allow bpf iterator programs to be sleepable. Signed-off-by: Kenny Yu <kennyyu@fb.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220124185403.468466-4-kennyyu@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-25 23:37:04 -08:00
Kenny Yu	a9491bb920	bpf: Add bpf_copy_from_user_task() helper This adds a helper for bpf programs to read the memory of other tasks. As an example use case at Meta, we are using a bpf task iterator program and this new helper to print C++ async stack traces for all threads of a given process. Signed-off-by: Kenny Yu <kennyyu@fb.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220124185403.468466-3-kennyyu@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-25 23:37:04 -08:00
Lorenzo Bianconi	fb0b8a7cea	net: xdp: introduce bpf_xdp_pointer utility routine Similar to skb_header_pointer, introduce bpf_xdp_pointer utility routine to return a pointer to a given position in the xdp_buff if the requested area (offset + len) is contained in a contiguous memory area otherwise it will be copied in a bounce buffer provided by the caller. Similar to the tc counterpart, introduce the two following xdp helpers: - bpf_xdp_load_bytes - bpf_xdp_store_bytes Reviewed-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/ab285c1efdd5b7a9d361348b1e7d3ef49f6382b3.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-25 23:37:04 -08:00
Lorenzo Bianconi	4ae89237d4	libbpf: Add SEC name for xdp frags programs Introduce support for the following SEC entries for XDP frags property: - SEC("xdp.frags") - SEC("xdp.frags/devmap") - SEC("xdp.frags/cpumap") Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/af23b6e4841c171ad1af01917839b77847a4bc27.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-25 23:37:04 -08:00
Lorenzo Bianconi	9af0e19376	bpf: introduce bpf_xdp_get_buff_len helper Introduce bpf_xdp_get_buff_len helper in order to return the xdp buffer total size (linear and paged area) Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/aac9ac3504c84026cf66a3c71b7c5ae89bc991be.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-25 23:37:04 -08:00
Lorenzo Bianconi	57df70e180	bpf: introduce BPF_F_XDP_HAS_FRAGS flag in prog_flags loading the ebpf program Introduce BPF_F_XDP_HAS_FRAGS and the related field in bpf_prog_aux in order to notify the driver the loaded program support xdp frags. Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/db2e8075b7032a356003f407d1b0deb99adaa0ed.1642758637.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-25 23:37:04 -08:00
Kajol Jain	da06e3efcc	perf: Add new macros for mem_hops field Add new macros for mem_hops field which can be used to represent remote-node, socket and board level details. Currently the code had macro for HOPS_0, which corresponds to data coming from another core but same node. Add new macros for HOPS_1 to HOPS_3 to represent remote-node, socket and board level data. For ex: Encodings for mem_hops fields with L2 cache: L2 - local L2 L2 \| REMOTE \| HOPS_0 - remote core, same node L2 L2 \| REMOTE \| HOPS_1 - remote node, same socket L2 L2 \| REMOTE \| HOPS_2 - remote socket, same board L2 L2 \| REMOTE \| HOPS_3 - remote board L2 Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211206091749.87585-2-kjain@linux.ibm.com	2022-01-25 23:37:04 -08:00
Andrii Nakryiko	b4b6e4dc20	sync: start syncing perf_event.h UAPI header as well This header is necessary for libbpf-sys to generate perf-related Rust bindings. It's more convenient to have it available locally with libbpf. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-01-25 23:37:04 -08:00
Ilya Leoshkevich	0ee425cdd7	ci: add Ubuntu instructions for s390x-self-hosted-builder There is a new Ubuntu-based builder, and setup steps for it are slightly different from what we had on the old RHEL-based one. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>	2022-01-25 17:21:08 -08:00
Ilya Leoshkevich	f1085fe3c3	ci: add s390x-self-hosted-builder's user to kvm group RHEL's podman sets /dev/kvm permissions to 0666, while Ubuntu's docker sets them to 0660. Therefore, in order to use KVM from a container, the user within must belong to the kvm group. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>	2022-01-25 17:21:08 -08:00
Evgeny Vereshchagin	393a058d06	tests: move the fuzzer upstream It should make it easier to start using CFLite or something like that to fuzz libbpf without getting pointless CVEs :-) More importantly, now it's possible to build the fuzzer by just cloning the repository, installing clang and running `./scripts/build-fuzzers.h`: ``` git clone https://github.com/libbpf/libbpf ./scripts/build-fuzzers.h unzip -d CORPUS fuzz/bpf-object-fuzzer_seed_corpus.zip ./out/bpf-object-fuzzer CORPUS ``` It should make it easier (for me at least) to report some elfutils bugs because they are much easier to reproduce manually now.	2022-01-24 15:37:36 -08:00
Andrii Nakryiko	3febb8a165	ci: update s390x blacklist Add bpf_mod_race and bpf_nf selftests to blacklist for s390x. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-01-21 16:54:32 -08:00
Andrii Nakryiko	5daafdccf9	ci: regenerate and check in latest vmlinux.h Add a bunch of new kernel types that are needed for successful selftest compilation. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2022-01-21 16:54:32 -08:00
Andrii Nakryiko	78e816a15d	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: e80f2a0d194605553315de68284fc41969f81f62 Checkpoint bpf-next commit: 820e6e227c4053b6b631ae65ef1f65d560cb392b Baseline bpf commit: 343e53754b21ae45530623222aa079fecd3cf942 Checkpoint bpf commit: baa59504c1cd0cca7d41954a45ee0b3dc78e41a0 Andrii Nakryiko (2): libbpf: deprecate legacy BPF map definitions libbpf: streamline low-level XDP APIs Kui-Feng Lee (1): libbpf: Improve btf__add_btf() with an additional hashmap for strings. Toke Høiland-Jørgensen (1): libbpf: Define BTF_KIND_* constants in btf.h to avoid compilation errors Usama Arif (1): uapi/bpf: Add missing description and returns for helper documentation YiFei Zhu (1): bpf: Add cgroup helpers bpf_{get,set}_retval to get/set syscall return value include/uapi/linux/bpf.h \| 33 +++++++++++ src/bpf_helpers.h \| 2 +- src/btf.c \| 31 ++++++++++- src/btf.h \| 22 +++++++- src/libbpf.c \| 8 +++ src/libbpf.h \| 29 ++++++++++ src/libbpf.map \| 4 ++ src/libbpf_legacy.h \| 5 ++ src/netlink.c \| 117 ++++++++++++++++++++++++++++----------- 9 files changed, 215 insertions(+), 36 deletions(-) -- 2.30.2	2022-01-21 16:54:32 -08:00
Andrii Nakryiko	d2ea0e2d03	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2022-01-21 16:54:32 -08:00
Andrii Nakryiko	8fbe7eec3a	libbpf: streamline low-level XDP APIs Introduce 4 new netlink-based XDP APIs for attaching, detaching, and querying XDP programs: - bpf_xdp_attach; - bpf_xdp_detach; - bpf_xdp_query; - bpf_xdp_query_id. These APIs replace bpf_set_link_xdp_fd, bpf_set_link_xdp_fd_opts, bpf_get_link_xdp_id, and bpf_get_link_xdp_info APIs ([0]). The latter don't follow a consistent naming pattern and some of them use non-extensible approaches (e.g., struct xdp_link_info which can't be modified without breaking libbpf ABI). The approach I took with these low-level XDP APIs is similar to what we did with low-level TC APIs. There is a nice duality of bpf_tc_attach vs bpf_xdp_attach, and so on. I left bpf_xdp_attach() to support detaching when -1 is specified for prog_fd for generality and convenience, but bpf_xdp_detach() is preferred due to clearer naming and associated semantics. Both bpf_xdp_attach() and bpf_xdp_detach() accept the same opts struct allowing to specify expected old_prog_fd. While doing the refactoring, I noticed that old APIs require users to specify opts with old_fd == -1 to declare "don't care about already attached XDP prog fd" condition. Otherwise, FD 0 is assumed, which is essentially never an intended behavior. So I made this behavior consistent with other kernel and libbpf APIs, in which zero FD means "no FD". This seems to be more in line with the latest thinking in BPF land and should cause less user confusion, hopefully. For querying, I left two APIs, both more generic bpf_xdp_query() allowing to query multiple IDs and attach mode, but also a specialization of it, bpf_xdp_query_id(), which returns only requested prog_id. Uses of prog_id returning bpf_get_link_xdp_id() were so prevalent across selftests and samples, that it seemed a very common use case and using bpf_xdp_query() for doing it felt very cumbersome with a highly branches if/else chain based on flags and attach mode. Old APIs are scheduled for deprecation in libbpf 0.8 release. [0] Closes: https://github.com/libbpf/libbpf/issues/309 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20220120061422.2710637-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-21 16:54:32 -08:00
Andrii Nakryiko	d788cd57b5	libbpf: deprecate legacy BPF map definitions Enact deprecation of legacy BPF map definition in SEC("maps") ([0]). For the definitions themselves introduce LIBBPF_STRICT_MAP_DEFINITIONS flag for libbpf strict mode. If it is set, error out on any struct bpf_map_def-based map definition. If not set, libbpf will print out a warning for each legacy BPF map to raise awareness that it goes away. For any use of BPF_ANNOTATE_KV_PAIR() macro providing a legacy way to associate BTF key/value type information with legacy BPF map definition, warn through libbpf's pr_warn() error message (but don't fail BPF object open). BPF-side struct bpf_map_def is marked as deprecated. User-space struct bpf_map_def has to be used internally in libbpf, so it is left untouched. It should be enough for bpf_map__def() to be marked deprecated to raise awareness that it goes away. bpftool is an interesting case that utilizes libbpf to open BPF ELF object to generate skeleton. As such, even though bpftool itself uses full on strict libbpf mode (LIBBPF_STRICT_ALL), it has to relax it a bit for BPF map definition handling to minimize unnecessary disruptions. So opt-out of LIBBPF_STRICT_MAP_DEFINITIONS for bpftool. User's code that will later use generated skeleton will make its own decision whether to enforce LIBBPF_STRICT_MAP_DEFINITIONS or not. There are few tests in selftests/bpf that are consciously using legacy BPF map definitions to test libbpf functionality. For those, temporary opt out of LIBBPF_STRICT_MAP_DEFINITIONS mode for the duration of those tests. [0] Closes: https://github.com/libbpf/libbpf/issues/272 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220120060529.1890907-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-21 16:54:32 -08:00
YiFei Zhu	ab022c8eb4	bpf: Add cgroup helpers bpf_{get,set}_retval to get/set syscall return value The helpers continue to use int for retval because all the hooks are int-returning rather than long-returning. The return value of bpf_set_retval is int for future-proofing, in case in the future there may be errors trying to set the retval. After the previous patch, if a program rejects a syscall by returning 0, an -EPERM will be generated no matter if the retval is already set to -err. This patch change it being forced only if retval is not -err. This is because we want to support, for example, invoking bpf_set_retval(-EINVAL) and return 0, and have the syscall return value be -EINVAL not -EPERM. For BPF_PROG_CGROUP_INET_EGRESS_RUN_ARRAY, the prior behavior is that, if the return value is NET_XMIT_DROP, the packet is silently dropped. We preserve this behavior for backward compatibility reasons, so even if an errno is set, the errno does not return to caller. However, setting a non-err to retval cannot propagate so this is not allowed and we return a -EFAULT in that case. Signed-off-by: YiFei Zhu <zhuyifei@google.com> Reviewed-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/b4013fd5d16bed0b01977c1fafdeae12e1de61fb.1639619851.git.zhuyifei@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2022-01-21 16:54:32 -08:00
Kui-Feng Lee	75c7c722f5	libbpf: Improve btf__add_btf() with an additional hashmap for strings. Add a hashmap to map the string offsets from a source btf to the string offsets from a target btf to reduce overheads. btf__add_btf() calls btf__add_str() to add strings from a source to a target btf. It causes many string comparisons, and it is a major hotspot when adding a big btf. btf__add_str() uses strcmp() to check if a hash entry is the right one. The extra hashmap here compares offsets of strings, that are much cheaper. It remembers the results of btf__add_str() for later uses to reduce the cost. We are parallelizing BTF encoding for pahole by creating separated btf instances for worker threads. These per-thread btf instances will be added to the btf instance of the main thread by calling btf__add_str() to deduplicate and write out. With this patch and -j4, the running time of pahole drops to about 6.0s from 6.6s. The following lines are the summary of 'perf stat' w/o the change. 6.668126396 seconds time elapsed 13.451054000 seconds user 0.715520000 seconds sys The following lines are the summary w/ the change. 5.986973919 seconds time elapsed 12.939903000 seconds user 0.724152000 seconds sys V4 fixes a bug of error checking against the pointer returned by hashmap__new(). [v3] https://lore.kernel.org/bpf/20220118232053.2113139-1-kuifeng@fb.com/ [v2] https://lore.kernel.org/bpf/20220114193713.461349-1-kuifeng@fb.com/ Signed-off-by: Kui-Feng Lee <kuifeng@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220119180214.255634-1-kuifeng@fb.com	2022-01-21 16:54:32 -08:00
Usama Arif	0f99d97a9c	uapi/bpf: Add missing description and returns for helper documentation Both description and returns section will become mandatory for helpers and syscalls in a later commit to generate man pages. This commit also adds in the documentation that BPF_PROG_RUN is an alias for BPF_PROG_TEST_RUN for anyone searching for the syscall in the generated man pages. Signed-off-by: Usama Arif <usama.arif@bytedance.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220119114442.1452088-1-usama.arif@bytedance.com	2022-01-21 16:54:32 -08:00
Toke Høiland-Jørgensen	8404d1396c	libbpf: Define BTF_KIND_* constants in btf.h to avoid compilation errors The btf.h header included with libbpf contains inline helper functions to check for various BTF kinds. These helpers directly reference the BTF_KIND_* constants defined in the kernel header, and because the header file is included in user applications, this happens in the user application compile units. This presents a problem if a user application is compiled on a system with older kernel headers because the constants are not available. To avoid this, add #defines of the constants directly in btf.h before using them. Since the kernel header moved to an enum for BTF_KIND_*, the #defines can shadow the enum values without any errors, so we only need #ifndef guards for the constants that predates the conversion to enum. We group these so there's only one guard for groups of values that were added together. [0] Closes: https://github.com/libbpf/libbpf/issues/436 Fixes: 223f903e9c83 ("bpf: Rename BTF_KIND_TAG to BTF_KIND_DECL_TAG") Fixes: 5b84bd10363e ("libbpf: Add support for BTF_KIND_TAG") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/bpf/20220118141327.34231-1-toke@redhat.com	2022-01-21 16:54:32 -08:00
Andrii Nakryiko	be89b28f96	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 44bab87d8ca6f0544a9f8fc97bdf33aa5b3c899e Checkpoint bpf-next commit: e80f2a0d194605553315de68284fc41969f81f62 Baseline bpf commit: d6d86830705f173fca6087a3e67ceaf68db80523 Checkpoint bpf commit: 343e53754b21ae45530623222aa079fecd3cf942 Christy Lee (2): libbpf: Rename bpf_prog_attach_xattr() to bpf_prog_attach_opts() libbpf: Deprecate bpf_map__def() API Coco Li (1): gro: add ability to control gro max packet size Mauricio Vásquez (1): libbpf: Use IS_ERR_OR_NULL() in hashmap__free() Yafang Shao (1): libbpf: Fix possible NULL pointer dereference when destroying skeleton include/uapi/linux/if_link.h \| 1 + src/bpf.c \| 9 +++++++-- src/bpf.h \| 4 ++++ src/hashmap.c \| 3 +-- src/libbpf.c \| 3 +++ src/libbpf.h \| 3 ++- src/libbpf.map \| 1 + 7 files changed, 19 insertions(+), 5 deletions(-) -- 2.30.2	2022-01-14 22:08:26 -08:00
Christy Lee	7b8e97bffc	libbpf: Deprecate bpf_map__def() API All fields accessed via bpf_map_def can now be accessed via appropirate getters and setters. Mark bpf_map__def() API as deprecated. Signed-off-by: Christy Lee <christylee@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220108004218.355761-6-christylee@fb.com	2022-01-14 22:08:26 -08:00
Yafang Shao	5b6dfd7f6b	libbpf: Fix possible NULL pointer dereference when destroying skeleton When I checked the code in skeleton header file generated with my own bpf prog, I found there may be possible NULL pointer dereference when destroying skeleton. Then I checked the in-tree bpf progs, finding that is a common issue. Let's take the generated samples/bpf/xdp_redirect_cpu.skel.h for example. Below is the generated code in xdp_redirect_cpu__create_skeleton(): xdp_redirect_cpu__create_skeleton struct bpf_object_skeleton s; s = (struct bpf_object_skeleton )calloc(1, sizeof(*s)); if (!s) goto error; ... error: bpf_object__destroy_skeleton(s); return -ENOMEM; After goto error, the NULL 's' will be deferenced in bpf_object__destroy_skeleton(). We can simply fix this issue by just adding a NULL check in bpf_object__destroy_skeleton(). Fixes: d66562fba1ce ("libbpf: Add BPF object skeleton support") Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220108134739.32541-1-laoar.shao@gmail.com	2022-01-14 22:08:26 -08:00
Christy Lee	e0de05d1b1	libbpf: Rename bpf_prog_attach_xattr() to bpf_prog_attach_opts() All xattr APIs are being dropped, let's converge to the convention used in high-level APIs and rename bpf_prog_attach_xattr to bpf_prog_attach_opts. Signed-off-by: Christy Lee <christylee@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220107184604.3668544-2-christylee@fb.com	2022-01-14 22:08:26 -08:00
Mauricio Vásquez	d5daf275c7	libbpf: Use IS_ERR_OR_NULL() in hashmap__free() hashmap__new() uses ERR_PTR() to return an error so it's better to use IS_ERR_OR_NULL() in order to check the pointer before calling free(). This will prevent freeing an invalid pointer if somebody calls hashmap__free() with the result of a failed hashmap__new() call. Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20220107152620.192327-1-mauricio@kinvolk.io	2022-01-14 22:08:26 -08:00
Coco Li	cde5b418dd	gro: add ability to control gro max packet size Eric Dumazet suggested to allow users to modify max GRO packet size. We have seen GRO being disabled by users of appliances (such as wifi access points) because of claimed bufferbloat issues, or some work arounds in sch_cake, to split GRO/GSO packets. Instead of disabling GRO completely, one can chose to limit the maximum packet size of GRO packets, depending on their latency constraints. This patch adds a per device gro_max_size attribute that can be changed with ip link command. ip link set dev eth0 gro_max_size 16000 Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Coco Li <lixiaoyan@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-01-14 22:08:26 -08:00
Adam Jensen	060c8a99c4	include: Include linux/stddef.h This fixes the build in environments such as Alpine Linux. See [0] for discussion. [0] https://github.com/libbpf/libbpf/pull/41 Signed-off-by: Adam Jensen	2022-01-14 12:21:53 -08:00
Kumar Kartikeya Dwivedi	22411acc4b	ci: Add userfaultfd kernel config Add necessary kernel config values to run BPF mod race test in conntrack-bpf series. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>	2022-01-11 12:40:59 -08:00
Andrii Nakryiko	e99f34e144	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: ecf45e60a62dfeb65658abac02f0bdb45b786911 Checkpoint bpf-next commit: 44bab87d8ca6f0544a9f8fc97bdf33aa5b3c899e Baseline bpf commit: 819d11507f6637731947836e6308f5966d64cf9d Checkpoint bpf commit: d6d86830705f173fca6087a3e67ceaf68db80523 Andrii Nakryiko (3): libbpf: Normalize PT_REGS_xxx() macro definitions libbpf: Use 100-character limit to make bpf_tracing.h easier to read libbpf: Improve LINUX_VERSION_CODE detection Christy Lee (3): libbpf: Deprecate bpf_perf_event_read_simple() API libbpf 1.0: Deprecate bpf_map__is_offload_neutral() libbpf 1.0: Deprecate bpf_object__find_map_by_offset() API Grant Seltzer (1): libbpf: Add documentation for bpf_map batch operations Qiang Wang (2): libbpf: Use probe_name for legacy kprobe libbpf: Support repeated legacy kprobes on same function src/bpf.c \| 8 +- src/bpf.h \| 115 ++++++++++- src/bpf_tracing.h \| 431 +++++++++++++++++------------------------- src/libbpf.c \| 56 ++++-- src/libbpf.h \| 5 +- src/libbpf_internal.h \| 2 + src/libbpf_probes.c \| 16 -- 7 files changed, 342 insertions(+), 291 deletions(-) -- 2.30.2	2022-01-06 16:20:54 -08:00
Grant Seltzer	4449d71509	libbpf: Add documentation for bpf_map batch operations This adds documention for: - bpf_map_delete_batch() - bpf_map_lookup_batch() - bpf_map_lookup_and_delete_batch() - bpf_map_update_batch() This also updates the public API for the `keys` parameter of `bpf_map_delete_batch()`, and both the `keys` and `values` parameters of `bpf_map_update_batch()` to be constants. Signed-off-by: Grant Seltzer <grantseltzer@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220106201304.112675-1-grantseltzer@gmail.com	2022-01-06 16:20:54 -08:00
Christy Lee	12932191c6	libbpf 1.0: Deprecate bpf_object__find_map_by_offset() API API created with simplistic assumptions about BPF map definitions. It hasn’t worked for a while, deprecate it in preparation for libbpf 1.0. [0] Closes: https://github.com/libbpf/libbpf/issues/302 Signed-off-by: Christy Lee <christylee@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220105003120.2222673-1-christylee@fb.com	2022-01-06 16:20:54 -08:00
Christy Lee	8440112546	libbpf 1.0: Deprecate bpf_map__is_offload_neutral() Deprecate bpf_map__is_offload_neutral(). It’s most probably broken already. PERF_EVENT_ARRAY isn’t the only map that’s not suitable for hardware offloading. Applications can directly check map type instead. [0] Closes: https://github.com/libbpf/libbpf/issues/306 Signed-off-by: Christy Lee <christylee@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220105000601.2090044-1-christylee@fb.com	2022-01-06 16:20:54 -08:00
Qiang Wang	b0c3d7133f	libbpf: Support repeated legacy kprobes on same function If repeated legacy kprobes on same function in one process, libbpf will register using the same probe name and got -EBUSY error. So append index to the probe name format to fix this problem. Co-developed-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Qiang Wang <wangqiang.wq.frank@bytedance.com> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211227130713.66933-2-wangqiang.wq.frank@bytedance.com	2022-01-06 16:20:54 -08:00
Qiang Wang	c2f2c26cb2	libbpf: Use probe_name for legacy kprobe Fix a bug in commit 46ed5fc33db9, which wrongly used the func_name instead of probe_name to register legacy kprobe. Fixes: 46ed5fc33db9 ("libbpf: Refactor and simplify legacy kprobe code") Co-developed-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Qiang Wang <wangqiang.wq.frank@bytedance.com> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Hengqi Chen <hengqi.chen@gmail.com> Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com> Link: https://lore.kernel.org/bpf/20211227130713.66933-1-wangqiang.wq.frank@bytedance.com	2022-01-06 16:20:54 -08:00
Christy Lee	2e52e09bc2	libbpf: Deprecate bpf_perf_event_read_simple() API With perf_buffer__poll() and perf_buffer__consume() APIs available, there is no reason to expose bpf_perf_event_read_simple() API to users. If users need custom perf buffer, they could re-implement the function. Mark bpf_perf_event_read_simple() and move the logic to a new static function so it can still be called by other functions in the same file. [0] Closes: https://github.com/libbpf/libbpf/issues/310 Signed-off-by: Christy Lee <christylee@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211229204156.13569-1-christylee@fb.com	2022-01-06 16:20:54 -08:00
Andrii Nakryiko	0171976dc5	libbpf: Improve LINUX_VERSION_CODE detection Ubuntu reports incorrect kernel version through uname(), which on older kernels leads to kprobe BPF programs failing to load due to the version check mismatch. Accommodate Ubuntu's quirks with LINUX_VERSION_CODE by using Ubuntu-specific /proc/version_code to fetch major/minor/patch versions to form LINUX_VERSION_CODE. While at it, consolide libbpf's kernel version detection code between libbpf.c and libbpf_probes.c. [0] Closes: https://github.com/libbpf/libbpf/issues/421 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20211222231003.2334940-1-andrii@kernel.org	2022-01-06 16:20:54 -08:00
Andrii Nakryiko	3f592a59d7	libbpf: Use 100-character limit to make bpf_tracing.h easier to read Improve bpf_tracing.h's macro definition readability by keeping them single-line and better aligned. This makes it easier to follow all those variadic patterns. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20211222213924.1869758-2-andrii@kernel.org	2022-01-06 16:20:54 -08:00
Andrii Nakryiko	0557ad0a9c	libbpf: Normalize PT_REGS_xxx() macro definitions Refactor PT_REGS macros definitions in bpf_tracing.h to avoid excessive duplication. We currently have classic PT_REGS_xxx() and CO-RE-enabled PT_REGS_xxx_CORE(). We are about to add also _SYSCALL variants, which would require excessive copying of all the per-architecture definitions. Instead, separate architecture-specific field/register names from the final macro that utilize them. That way for upcoming _SYSCALL variants we'll be able to just define x86_64 exception and otherwise have one common set of _SYSCALL macro definitions common for all architectures. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Ilya Leoshkevich <iii@linux.ibm.com> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/bpf/20211222213924.1869758-1-andrii@kernel.org	2022-01-06 16:20:54 -08:00
Kumar Kartikeya Dwivedi	7c382f0df9	ci: Add conntrack kernel config Add necessary kernel config values to test BPF kfunc functionality for netfilter's conntrack subsystem. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>	2022-01-05 12:58:11 -08:00
Chris Tarazi	ceba6a788a	travis-ci/rootfs: Fix mount(8) invocation for Arch Linux Given that the rootfs for Arch Linux uses the busybox variant of mount(8), the `-l` doesn't exist on that binary and gives the following error msg with version v1.34.1 of busybox when invoking mkrootfs_arch.sh: ``` ... [ 0.781471] random: fast init done starting pid 72, tty '': '/etc/init.d/rcS' + for path in /etc/rcS.d/S* + '[' -x /etc/rcS.d/S10-mount ']' + /etc/rcS.d/S10-mount + /bin/mount proc /proc -t proc ++ /bin/mount -l -t devtmpfs /bin/mount: unrecognized option: l ... ``` This prevented me from generating a rootfs. This is fixed by removing the `-l`, as plainly invoking `mount -t devtmpfs` returns the same output with `mount -l ...` on the non-busybox variant (on Arch Linux, it comes from the `utils-linux` package). After this change, I was able to run `./tools/testing/selftests/bpf/vmtest.sh -i` (from the kernel src; with modification to the script to pick up my locally-generated .zstd of the new rootfs) and it worked. Signed-off-by: Chris Tarazi <tarazichris@gmail.com>	2022-01-05 12:57:27 -08:00
grantseltzer	bf7aacea49	Fix comparison operator in API documentation	2022-01-04 21:17:07 -08:00
Andrii Nakryiko	af2da673d8	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: e967a20a8fabc6442a78e2e2059e63a4bb6aed08 Checkpoint bpf-next commit: ecf45e60a62dfeb65658abac02f0bdb45b786911 Baseline bpf commit: 819d11507f6637731947836e6308f5966d64cf9d Checkpoint bpf commit: 819d11507f6637731947836e6308f5966d64cf9d Jiri Olsa (1): libbpf: Do not use btf_dump__new() macro in C++ mode src/btf.h \| 6 ++++++ 1 file changed, 6 insertions(+) -- 2.30.2	2021-12-23 20:00:21 -08:00
Jiri Olsa	1321a8bb49	libbpf: Do not use btf_dump__new() macro in C++ mode As reported in here [0], C++ compilers don't support __builtin_types_compatible_p(), so at least don't screw up compilation for them and let C++ users pick btf_dump__new vs btf_dump__new_deprecated explicitly. [0] https://github.com/libbpf/libbpf/issues/283#issuecomment-986100727 Fixes: 6084f5dc928f ("libbpf: Ensure btf_dump__new() and btf_dump_opts are future-proof") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20211223131736.483956-1-jolsa@kernel.org	2021-12-23 14:17:35 +01:00
grantseltzer	287d0d097b	Add documentation for error checking in API Signed-off-by: grantseltzer <grantseltzer@gmail.com>	2021-12-23 19:59:03 -08:00
Yucong Sun	9fab7c81ec	ci: Add a step to patch kernel with temporary fixes Apply a custom set of patches against bpf-next kernel tree before building vmlinux image.	2021-12-20 20:31:42 -08:00
Andrii Nakryiko	96268bf0c2	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: a34efe503bc55c5732e328e5191ad549eb899f31 Checkpoint bpf-next commit: e967a20a8fabc6442a78e2e2059e63a4bb6aed08 Baseline bpf commit: f7abc4c8df8c7930d0b9c56d9abee9a1fca635e9 Checkpoint bpf commit: 819d11507f6637731947836e6308f5966d64cf9d Andrii Nakryiko (2): libbpf: Avoid reading past ELF data section end when copying license libbpf: Rework feature-probing APIs src/libbpf.c \| 5 +- src/libbpf.h \| 52 +++++++++- src/libbpf.map \| 3 + src/libbpf_probes.c \| 235 ++++++++++++++++++++++++++++++++++---------- 4 files changed, 240 insertions(+), 55 deletions(-) -- 2.30.2	2021-12-17 17:13:53 -08:00
Andrii Nakryiko	168cf9b8ae	libbpf: Rework feature-probing APIs Create three extensible alternatives to inconsistently named feature-probing APIs: - libbpf_probe_bpf_prog_type() instead of bpf_probe_prog_type(); - libbpf_probe_bpf_map_type() instead of bpf_probe_map_type(); - libbpf_probe_bpf_helper() instead of bpf_probe_helper(). Set up return values such that libbpf can report errors (e.g., if some combination of input arguments isn't possible to validate, etc), in addition to whether the feature is supported (return value 1) or not supported (return value 0). Also schedule deprecation of those three APIs. Also schedule deprecation of bpf_probe_large_insn_limit(). Also fix all the existing detection logic for various program and map types that never worked: - BPF_PROG_TYPE_LIRC_MODE2; - BPF_PROG_TYPE_TRACING; - BPF_PROG_TYPE_LSM; - BPF_PROG_TYPE_EXT; - BPF_PROG_TYPE_SYSCALL; - BPF_PROG_TYPE_STRUCT_OPS; - BPF_MAP_TYPE_STRUCT_OPS; - BPF_MAP_TYPE_BLOOM_FILTER. Above prog/map types needed special setups and detection logic to work. Subsequent patch adds selftests that will make sure that all the detection logic keeps working for all current and future program and map types, avoiding otherwise inevitable bit rot. [0] Closes: https://github.com/libbpf/libbpf/issues/312 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Cc: Julia Kartseva <hex@fb.com> Link: https://lore.kernel.org/bpf/20211217171202.3352835-2-andrii@kernel.org	2021-12-17 17:13:53 -08:00
Andrii Nakryiko	8e706ddc6c	libbpf: Avoid reading past ELF data section end when copying license Fix possible read beyond ELF "license" data section if the license string is not properly zero-terminated. Use the fact that libbpf_strlcpy never accesses the (N-1)st byte of the source string because it's replaced with '\0' anyways. If this happens, it's a violation of contract between libbpf and a user, but not handling this more robustly upsets CIFuzz, so given the fix is trivial, let's fix the potential issue. Fixes: 9fc205b413b3 ("libbpf: Add sane strncpy alternative and use it internally") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211214232054.3458774-1-andrii@kernel.org	2021-12-17 17:13:53 -08:00
Andrii Nakryiko	dc49f2d07b	ci: add LIRC kernel config Add necessary kernel config values to make BPF_PROG_TYPE_LIRC_MODE2 programs work. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-12-16 16:44:52 -08:00
Andrii Nakryiko	19656636a9	vmtest: blacklist bpf_loop and get_func_args_test for s390x bpf_loop is using arch-specific sys_nanosleep attach function. get_func_args_test relies on BPF trampoline. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-12-14 17:06:30 -08:00
Andrii Nakryiko	61acde2308	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 229fae38d0fc0d6ff58d57cbeb1432da55e58d4f Checkpoint bpf-next commit: a34efe503bc55c5732e328e5191ad549eb899f31 Baseline bpf commit: 0be2516f865f5a876837184a8385163ff64a5889 Checkpoint bpf commit: f7abc4c8df8c7930d0b9c56d9abee9a1fca635e9 Alexei Starovoitov (1): libbpf: Fix gen_loader assumption on number of programs. Andrii Nakryiko (4): libbpf: Don't validate TYPE_ID relo's original imm value libbpf: Fix potential uninit memory read libbpf: Add sane strncpy alternative and use it internally libbpf: Auto-bump RLIMIT_MEMLOCK if kernel needs it for BPF Grant Seltzer (1): libbpf: Add doc comments for bpf_program__(un)pin() Hangbin Liu (1): Bonding: add arp_missed_max option Hou Tao (1): bpf: Add bpf_strncmp helper Jiri Olsa (1): bpf: Add get_func_[arg\|ret\|arg_cnt] helpers Kui-Feng Lee (1): libbpf: Mark bpf_object__find_program_by_title API deprecated. include/uapi/linux/bpf.h \| 39 +++++++++++++++++ include/uapi/linux/if_link.h \| 1 + src/bpf.c \| 85 +++++++++++++++++++++++++++++++++++- src/bpf.h \| 2 + src/btf_dump.c \| 4 +- src/gen_loader.c \| 12 +++-- src/libbpf.c \| 55 +++++------------------ src/libbpf.h \| 25 +++++++++++ src/libbpf.map \| 1 + src/libbpf_internal.h \| 58 ++++++++++++++++++++++++ src/libbpf_legacy.h \| 12 ++++- src/relo_core.c \| 20 ++++++--- src/xsk.c \| 9 ++-- 13 files changed, 260 insertions(+), 63 deletions(-) -- 2.30.2	2021-12-14 17:06:30 -08:00
Andrii Nakryiko	266e897ad2	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2021-12-14 17:06:30 -08:00
Kui-Feng Lee	7152ecf163	libbpf: Mark bpf_object__find_program_by_title API deprecated. Deprecate this API since v0.7. All callers should move to bpf_object__find_program_by_name if possible, otherwise use bpf_object__for_each_program to find a program out from a given section. [0] Closes: https://github.com/libbpf/libbpf/issues/292 Signed-off-by: Kui-Feng Lee <kuifeng@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211214035931.1148209-5-kuifeng@fb.com	2021-12-14 17:06:30 -08:00
Andrii Nakryiko	216eaa760e	libbpf: Auto-bump RLIMIT_MEMLOCK if kernel needs it for BPF The need to increase RLIMIT_MEMLOCK to do anything useful with BPF is one of the first extremely frustrating gotchas that all new BPF users go through and in some cases have to learn it a very hard way. Luckily, starting with upstream Linux kernel version 5.11, BPF subsystem dropped the dependency on memlock and uses memcg-based memory accounting instead. Unfortunately, detecting memcg-based BPF memory accounting is far from trivial (as can be evidenced by this patch), so in practice most BPF applications still do unconditional RLIMIT_MEMLOCK increase. As we move towards libbpf 1.0, it would be good to allow users to forget about RLIMIT_MEMLOCK vs memcg and let libbpf do the sensible adjustment automatically. This patch paves the way forward in this matter. Libbpf will do feature detection of memcg-based accounting, and if detected, will do nothing. But if the kernel is too old, just like BCC, libbpf will automatically increase RLIMIT_MEMLOCK on behalf of user application ([0]). As this is technically a breaking change, during the transition period applications have to opt into libbpf 1.0 mode by setting LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK bit when calling libbpf_set_strict_mode(). Libbpf allows to control the exact amount of set RLIMIT_MEMLOCK limit with libbpf_set_memlock_rlim_max() API. Passing 0 will make libbpf do nothing with RLIMIT_MEMLOCK. libbpf_set_memlock_rlim_max() has to be called before the first bpf_prog_load(), bpf_btf_load(), or bpf_object__load() call, otherwise it has no effect and will return -EBUSY. [0] Closes: https://github.com/libbpf/libbpf/issues/369 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211214195904.1785155-2-andrii@kernel.org	2021-12-14 17:06:30 -08:00
Andrii Nakryiko	a4e725f8f5	libbpf: Add sane strncpy alternative and use it internally strncpy() has a notoriously error-prone semantics which makes GCC complain about it a lot (and quite often completely completely falsely at that). Instead of pleasing GCC all the time (-Wno-stringop-truncation is unfortunately only supported by GCC, so it's a bit too messy to just enable it in Makefile), add libbpf-internal libbpf_strlcpy() helper which follows what FreeBSD's strlcpy() does and what most people would expect from strncpy(): copies up to N-1 first bytes from source string into destination string and ensures zero-termination afterwards. Replace all the relevant uses of strncpy/strncat/memcpy in libbpf with libbpf_strlcpy(). This also fixes the issue reported by Emmanuel Deloget in xsk.c where memcpy() could access source string beyond its end. Fixes: 2f6324a3937f8 (libbpf: Support shared umems between queues and devices) Reported-by: Emmanuel Deloget <emmanuel.deloget@eho.link> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211211004043.2374068-1-andrii@kernel.org	2021-12-14 17:06:30 -08:00
Andrii Nakryiko	df5689f1c8	libbpf: Fix potential uninit memory read In case of BPF_CORE_TYPE_ID_LOCAL we fill out target result explicitly. But targ_res itself isn't initialized in such a case, and subsequent call to bpf_core_patch_insn() might read uninitialized field (like fail_memsz_adjust in this case). So ensure that targ_res is zero-initialized for BPF_CORE_TYPE_ID_LOCAL case. This was reported by Coverity static analyzer. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211214010032.3843804-1-andrii@kernel.org	2021-12-14 17:06:30 -08:00
Grant Seltzer	6894f573d2	libbpf: Add doc comments for bpf_program__(un)pin() This adds doc comments for the two bpf_program pinning functions, bpf_program__pin() and bpf_program__unpin() Signed-off-by: Grant Seltzer <grantseltzer@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211209232222.541733-1-grantseltzer@gmail.com	2021-12-14 17:06:30 -08:00
Jiri Olsa	2b0d408764	bpf: Add get_func_[arg\|ret\|arg_cnt] helpers Adding following helpers for tracing programs: Get n-th argument of the traced function: long bpf_get_func_arg(void ctx, u32 n, u64 value) Get return value of the traced function: long bpf_get_func_ret(void ctx, u64 value) Get arguments count of the traced function: long bpf_get_func_arg_cnt(void *ctx) The trampoline now stores number of arguments on ctx-8 address, so it's easy to verify argument index and find return value argument's position. Moving function ip address on the trampoline stack behind the number of functions arguments, so it's now stored on ctx-16 address if it's needed. All helpers above are inlined by verifier. Also bit unrelated small change - using newly added function bpf_prog_has_trampoline in check_get_func_ip. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211208193245.172141-5-jolsa@kernel.org	2021-12-14 17:06:30 -08:00
Andrii Nakryiko	ac20634cdc	libbpf: Don't validate TYPE_ID relo's original imm value During linking, type IDs in the resulting linked BPF object file can change, and so ldimm64 instructions corresponding to BPF_CORE_TYPE_ID_TARGET and BPF_CORE_TYPE_ID_LOCAL CO-RE relos can get their imm value out of sync with actual CO-RE relocation information that's updated by BPF linker properly during linking process. We could teach BPF linker to adjust such instructions, but it feels a bit too much for linker to re-implement good chunk of bpf_core_patch_insns logic just for this. This is a redundant safety check for TYPE_ID relocations, as the real validation is in matching CO-RE specs, so if that works fine, it's very unlikely that there is something wrong with the instruction itself. So, instead, teach libbpf (and kernel) to ignore insn->imm for BPF_CORE_TYPE_ID_TARGET and BPF_CORE_TYPE_ID_LOCAL relos. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211213010706.100231-1-andrii@kernel.org	2021-12-14 17:06:30 -08:00
Hou Tao	04804b4710	bpf: Add bpf_strncmp helper The helper compares two strings: one string is a null-terminated read-only string, and another string has const max storage size but doesn't need to be null-terminated. It can be used to compare file name in tracing or LSM program. Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211210141652.877186-2-houtao1@huawei.com	2021-12-14 17:06:30 -08:00
Alexei Starovoitov	16bb788578	libbpf: Fix gen_loader assumption on number of programs. libbpf's obj->nr_programs includes static and global functions. That number could be higher than the actual number of bpf programs going be loaded by gen_loader. Passing larger nr_programs to bpf_gen__init() doesn't hurt. Those exra stack slots will stay as zero. bpf_gen__finish() needs to check that actual number of progs that gen_loader saw is less than or equal to obj->nr_programs. Fixes: ba05fd36b851 ("libbpf: Perform map fd cleanup for gen_loader in case of error") Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2021-12-14 17:06:30 -08:00
Hangbin Liu	5eb804a2db	Bonding: add arp_missed_max option Currently, we use hard code number to verify if we are in the arp_interval timeslice. But some user may want to reduce/extend the verify timeslice. With the similar team option 'missed_max' the uers could change that number based on their own environment. Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-12-14 17:06:30 -08:00
Andrii Nakryiko	bcf58fc7a5	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 43174f0d4597325cb91f1f1f55263eb6e6101036 Checkpoint bpf-next commit: 229fae38d0fc0d6ff58d57cbeb1432da55e58d4f Baseline bpf commit: c0d95d3380ee099d735e08618c0d599e72f6c8b0 Checkpoint bpf commit: 0be2516f865f5a876837184a8385163ff64a5889 Alexei Starovoitov (8): libbpf: Replace btf__type_by_id() with btf_type_by_id(). bpf: Prepare relo_core.c for kernel duty. bpf: Define enum bpf_core_relo_kind as uapi. bpf: Pass a set of bpf_core_relo-s to prog_load command. libbpf: Use CO-RE in the kernel in light skeleton. libbpf: Support init of inner maps in light skeleton. libbpf: Clean gen_loader's attach kind. libbpf: Reduce bpf_core_apply_relo_insn() stack usage. Andrii Nakryiko (12): libbpf: Cleanup struct bpf_core_cand. libbpf: Use __u32 fields in bpf_map_create_opts libbpf: Add API to get/set log_level at per-program level libbpf: Deprecate bpf_prog_load_xattr() API libbpf: Fix bpf_prog_load() log_buf logic for log_level 0 libbpf: Add OPTS-based bpf_btf_load() API libbpf: Allow passing preallocated log_buf when loading BTF into kernel libbpf: Allow passing user log setting through bpf_object_open_opts libbpf: Improve logging around BPF program loading libbpf: Preserve kernel error code and remove kprobe prog type guessing libbpf: Add per-program log buffer setter and getter libbpf: Deprecate bpf_object__load_xattr() Eric Dumazet (1): tools: sync uapi/linux/if_link.h header Grant Seltzer (1): libbpf: Add doc comments in libbpf.h Joanne Koong (1): bpf: Add bpf_loop helper Kumar Kartikeya Dwivedi (2): libbpf: Avoid double stores for success/failure case of ksym relocations libbpf: Avoid reload of imm for weak, unresolved, repeating ksym Mehrdad Arshad Rad (1): libbpf: Remove duplicate assignments Shuyi Cheng (1): libbpf: Add "bool skipped" to struct bpf_map Vincent Minet (1): libbpf: Fix typo in btf__dedup@LIBBPF_0.0.2 definition huangxuesen (1): libbpf: Fix trivial typo include/uapi/linux/bpf.h \| 103 +++++++++- include/uapi/linux/if_link.h \| 293 +++++++++++++++++++++++---- src/bpf.c \| 88 ++++++--- src/bpf.h \| 30 ++- src/bpf_gen_internal.h \| 4 + src/btf.c \| 82 +++++--- src/gen_loader.c \| 114 ++++++++--- src/libbpf.c \| 371 ++++++++++++++++++++++++----------- src/libbpf.h \| 109 +++++++++- src/libbpf.map \| 9 + src/libbpf_common.h \| 5 + src/libbpf_internal.h \| 3 +- src/libbpf_probes.c \| 2 +- src/libbpf_version.h \| 2 +- src/relo_core.c \| 231 ++++++++++++---------- src/relo_core.h \| 103 +++------- 16 files changed, 1138 insertions(+), 411 deletions(-) -- 2.30.2	2021-12-10 16:17:33 -08:00
Andrii Nakryiko	1f83414ea4	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2021-12-10 14:03:13 -08:00
Eric Dumazet	a0ddf21c92	tools: sync uapi/linux/if_link.h header This file has not been updated for a while. Sync it before BIG TCP patch series. Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20211122184810.769159-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-10 14:03:13 -08:00
Shuyi Cheng	bb14c6f5b5	libbpf: Add "bool skipped" to struct bpf_map Fix error: "failed to pin map: Bad file descriptor, path: /sys/fs/bpf/_rodata_str1_1." In the old kernel, the global data map will not be created, see [0]. So we should skip the pinning of the global data map to avoid bpf_object__pin_maps returning error. Therefore, when the map is not created, we mark “map->skipped" as true and then check during relocation and during pinning. Fixes: 16e0c35c6f7a ("libbpf: Load global data maps lazily on legacy kernels") Signed-off-by: Shuyi Cheng <chengshuyi@linux.alibaba.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-12-10 14:03:13 -08:00
Vincent Minet	c7dedfe23f	libbpf: Fix typo in btf__dedup@LIBBPF_0.0.2 definition The btf__dedup_deprecated name was misspelled in the definition of the compat symbol for btf__dedup. This leads it to be missing from the shared library. This fixes it. Fixes: 957d350a8b94 ("libbpf: Turn btf_dedup_opts into OPTS-based struct") Signed-off-by: Vincent Minet <vincent@vincent-minet.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211210063112.80047-1-vincent@vincent-minet.net	2021-12-10 14:03:13 -08:00
Andrii Nakryiko	f13e766fa4	libbpf: Deprecate bpf_object__load_xattr() Deprecate non-extensible bpf_object__load_xattr() in v0.8 ([0]). With log_level control through bpf_object_open_opts or bpf_program__set_log_level(), we are finally at the point where bpf_object__load_xattr() doesn't provide any functionality that can't be accessed through other (better) ways. The other feature, target_btf_path, is also controllable through bpf_object_open_opts. [0] Closes: https://github.com/libbpf/libbpf/issues/289 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-9-andrii@kernel.org	2021-12-10 14:03:13 -08:00
Andrii Nakryiko	90910812b5	libbpf: Add per-program log buffer setter and getter Allow to set user-provided log buffer on a per-program basis ([0]). This gives great deal of flexibility in terms of which programs are loaded with logging enabled and where corresponding logs go. Log buffer set with bpf_program__set_log_buf() overrides kernel_log_buf and kernel_log_size settings set at bpf_object open time through bpf_object_open_opts, if any. Adjust bpf_object_load_prog_instance() logic to not perform own log buf allocation and load retry if custom log buffer is provided by the user. [0] Closes: https://github.com/libbpf/libbpf/issues/418 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-8-andrii@kernel.org	2021-12-10 14:03:13 -08:00
Andrii Nakryiko	eb9d74e7ad	libbpf: Preserve kernel error code and remove kprobe prog type guessing Instead of rewriting error code returned by the kernel of prog load with libbpf-sepcific variants pass through the original error. There is now also no need to have a backup generic -LIBBPF_ERRNO__LOAD fallback error as bpf_prog_load() guarantees that errno will be properly set no matter what. Also drop a completely outdated and pretty useless BPF_PROG_TYPE_KPROBE guess logic. It's not necessary and neither it's helpful in modern BPF applications. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-7-andrii@kernel.org	2021-12-10 14:03:13 -08:00
Andrii Nakryiko	d9b3fae391	libbpf: Improve logging around BPF program loading Add missing "prog '%s': " prefixes in few places and use consistently markers for beginning and end of program load logs. Here's an example of log output: libbpf: prog 'handler': BPF program load failed: Permission denied libbpf: -- BEGIN PROG LOAD LOG --- arg#0 reference type('UNKNOWN ') size cannot be determined: -22 ; out1 = in1; 0: (18) r1 = 0xffffc9000cdcc000 2: (61) r1 = (u32 )(r1 +0) ... 81: (63) (u32 )(r4 +0) = r5 R1_w=map_value(id=0,off=16,ks=4,vs=20,imm=0) R4=map_value(id=0,off=400,ks=4,vs=16,imm=0) invalid access to map value, value_size=16 off=400 size=4 R4 min value is outside of the allowed memory range processed 63 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 -- END PROG LOAD LOG -- libbpf: failed to load program 'handler' libbpf: failed to load object 'test_skeleton' The entire verifier log, including BEGIN and END markers are now always youtput during a single print callback call. This should make it much easier to post-process or parse it, if necessary. It's not an explicit API guarantee, but it can be reasonably expected to stay like that. Also __bpf_object__open is renamed to bpf_object_open() as it's always an adventure to find the exact function that implements bpf_object's open phase, so drop the double underscored and use internal libbpf naming convention. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-6-andrii@kernel.org	2021-12-10 14:03:13 -08:00
Andrii Nakryiko	dc1df24314	libbpf: Allow passing user log setting through bpf_object_open_opts Allow users to provide their own custom log_buf, log_size, and log_level at bpf_object level through bpf_object_open_opts. This log_buf will be used during BTF loading. Subsequent patch will use same log_buf during BPF program loading, unless overriden at per-bpf_program level. When such custom log_buf is provided, libbpf won't be attempting retrying loading of BTF to try to provide its own log buffer to capture kernel's error log output. User is responsible to provide big enough buffer, otherwise they run a risk of getting -ENOSPC error from the bpf() syscall. See also comments in bpf_object_open_opts regarding log_level and log_buf interactions. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-5-andrii@kernel.org	2021-12-10 14:03:13 -08:00
Andrii Nakryiko	0504b7ff22	libbpf: Allow passing preallocated log_buf when loading BTF into kernel Add libbpf-internal btf_load_into_kernel() that allows to pass preallocated log_buf and custom log_level to be passed into kernel during BPF_BTF_LOAD call. When custom log_buf is provided, btf_load_into_kernel() won't attempt an retry with automatically allocated internal temporary buffer to capture BTF validation log. It's important to note the relation between log_buf and log_level, which slightly deviates from stricter kernel logic. From kernel's POV, if log_buf is specified, log_level has to be > 0, and vice versa. While kernel has good reasons to request such "sanity, this, in practice, is a bit unconvenient and restrictive for libbpf's high-level bpf_object APIs. So libbpf will allow to set non-NULL log_buf and log_level == 0. This is fine and means to attempt to load BTF without logging requested, but if it failes, retry the load with custom log_buf and log_level 1. Similar logic will be implemented for program loading. In practice this means that users can provide custom log buffer just in case error happens, but not really request slower verbose logging all the time. This is also consistent with libbpf behavior when custom log_buf is not set: libbpf first tries to load everything with log_level=0, and only if error happens allocates internal log buffer and retries with log_level=1. Also, while at it, make BTF validation log more obvious and follow the log pattern libbpf is using for dumping BPF verifier log during BPF_PROG_LOAD. BTF loading resulting in an error will look like this: libbpf: BTF loading error: -22 libbpf: -- BEGIN BTF LOAD LOG --- magic: 0xeb9f version: 1 flags: 0x0 hdr_len: 24 type_off: 0 type_len: 1040 str_off: 1040 str_len: 2063598257 btf_total_size: 1753 Total section length too long -- END BTF LOAD LOG -- libbpf: Error loading .BTF into kernel: -22. BTF is optional, ignoring. This makes it much easier to find relevant parts in libbpf log output. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-4-andrii@kernel.org	2021-12-10 14:03:13 -08:00
Andrii Nakryiko	3c93f7ddb2	libbpf: Add OPTS-based bpf_btf_load() API Similar to previous bpf_prog_load() and bpf_map_create() APIs, add bpf_btf_load() API which is taking optional OPTS struct. Schedule bpf_load_btf() for deprecation in v0.8 ([0]). This makes naming consistent with BPF_BTF_LOAD command, sets up an API for extensibility in the future, moves options parameters (log-related fields) into optional options, and also allows to pass log_level directly. It also removes log buffer auto-allocation logic from low-level API (consistent with bpf_prog_load() behavior), but preserves a special treatment of log_level == 0 with non-NULL log_buf, which matches low-level bpf_prog_load() and high-level libbpf APIs for BTF and program loading behaviors. [0] Closes: https://github.com/libbpf/libbpf/issues/419 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-3-andrii@kernel.org	2021-12-10 14:03:13 -08:00
Andrii Nakryiko	896a3ae0d0	libbpf: Fix bpf_prog_load() log_buf logic for log_level 0 To unify libbpf APIs behavior w.r.t. log_buf and log_level, fix bpf_prog_load() to follow the same logic as bpf_btf_load() and high-level bpf_object__load() API will follow in the subsequent patches: - if log_level is 0 and non-NULL log_buf is provided by a user, attempt load operation initially with no log_buf and log_level set; - if successful, we are done, return new FD; - on error, retry the load operation with log_level bumped to 1 and log_buf set; this way verbose logging will be requested only when we are sure that there is a failure, but will be fast in the common/expected success case. Of course, user can still specify log_level > 0 from the very beginning to force log collection. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211209193840.1248570-2-andrii@kernel.org	2021-12-10 14:03:13 -08:00
Grant Seltzer	728b1721e5	libbpf: Add doc comments in libbpf.h This adds comments above functions in libbpf.h which document their uses. These comments are of a format that doxygen and sphinx can pick up and render. These are rendered by libbpf.readthedocs.org These doc comments are for: - bpf_object__open_file() - bpf_object__open_mem() - bpf_program__attach_uprobe() - bpf_program__attach_uprobe_opts() Signed-off-by: Grant Seltzer <grantseltzer@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211206203709.332530-1-grantseltzer@gmail.com	2021-12-10 14:03:13 -08:00
huangxuesen	04941813a5	libbpf: Fix trivial typo Fix typo in comment from 'bpf_skeleton_map' to 'bpf_map_skeleton' and from 'bpf_skeleton_prog' to 'bpf_prog_skeleton'. Signed-off-by: huangxuesen <huangxuesen@kuaishou.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1638755236-3851199-1-git-send-email-hxseverything@gmail.com	2021-12-10 14:03:13 -08:00
Alexei Starovoitov	c3c540b402	libbpf: Reduce bpf_core_apply_relo_insn() stack usage. Reduce bpf_core_apply_relo_insn() stack usage and bump BPF_CORE_SPEC_MAX_LEN limit back to 64. Fixes: 29db4bea1d10 ("bpf: Prepare relo_core.c for kernel duty.") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211203182836.16646-1-alexei.starovoitov@gmail.com	2021-12-10 14:03:13 -08:00
Andrii Nakryiko	b633ace366	libbpf: Deprecate bpf_prog_load_xattr() API bpf_prog_load_xattr() is high-level API that's named as a low-level BPF_PROG_LOAD wrapper APIs, but it actually operates on struct bpf_object. It's badly and confusingly misnamed as it will load all the progs insige bpf_object, returning prog_fd of the very first BPF program. It also has a bunch of ad-hoc things like log_level override, map_ifindex auto-setting, etc. All this can be expressed more explicitly and cleanly through existing libbpf APIs. This patch marks bpf_prog_load_xattr() for deprecation in libbpf v0.8 ([0]). [0] Closes: https://github.com/libbpf/libbpf/issues/308 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211201232824.3166325-10-andrii@kernel.org	2021-12-10 14:03:13 -08:00
Andrii Nakryiko	d761220e33	libbpf: Add API to get/set log_level at per-program level Add bpf_program__set_log_level() and bpf_program__log_level() to fetch and adjust log_level sent during BPF_PROG_LOAD command. This allows to selectively request more or less verbose output in BPF verifier log. Also bump libbpf version to 0.7 and make these APIs the first in v0.7. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211201232824.3166325-3-andrii@kernel.org	2021-12-10 14:03:13 -08:00
Andrii Nakryiko	ac1c007607	libbpf: Use __u32 fields in bpf_map_create_opts Corresponding Linux UAPI struct uses __u32, not int, so keep it consistent. Fixes: 992c4225419a ("libbpf: Unify low-level map creation APIs w/ new bpf_map_create()") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211201232824.3166325-2-andrii@kernel.org	2021-12-10 14:03:13 -08:00
Alexei Starovoitov	01b2b45e8d	libbpf: Clean gen_loader's attach kind. The gen_loader has to clear attach_kind otherwise the programs without attach_btf_id will fail load if they follow programs with attach_btf_id. Fixes: 67234743736a ("libbpf: Generate loader program out of BPF ELF file.") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211201181040.23337-12-alexei.starovoitov@gmail.com	2021-12-10 14:03:13 -08:00
Alexei Starovoitov	a7935b996f	libbpf: Support init of inner maps in light skeleton. Add ability to initialize inner maps in light skeleton. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211201181040.23337-11-alexei.starovoitov@gmail.com	2021-12-10 14:03:13 -08:00
Alexei Starovoitov	5f887b332c	libbpf: Use CO-RE in the kernel in light skeleton. Without lskel the CO-RE relocations are processed by libbpf before any other work is done. Instead, when lskel is needed, remember relocation as RELO_CORE kind. Then when loader prog is generated for a given bpf program pass CO-RE relos of that program to gen loader via bpf_gen__record_relo_core(). The gen loader will remember them as-is and pass it later as-is into the kernel. The normal libbpf flow is to process CO-RE early before call relos happen. In case of gen_loader the core relos have to be added to other relos to be copied together when bpf static function is appended in different places to other main bpf progs. During the copy the append_subprog_relos() will adjust insn_idx for normal relos and for RELO_CORE kind too. When that is done each struct reloc_desc has good relos for specific main prog. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211201181040.23337-10-alexei.starovoitov@gmail.com	2021-12-10 14:03:13 -08:00
Andrii Nakryiko	20e7ed521a	libbpf: Cleanup struct bpf_core_cand. Remove two redundant fields from struct bpf_core_cand. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211201181040.23337-8-alexei.starovoitov@gmail.com	2021-12-10 14:03:13 -08:00
Alexei Starovoitov	d785a21c71	bpf: Pass a set of bpf_core_relo-s to prog_load command. struct bpf_core_relo is generated by llvm and processed by libbpf. It's a de-facto uapi. With CO-RE in the kernel the struct bpf_core_relo becomes uapi de-jure. Add an ability to pass a set of 'struct bpf_core_relo' to prog_load command and let the kernel perform CO-RE relocations. Note the struct bpf_line_info and struct bpf_func_info have the same layout when passed from LLVM to libbpf and from libbpf to the kernel except "insn_off" fields means "byte offset" when LLVM generates it. Then libbpf converts it to "insn index" to pass to the kernel. The struct bpf_core_relo's "insn_off" field is always "byte offset". Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211201181040.23337-6-alexei.starovoitov@gmail.com	2021-12-10 14:03:13 -08:00
Alexei Starovoitov	8e43882e53	bpf: Define enum bpf_core_relo_kind as uapi. enum bpf_core_relo_kind is generated by llvm and processed by libbpf. It's a de-facto uapi. With CO-RE in the kernel the bpf_core_relo_kind values become uapi de-jure. Also rename them with BPF_CORE_ prefix to distinguish from conflicting names in bpf_core_read.h. The enums bpf_field_info_kind, bpf_type_id_kind, bpf_type_info_kind, bpf_enum_value_kind are passing different values from bpf program into llvm. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211201181040.23337-5-alexei.starovoitov@gmail.com	2021-12-10 14:03:13 -08:00
Alexei Starovoitov	2be7e6a830	bpf: Prepare relo_core.c for kernel duty. Make relo_core.c to be compiled for the kernel and for user space libbpf. Note the patch is reducing BPF_CORE_SPEC_MAX_LEN from 64 to 32. This is the maximum number of nested structs and arrays. For example: struct sample { int a; struct { int b[10]; }; }; struct sample s = ...; int y = &s->b[5]; This field access is encoded as "0:1:0:5" and spec len is 4. The follow up patch might bump it back to 64. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211201181040.23337-4-alexei.starovoitov@gmail.com	2021-12-10 14:03:13 -08:00
Alexei Starovoitov	b9bd1f8682	libbpf: Replace btf__type_by_id() with btf_type_by_id(). To prepare relo_core.c to be compiled in the kernel and the user space replace btf__type_by_id with btf_type_by_id. In libbpf btf__type_by_id and btf_type_by_id have different behavior. bpf_core_apply_relo_insn() needs behavior of uapi btf__type_by_id vs internal btf_type_by_id, but type_id range check is already done in bpf_core_apply_relo(), so it's safe to replace it everywhere. The kernel btf_type_by_id() does the check anyway. It doesn't hurt. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211201181040.23337-2-alexei.starovoitov@gmail.com	2021-12-10 14:03:13 -08:00
Kumar Kartikeya Dwivedi	c4da8092cc	libbpf: Avoid reload of imm for weak, unresolved, repeating ksym Alexei pointed out that we can use BPF_REG_0 which already contains imm from move_blob2blob computation. Note that we now compare the second insn's imm, but this should not matter, since both will be zeroed out for the error case for the insn populated earlier. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20211122235733.634914-4-memxor@gmail.com	2021-12-10 14:03:13 -08:00
Kumar Kartikeya Dwivedi	b2b45a3131	libbpf: Avoid double stores for success/failure case of ksym relocations Instead, jump directly to success case stores in case ret >= 0, else do the default 0 value store and jump over the success case. This is better in terms of readability. Readjust the code for kfunc relocation as well to follow a similar pattern, also leads to easier to follow code now. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20211122235733.634914-3-memxor@gmail.com	2021-12-10 14:03:13 -08:00
Joanne Koong	73c8768db7	bpf: Add bpf_loop helper This patch adds the kernel-side and API changes for a new helper function, bpf_loop: long bpf_loop(u32 nr_loops, void callback_fn, void callback_ctx, u64 flags); where long (callback_fn)(u32 index, void ctx); bpf_loop invokes the "callback_fn" nr_loops times or until the callback_fn returns 1. The callback_fn can only return 0 or 1, and this is enforced by the verifier. The callback_fn index is zero-indexed. A few things to please note: ~ The "u64 flags" parameter is currently unused but is included in case a future use case for it arises. ~ In the kernel-side implementation of bpf_loop (kernel/bpf/bpf_iter.c), bpf_callback_t is used as the callback function cast. ~ A program can have nested bpf_loop calls but the program must still adhere to the verifier constraint of its stack depth (the stack depth cannot exceed MAX_BPF_STACK)) ~ Recursive callback_fns do not pass the verifier, due to the call stack for these being too deep. ~ The next patch will include the tests and benchmark Signed-off-by: Joanne Koong <joannekoong@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211130030622.4131246-2-joannekoong@fb.com	2021-12-10 14:03:13 -08:00
Mehrdad Arshad Rad	bafda72319	libbpf: Remove duplicate assignments There is a same action when load_attr.attach_btf_id is initialized. Signed-off-by: Mehrdad Arshad Rad <arshad.rad@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211128193337.10628-1-arshad.rad@gmail.com	2021-12-10 14:03:13 -08:00
Andrii Nakryiko	33ec2ca026	sync: improve patch application process by using patch command git apply -3 doesn't always leave conflicted files in the working directory. Use patch --merge instead, it seems to work better in more complicated situations. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-12-09 22:43:20 -08:00
Yucong Sun	7e89be4022	Migrate vmtest to modular actions in libbpf/ci	2021-12-09 14:09:04 -08:00
Ilya Leoshkevich	93e89b3474	ci: upgrade s390x runner to v2.285.0 This is needed for composite actions with conditional steps. Fixes #416. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>	2021-12-03 10:42:54 -08:00
Quentin Monnet	4884bf3dbd	ci: fix test on /exitstatus existence and size The condition for the test is incorrect, we want to add a default exit status if the file is empty. But [[ -s file ]] returns true if the file exists and has a size greater than zero. Let's reverse the condition. Fixes: `385b2d1738` ("ci: change VM's /exitstatus format to prepare it for several results") Signed-off-by: Quentin Monnet <quentin@isovalent.com>	2021-12-01 15:46:22 -08:00
Andrii Nakryiko	690d0531f9	ci: whitelist legacy_printk tests on 4.9 and 5.5 legacy_printk selftests is specially designed to be runnable on old kernels and validate libbpf's bpf_trace_printk-related macros. Run them in CI. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-11-30 16:39:31 -08:00
Quentin Monnet	7cda69caeb	ci: add folding markers to avoid getting output out of sections Move commands to existing sections, or add markers for new sections when no surrounding existing section is relevant. This is to avoid having logs outside of subsection for the “vmtest” step of the GitHub action.	2021-11-30 14:39:02 -08:00
Quentin Monnet	1f7db672e4	ci: carry on after selftest failure and report test group results Two changes come with this patch: - Test groups report their exit status individually for the final summary, by appending it to /exitstatus in the VM. “Test groups” are test_maps, test_verifier, test_progs, and test_progs-no_alu32. - For these separate reports to make sense, allow the CI action to carry on even after one of the groups fails, by adding "&& true" to the commands in order to neutralise the effect of the "set -e".	2021-11-30 14:39:02 -08:00
Quentin Monnet	385b2d1738	ci: change VM's /exitstatus format to prepare it for several results We recently introduced a summary for the results of the different groups of tests for the CI, displayed after the machine is shut down. There are currently two groups, "bpftool" and "vm_tests". We want to split the latter into different subgroups. For that, we will make each group of tests that runs in the VM print its exit status to the /exitstatus file. In preparation for this, let's update the format of this /exitstatus file. Instead of containing just an integer, it now contains a line with a group name, a colon, and the integer result. This is easy enough to parse on the other end. We also drop the associative array, and iterate on /exitstatus instead to produce the summary: this way, the order of the checks is preserved.	2021-11-30 14:39:02 -08:00
Quentin Monnet	7f11cd48d6	ci: create helpers for formatting errors and notices Create helpers for formatting errors and notices for GitHub actions, instead of directly printing the double colons and attributes.	2021-11-30 14:39:02 -08:00
Andrii Nakryiko	4374bad784	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 8f6f41f39348f25db843f2fcb2f1c166b4bfa2d7 Checkpoint bpf-next commit: 43174f0d4597325cb91f1f1f55263eb6e6101036 Baseline bpf commit: c0d95d3380ee099d735e08618c0d599e72f6c8b0 Checkpoint bpf commit: c0d95d3380ee099d735e08618c0d599e72f6c8b0 Alan Maguire (1): libbpf: Silence uninitialized warning/error in btf_dump_dump_type_data Hengqi Chen (1): libbpf: Support static initialization of BPF_MAP_TYPE_PROG_ARRAY Tiezhu Yang (1): bpf, mips: Fix build errors about __NR_bpf undeclared src/bpf.c \| 6 ++ src/btf_dump.c \| 2 +- src/libbpf.c \| 154 ++++++++++++++++++++++++++++++++++---------- src/skel_internal.h \| 10 +++ 4 files changed, 138 insertions(+), 34 deletions(-) -- 2.30.2	2021-11-29 11:20:00 -08:00
Alan Maguire	55b057565f	libbpf: Silence uninitialized warning/error in btf_dump_dump_type_data When compiling libbpf with gcc 4.8.5, we see: CC staticobjs/btf_dump.o btf_dump.c: In function ‘btf_dump_dump_type_data.isra.24’: btf_dump.c:2296:5: error: ‘err’ may be used uninitialized in this function [-Werror=maybe-uninitialized] if (err < 0) ^ cc1: all warnings being treated as errors make: *** [staticobjs/btf_dump.o] Error 1 While gcc 4.8.5 is too old to build the upstream kernel, it's possible it could be used to build standalone libbpf which suffers from the same problem. Silence the error by initializing 'err' to 0. The warning/error seems to be a false positive since err is set early in the function. Regardless we shouldn't prevent libbpf from building for this. Fixes: 920d16af9b42 ("libbpf: BTF dumper support for typed data") Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1638180040-8037-1-git-send-email-alan.maguire@oracle.com	2021-11-29 11:20:00 -08:00
Hengqi Chen	472c0726e8	libbpf: Support static initialization of BPF_MAP_TYPE_PROG_ARRAY Support static initialization of BPF_MAP_TYPE_PROG_ARRAY with a syntax similar to map-in-map initialization ([0]): SEC("socket") int tailcall_1(void ctx) { return 0; } struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 2); __uint(key_size, sizeof(__u32)); __array(values, int (void )); } prog_array_init SEC(".maps") = { .values = { [1] = (void *)&tailcall_1, }, }; Here's the relevant part of libbpf debug log showing what's going on with prog-array initialization: libbpf: sec '.relsocket': collecting relocation for section(3) 'socket' libbpf: sec '.relsocket': relo #0: insn #2 against 'prog_array_init' libbpf: prog 'entry': found map 0 (prog_array_init, sec 4, off 0) for insn #0 libbpf: .maps relo #0: for 3 value 0 rel->r_offset 32 name 53 ('tailcall_1') libbpf: .maps relo #0: map 'prog_array_init' slot [1] points to prog 'tailcall_1' libbpf: map 'prog_array_init': created successfully, fd=5 libbpf: map 'prog_array_init': slot [1] set to prog 'tailcall_1' fd=6 [0] Closes: https://github.com/libbpf/libbpf/issues/354 Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211128141633.502339-2-hengqi.chen@gmail.com	2021-11-29 11:20:00 -08:00
Tiezhu Yang	c86cb27d5b	bpf, mips: Fix build errors about __NR_bpf undeclared Add the __NR_bpf definitions to fix the following build errors for mips: $ cd tools/bpf/bpftool $ make [...] bpf.c:54:4: error: #error __NR_bpf not defined. libbpf does not support your arch. # error __NR_bpf not defined. libbpf does not support your arch. ^~~~~ bpf.c: In function ‘sys_bpf’: bpf.c:66:17: error: ‘__NR_bpf’ undeclared (first use in this function); did you mean ‘__NR_brk’? return syscall(__NR_bpf, cmd, attr, size); ^~~~~~~~ __NR_brk [...] In file included from gen_loader.c:15:0: skel_internal.h: In function ‘skel_sys_bpf’: skel_internal.h:53:17: error: ‘__NR_bpf’ undeclared (first use in this function); did you mean ‘__NR_brk’? return syscall(__NR_bpf, cmd, attr, size); ^~~~~~~~ __NR_brk We can see the following generated definitions: $ grep -r "#define __NR_bpf" arch/mips arch/mips/include/generated/uapi/asm/unistd_o32.h:#define __NR_bpf (__NR_Linux + 355) arch/mips/include/generated/uapi/asm/unistd_n64.h:#define __NR_bpf (__NR_Linux + 315) arch/mips/include/generated/uapi/asm/unistd_n32.h:#define __NR_bpf (__NR_Linux + 319) The __NR_Linux is defined in arch/mips/include/uapi/asm/unistd.h: $ grep -r "#define __NR_Linux" arch/mips arch/mips/include/uapi/asm/unistd.h:#define __NR_Linux 4000 arch/mips/include/uapi/asm/unistd.h:#define __NR_Linux 5000 arch/mips/include/uapi/asm/unistd.h:#define __NR_Linux 6000 That is to say, __NR_bpf is: 4000 + 355 = 4355 for mips o32, 6000 + 319 = 6319 for mips n32, 5000 + 315 = 5315 for mips n64. So use the GCC pre-defined macro _ABIO32, _ABIN32 and _ABI64 [1] to define the corresponding __NR_bpf. This patch is similar with commit bad1926dd2f6 ("bpf, s390: fix build for libbpf and selftest suite"). [1] https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/mips/mips.h#l549 Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/1637804167-8323-1-git-send-email-yangtiezhu@loongson.cn	2021-11-29 11:20:00 -08:00
Andrii Nakryiko	3ef05a585e	sync: try harder when git am -3 fails `git am -3` will give up frequently even in cases when patch can be auto-merged with: ``` Applying: libbpf: Unify low-level map creation APIs w/ new bpf_map_create() error: sha1 information is lacking or useless (src/libbpf.c). error: could not build fake ancestor Patch failed at 0001 libbpf: Unify low-level map creation APIs w/ new bpf_map_create() ``` But `git apply -3` in the same situation will succeed with three-way merge just fine: ``` error: patch failed: src/bpf_gen_internal.h:51 Falling back to three-way merge... Applied patch to 'src/bpf_gen_internal.h' cleanly. ``` So if git am fails, try git apply and if that succeeds, automatically `git am --continue`. If not, fallback to user actions. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-11-26 13:51:29 -08:00
Andrii Nakryiko	493bfa8a59	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: fa721d4f0b91f525339996f4faef7bb072d70162 Checkpoint bpf-next commit: 8f6f41f39348f25db843f2fcb2f1c166b4bfa2d7 Baseline bpf commit: c0d95d3380ee099d735e08618c0d599e72f6c8b0 Checkpoint bpf commit: c0d95d3380ee099d735e08618c0d599e72f6c8b0 Andrii Nakryiko (8): libbpf: Load global data maps lazily on legacy kernels libbpf: Unify low-level map creation APIs w/ new bpf_map_create() libbpf: Use bpf_map_create() consistently internally libbpf: Prevent deprecation warnings in xsk.c libbpf: Fix potential misaligned memory access in btf_ext__new() libbpf: Don't call libc APIs with NULL pointers libbpf: Fix glob_syms memory leak in bpf_linker libbpf: Fix using invalidated memory in bpf_linker src/bpf.c \| 140 +++++++++++++++++------------------------ src/bpf.h \| 33 +++++++++- src/bpf_gen_internal.h \| 5 +- src/btf.c \| 10 +-- src/btf.h \| 2 +- src/gen_loader.c \| 46 +++++--------- src/libbpf.c \| 107 +++++++++++++++++-------------- src/libbpf.map \| 1 + src/libbpf_internal.h \| 21 ------- src/libbpf_probes.c \| 30 ++++----- src/linker.c \| 6 +- src/skel_internal.h \| 3 +- src/xsk.c \| 18 +++--- 13 files changed, 204 insertions(+), 218 deletions(-) -- 2.30.2	2021-11-26 13:51:29 -08:00
Andrii Nakryiko	9f006f1ed6	libbpf: Fix using invalidated memory in bpf_linker add_dst_sec() can invalidate bpf_linker's section index making dst_symtab pointer pointing into unallocated memory. Reinitialize dst_symtab pointer on each iteration to make sure it's always valid. Fixes: faf6ed321cf6 ("libbpf: Add BPF static linker APIs") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211124002325.1737739-7-andrii@kernel.org	2021-11-26 13:51:29 -08:00
Andrii Nakryiko	5fc0d66cad	libbpf: Fix glob_syms memory leak in bpf_linker glob_syms array wasn't freed on bpf_link__free(). Fix that. Fixes: a46349227cd8 ("libbpf: Add linker extern resolution support for functions and global variables") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211124002325.1737739-6-andrii@kernel.org	2021-11-26 13:51:29 -08:00
Andrii Nakryiko	37c3e92657	libbpf: Don't call libc APIs with NULL pointers Sanitizer complains about qsort(), bsearch(), and memcpy() being called with NULL pointer. This can only happen when the associated number of elements is zero, so no harm should be done. But still prevent this from happening to keep sanitizer runs clean from extra noise. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211124002325.1737739-5-andrii@kernel.org	2021-11-26 13:51:29 -08:00
Andrii Nakryiko	25eb5c4e02	libbpf: Fix potential misaligned memory access in btf_ext__new() Perform a memory copy before we do the sanity checks of btf_ext_hdr. This prevents misaligned memory access if raw btf_ext data is not 4-byte aligned ([0]). While at it, also add missing const qualifier. [0] Closes: https://github.com/libbpf/libbpf/issues/391 Fixes: 2993e0515bb4 ("tools/bpf: add support to read .BTF.ext sections") Reported-by: Evgeny Vereshchagin <evvers@ya.ru> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211124002325.1737739-3-andrii@kernel.org	2021-11-26 13:51:29 -08:00
Andrii Nakryiko	07e4e0cb04	libbpf: Prevent deprecation warnings in xsk.c xsk.c is using own APIs that are marked for deprecation internally. Given xsk.c and xsk.h will be gone in libbpf 1.0, there is no reason to do public vs internal function split just to avoid deprecation warnings. So just add a pragma to silence deprecation warnings (until the code is removed completely). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211124193233.3115996-4-andrii@kernel.org	2021-11-26 13:51:29 -08:00
Andrii Nakryiko	316b60fa89	libbpf: Use bpf_map_create() consistently internally Remove all the remaining uses of to-be-deprecated bpf_create_map*() APIs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211124193233.3115996-3-andrii@kernel.org	2021-11-26 13:51:29 -08:00
Andrii Nakryiko	6cfb97c561	libbpf: Unify low-level map creation APIs w/ new bpf_map_create() Mark the entire zoo of low-level map creation APIs for deprecation in libbpf 0.7 ([0]) and introduce a new bpf_map_create() API that is OPTS-based (and thus future-proof) and matches the BPF_MAP_CREATE command name. While at it, ensure that gen_loader sends map_extra field. Also remove now unneeded btf_key_type_id/btf_value_type_id logic that libbpf is doing anyways. [0] Closes: https://github.com/libbpf/libbpf/issues/282 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211124193233.3115996-2-andrii@kernel.org	2021-11-26 13:51:29 -08:00
Andrii Nakryiko	5c31bcf220	libbpf: Load global data maps lazily on legacy kernels Load global data maps lazily, if kernel is too old to support global data. Make sure that programs are still correct by detecting if any of the to-be-loaded programs have relocation against any of such maps. This allows to solve the issue ([0]) with bpf_printk() and Clang generating unnecessary and unreferenced .rodata.strX.Y sections, but it also goes further along the CO-RE lines, allowing to have a BPF object in which some code can work on very old kernels and relies only on BPF maps explicitly, while other BPF programs might enjoy global variable support. If such programs are correctly set to not load at runtime on old kernels, bpf_object will load and function correctly now. [0] https://lore.kernel.org/bpf/CAK-59YFPU3qO+_pXWOH+c1LSA=8WA1yabJZfREjOEXNHAqgXNg@mail.gmail.com/ Fixes: aed659170a31 ("libbpf: Support multiple .rodata.* and .data.* BPF maps") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20211123200105.387855-1-andrii@kernel.org	2021-11-26 13:51:29 -08:00
Andrii Nakryiko	5b4dbd8141	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: d41bc48bfab2076f7db88d079a3a3203dd9c4a54 Checkpoint bpf-next commit: fa721d4f0b91f525339996f4faef7bb072d70162 Baseline bpf commit: 099f896f498a2b26d84f4ddae039b2c542c18b48 Checkpoint bpf commit: c0d95d3380ee099d735e08618c0d599e72f6c8b0 Andrii Nakryiko (2): libbpf: Add runtime APIs to query libbpf version libbpf: Accommodate DWARF/compiler bug with duplicated structs Dave Tucker (1): bpf, docs: Fix ordering of bpf documentation Florent Revest (1): libbpf: Change bpf_program__set_extra_flags to bpf_program__set_flags docs/index.rst \| 4 ++-- src/btf.c \| 45 +++++++++++++++++++++++++++++++++++++++++---- src/libbpf.c \| 23 +++++++++++++++++++++-- src/libbpf.h \| 6 +++++- src/libbpf.map \| 5 ++++- 5 files changed, 73 insertions(+), 10 deletions(-) -- 2.30.2	2021-11-23 23:04:18 -08:00
Florent Revest	14e12f4290	libbpf: Change bpf_program__set_extra_flags to bpf_program__set_flags bpf_program__set_extra_flags has just been introduced so we can still change it without breaking users. This new interface is a bit more flexible (for example if someone wants to clear a flag). Signed-off-by: Florent Revest <revest@chromium.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211119180035.1396139-1-revest@chromium.org	2021-11-23 23:04:18 -08:00
Andrii Nakryiko	60ce9af668	libbpf: Accommodate DWARF/compiler bug with duplicated structs According to [0], compilers sometimes might produce duplicate DWARF definitions for exactly the same struct/union within the same compilation unit (CU). We've had similar issues with identical arrays and handled them with a similar workaround in 6b6e6b1d09aa ("libbpf: Accomodate DWARF/compiler bug with duplicated identical arrays"). Do the same for struct/union by ensuring that two structs/unions are exactly the same, down to the integer values of field referenced type IDs. Solving this more generically (allowing referenced types to be equivalent, but using different type IDs, all within a single CU) requires a huge complexity increase to handle many-to-many mappings between canonidal and candidate type graphs. Before we invest in that, let's see if this approach handles all the instances of this issue in practice. Thankfully it's pretty rare, it seems. [0] https://lore.kernel.org/bpf/YXr2NFlJTAhHdZqq@krava/ Reported-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211117194114.347675-1-andrii@kernel.org	2021-11-23 23:04:18 -08:00
Andrii Nakryiko	d29571725a	libbpf: Add runtime APIs to query libbpf version Libbpf provided LIBBPF_MAJOR_VERSION and LIBBPF_MINOR_VERSION macros to check libbpf version at compilation time. This doesn't cover all the needs, though, because version of libbpf that application is compiled against doesn't necessarily match the version of libbpf at runtime, especially if libbpf is used as a shared library. Add libbpf_major_version() and libbpf_minor_version() returning major and minor versions, respectively, as integers. Also add a convenience libbpf_version_string() for various tooling using libbpf to print out libbpf version in a human-readable form. Currently it will return "v0.6", but in the future it can contains some extra information, so the format itself is not part of a stable API and shouldn't be relied upon. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20211118174054.2699477-1-andrii@kernel.org	2021-11-23 23:04:18 -08:00
Dave Tucker	842c5b7bff	bpf, docs: Fix ordering of bpf documentation This commit fixes the display of the BPF documentation in the sidebar when rendered as HTML. Before this patch, the sidebar would render as follows for some sections: \| BPF Documentation \|- BPF Type Format (BTF) \|- BPF Type Format (BTF) This was due to creating a heading in index.rst followed by a sphinx toctree, where the file referenced carries the same title as the section heading. To fix this I applied a pattern that has been established in other subfolders of Documentation: 1. Re-wrote index.rst to have a single toctree 2. Split the sections out in to their own files Additionally maps.rst and programs.rst make use of a glob pattern to include map_* or prog_* rst files in their toctree, meaning future map or program type documentation will be automatically included. Signed-off-by: Dave Tucker <dave@dtucker.co.uk> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/1a1eed800e7b9dc13b458de113a489641519b0cc.1636749493.git.dave@dtucker.co.uk	2021-11-23 23:04:18 -08:00
Quentin Monnet	9109d6a4b4	ci: create summary for tests and account for bpftool checks result The bpftool checks work as expected when the CI runs, except that they do not set any error status code for the script on error, which means that the failures are lost among the logs and not reported in a clear way to the reviewers. This commit aims at fixing the issue. We could simply exit with a non-zero error status when the bpftool checks, but that would prevent the other tests from running. Instead, we propose to store the result of the bpftool checks in a bash array. This array can later be reused to print a summary of the different groups of tests, at the end of the CI run, to help the reviewers understand where the failure happened without having to manually unfold all the sections on the GitHub interface. Currently, there are only two groups: the bpftool checks and the "VM tests". The latter may later be split into test_maps, test_progs, test_progs-no_alu32, etc. by teaching each of them to append their exit status code to the "exitstatus" file. Fixes: `88649fe655` ("ci: run script to test bpftool types/options sync")	2021-11-18 11:19:36 -08:00
Quentin Monnet	eab19ffead	ci: pass shutdown fold description to fold command For displaying a coloured title for the shutdown section in the logs, instead of having the colour control codes directly written in run.sh, we can pass the section title as an argument to "travis_fold()" and have it format and print it for us. This is cleaner, and slightly more in-line with what we do in the CI files of the vmtest repository.	2021-11-18 11:19:36 -08:00
Andrii Nakryiko	94a49850c5	Makefile: enforce gnu89 standard libbpf conforms to kernel style and uses the same -std=gnu89 standard for compilation. So enforce it on Github projection as well. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-11-16 13:17:03 -08:00
Andrii Nakryiko	d71409b508	Makefile: don't hide relevant parts of file path File path has shared vs static path, it's useful to see. So preserve it in pretty output. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-11-16 13:17:03 -08:00
Andrii Nakryiko	f0ecdeed3a	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 9faaffbe85edcbdc54096f7f87baa3bc4842a7e2 Checkpoint bpf-next commit: d41bc48bfab2076f7db88d079a3a3203dd9c4a54 Baseline bpf commit: 5833291ab6de9c3e2374336b51c814e515e8f3a5 Checkpoint bpf commit: 099f896f498a2b26d84f4ddae039b2c542c18b48 Kumar Kartikeya Dwivedi (1): libbpf: Perform map fd cleanup for gen_loader in case of error Tiezhu Yang (1): bpf: Change value of MAX_TAIL_CALL_CNT from 32 to 33 Yonghong Song (1): libbpf: Fix a couple of missed btf_type_tag handling in btf.c include/uapi/linux/bpf.h \| 2 +- src/bpf_gen_internal.h \| 4 ++-- src/btf.c \| 2 ++ src/gen_loader.c \| 47 +++++++++++++++++++++++++--------------- src/libbpf.c \| 4 ++-- 5 files changed, 37 insertions(+), 22 deletions(-) -- 2.30.2	2021-11-16 13:16:07 -08:00
Andrii Nakryiko	d924fa62ee	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2021-11-16 13:16:07 -08:00
Kumar Kartikeya Dwivedi	0f5a62b2d8	libbpf: Perform map fd cleanup for gen_loader in case of error Alexei reported a fd leak issue in gen loader (when invoked from bpftool) [0]. When adding ksym support, map fd allocation was moved from stack to loader map, however I missed closing these fds (relevant when cleanup label is jumped to on error). For the success case, the allocated fd is returned in loader ctx, hence this problem is not noticed. Make three changes, first MAX_USED_MAPS in MAX_FD_ARRAY_SZ instead of MAX_USED_PROGS, the braino was not a problem until now for this case as we didn't try to close map fds (otherwise use of it would have tried closing 32 additional fds in ksym btf fd range). Then, do a cleanup for all nr_maps fds in cleanup label code, so that in case of error all temporary map fds from bpf_gen__map_create are closed. Then, adjust the cleanup label to only generate code for the required number of program and map fds. To trim code for remaining program fds, lay out prog_fd array in stack in the end, so that we can directly skip the remaining instances. Still stack size remains same, since changing that would require changes in a lot of places (including adjustment of stack_off macro), so nr_progs_sz variable is only used to track required number of iterations (and jump over cleanup size calculated from that), stack offset calculation remains unaffected. The difference for test_ksyms_module.o is as follows: libbpf: //prog cleanup iterations: before = 34, after = 5 libbpf: //maps cleanup iterations: before = 64, after = 2 Also, move allocation of gen->fd_array offset to bpf_gen__init. Since offset can now be 0, and we already continue even if add_data returns 0 in case of failure, we do not need to distinguish between 0 offset and failure case 0, as we rely on bpf_gen__finish to check errors. We can also skip check for gen->fd_array in add_*_fd functions, since bpf_gen__init will take care of it. [0]: https://lore.kernel.org/bpf/CAADnVQJ6jSitKSNKyxOrUzwY2qDRX0sPkJ=VLGHuCLVJ=qOt9g@mail.gmail.com Fixes: 18f4fccbf314 ("libbpf: Update gen_loader to emit BTF_KIND_FUNC relocations") Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211112232022.899074-1-memxor@gmail.com	2021-11-16 13:16:07 -08:00
Tiezhu Yang	5ca49d2b32	bpf: Change value of MAX_TAIL_CALL_CNT from 32 to 33 In the current code, the actual max tail call count is 33 which is greater than MAX_TAIL_CALL_CNT (defined as 32). The actual limit is not consistent with the meaning of MAX_TAIL_CALL_CNT and thus confusing at first glance. We can see the historical evolution from commit 04fd61ab36ec ("bpf: allow bpf programs to tail-call other bpf programs") and commit f9dabe016b63 ("bpf: Undo off-by-one in interpreter tail call count limit"). In order to avoid changing existing behavior, the actual limit is 33 now, this is reasonable. After commit 874be05f525e ("bpf, tests: Add tail call test suite"), we can see there exists failed testcase. On all archs when CONFIG_BPF_JIT_ALWAYS_ON is not set: # echo 0 > /proc/sys/net/core/bpf_jit_enable # modprobe test_bpf # dmesg \| grep -w FAIL Tail call error path, max count reached jited:0 ret 34 != 33 FAIL On some archs: # echo 1 > /proc/sys/net/core/bpf_jit_enable # modprobe test_bpf # dmesg \| grep -w FAIL Tail call error path, max count reached jited:1 ret 34 != 33 FAIL Although the above failed testcase has been fixed in commit 18935a72eb25 ("bpf/tests: Fix error in tail call limit tests"), it would still be good to change the value of MAX_TAIL_CALL_CNT from 32 to 33 to make the code more readable. The 32-bit x86 JIT was using a limit of 32, just fix the wrong comments and limit to 33 tail calls as the constant MAX_TAIL_CALL_CNT updated. For the mips64 JIT, use "ori" instead of "addiu" as suggested by Johan Almbladh. For the riscv JIT, use RV_REG_TCC directly to save one register move as suggested by Björn Töpel. For the other implementations, no function changes, it does not change the current limit 33, the new value of MAX_TAIL_CALL_CNT can reflect the actual max tail call count, the related tail call testcases in test_bpf module and selftests can work well for the interpreter and the JIT. Here are the test results on x86_64: # uname -m x86_64 # echo 0 > /proc/sys/net/core/bpf_jit_enable # modprobe test_bpf test_suite=test_tail_calls # dmesg \| tail -1 test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [0/8 JIT'ed] # rmmod test_bpf # echo 1 > /proc/sys/net/core/bpf_jit_enable # modprobe test_bpf test_suite=test_tail_calls # dmesg \| tail -1 test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [8/8 JIT'ed] # rmmod test_bpf # ./test_progs -t tailcalls #142 tailcalls:OK Summary: 1/11 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Tested-by: Ilya Leoshkevich <iii@linux.ibm.com> Acked-by: Björn Töpel <bjorn@kernel.org> Acked-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/bpf/1636075800-3264-1-git-send-email-yangtiezhu@loongson.cn	2021-11-16 13:16:07 -08:00
Yonghong Song	219c8e11e0	libbpf: Fix a couple of missed btf_type_tag handling in btf.c Commit 2dc1e488e5cd ("libbpf: Support BTF_KIND_TYPE_TAG") added the BTF_KIND_TYPE_TAG support. But to test vmlinux build with ... #define __user __attribute__((btf_type_tag("user"))) ... I needed to sync libbpf repo and manually copy libbpf sources to pahole. To simplify process, I used BTF_KIND_RESTRICT to simulate the BTF_KIND_TYPE_TAG with vmlinux build as "restrict" modifier is barely used in kernel. But this approach missed one case in dedup with structures where BTF_KIND_RESTRICT is handled and BTF_KIND_TYPE_TAG is not handled in btf_dedup_is_equiv(), and this will result in a pahole dedup failure. This patch fixed this issue and a selftest is added in the subsequent patch to test this scenario. The other missed handling is in btf__resolve_size(). Currently the compiler always emit like PTR->TYPE_TAG->... so in practice we don't hit the missing BTF_KIND_TYPE_TAG handling issue with compiler generated code. But lets add case BTF_KIND_TYPE_TAG in the switch statement to be future proof. Fixes: 2dc1e488e5cd ("libbpf: Support BTF_KIND_TYPE_TAG") Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211115163937.3922235-1-yhs@fb.com	2021-11-16 13:16:07 -08:00
Ilya Leoshkevich	140b902274	ci: add s390x vmtest Run it on the self-hosted builder with tag "z15". Also add the infrastructure code for the self-hosted builder. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>	2021-11-15 22:39:49 -08:00
Ilya Leoshkevich	1987a34fc9	vmtest: use libguestfs for disk image manipulations Running vmtest inside a container removes the ability to use certain root powers, among other things - mounting arbitrary images. Use libguestfs in order to avoid having to mount anything. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>	2021-11-15 22:39:49 -08:00
Ilya Leoshkevich	3b1714aa92	vmtest: add s390x blacklist A lot of tests in test_progs fail due to the missing trampoline implementation on s390x (and a handful for other reasons). Yet, a lot more pass, so disabling test_progs altogether is too heavy-handed. So add a mechanism for arch-specific blacklists (as discussed in [1]) and introduce a s390x blacklist, that simply reflects the status quo. [1] https://github.com/libbpf/libbpf/pull/204#discussion_r601768628 Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>	2021-11-15 22:39:49 -08:00
Ilya Leoshkevich	554054d876	vmtest: tweak qemu invocation for s390x We need a different binary and console. Also use a fixed number of cores in order to avoid OOM in case a builder has too many of them. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>	2021-11-15 22:39:49 -08:00
Ilya Leoshkevich	26e196d449	vmtest: add s390x image Generated by simply running mkrootfs_debian.sh. Also use $ARCH as an image name prefix. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>	2021-11-15 22:39:49 -08:00
Ilya Leoshkevich	3fac0b3d08	vmtest: add s390x config Select the current config based on $ARCH value and thus rename the existing config to config-latest.$ARCH. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>	2021-11-15 22:39:49 -08:00
Ilya Leoshkevich	ac4a0fa400	vmtest: add debootstrap-based mkrootfs script The existing mkrootfs.sh is based on Arch Linux, which supports only x86_64. Add mkrootfs_debian.sh: a debootstrap-based script. Debian was chosen, because it supports more architectures than other mainstream distros. Move init setup to mkrootfs_tweak.sh, rename the existing Arch script to mkrootfs_arch.sh. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>	2021-11-15 22:39:49 -08:00
Ilya Leoshkevich	6ad73f5083	vmtest: do not install lld s390x LLVM does not have it, and it is not needed for the libbpf CI. So drop it. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>	2021-11-15 22:39:49 -08:00
Ilya Leoshkevich	8a52e49575	vmtest: use python3-docutils instead of python-docutils There is no python-docutils on Debian Bullseye, but python3-docutils exists everywhere. Since Python 2 is EOL anyway, use the Python 3 version. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>	2021-11-15 22:39:49 -08:00
Andrii Nakryiko	5b9d079c7f	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: b8b5cb55f5d3f03cc1479a3768d68173a10359ad Checkpoint bpf-next commit: 9faaffbe85edcbdc54096f7f87baa3bc4842a7e2 Baseline bpf commit: 47b3708c6088a60e7dc3b809dbb0d4c46590b32f Checkpoint bpf commit: 5833291ab6de9c3e2374336b51c814e515e8f3a5 Andrii Nakryiko (11): libbpf: Rename DECLARE_LIBBPF_OPTS into LIBBPF_OPTS libbpf: Pass number of prog load attempts explicitly libbpf: Unify low-level BPF_PROG_LOAD APIs into bpf_prog_load() libbpf: Remove internal use of deprecated bpf_prog_load() variants libbpf: Stop using to-be-deprecated APIs libbpf: Remove deprecation attribute from struct bpf_prog_prep_result libbpf: Free up resources used by inner map definition libbpf: Add ability to get/set per-program load flags libbpf: Turn btf_dedup_opts into OPTS-based struct libbpf: Ensure btf_dump__new() and btf_dump_opts are future-proof libbpf: Make perf_buffer__new() use OPTS-based interface Mark Pashmfouroush (1): bpf: Add ingress_ifindex to bpf_sk_lookup Song Liu (1): bpf: Introduce helper bpf_find_vma Yonghong Song (2): bpf: Support BTF_KIND_TYPE_TAG for btf_type_tag attributes libbpf: Support BTF_KIND_TYPE_TAG include/uapi/linux/bpf.h \| 21 +++ include/uapi/linux/btf.h \| 3 +- src/bpf.c \| 166 +++++++++++++--------- src/bpf.h \| 74 +++++++++- src/bpf_gen_internal.h \| 8 +- src/btf.c \| 69 ++++++--- src/btf.h \| 80 +++++++++-- src/btf_dump.c \| 40 ++++-- src/gen_loader.c \| 30 ++-- src/libbpf.c \| 297 +++++++++++++++++++++++---------------- src/libbpf.h \| 95 ++++++++++--- src/libbpf.map \| 13 ++ src/libbpf_common.h \| 14 +- src/libbpf_internal.h \| 33 +---- src/libbpf_legacy.h \| 1 + src/libbpf_probes.c \| 20 ++- src/linker.c \| 4 +- src/xsk.c \| 34 ++--- 18 files changed, 670 insertions(+), 332 deletions(-) -- 2.30.2	2021-11-12 23:46:09 -08:00
Andrii Nakryiko	bc66d28b68	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2021-11-12 23:46:09 -08:00
Yonghong Song	98181e0546	libbpf: Support BTF_KIND_TYPE_TAG Add libbpf support for BTF_KIND_TYPE_TAG. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211112012614.1505315-1-yhs@fb.com	2021-11-12 23:46:09 -08:00
Yonghong Song	d932a1a46b	bpf: Support BTF_KIND_TYPE_TAG for btf_type_tag attributes LLVM patches ([1] for clang, [2] and [3] for BPF backend) added support for btf_type_tag attributes. This patch added support for the kernel. The main motivation for btf_type_tag is to bring kernel annotations __user, __rcu etc. to btf. With such information available in btf, bpf verifier can detect mis-usages and reject the program. For example, for __user tagged pointer, developers can then use proper helper like bpf_probe_read_user() etc. to read the data. BTF_KIND_TYPE_TAG may also useful for other tracing facility where instead of to require user to specify kernel/user address type, the kernel can detect it by itself with btf. [1] https://reviews.llvm.org/D111199 [2] https://reviews.llvm.org/D113222 [3] https://reviews.llvm.org/D113496 Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211112012609.1505032-1-yhs@fb.com	2021-11-12 23:46:09 -08:00
Andrii Nakryiko	011a01594c	libbpf: Make perf_buffer__new() use OPTS-based interface Add new variants of perf_buffer__new() and perf_buffer__new_raw() that use OPTS-based options for future extensibility ([0]). Given all the currently used API names are best fits, re-use them and use ___libbpf_override() approach and symbol versioning to preserve ABI and source code compatibility. struct perf_buffer_opts and struct perf_buffer_raw_opts are kept as well, but they are restructured such that they are OPTS-based when used with new APIs. For struct perf_buffer_raw_opts we keep few fields intact, so we have to also preserve the memory location of them both when used as OPTS and for legacy API variants. This is achieved with anonymous padding for OPTS "incarnation" of the struct. These pads can be eventually used for new options. [0] Closes: https://github.com/libbpf/libbpf/issues/311 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211111053624.190580-6-andrii@kernel.org	2021-11-12 23:46:09 -08:00
Andrii Nakryiko	b9a88a4533	libbpf: Ensure btf_dump__new() and btf_dump_opts are future-proof Change btf_dump__new() and corresponding struct btf_dump_ops structure to be extensible by using OPTS "framework" ([0]). Given we don't change the names, we use a similar approach as with bpf_prog_load(), but this time we ended up with two APIs with the same name and same number of arguments, so overloading based on number of arguments with ___libbpf_override() doesn't work. Instead, use "overloading" based on types. In this particular case, print callback has to be specified, so we detect which argument is a callback. If it's 4th (last) argument, old implementation of API is used by user code. If not, it must be 2nd, and thus new implementation is selected. The rest is handled by the same symbol versioning approach. btf_ext argument is dropped as it was never used and isn't necessary either. If in the future we'll need btf_ext, that will be added into OPTS-based struct btf_dump_opts. struct btf_dump_opts is reused for both old API and new APIs. ctx field is marked deprecated in v0.7+ and it's put at the same memory location as OPTS's sz field. Any user of new-style btf_dump__new() will have to set sz field and doesn't/shouldn't use ctx, as ctx is now passed along the callback as mandatory input argument, following the other APIs in libbpf that accept callbacks consistently. Again, this is quite ugly in implementation, but is done in the name of backwards compatibility and uniform and extensible future APIs (at the same time, sigh). And it will be gone in libbpf 1.0. [0] Closes: https://github.com/libbpf/libbpf/issues/283 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211111053624.190580-5-andrii@kernel.org	2021-11-12 23:46:09 -08:00
Andrii Nakryiko	969018545d	libbpf: Turn btf_dedup_opts into OPTS-based struct btf__dedup() and struct btf_dedup_opts were added before we figured out OPTS mechanism. As such, btf_dedup_opts is non-extensible without breaking an ABI and potentially crashing user application. Unfortunately, btf__dedup() and btf_dedup_opts are short and succinct names that would be great to preserve and use going forward. So we use ___libbpf_override() macro approach, used previously for bpf_prog_load() API, to define a new btf__dedup() variant that accepts only struct btf * and struct btf_dedup_opts * arguments, and rename the old btf__dedup() implementation into btf__dedup_deprecated(). This keeps both source and binary compatibility with old and new applications. The biggest problem was struct btf_dedup_opts, which wasn't OPTS-based, and as such doesn't have `size_t sz;` as a first field. But btf__dedup() is a pretty rarely used API and I believe that the only currently known users (besides selftests) are libbpf's own bpf_linker and pahole. Neither use case actually uses options and just passes NULL. So instead of doing extra hacks, just rewrite struct btf_dedup_opts into OPTS-based one, move btf_ext argument into those opts (only bpf_linker needs to dedup btf_ext, so it's not a typical thing to specify), and drop never used `dont_resolve_fwds` option (it was never used anywhere, AFAIK, it makes BTF dedup much less useful and efficient). Just in case, for old implementation, btf__dedup_deprecated(), detect non-NULL options and error out with helpful message, to help users migrate, if there are any user playing with btf__dedup(). The last remaining piece is dedup_table_size, which is another anachronism from very early days of BTF dedup. Since then it has been reduced to the only valid value, 1, to request forced hash collisions. This is only used during testing. So instead introduce a bool flag to force collisions explicitly. This patch also adapts selftests to new btf__dedup() and btf_dedup_opts use to avoid selftests breakage. [0] Closes: https://github.com/libbpf/libbpf/issues/281 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211111053624.190580-4-andrii@kernel.org	2021-11-12 23:46:09 -08:00
Andrii Nakryiko	0e80b7dc3f	libbpf: Add ability to get/set per-program load flags Add bpf_program__flags() API to retrieve prog_flags that will be (or were) supplied to BPF_PROG_LOAD command. Also add bpf_program__set_extra_flags() API to allow to set extra flags, in addition to those determined by program's SEC() definition. Such flags are logically OR'ed with libbpf-derived flags. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211111051758.92283-2-andrii@kernel.org	2021-11-12 23:46:09 -08:00
Mark Pashmfouroush	932800b20b	bpf: Add ingress_ifindex to bpf_sk_lookup It may be helpful to have access to the ifindex during bpf socket lookup. An example may be to scope certain socket lookup logic to specific interfaces, i.e. an interface may be made exempt from custom lookup code. Add the ifindex of the arriving connection to the bpf_sk_lookup API. Signed-off-by: Mark Pashmfouroush <markpash@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211110111016.5670-2-markpash@cloudflare.com	2021-11-12 23:46:09 -08:00
Song Liu	cfc69268e5	bpf: Introduce helper bpf_find_vma In some profiler use cases, it is necessary to map an address to the backing file, e.g., a shared library. bpf_find_vma helper provides a flexible way to achieve this. bpf_find_vma maps an address of a task to the vma (vm_area_struct) for this address, and feed the vma to an callback BPF function. The callback function is necessary here, as we need to ensure mmap_sem is unlocked. It is necessary to lock mmap_sem for find_vma. To lock and unlock mmap_sem safely when irqs are disable, we use the same mechanism as stackmap with build_id. Specifically, when irqs are disabled, the unlocked is postponed in an irq_work. Refactor stackmap.c so that the irq_work is shared among bpf_find_vma and stackmap helpers. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Hengqi Chen <hengqi.chen@gmail.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20211105232330.1936330-2-songliubraving@fb.com	2021-11-12 23:46:09 -08:00
Andrii Nakryiko	9b2bbdefd5	libbpf: Free up resources used by inner map definition It's not enough to just free(map->inner_map), as inner_map itself can have extra memory allocated, like map name. Fixes: 646f02ffdd49 ("libbpf: Add BTF-defined map-in-map support") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com> Link: https://lore.kernel.org/bpf/20211107165521.9240-3-andrii@kernel.org	2021-11-12 23:46:09 -08:00
Andrii Nakryiko	c7236a5342	libbpf: Remove deprecation attribute from struct bpf_prog_prep_result This deprecation annotation has no effect because for struct deprecation attribute has to be declared after struct definition. But instead of moving it to the end of struct definition, remove it. When deprecation will go in effect at libbpf v0.7, this deprecation attribute will cause libbpf's own source code compilation to trigger deprecation warnings, which is unavoidable because libbpf still has to support that API. So keep deprecation of APIs, but don't mark structs used in API as deprecated. Fixes: e21d585cb3db ("libbpf: Deprecate multi-instance bpf_program APIs") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20211103220845.2676888-8-andrii@kernel.org	2021-11-12 23:46:09 -08:00
Andrii Nakryiko	a611094604	libbpf: Stop using to-be-deprecated APIs Remove all the internal uses of libbpf APIs that are slated to be deprecated in v0.7. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211103220845.2676888-6-andrii@kernel.org	2021-11-12 23:46:09 -08:00
Andrii Nakryiko	9b422137af	libbpf: Remove internal use of deprecated bpf_prog_load() variants Remove all the internal uses of bpf_load_program_xattr(), which is slated for deprecation in v0.7. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211103220845.2676888-5-andrii@kernel.org	2021-11-12 23:46:09 -08:00
Andrii Nakryiko	65cdd0c73d	libbpf: Unify low-level BPF_PROG_LOAD APIs into bpf_prog_load() Add a new unified OPTS-based low-level API for program loading, bpf_prog_load() ([0]). bpf_prog_load() accepts few "mandatory" parameters as input arguments (program type, name, license, instructions) and all the other optional (as in not required to specify for all types of BPF programs) fields into struct bpf_prog_load_opts. This makes all the other non-extensible APIs variant for BPF_PROG_LOAD obsolete and they are slated for deprecation in libbpf v0.7: - bpf_load_program(); - bpf_load_program_xattr(); - bpf_verify_program(). Implementation-wise, internal helper libbpf__bpf_prog_load is refactored to become a public bpf_prog_load() API. struct bpf_prog_load_params used internally is replaced by public struct bpf_prog_load_opts. Unfortunately, while conceptually all this is pretty straightforward, the biggest complication comes from the already existing bpf_prog_load() high-level API, which has nothing to do with BPF_PROG_LOAD command. We try really hard to have a new API named bpf_prog_load(), though, because it maps naturally to BPF_PROG_LOAD command. For that, we rename old bpf_prog_load() into bpf_prog_load_deprecated() and mark it as COMPAT_VERSION() for shared library users compiled against old version of libbpf. Statically linked users and shared lib users compiled against new version of libbpf headers will get "rerouted" to bpf_prog_deprecated() through a macro helper that decides whether to use new or old bpf_prog_load() based on number of input arguments (see ___libbpf_overload in libbpf_common.h). To test that existing bpf_prog_load()-using code compiles and works as expected, I've compiled and ran selftests as is. I had to remove (locally) selftest/bpf/Makefile -Dbpf_prog_load=bpf_prog_test_load hack because it was conflicting with the macro-based overload approach. I don't expect anyone else to do something like this in practice, though. This is testing-specific way to replace bpf_prog_load() calls with special testing variant of it, which adds extra prog_flags value. After testing I kept this selftests hack, but ensured that we use a new bpf_prog_load_deprecated name for this. This patch also marks bpf_prog_load() and bpf_prog_load_xattr() as deprecated. bpf_object interface has to be used for working with struct bpf_program. Libbpf doesn't support loading just a bpf_program. The silver lining is that when we get to libbpf 1.0 all these complication will be gone and we'll have one clean bpf_prog_load() low-level API with no backwards compatibility hackery surrounding it. [0] Closes: https://github.com/libbpf/libbpf/issues/284 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211103220845.2676888-4-andrii@kernel.org	2021-11-12 23:46:09 -08:00
Andrii Nakryiko	6b2db898cc	libbpf: Pass number of prog load attempts explicitly Allow to control number of BPF_PROG_LOAD attempts from outside the sys_bpf_prog_load() helper. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20211103220845.2676888-3-andrii@kernel.org	2021-11-12 23:46:09 -08:00
Andrii Nakryiko	ea6c242fc6	libbpf: Rename DECLARE_LIBBPF_OPTS into LIBBPF_OPTS It's confusing that libbpf-provided helper macro doesn't start with LIBBPF. Also "declare" vs "define" is confusing terminology, I can never remember and always have to look up previous examples. Bypass both issues by renaming DECLARE_LIBBPF_OPTS into a short and clean LIBBPF_OPTS. To avoid breaking existing code, provide: #define DECLARE_LIBBPF_OPTS LIBBPF_OPTS in libbpf_legacy.h. We can decide later if we ever want to remove it or we'll keep it forever because it doesn't add any maintainability burden. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20211103220845.2676888-2-andrii@kernel.org	2021-11-12 23:46:09 -08:00
Andrii Nakryiko	26e768783c	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 8388092b2551f7ae34dad800ce828779f7c948c9 Checkpoint bpf-next commit: b8b5cb55f5d3f03cc1479a3768d68173a10359ad Baseline bpf commit: c08455dec5acf4668f5d1eb099f7fedb29f2de5f Checkpoint bpf commit: 47b3708c6088a60e7dc3b809dbb0d4c46590b32f Andrii Nakryiko (7): libbpf: Detect corrupted ELF symbols section libbpf: Improve sanity checking during BTF fix up libbpf: Validate that .BTF and .BTF.ext sections contain data libbpf: Fix section counting logic libbpf: Improve ELF relo sanitization libbpf: Deprecate bpf_program__load() API libbpf: Fix non-C89 loop variable declaration in gen_loader.c Mehrdad Arshad Rad (1): libbpf: Fix lookup_and_delete_elem_flags error reporting src/bpf.c \| 4 ++- src/gen_loader.c \| 3 +- src/libbpf.c \| 79 +++++++++++++++++++++++++++++++----------------- src/libbpf.h \| 4 +-- 4 files changed, 59 insertions(+), 31 deletions(-) -- 2.30.2	2021-11-06 19:33:03 -07:00
Mehrdad Arshad Rad	88209a3c44	libbpf: Fix lookup_and_delete_elem_flags error reporting Fix bpf_map_lookup_and_delete_elem_flags() to pass the return code through libbpf_err_errno() as we do similarly in bpf_map_lookup_and_delete_elem(). Fixes: f12b65432728 ("libbpf: Streamline error reporting for low-level APIs") Signed-off-by: Mehrdad Arshad Rad <arshad.rad@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20211104171354.11072-1-arshad.rad@gmail.com	2021-11-06 19:33:03 -07:00
Andrii Nakryiko	2ab2615926	libbpf: Fix non-C89 loop variable declaration in gen_loader.c Fix the `int i` declaration inside the for statement. This is non-C89 compliant. See [0] for user report breaking BCC build. [0] https://github.com/libbpf/libbpf/issues/403 Fixes: 18f4fccbf314 ("libbpf: Update gen_loader to emit BTF_KIND_FUNC relocations") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/bpf/20211105191055.3324874-1-andrii@kernel.org	2021-11-06 19:33:03 -07:00
Andrii Nakryiko	c03b183a6e	libbpf: Deprecate bpf_program__load() API Mark bpf_program__load() as deprecated ([0]) since v0.6. Also rename few internal program loading bpf_object helper functions to have more consistent naming. [0] Closes: https://github.com/libbpf/libbpf/issues/301 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211103051449.1884903-1-andrii@kernel.org	2021-11-06 19:33:03 -07:00
Andrii Nakryiko	36cc591ac8	libbpf: Improve ELF relo sanitization Add few sanity checks for relocations to prevent div-by-zero and out-of-bounds array accesses in libbpf. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20211103173213.1376990-6-andrii@kernel.org	2021-11-06 19:33:03 -07:00
Andrii Nakryiko	3acf7c289a	libbpf: Fix section counting logic e_shnum does include section #0 and as such is exactly the number of ELF sections that we need to allocate memory for to use section indices as array indices. Fix the off-by-one error. This is purely accounting fix, previously we were overallocating one too many array items. But no correctness errors otherwise. Fixes: 25bbbd7a444b ("libbpf: Remove assumptions about uniqueness of .rodata/.data/.bss maps") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20211103173213.1376990-5-andrii@kernel.org	2021-11-06 19:33:03 -07:00
Andrii Nakryiko	a383b3e200	libbpf: Validate that .BTF and .BTF.ext sections contain data .BTF and .BTF.ext ELF sections should have SHT_PROGBITS type and contain data. If they are not, ELF is invalid or corrupted, so bail out. Otherwise this can lead to data->d_buf being NULL and SIGSEGV later on. Reported by oss-fuzz project. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20211103173213.1376990-4-andrii@kernel.org	2021-11-06 19:33:03 -07:00
Andrii Nakryiko	2f52e2afc0	libbpf: Improve sanity checking during BTF fix up If BTF is corrupted DATASEC's variable type ID might be incorrect. Prevent this easy to detect situation with extra NULL check. Reported by oss-fuzz project. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20211103173213.1376990-3-andrii@kernel.org	2021-11-06 19:33:03 -07:00
Andrii Nakryiko	738277b773	libbpf: Detect corrupted ELF symbols section Prevent divide-by-zero if ELF is corrupted and has zero sh_entsize. Reported by oss-fuzz project. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20211103173213.1376990-2-andrii@kernel.org	2021-11-06 19:33:03 -07:00
Andrii Nakryiko	16dfb4ffe4	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 36e70b9b06bf14a0fac87315f0e73d6e17e80aad Checkpoint bpf-next commit: 8388092b2551f7ae34dad800ce828779f7c948c9 Baseline bpf commit: 72f898ca0ab85fde6facf78b14d9f67a4a7b32d1 Checkpoint bpf commit: c08455dec5acf4668f5d1eb099f7fedb29f2de5f Dave Marchevsky (1): libbpf: Deprecate bpf_program__get_prog_info_linear Joanne Koong (1): bpf: Add alignment padding for "map_extra" + consolidate holes Magnus Karlsson (1): libbpf: Deprecate AF_XDP support include/uapi/linux/bpf.h \| 1 + src/libbpf.h \| 3 ++ src/xsk.h \| 90 +++++++++++++++++++++++----------------- 3 files changed, 56 insertions(+), 38 deletions(-) -- 2.30.2	2021-11-03 16:00:04 -07:00
Dave Marchevsky	826770613d	libbpf: Deprecate bpf_program__get_prog_info_linear As part of the road to libbpf 1.0, and discussed in libbpf issue tracker [0], bpf_program__get_prog_info_linear and its associated structs and helper functions should be deprecated. The functionality is too specific to the needs of 'perf', and there's little/no out-of-tree usage to preclude introduction of a more general helper in the future. [0] Closes: https://github.com/libbpf/libbpf/issues/313 Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211101224357.2651181-5-davemarchevsky@fb.com	2021-11-03 16:00:04 -07:00
Magnus Karlsson	277846bc6c	libbpf: Deprecate AF_XDP support Deprecate AF_XDP support in libbpf ([0]). This has been moved to libxdp as it is a better fit for that library. The AF_XDP support only uses the public libbpf functions and can therefore just use libbpf as a library from libxdp. The libxdp APIs are exactly the same so it should just be linking with libxdp instead of libbpf for the AF_XDP functionality. If not, please submit a bug report. Linking with both libraries is supported but make sure you link in the correct order so that the new functions in libxdp are used instead of the deprecated ones in libbpf. Libxdp can be found at https://github.com/xdp-project/xdp-tools. [0] Closes: https://github.com/libbpf/libbpf/issues/270 Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20211029090111.4733-1-magnus.karlsson@gmail.com	2021-11-03 16:00:04 -07:00
Joanne Koong	1e97e84c86	bpf: Add alignment padding for "map_extra" + consolidate holes This patch makes 2 changes regarding alignment padding for the "map_extra" field. 1) In the kernel header, "map_extra" and "btf_value_type_id" are rearranged to consolidate the hole. Before: struct bpf_map { ... u32 max_entries; /* 36 4 / u32 map_flags; / 40 4 / / XXX 4 bytes hole, try to pack / u64 map_extra; / 48 8 / int spin_lock_off; / 56 4 / int timer_off; / 60 4 / / --- cacheline 1 boundary (64 bytes) --- / u32 id; / 64 4 / int numa_node; / 68 4 / ... bool frozen; / 117 1 / / XXX 10 bytes hole, try to pack / / --- cacheline 2 boundary (128 bytes) --- / ... struct work_struct work; / 144 72 / / --- cacheline 3 boundary (192 bytes) was 24 bytes ago --- / struct mutex freeze_mutex; / 216 144 / / --- cacheline 5 boundary (320 bytes) was 40 bytes ago --- / u64 writecnt; / 360 8 / / size: 384, cachelines: 6, members: 26 / / sum members: 354, holes: 2, sum holes: 14 / / padding: 16 / / forced alignments: 2, forced holes: 1, sum forced holes: 10 / } __attribute__((__aligned__(64))); After: struct bpf_map { ... u32 max_entries; / 36 4 / u64 map_extra; / 40 8 / u32 map_flags; / 48 4 / int spin_lock_off; / 52 4 / int timer_off; / 56 4 / u32 id; / 60 4 / / --- cacheline 1 boundary (64 bytes) --- / int numa_node; / 64 4 / ... bool frozen / 113 1 / / XXX 14 bytes hole, try to pack / / --- cacheline 2 boundary (128 bytes) --- / ... struct work_struct work; / 144 72 / / --- cacheline 3 boundary (192 bytes) was 24 bytes ago --- / struct mutex freeze_mutex; / 216 144 / / --- cacheline 5 boundary (320 bytes) was 40 bytes ago --- / u64 writecnt; / 360 8 / / size: 384, cachelines: 6, members: 26 / / sum members: 354, holes: 1, sum holes: 14 / / padding: 16 / / forced alignments: 2, forced holes: 1, sum forced holes: 14 */ } __attribute__((__aligned__(64))); 2) Add alignment padding to the bpf_map_info struct More details can be found in commit 36f9814a494a ("bpf: fix uapi hole for 32 bit compat applications") Signed-off-by: Joanne Koong <joannekoong@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20211029224909.1721024-3-joannekoong@fb.com	2021-11-03 16:00:04 -07:00
Andrii Nakryiko	17d7f04e7c	include: add BPF_ALU32_IMM macro implementation BPF_ALU32_IMM is now used in gen_loader.c. Add its definition to include/linux/filter.h header. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-11-01 15:10:25 -07:00
Andrii Nakryiko	c4f9ee9fbb	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: c825f5fee19caf301d9821cd79abaa734322de26 Checkpoint bpf-next commit: 36e70b9b06bf14a0fac87315f0e73d6e17e80aad Baseline bpf commit: 04f8ef5643bcd8bcde25dfdebef998aea480b2ba Checkpoint bpf commit: 72f898ca0ab85fde6facf78b14d9f67a4a7b32d1 Andrii Nakryiko (4): libbpf: Fix off-by-one bug in bpf_core_apply_relo() libbpf: Add ability to fetch bpf_program's underlying instructions libbpf: Deprecate multi-instance bpf_program APIs libbpf: Deprecate ambiguously-named bpf_program__size() API Björn Töpel (1): riscv, libbpf: Add RISC-V (RV64) support to bpf_tracing.h Ilya Leoshkevich (2): libbpf: Fix endianness detection in BPF_CORE_READ_BITFIELD_PROBED() libbpf: Use __BYTE_ORDER__ Joanne Koong (2): bpf: Add bloom filter map implementation libbpf: Add "map_extra" as a per-map-type extra flag Joe Burton (1): libbpf: Deprecate bpf_objects_list Kumar Kartikeya Dwivedi (5): bpf: Add bpf_kallsyms_lookup_name helper libbpf: Add typeless ksym support to gen_loader libbpf: Add weak ksym support to gen_loader libbpf: Ensure that BPF syscall fds are never 0, 1, or 2 libbpf: Use O_CLOEXEC uniformly when opening fds include/uapi/linux/bpf.h \| 25 ++++++++ src/bpf.c \| 62 ++++++++++++++---- src/bpf_core_read.h \| 2 +- src/bpf_gen_internal.h \| 14 ++-- src/bpf_tracing.h \| 32 ++++++++++ src/btf.c \| 6 +- src/btf_dump.c \| 8 +-- src/gen_loader.c \| 135 ++++++++++++++++++++++++++++++++++----- src/libbpf.c \| 105 +++++++++++++++++++++--------- src/libbpf.h \| 49 ++++++++++++-- src/libbpf.map \| 4 ++ src/libbpf_internal.h \| 49 +++++++++++++- src/libbpf_legacy.h \| 6 ++ src/libbpf_probes.c \| 2 +- src/linker.c \| 16 ++--- src/relo_core.c \| 2 +- src/xsk.c \| 6 +- 17 files changed, 432 insertions(+), 91 deletions(-) -- 2.30.2	2021-11-01 15:10:25 -07:00
Andrii Nakryiko	6fd2ee5486	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2021-11-01 15:10:25 -07:00
Björn Töpel	7beaa2ef90	riscv, libbpf: Add RISC-V (RV64) support to bpf_tracing.h Add macros for 64-bit RISC-V PT_REGS to bpf_tracing.h. Signed-off-by: Björn Töpel <bjorn@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211028161057.520552-4-bjorn@kernel.org	2021-11-01 15:10:25 -07:00
Kumar Kartikeya Dwivedi	a0195b3078	libbpf: Use O_CLOEXEC uniformly when opening fds There are some instances where we don't use O_CLOEXEC when opening an fd, fix these up. Otherwise, it is possible that a parallel fork causes these fds to leak into a child process on execve. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211028063501.2239335-6-memxor@gmail.com	2021-11-01 15:10:25 -07:00
Kumar Kartikeya Dwivedi	bedab00b50	libbpf: Ensure that BPF syscall fds are never 0, 1, or 2 Add a simple wrapper for passing an fd and getting a new one >= 3 if it is one of 0, 1, or 2. There are two primary reasons to make this change: First, libbpf relies on the assumption a certain BPF fd is never 0 (e.g. most recently noticed in [0]). Second, Alexei pointed out in [1] that some environments reset stdin, stdout, and stderr if they notice an invalid fd at these numbers. To protect against both these cases, switch all internal BPF syscall wrappers in libbpf to always return an fd >= 3. We only need to modify the syscall wrappers and not other code that assumes a valid fd by doing >= 0, to avoid pointless churn, and because it is still a valid assumption. The cost paid is two additional syscalls if fd is in range [0, 2]. [0]: e31eec77e4ab ("bpf: selftests: Fix fd cleanup in get_branch_snapshot") [1]: https://lore.kernel.org/bpf/CAADnVQKVKY8o_3aU8Gzke443+uHa-eGoM0h7W4srChMXU1S4Bg@mail.gmail.com Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211028063501.2239335-5-memxor@gmail.com	2021-11-01 15:10:25 -07:00
Kumar Kartikeya Dwivedi	c95bf6714d	libbpf: Add weak ksym support to gen_loader This extends existing ksym relocation code to also support relocating weak ksyms. Care needs to be taken to zero out the src_reg (currently BPF_PSEUOD_BTF_ID, always set for gen_loader by bpf_object__relocate_data) when the BTF ID lookup fails at runtime. This is not a problem for libbpf as it only sets ext->is_set when BTF ID lookup succeeds (and only proceeds in case of failure if ext->is_weak, leading to src_reg remaining as 0 for weak unresolved ksym). A pattern similar to emit_relo_kfunc_btf is followed of first storing the default values and then jumping over actual stores in case of an error. For src_reg adjustment, we also need to perform it when copying the populated instruction, so depending on if copied insn[0].imm is 0 or not, we decide to jump over the adjustment. We cannot reach that point unless the ksym was weak and resolved and zeroed out, as the emit_check_err will cause us to jump to cleanup label, so we do not need to recheck whether the ksym is weak before doing the adjustment after copying BTF ID and BTF FD. This is consistent with how libbpf relocates weak ksym. Logging statements are added to show the relocation result and aid debugging. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211028063501.2239335-4-memxor@gmail.com	2021-11-01 15:10:25 -07:00
Kumar Kartikeya Dwivedi	8e697cf9fd	libbpf: Add typeless ksym support to gen_loader This uses the bpf_kallsyms_lookup_name helper added in previous patches to relocate typeless ksyms. The return value ENOENT can be ignored, and the value written to 'res' can be directly stored to the insn, as it is overwritten to 0 on lookup failure. For repeating symbols, we can simply copy the previously populated bpf_insn. Also, we need to take care to not close fds for typeless ksym_desc, so reuse the 'off' member's space to add a marker for typeless ksym and use that to skip them in cleanup_relos. We add a emit_ksym_relo_log helper that avoids duplicating common logging instructions between typeless and weak ksym (for future commit). Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211028063501.2239335-3-memxor@gmail.com	2021-11-01 15:10:25 -07:00
Kumar Kartikeya Dwivedi	1dd20d7144	bpf: Add bpf_kallsyms_lookup_name helper This helper allows us to get the address of a kernel symbol from inside a BPF_PROG_TYPE_SYSCALL prog (used by gen_loader), so that we can relocate typeless ksym vars. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20211028063501.2239335-2-memxor@gmail.com	2021-11-01 15:10:25 -07:00
Joanne Koong	28c8a2c179	libbpf: Add "map_extra" as a per-map-type extra flag This patch adds the libbpf infrastructure for supporting a per-map-type "map_extra" field, whose definition will be idiosyncratic depending on map type. For example, for the bloom filter map, the lower 4 bits of map_extra is used to denote the number of hash functions. Please note that until libbpf 1.0 is here, the "bpf_create_map_params" struct is used as a temporary means for propagating the map_extra field to the kernel. Signed-off-by: Joanne Koong <joannekoong@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211027234504.30744-3-joannekoong@fb.com	2021-11-01 15:10:25 -07:00
Joanne Koong	11f873fd5b	bpf: Add bloom filter map implementation This patch adds the kernel-side changes for the implementation of a bpf bloom filter map. The bloom filter map supports peek (determining whether an element is present in the map) and push (adding an element to the map) operations.These operations are exposed to userspace applications through the already existing syscalls in the following way: BPF_MAP_LOOKUP_ELEM -> peek BPF_MAP_UPDATE_ELEM -> push The bloom filter map does not have keys, only values. In light of this, the bloom filter map's API matches that of queue stack maps: user applications use BPF_MAP_LOOKUP_ELEM/BPF_MAP_UPDATE_ELEM which correspond internally to bpf_map_peek_elem/bpf_map_push_elem, and bpf programs must use the bpf_map_peek_elem and bpf_map_push_elem APIs to query or add an element to the bloom filter map. When the bloom filter map is created, it must be created with a key_size of 0. For updates, the user will pass in the element to add to the map as the value, with a NULL key. For lookups, the user will pass in the element to query in the map as the value, with a NULL key. In the verifier layer, this requires us to modify the argument type of a bloom filter's BPF_FUNC_map_peek_elem call to ARG_PTR_TO_MAP_VALUE; as well, in the syscall layer, we need to copy over the user value so that in bpf_map_peek_elem, we know which specific value to query. A few things to please take note of: * If there are any concurrent lookups + updates, the user is responsible for synchronizing this to ensure no false negative lookups occur. * The number of hashes to use for the bloom filter is configurable from userspace. If no number is specified, the default used will be 5 hash functions. The benchmarks later in this patchset can help compare the performance of using different number of hashes on different entry sizes. In general, using more hashes decreases both the false positive rate and the speed of a lookup. * Deleting an element in the bloom filter map is not supported. * The bloom filter map may be used as an inner map. * The "max_entries" size that is specified at map creation time is used to approximate a reasonable bitmap size for the bloom filter, and is not otherwise strictly enforced. If the user wishes to insert more entries into the bloom filter than "max_entries", they may do so but they should be aware that this may lead to a higher false positive rate. Signed-off-by: Joanne Koong <joannekoong@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211027234504.30744-2-joannekoong@fb.com	2021-11-01 15:10:25 -07:00
Joe Burton	50041f432d	libbpf: Deprecate bpf_objects_list Add a flag to `enum libbpf_strict_mode' to disable the global `bpf_objects_list', preventing race conditions when concurrent threads call bpf_object__open() or bpf_object__close(). bpf_object__next() will return NULL if this option is set. Callers may achieve the same workflow by tracking bpf_objects in application code. [0] Closes: https://github.com/libbpf/libbpf/issues/293 Signed-off-by: Joe Burton <jevburton@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211026223528.413950-1-jevburton.kernel@gmail.com	2021-11-01 15:10:25 -07:00
Ilya Leoshkevich	87a9622982	libbpf: Use __BYTE_ORDER__ Use the compiler-defined __BYTE_ORDER__ instead of the libc-defined __BYTE_ORDER for consistency. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211026010831.748682-3-iii@linux.ibm.com	2021-11-01 15:10:25 -07:00
Ilya Leoshkevich	5b732fc1d8	libbpf: Fix endianness detection in BPF_CORE_READ_BITFIELD_PROBED() __BYTE_ORDER is supposed to be defined by a libc, and __BYTE_ORDER__ - by a compiler. bpf_core_read.h checks __BYTE_ORDER == __LITTLE_ENDIAN, which is true if neither are defined, leading to incorrect behavior on big-endian hosts if libc headers are not included, which is often the case. Fixes: ee26dade0e3b ("libbpf: Add support for relocatable bitfields") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211026010831.748682-2-iii@linux.ibm.com	2021-11-01 15:10:25 -07:00
Andrii Nakryiko	ffc5139acd	libbpf: Deprecate ambiguously-named bpf_program__size() API The name of the API doesn't convey clearly that this size is in number of bytes (there needed to be a separate comment to make this clear in libbpf.h). Further, measuring the size of BPF program in bytes is not exactly the best fit, because BPF programs always consist of 8-byte instructions. As such, bpf_program__insn_cnt() is a better alternative in pretty much any imaginable case. So schedule bpf_program__size() deprecation starting from v0.7 and it will be removed in libbpf 1.0. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211025224531.1088894-5-andrii@kernel.org	2021-11-01 15:10:25 -07:00
Andrii Nakryiko	cfbdceb99d	libbpf: Deprecate multi-instance bpf_program APIs Schedule deprecation of a set of APIs that are related to multi-instance bpf_programs: - bpf_program__set_prep() ([0]); - bpf_program__{set,unset}_instance() ([1]); - bpf_program__nth_fd(). These APIs are obscure, very niche, and don't seem to be used much in practice. bpf_program__set_prep() is pretty useless for anything but the simplest BPF programs, as it doesn't allow to adjust BPF program load attributes, among other things. In short, it already bitrotted and will bitrot some more if not removed. With bpf_program__insns() API, which gives access to post-processed BPF program instructions of any given entry-point BPF program, it's now possible to do whatever necessary adjustments were possible with set_prep() API before, but also more. Given any such use case is automatically an advanced use case, requiring users to stick to low-level bpf_prog_load() APIs and managing their own prog FDs is reasonable. [0] Closes: https://github.com/libbpf/libbpf/issues/299 [1] Closes: https://github.com/libbpf/libbpf/issues/300 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211025224531.1088894-4-andrii@kernel.org	2021-11-01 15:10:25 -07:00
Andrii Nakryiko	9871f15dd6	libbpf: Add ability to fetch bpf_program's underlying instructions Add APIs providing read-only access to bpf_program BPF instructions ([0]). This is useful for diagnostics purposes, but it also allows a cleaner support for cloning BPF programs after libbpf did all the FD resolution and CO-RE relocations, subprog instructions appending, etc. Currently, cloning BPF program is possible only through hijacking a half-broken bpf_program__set_prep() API, which doesn't really work well for anything but most primitive programs. For instance, set_prep() API doesn't allow adjusting BPF program load parameters which are necessary for loading fentry/fexit BPF programs (the case where BPF program cloning is a necessity if doing some sort of mass-attachment functionality). Given bpf_program__set_prep() API is set to be deprecated, having a cleaner alternative is a must. libbpf internally already keeps track of linear array of struct bpf_insn, so it's not hard to expose it. The only gotcha is that libbpf previously freed instructions array during bpf_object load time, which would make this API much less useful overall, because in between bpf_object__open() and bpf_object__load() a lot of changes to instructions are done by libbpf. So this patch makes libbpf hold onto prog->insns array even after BPF program loading. I think this is a small price for added functionality and improved introspection of BPF program code. See retsnoop PR ([1]) for how it can be used in practice and code savings compared to relying on bpf_program__set_prep(). [0] Closes: https://github.com/libbpf/libbpf/issues/298 [1] https://github.com/anakryiko/retsnoop/pull/1 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211025224531.1088894-3-andrii@kernel.org	2021-11-01 15:10:25 -07:00
Andrii Nakryiko	93c109c9ee	libbpf: Fix off-by-one bug in bpf_core_apply_relo() Fix instruction index validity check which has off-by-one error. Fixes: 3ee4f5335511 ("libbpf: Split bpf_core_apply_relo() into bpf_program independent helper.") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211025224531.1088894-2-andrii@kernel.org	2021-11-01 15:10:25 -07:00
Quentin Monnet	eaea2bce02	sync: remove redundant test on $BPF_BRANCH The sync-kernel.sh script has two consecutive tests for $BPF_BRANCH being provided by the user (and so the second one can currently never fail). Looking at the error message displayed in each case, we want to keep the second one. Let's remove the first check. Signed-off-by: Quentin Monnet <quentin@isovalent.com>	2021-10-26 14:39:07 -07:00
Quentin Monnet	f05791d8cf	sync: fix comment for commit_signature() (subject instead of hash) The commit_signature() function does not use the hash of the commit, which typically differs between the kernel repo and the mirrored version, but the subject for this commit. Fix the comment accordingly. Signed-off-by: Quentin Monnet <quentin@isovalent.com>	2021-10-25 13:29:16 -07:00
Andrii Nakryiko	2bb8f041b0	README: add links to BPF CO-RE reference guide Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-10-24 16:19:38 -07:00
Evgeny Vereshchagin	50ae3acfe9	[ci] turn on CIFuzz https://google.github.io/oss-fuzz/getting-started/continuous-integration/ Signed-off-by: Evgeny Vereshchagin <evvers@ya.ru>	2021-10-22 18:41:23 -07:00
Andrii Nakryiko	07ba0eeb8e	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 5319255b8df9271474bc9027cabf82253934f28d Checkpoint bpf-next commit: c825f5fee19caf301d9821cd79abaa734322de26 Baseline bpf commit: 8d6c414cd2fb74aa6812e9bfec6178f8246c4f3a Checkpoint bpf commit: 04f8ef5643bcd8bcde25dfdebef998aea480b2ba Andrii Nakryiko (9): libbpf: Deprecate btf__finalize_data() and move it into libbpf.c libbpf: Extract ELF processing state into separate struct libbpf: Use Elf64-specific types explicitly for dealing with ELF libbpf: Remove assumptions about uniqueness of .rodata/.data/.bss maps libbpf: Support multiple .rodata.* and .data.* BPF maps libbpf: Simplify look up by name of internal maps libbpf: Fix the use of aligned attribute libbpf: Fix overflow in BTF sanity checks libbpf: Fix BTF header parsing checks Dave Marchevsky (2): libbpf: Migrate internal use of bpf_program__get_prog_info_linear bpf: Add verified_insns to bpf_prog_info and fdinfo Hengqi Chen (2): bpf: Add bpf_skc_to_unix_sock() helper libbpf: Add btf__type_cnt() and btf__raw_data() APIs Ilya Leoshkevich (3): libbpf: Fix dumping big-endian bitfields libbpf: Fix dumping non-aligned __int128 libbpf: Fix ptr_is_aligned() usages Mauricio Vásquez (1): libbpf: Fix memory leak in btf__dedup() Stanislav Fomichev (1): libbpf: Use func name when pinning programs with LIBBPF_STRICT_SEC_NAME Yonghong Song (1): bpf: Rename BTF_KIND_TAG to BTF_KIND_DECL_TAG include/uapi/linux/bpf.h \| 8 + include/uapi/linux/btf.h \| 8 +- src/btf.c \| 187 +++----- src/btf.h \| 17 +- src/btf_dump.c \| 56 ++- src/libbpf.c \| 984 ++++++++++++++++++++++++--------------- src/libbpf.map \| 4 +- src/libbpf_internal.h \| 12 +- src/libbpf_legacy.h \| 3 + src/linker.c \| 29 +- 10 files changed, 744 insertions(+), 564 deletions(-) -- 2.30.2	2021-10-22 18:38:55 -07:00
Andrii Nakryiko	b15d479ef7	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2021-10-22 18:38:55 -07:00
Andrii Nakryiko	d374094d8c	libbpf: Fix BTF header parsing checks Original code assumed fixed and correct BTF header length. That's not always the case, though, so fix this bug with a proper additional check. And use actual header length instead of sizeof(struct btf_header) in sanity checks. Fixes: 8a138aed4a80 ("bpf: btf: Add BTF support to libbpf") Reported-by: Evgeny Vereshchagin <evvers@ya.ru> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211023003157.726961-2-andrii@kernel.org	2021-10-22 18:38:55 -07:00
Andrii Nakryiko	19d260d144	libbpf: Fix overflow in BTF sanity checks btf_header's str_off+str_len or type_off+type_len can overflow as they are u32s. This will lead to bypassing the sanity checks during BTF parsing, resulting in crashes afterwards. Fix by using 64-bit signed integers for comparison. Fixes: d8123624506c ("libbpf: Fix BTF data layout checks and allow empty BTF") Reported-by: Evgeny Vereshchagin <evvers@ya.ru> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211023003157.726961-1-andrii@kernel.org	2021-10-22 18:38:55 -07:00
Stanislav Fomichev	f1558d7a23	libbpf: Use func name when pinning programs with LIBBPF_STRICT_SEC_NAME We can't use section name anymore because they are not unique and pinning objects with multiple programs with the same progtype/secname will fail. [0] Closes: https://github.com/libbpf/libbpf/issues/273 Fixes: 33a2c75c55e2 ("libbpf: add internal pin_name") Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20211021214814.1236114-2-sdf@google.com	2021-10-22 18:38:55 -07:00
Hengqi Chen	596c9a2d77	libbpf: Add btf__type_cnt() and btf__raw_data() APIs Add btf__type_cnt() and btf__raw_data() APIs and deprecate btf__get_nr_type() and btf__get_raw_data() since the old APIs don't follow the libbpf naming convention for getters which omit 'get' in the name (see [0]). btf__raw_data() is just an alias to the existing btf__get_raw_data(). btf__type_cnt() now returns the number of all types of the BTF object including 'void'. [0] Closes: https://github.com/libbpf/libbpf/issues/279 Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211022130623.1548429-2-hengqi.chen@gmail.com	2021-10-22 18:38:55 -07:00
Mauricio Vásquez	eb10610a3b	libbpf: Fix memory leak in btf__dedup() Free btf_dedup if btf_ensure_modifiable() returns error. Fixes: 919d2b1dbb07 ("libbpf: Allow modification of BTF and add btf__add_str API") Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211022202035.48868-1-mauricio@kinvolk.io	2021-10-22 18:38:55 -07:00
Andrii Nakryiko	d7982f3948	libbpf: Fix the use of aligned attribute Building libbpf sources out of kernel tree (in Github repo) we run into compilation error due to unknown __aligned attribute. It must be coming from some kernel header, which is not available to Github sources. Use explicit __attribute__((aligned(16))) instead. Fixes: 961632d54163 ("libbpf: Fix dumping non-aligned __int128") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211022192502.2975553-1-andrii@kernel.org	2021-10-22 18:38:55 -07:00
Andrii Nakryiko	76b4bf4295	libbpf: Simplify look up by name of internal maps Map name that's assigned to internal maps (.rodata, .data, .bss, etc) consist of a small prefix of bpf_object's name and ELF section name as a suffix. This makes it hard for users to "guess" the name to use for looking up by name with bpf_object__find_map_by_name() API. One proposal was to drop object name prefix from the map name and just use ".rodata", ".data", etc, names. One downside called out was that when multiple BPF applications are active on the host, it will be hard to distinguish between multiple instances of .rodata and know which BPF object (app) they belong to. Having few first characters, while quite limiting, still can give a bit of a clue, in general. Note, though, that btf_value_type_id for such global data maps (ARRAY) points to DATASEC type, which encodes full ELF name, so tools like bpftool can take advantage of this fact to "recover" full original name of the map. This is also the reason why for custom .data.* and .rodata.* maps libbpf uses only their ELF names and doesn't prepend object name at all. Another downside of such approach is that it is not backwards compatible and, among direct use of bpf_object__find_map_by_name() API, will break any BPF skeleton generated using bpftool that was compiled with older libbpf version. Instead of causing all this pain, libbpf will still generate map name using a combination of object name and ELF section name, but it will allow looking such maps up by their natural names, which correspond to their respective ELF section names. This means non-truncated ELF section names longer than 15 characters are going to be expected and supported. With such set up, we get the best of both worlds: leave small bits of a clue about BPF application that instantiated such maps, as well as making it easy for user apps to lookup such maps at runtime. In this sense it closes corresponding libbpf 1.0 issue ([0]). BPF skeletons will continue using full names for lookups. [0] Closes: https://github.com/libbpf/libbpf/issues/275 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20211021014404.2635234-10-andrii@kernel.org	2021-10-22 18:38:55 -07:00
Andrii Nakryiko	5bf62459b1	libbpf: Support multiple .rodata.* and .data.* BPF maps Add support for having multiple .rodata and .data data sections ([0]). .rodata/.data are supported like the usual, but now also .rodata.<whatever> and .data.<whatever> are also supported. Each such section will get its own backing BPF_MAP_TYPE_ARRAY, just like .rodata and .data. Multiple .bss maps are not supported, as the whole '.bss' name is confusing and might be deprecated soon, as well as user would need to specify custom ELF section with SEC() attribute anyway, so might as well stick to just .data.* and .rodata.* convention. User-visible map name for such new maps is going to be just their ELF section names. [0] https://github.com/libbpf/libbpf/issues/274 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20211021014404.2635234-8-andrii@kernel.org	2021-10-22 18:38:55 -07:00
Andrii Nakryiko	421213a052	libbpf: Remove assumptions about uniqueness of .rodata/.data/.bss maps Remove internal libbpf assumption that there can be only one .rodata, .data, and .bss map per BPF object. To achieve that, extend and generalize the scheme that was used for keeping track of relocation ELF sections. Now each ELF section has a temporary extra index that keeps track of logical type of ELF section (relocations, data, read-only data, BSS). Switch relocation to this scheme, as well as .rodata/.data/.bss handling. We don't yet allow multiple .rodata, .data, and .bss sections, but no libbpf internal code makes an assumption that there can be only one of each and thus they can be explicitly referenced by a single index. Next patches will actually allow multiple .rodata and .data sections. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20211021014404.2635234-5-andrii@kernel.org	2021-10-22 18:38:55 -07:00
Andrii Nakryiko	4cafbf7527	libbpf: Use Elf64-specific types explicitly for dealing with ELF Minimize the usage of class-agnostic gelf_xxx() APIs from libelf. These APIs require copying ELF data structures into local GElf_xxx structs and have a more cumbersome API. BPF ELF file is defined to be always 64-bit ELF object, even when intended to be run on 32-bit host architectures, so there is no need to do class-agnostic conversions everywhere. BPF static linker implementation within libbpf has been using Elf64-specific types since initial implementation. Add two simple helpers, elf_sym_by_idx() and elf_rel_by_idx(), for more succinct direct access to ELF symbol and relocation records within ELF data itself and switch all the GElf_xxx usage into Elf64_xxx equivalents. The only remaining place within libbpf.c that's still using gelf API is gelf_getclass(), as there doesn't seem to be a direct way to get underlying ELF bitness. No functional changes intended. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20211021014404.2635234-4-andrii@kernel.org	2021-10-22 18:38:55 -07:00
Andrii Nakryiko	f687443178	libbpf: Extract ELF processing state into separate struct Name currently anonymous internal struct that keeps ELF-related state for bpf_object. Just a bit of clean up, no functional changes. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20211021014404.2635234-3-andrii@kernel.org	2021-10-22 18:38:55 -07:00
Andrii Nakryiko	38fb8cfc0c	libbpf: Deprecate btf__finalize_data() and move it into libbpf.c There isn't a good use case where anyone but libbpf itself needs to call btf__finalize_data(). It was implemented for internal use and it's not clear why it was made into public API in the first place. To function, it requires active ELF data, which is stored inside bpf_object for the duration of opening phase only. But the only BTF that needs bpf_object's ELF is that bpf_object's BTF itself, which libbpf fixes up automatically during bpf_object__open() operation anyways. There is no need for any additional fix up and no reasonable scenario where it's useful and appropriate. Thus, btf__finalize_data() is just an API atavism and is better removed. So this patch marks it as deprecated immediately (v0.6+) and moves the code from btf.c into libbpf.c where it's used in the context of bpf_object opening phase. Such code co-location allows to make code structure more straightforward and remove bpf_object__section_size() and bpf_object__variable_offset() internal helpers from libbpf_internal.h, making them static. Their naming is also adjusted to not create a wrong illusion that they are some sort of method of bpf_object. They are internal helpers and are called appropriately. This is part of libbpf 1.0 effort ([0]). [0] Closes: https://github.com/libbpf/libbpf/issues/276 Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20211021014404.2635234-2-andrii@kernel.org	2021-10-22 18:38:55 -07:00
Dave Marchevsky	45115706b6	bpf: Add verified_insns to bpf_prog_info and fdinfo This stat is currently printed in the verifier log and not stored anywhere. To ease consumption of this data, add a field to bpf_prog_aux so it can be exposed via BPF_OBJ_GET_INFO_BY_FD and fdinfo. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20211020074818.1017682-2-davemarchevsky@fb.com	2021-10-22 18:38:55 -07:00
Ilya Leoshkevich	bde69b0ee0	libbpf: Fix ptr_is_aligned() usages Currently ptr_is_aligned() takes size, and not alignment, as a parameter, which may be overly pessimistic e.g. for __i128 on s390, which must be only 8-byte aligned. Fix by using btf__align_of(). Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211021104658.624944-2-iii@linux.ibm.com	2021-10-22 18:38:55 -07:00
Hengqi Chen	19c6144c09	bpf: Add bpf_skc_to_unix_sock() helper The helper is used in tracing programs to cast a socket pointer to a unix_sock pointer. The return value could be NULL if the casting is illegal. Suggested-by: Yonghong Song <yhs@fb.com> Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20211021134752.1223426-2-hengqi.chen@gmail.com	2021-10-22 18:38:55 -07:00
Ilya Leoshkevich	760c39208c	libbpf: Fix dumping non-aligned __int128 Non-aligned integers are dumped as bitfields, which is supported for at most 64-bit integers. Fix by using the same trick as btf_dump_float_data(): copy non-aligned values to the local buffer. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211013160902.428340-4-iii@linux.ibm.com	2021-10-22 18:38:55 -07:00
Ilya Leoshkevich	fa93001e85	libbpf: Fix dumping big-endian bitfields On big-endian arches not only bytes, but also bits are numbered in reverse order (see e.g. S/390 ELF ABI Supplement, but this is also true for other big-endian arches as well). Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211013160902.428340-3-iii@linux.ibm.com	2021-10-22 18:38:55 -07:00
Dave Marchevsky	50028712c4	libbpf: Migrate internal use of bpf_program__get_prog_info_linear In preparation for bpf_program__get_prog_info_linear deprecation, move the single use in libbpf to call bpf_obj_get_info_by_fd directly. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211011082031.4148337-2-davemarchevsky@fb.com	2021-10-22 18:38:55 -07:00
Yonghong Song	0f3ba10651	bpf: Rename BTF_KIND_TAG to BTF_KIND_DECL_TAG Patch set [1] introduced BTF_KIND_TAG to allow tagging declarations for struct/union, struct/union field, var, func and func arguments and these tags will be encoded into dwarf. They are also encoded to btf by llvm for the bpf target. After BTF_KIND_TAG is introduced, we intended to use it for kernel __user attributes. But kernel __user is actually a type attribute. Upstream and internal discussion showed it is not a good idea to mix declaration attribute and type attribute. So we proposed to introduce btf_type_tag as a type attribute and existing btf_tag renamed to btf_decl_tag ([2]). This patch renamed BTF_KIND_TAG to BTF_KIND_DECL_TAG and some other declarations with _tag to _decl_tag to make it clear the tag is for declaration. In the future, BTF_KIND_TYPE_TAG might be introduced per [3]. [1] https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/ [2] https://reviews.llvm.org/D111588 [3] https://reviews.llvm.org/D111199 Fixes: b5ea834dde6b ("bpf: Support for new btf kind BTF_KIND_TAG") Fixes: 5b84bd10363e ("libbpf: Add support for BTF_KIND_TAG") Fixes: 5c07f2fec003 ("bpftool: Add support for BTF_KIND_TAG") Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com	2021-10-22 18:38:55 -07:00
Evgeny Vereshchagin	ebf17ac628	README: add OSS-Fuzz badge Signed-off-by: Evgeny Vereshchagin <evvers@ya.ru>	2021-10-22 09:20:25 -07:00
Evgeny Vereshchagin	06390e2371	[coverity] skip forks to stop GitHub from sending notifications about failed attemtps to run the Coverity workflow without `COVERITY_SCAN_TOKEN` like https://github.com/evverx/libbpf/actions/runs/1364759947 Signed-off-by: Evgeny Vereshchagin <evvers@ya.ru>	2021-10-21 16:13:30 -07:00
Andrii Nakryiko	92c1e61a60	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 0e545dbaa2797133f57bf8387e8f74cd245cedea Checkpoint bpf-next commit: 5319255b8df9271474bc9027cabf82253934f28d Baseline bpf commit: d0c6416bd7091647f6041599f396bfa19ae30368 Checkpoint bpf commit: 8d6c414cd2fb74aa6812e9bfec6178f8246c4f3a Hou Tao (1): libbpf: Support detecting and attaching of writable tracepoint program src/libbpf.c \| 26 +++++++++++++++++++++----- 1 file changed, 21 insertions(+), 5 deletions(-) -- 2.30.2	2021-10-11 12:30:36 -07:00
Hou Tao	133e3603ec	libbpf: Support detecting and attaching of writable tracepoint program Program on writable tracepoint is BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE, but its attachment is the same as BPF_PROG_TYPE_RAW_TRACEPOINT. Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211004094857.30868-3-hotforest@gmail.com	2021-10-11 12:30:36 -07:00
Andrii Nakryiko	f9f6e92458	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 38261f369fb905552ebdd3feb9699c0788fd3371 Checkpoint bpf-next commit: 0e545dbaa2797133f57bf8387e8f74cd245cedea Baseline bpf commit: 571fa247ab411f3233eeaaf837c6e646a513b9f8 Checkpoint bpf commit: d0c6416bd7091647f6041599f396bfa19ae30368 Alexei Starovoitov (1): libbpf: Make gen_loader data aligned. Andrii Nakryiko (2): libbpf: Add API that copies all BTF types from one BTF object to another libbpf: Fix memory leak in strset Grant Seltzer (1): libbpf: Add API documentation convention guidelines Hengqi Chen (3): libbpf: Support uniform BTF-defined key/value specification across all BPF maps libbpf: Deprecate bpf_object__unload() API since v0.6 libbpf: Deprecate bpf_{map,program}__{prev,next} APIs since v0.7 Kumar Kartikeya Dwivedi (5): libbpf: Fix skel_internal.h to set errno on loader retval < 0 libbpf: Support kernel module function calls libbpf: Resolve invalid weak kfunc calls with imm = 0, off = 0 libbpf: Update gen_loader to emit BTF_KIND_FUNC relocations libbpf: Fix segfault in light skeleton for objects without BTF Toke Høiland-Jørgensen (1): libbpf: Properly ignore STT_SECTION symbols in legacy map definitions docs/libbpf_naming_convention.rst \| 40 ++++ src/bpf.c \| 1 + src/bpf_gen_internal.h \| 16 +- src/btf.c \| 132 ++++++++++++- src/btf.h \| 22 +++ src/gen_loader.c \| 317 +++++++++++++++++++++++++----- src/libbpf.c \| 165 ++++++++++++---- src/libbpf.h \| 36 ++-- src/libbpf.map \| 5 + src/libbpf_internal.h \| 3 + src/skel_internal.h \| 6 +- src/strset.c \| 1 + 12 files changed, 633 insertions(+), 111 deletions(-) -- 2.30.2	2021-10-06 14:40:18 -07:00
Andrii Nakryiko	b05bace770	libbpf: Fix memory leak in strset Free struct strset itself, not just its internal parts. Fixes: 90d76d3ececc ("libbpf: Extract internal set-of-strings datastructure APIs") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20211001185910.86492-1-andrii@kernel.org	2021-10-06 14:40:18 -07:00
Kumar Kartikeya Dwivedi	7022527a7b	libbpf: Fix segfault in light skeleton for objects without BTF When fed an empty BPF object, bpftool gen skeleton -L crashes at btf__set_fd() since it assumes presence of obj->btf, however for the sequence below clang adds no .BTF section (hence no BTF). Reproducer: $ touch a.bpf.c $ clang -O2 -g -target bpf -c a.bpf.c $ bpftool gen skeleton -L a.bpf.o /* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) / / THIS FILE IS AUTOGENERATED! */ struct a_bpf { struct bpf_loader_ctx ctx; Segmentation fault (core dumped) The same occurs for files compiled without BTF info, i.e. without clang's -g flag. Fixes: 67234743736a (libbpf: Generate loader program out of BPF ELF file.) Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210930061634.1840768-1-memxor@gmail.com	2021-10-06 14:40:18 -07:00
Hengqi Chen	d3169df794	libbpf: Deprecate bpf_{map,program}__{prev,next} APIs since v0.7 Deprecate bpf_{map,program}__{prev,next} APIs. Replace them with a new set of APIs named bpf_object__{prev,next}_{program,map} which follow the libbpf API naming convention ([0]). No functionality changes. [0] Closes: https://github.com/libbpf/libbpf/issues/296 Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20211003165844.4054931-2-hengqi.chen@gmail.com	2021-10-06 14:40:18 -07:00
Hengqi Chen	d665ca0bb0	libbpf: Deprecate bpf_object__unload() API since v0.6 BPF objects are not reloadable after unload. Users are expected to use bpf_object__close() to unload and free up resources in one operation. No need to expose bpf_object__unload() as a public API, deprecate it ([0]). Add bpf_object__unload() as an alias to internal bpf_object_unload() and replace all bpf_object__unload() uses to avoid compilation errors. [0] Closes: https://github.com/libbpf/libbpf/issues/290 Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20211002161000.3854559-1-hengqi.chen@gmail.com	2021-10-06 14:40:18 -07:00
Grant Seltzer	744fd961c7	libbpf: Add API documentation convention guidelines This adds a section to the documentation for libbpf naming convention which describes how to document API features in libbpf, specifically the format of which API doc comments need to conform to. Signed-off-by: Grant Seltzer <grantseltzer@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20211004215644.497327-1-grantseltzer@gmail.com	2021-10-06 14:40:18 -07:00
Andrii Nakryiko	13ebb60ab6	libbpf: Add API that copies all BTF types from one BTF object to another Add a bulk copying api, btf__add_btf(), that speeds up and simplifies appending entire contents of one BTF object to another one, taking care of copying BTF type data, adjusting resulting BTF type IDs according to their new locations in the destination BTF object, as well as copying and deduplicating all the referenced strings and updating all the string offsets in new BTF types as appropriate. This API is intended to be used from tools that are generating and otherwise manipulating BTFs generically, such as pahole. In pahole's case, this API is useful for speeding up parallelized BTF encoding, as it allows pahole to offload all the intricacies of BTF type copying to libbpf and handle the parallelization aspects of the process. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Link: https://lore.kernel.org/bpf/20211006051107.17921-2-andrii@kernel.org	2021-10-06 14:40:18 -07:00
Kumar Kartikeya Dwivedi	3298393748	libbpf: Update gen_loader to emit BTF_KIND_FUNC relocations This change updates the BPF syscall loader to relocate BTF_KIND_FUNC relocations, with support for weak kfunc relocations. The general idea is to move map_fds to loader map, and also use the data for storing kfunc BTF fds. Since both reuse the fd_array parameter, they need to be kept together. For map_fds, we reserve MAX_USED_MAPS slots in a region, and for kfunc, we reserve MAX_KFUNC_DESCS. This is done so that insn->off has more chances of being <= INT16_MAX than treating data map as a sparse array and adding fd as needed. When the MAX_KFUNC_DESCS limit is reached, we fall back to the sparse array model, so that as long as it does remain <= INT16_MAX, we pass an index relative to the start of fd_array. We store all ksyms in an array where we try to avoid calling the bpf_btf_find_by_name_kind helper, and also reuse the BTF fd that was already stored. This also speeds up the loading process compared to emitting calls in all cases, in later tests. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211002011757.311265-9-memxor@gmail.com	2021-10-06 14:40:18 -07:00
Kumar Kartikeya Dwivedi	0141d9dded	libbpf: Resolve invalid weak kfunc calls with imm = 0, off = 0 Preserve these calls as it allows verifier to succeed in loading the program if they are determined to be unreachable after dead code elimination during program load. If not, the verifier will fail at runtime. This is done for ext->is_weak symbols similar to the case for variable ksyms. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211002011757.311265-8-memxor@gmail.com	2021-10-06 14:40:18 -07:00
Kumar Kartikeya Dwivedi	7dde8f8f8d	libbpf: Support kernel module function calls This patch adds libbpf support for kernel module function call support. The fd_array parameter is used during BPF program load to pass module BTFs referenced by the program. insn->off is set to index into this array, but starts from 1, because insn->off as 0 is reserved for btf_vmlinux. We try to use existing insn->off for a module, since the kernel limits the maximum distinct module BTFs for kfuncs to 256, and also because index must never exceed the maximum allowed value that can fit in insn->off (INT16_MAX). In the future, if kernel interprets signed offset as unsigned for kfunc calls, this limit can be increased to UINT16_MAX. Also introduce a btf__find_by_name_kind_own helper to start searching from module BTF's start id when we know that the BTF ID is not present in vmlinux BTF (in find_ksym_btf_id). Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211002011757.311265-7-memxor@gmail.com	2021-10-06 14:40:18 -07:00
Hengqi Chen	962f241379	libbpf: Support uniform BTF-defined key/value specification across all BPF maps A bunch of BPF maps do not support specifying BTF types for key and value. This is non-uniform and inconvenient[0]. Currently, libbpf uses a retry logic which removes BTF type IDs when BPF map creation failed. Instead of retrying, this commit recognizes those specialized maps and removes BTF type IDs when creating BPF map. [0] Closes: https://github.com/libbpf/libbpf/issues/355 Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210930161456.3444544-2-hengqi.chen@gmail.com	2021-10-06 14:40:18 -07:00
Kumar Kartikeya Dwivedi	8c2d905ff4	libbpf: Fix skel_internal.h to set errno on loader retval < 0 When the loader indicates an internal error (result of a checked bpf system call), it returns the result in attr.test.retval. However, tests that rely on ASSERT_OK_PTR on NULL (returned from light skeleton) may miss that NULL denotes an error if errno is set to 0. This would result in skel pointer being NULL, while ASSERT_OK_PTR returning 1, leading to a SEGV on dereference of skel, because libbpf_get_error relies on the assumption that errno is always set in case of error for ptr == NULL. In particular, this was observed for the ksyms_module test. When executed using `./test_progs -t ksyms`, prior tests manipulated errno and the test didn't crash when it failed at ksyms_module load, while using `./test_progs -t ksyms_module` crashed due to errno being untouched. Fixes: 67234743736a (libbpf: Generate loader program out of BPF ELF file.) Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210927145941.1383001-11-memxor@gmail.com	2021-10-06 14:40:18 -07:00
Toke Høiland-Jørgensen	531be4d879	libbpf: Properly ignore STT_SECTION symbols in legacy map definitions The previous patch to ignore STT_SECTION symbols only added the ignore condition in one of them. This fails if there's more than one map definition in the 'maps' section, because the subsequent modulus check will fail, resulting in error messages like: libbpf: elf: unable to determine legacy map definition size in ./xdpdump_xdp.o Fix this by also ignoring STT_SECTION in the first loop. Fixes: c3e8c44a9063 ("libbpf: Ignore STT_SECTION symbols in 'maps' section") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210929213837.832449-1-toke@redhat.com	2021-10-06 14:40:18 -07:00
Alexei Starovoitov	f49c65d216	libbpf: Make gen_loader data aligned. Align gen_loader data to 8 byte boundary to make sure union bpf_attr, bpf_insns and other structs are aligned. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210927145941.1383001-9-memxor@gmail.com	2021-10-06 14:40:18 -07:00
Sergei Iudin	b0feb9b4d5	Remove caching of pahole test result Initial idea was to run it hourly, but when it runs hourly it produce a lot of useless noise and we had to switch it do daily. When we run it daily caching make much less sense and only make debugging more complex.	2021-10-04 11:20:32 -07:00
Andrii Nakryiko	e671a47bc2	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 69cd823956ba8ce266a901170b1060db8073bddd Checkpoint bpf-next commit: 38261f369fb905552ebdd3feb9699c0788fd3371 Baseline bpf commit: bc23f724481759d0fac61dfb5ce979af2190bbe0 Checkpoint bpf commit: 571fa247ab411f3233eeaaf837c6e646a513b9f8 Andrii Nakryiko (15): libbpf: Use pre-setup sec_def in libbpf_find_attach_btf_id() libbpf: Deprecated bpf_object_open_opts.relaxed_core_relocs libbpf: Allow skipping attach_func_name in bpf_program__set_attach_target() libbpf: Schedule open_opts.attach_prog_fd deprecation since v0.7 libbpf: Constify all high-level program attach APIs libbpf: Fix memory leak in legacy kprobe attach logic libbpf: Refactor and simplify legacy kprobe code libbpf: Add legacy uprobe attaching support libbpf: Add "tc" SEC_DEF which is a better name for "classifier" libbpf: Refactor internal sec_def handling to enable pluggability libbpf: Reduce reliance of attach_fns on sec_def internals libbpf: Refactor ELF section handler definitions libbpf: Complete SEC() table unification for BPF_APROG_SEC/BPF_EAPROG_SEC libbpf: Add opt-in strict BPF program section name handling logic selftests/bpf: Switch sk_lookup selftests to strict SEC("sk_lookup") use Dave Marchevsky (4): bpf: Add bpf_trace_vprintk helper libbpf: Modify bpf_printk to choose helper based on arg count libbpf: Use static const fmt string in __bpf_printk bpf: Clarify data_len param in bpf_snprintf and bpf_seq_printf comments Grant Seltzer (1): libbpf: Add doc comments in libbpf.h Kumar Kartikeya Dwivedi (1): libbpf: Fix segfault in static linker for objects without BTF Matteo Croce (1): bpf: Update bpf_get_smp_processor_id() documentation Toke Høiland-Jørgensen (1): libbpf: Ignore STT_SECTION symbols in 'maps' section include/uapi/linux/bpf.h \| 18 +- src/bpf_helpers.h \| 51 ++- src/libbpf.c \| 882 +++++++++++++++++++++++---------------- src/libbpf.h \| 106 +++-- src/libbpf_common.h \| 5 + src/libbpf_internal.h \| 7 + src/libbpf_legacy.h \| 9 + src/linker.c \| 8 +- 8 files changed, 679 insertions(+), 407 deletions(-) -- 2.30.2	2021-09-29 12:05:46 -07:00
Andrii Nakryiko	151e3cb314	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2021-09-29 12:05:46 -07:00
Kumar Kartikeya Dwivedi	2cfeea135c	libbpf: Fix segfault in static linker for objects without BTF When a BPF object is compiled without BTF info (without -g), trying to link such objects using bpftool causes a SIGSEGV due to btf__get_nr_types accessing obj->btf which is NULL. Fix this by checking for the NULL pointer, and return error. Reproducer: $ cat a.bpf.c extern int foo(void); int bar(void) { return foo(); } $ cat b.bpf.c int foo(void) { return 0; } $ clang -O2 -target bpf -c a.bpf.c $ clang -O2 -target bpf -c b.bpf.c $ bpftool gen obj out a.bpf.o b.bpf.o Segmentation fault (core dumped) After fix: $ bpftool gen obj out a.bpf.o b.bpf.o libbpf: failed to find BTF info for object 'a.bpf.o' Error: failed to link 'a.bpf.o': Unknown error -22 (-22) Fixes: a46349227cd8 (libbpf: Add linker extern resolution support for functions and global variables) Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210924023725.70228-1-memxor@gmail.com	2021-09-29 12:05:46 -07:00
Andrii Nakryiko	3fe8ee2edb	selftests/bpf: Switch sk_lookup selftests to strict SEC("sk_lookup") use Update "sk_lookup/" definition to be a stand-alone type specifier, with backwards-compatible prefix match logic in non-libbpf-1.0 mode. Currently in selftests all the "sk_lookup/<whatever>" uses just use <whatever> for duplicated unique name encoding, which is redundant as BPF program's name (C function name) uniquely and descriptively identifies the intended use for such BPF programs. With libbpf's SEC_DEF("sk_lookup") definition updated, switch existing sk_lookup programs to use "unqualified" SEC("sk_lookup") section names, with no random text after it. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20210928161946.2512801-11-andrii@kernel.org	2021-09-29 12:05:46 -07:00
Andrii Nakryiko	823881b4f6	libbpf: Add opt-in strict BPF program section name handling logic Implement strict ELF section name handling for BPF programs. It utilizes `libbpf_set_strict_mode()` framework and adds new flag: LIBBPF_STRICT_SEC_NAME. If this flag is set, libbpf will enforce exact section name matching for a lot of program types that previously allowed just partial prefix match. E.g., if previously SEC("xdp_whatever_i_want") was allowed, now in strict mode only SEC("xdp") will be accepted, which makes SEC("") definitions cleaner and more structured. SEC() now won't be used as yet another way to uniquely encode BPF program identifier (for that C function name is better and is guaranteed to be unique within bpf_object). Now SEC() is strictly BPF program type and, depending on program type, extra load/attach parameter specification. Libbpf completely supports multiple BPF programs in the same ELF section, so multiple BPF programs of the same type/specification easily co-exist together within the same bpf_object scope. Additionally, a new (for now internal) convention is introduced: section name that can be a stand-alone exact BPF program type specificator, but also could have extra parameters after '/' delimiter. An example of such section is "struct_ops", which can be specified by itself, but also allows to specify the intended operation to be attached to, e.g., "struct_ops/dctcp_init". Note, that "struct_ops_some_op" is not allowed. Such section definition is specified as "struct_ops+". This change is part of libbpf 1.0 effort ([0], [1]). [0] Closes: https://github.com/libbpf/libbpf/issues/271 [1] https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0#stricter-and-more-uniform-bpf-program-section-name-sec-handling Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20210928161946.2512801-10-andrii@kernel.org	2021-09-29 12:05:46 -07:00
Andrii Nakryiko	20c1aa11b1	libbpf: Complete SEC() table unification for BPF_APROG_SEC/BPF_EAPROG_SEC Complete SEC() table refactoring towards unified form by rewriting BPF_APROG_SEC and BPF_EAPROG_SEC definitions with SEC_DEF(SEC_ATTACHABLE_OPT) (for optional expected_attach_type) and SEC_DEF(SEC_ATTACHABLE) (mandatory expected_attach_type), respectively. Drop BPF_APROG_SEC, BPF_EAPROG_SEC, and BPF_PROG_SEC_IMPL macros after that, leaving SEC_DEF() macro as the only one used. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20210928161946.2512801-9-andrii@kernel.org	2021-09-29 12:05:46 -07:00
Andrii Nakryiko	62ea715f71	libbpf: Refactor ELF section handler definitions Refactor ELF section handler definitions table to use a set of flags and unified SEC_DEF() macro. This allows for more succinct and table-like set of definitions, and allows to more easily extend the logic without adding more verbosity (this is utilized in later patches in the series). This approach is also making libbpf-internal program pre-load callback not rely on bpf_sec_def definition, which demonstrates that future pluggable ELF section handlers will be able to achieve similar level of integration without libbpf having to expose extra types and APIs. For starters, update SEC_DEF() definitions and make them more succinct. Also convert BPF_PROG_SEC() and BPF_APROG_COMPAT() definitions to a common SEC_DEF() use. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20210928161946.2512801-8-andrii@kernel.org	2021-09-29 12:05:46 -07:00
Andrii Nakryiko	3c5c62097e	libbpf: Reduce reliance of attach_fns on sec_def internals Move closer to not relying on bpf_sec_def internals that won't be part of public API, when pluggable SEC() handlers will be allowed. Drop pre-calculated prefix length, and in various helpers don't rely on this prefix length availability. Also minimize reliance on knowing bpf_sec_def's prefix for few places where section prefix shortcuts are supported (e.g., tp vs tracepoint, raw_tp vs raw_tracepoint). Given checking some string for having a given string-constant prefix is such a common operation and so annoying to be done with pure C code, add a small macro helper, str_has_pfx(), and reuse it throughout libbpf.c where prefix comparison is performed. With __builtin_constant_p() it's possible to have a convenient helper that checks some string for having a given prefix, where prefix is either string literal (or compile-time known string due to compiler optimization) or just a runtime string pointer, which is quite convenient and saves a lot of typing and string literal duplication. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20210928161946.2512801-7-andrii@kernel.org	2021-09-29 12:05:46 -07:00
Andrii Nakryiko	fba5f02401	libbpf: Refactor internal sec_def handling to enable pluggability Refactor internals of libbpf to allow adding custom SEC() handling logic easily from outside of libbpf. To that effect, each SEC()-handling registration sets mandatory program type/expected attach type for a given prefix and can provide three callbacks called at different points of BPF program lifetime: - init callback for right after bpf_program is initialized and prog_type/expected_attach_type is set. This happens during bpf_object__open() step, close to the very end of constructing bpf_object, so all the libbpf APIs for querying and updating bpf_program properties should be available; - pre-load callback is called right before BPF_PROG_LOAD command is called in the kernel. This callbacks has ability to set both bpf_program properties, as well as program load attributes, overriding and augmenting the standard libbpf handling of them; - optional auto-attach callback, which makes a given SEC() handler support auto-attachment of a BPF program through bpf_program__attach() API and/or BPF skeletons <skel>__attach() method. Each callbacks gets a `long cookie` parameter passed in, which is specified during SEC() handling. This can be used by callbacks to lookup whatever additional information is necessary. This is not yet completely ready to be exposed to the outside world, mainly due to non-public nature of struct bpf_prog_load_params. Instead of making it part of public API, we'll wait until the planned low-level libbpf API improvements for BPF_PROG_LOAD and other typical bpf() syscall APIs, at which point we'll have a public, probably OPTS-based, way to fully specify BPF program load parameters, which will be used as an interface for custom pre-load callbacks. But this change itself is already a good first step to unify the BPF program hanling logic even within the libbpf itself. As one example, all the extra per-program type handling (sleepable bit, attach_btf_id resolution, unsetting optional expected attach type) is now more obvious and is gathered in one place. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/bpf/20210928161946.2512801-6-andrii@kernel.org	2021-09-29 12:05:46 -07:00
Andrii Nakryiko	bb833f8129	libbpf: Add "tc" SEC_DEF which is a better name for "classifier" As argued in [0], add "tc" ELF section definition for SCHED_CLS BPF program type. "classifier" is a misleading terminology and should be migrated away from. [0] https://lore.kernel.org/bpf/270e27b1-e5be-5b1c-b343-51bd644d0747@iogearbox.net/ Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210928161946.2512801-2-andrii@kernel.org	2021-09-29 12:05:46 -07:00
Toke Høiland-Jørgensen	cb12e83136	libbpf: Ignore STT_SECTION symbols in 'maps' section When parsing legacy map definitions, libbpf would error out when encountering an STT_SECTION symbol. This becomes a problem because some versions of binutils will produce SECTION symbols for every section when processing an ELF file, so BPF files run through 'strip' will end up with such symbols, making libbpf refuse to load them. There's not really any reason why erroring out is strictly necessary, so change libbpf to just ignore SECTION symbols when parsing the ELF. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210927205810.715656-1-toke@redhat.com	2021-09-29 12:05:46 -07:00
Andrii Nakryiko	09b528e847	libbpf: Add legacy uprobe attaching support Similarly to recently added legacy kprobe attach interface support through tracefs, support attaching uprobes using the legacy interface if host kernel doesn't support newer FD-based interface. For uprobes event name consists of "libbpf_" prefix, PID, sanitized binary path and offset within that binary. Structuraly the code is aligned with kprobe logic refactoring in previous patch. struct bpf_link_perf is re-used and all the same legacy_probe_name and legacy_is_retprobe fields are used to ensure proper cleanup on bpf_link__destroy(). Users should be aware, though, that on old kernels which don't support FD-based interface for kprobe/uprobe attachment, if the application crashes before bpf_link__destroy() is called, uprobe legacy events will be left in tracefs. This is the same limitation as with legacy kprobe interfaces. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210921210036.1545557-5-andrii@kernel.org	2021-09-29 12:05:46 -07:00
Andrii Nakryiko	9ecc0dcc19	libbpf: Refactor and simplify legacy kprobe code Refactor legacy kprobe handling code to follow the same logic as uprobe legacy logic added in the next patchs: - add append_to_file() helper that makes it simpler to work with tracefs file-based interface for creating and deleting probes; - move out probe/event name generation outside of the code that adds/removes it, which simplifies bookkeeping significantly; - change the probe name format to start with "libbpf_" prefix and include offset within kernel function; - switch 'unsigned long' to 'size_t' for specifying kprobe offsets, which is consistent with how uprobes define that, simplifies printf()-ing internally, and also avoids unnecessary complications on architectures where sizeof(long) != sizeof(void *). This patch also implicitly fixes the problem with invalid open() error handling present in poke_kprobe_events(), which (the function) this patch removes. Fixes: ca304b40c20d ("libbpf: Introduce legacy kprobe events support") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210921210036.1545557-4-andrii@kernel.org	2021-09-29 12:05:46 -07:00
Andrii Nakryiko	246b61780b	libbpf: Fix memory leak in legacy kprobe attach logic In some error scenarios legacy_probe string won't be free()'d. Fix this. This was reported by Coverity static analysis. Fixes: ca304b40c20d ("libbpf: Introduce legacy kprobe events support") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210921210036.1545557-2-andrii@kernel.org	2021-09-29 12:05:46 -07:00
Grant Seltzer	cdf14ff22b	libbpf: Add doc comments in libbpf.h This adds comments above functions in libbpf.h which document their uses. These comments are of a format that doxygen and sphinx can pick up and render. These are rendered by libbpf.readthedocs.org These doc comments are for: - bpf_object__find_map_by_name() - bpf_map__fd() - bpf_map__is_internal() - libbpf_get_error() - libbpf_num_possible_cpus() Signed-off-by: Grant Seltzer <grantseltzer@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210918031457.36204-1-grantseltzer@gmail.com	2021-09-29 12:05:46 -07:00
Dave Marchevsky	e1966114cc	bpf: Clarify data_len param in bpf_snprintf and bpf_seq_printf comments Since the data_len in these two functions is a byte len of the preceding u64 *data array, it must always be a multiple of 8. If this isn't the case both helpers error out, so let's make the requirement explicit so users don't need to infer it. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210917182911.2426606-10-davemarchevsky@fb.com	2021-09-29 12:05:46 -07:00
Dave Marchevsky	989d7189cd	libbpf: Use static const fmt string in __bpf_printk The __bpf_printk convenience macro was using a 'char' fmt string holder as it predates support for globals in libbpf. Move to more efficient 'static const char', but provide a fallback to the old way via BPF_NO_GLOBAL_DATA so users on old kernels can still use the macro. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210917182911.2426606-6-davemarchevsky@fb.com	2021-09-29 12:05:46 -07:00
Dave Marchevsky	18922504c3	libbpf: Modify bpf_printk to choose helper based on arg count Instead of being a thin wrapper which calls into bpf_trace_printk, libbpf's bpf_printk convenience macro now chooses between bpf_trace_printk and bpf_trace_vprintk. If the arg count (excluding format string) is >3, use bpf_trace_vprintk, otherwise use the older helper. The motivation behind this added complexity - instead of migrating entirely to bpf_trace_vprintk - is to maintain good developer experience for users compiling against new libbpf but running on older kernels. Users who are passing <=3 args to bpf_printk will see no change in their bytecode. __bpf_vprintk functions similarly to BPF_SEQ_PRINTF and BPF_SNPRINTF macros elsewhere in the file - it allows use of bpf_trace_vprintk without manual conversion of varargs to u64 array. Previous implementation of bpf_printk macro is moved to __bpf_printk for use by the new implementation. This does change behavior of bpf_printk calls with >3 args in the "new libbpf, old kernels" scenario. Before this patch, attempting to use 4 args to bpf_printk results in a compile-time error. After this patch, using bpf_printk with 4 args results in a trace_vprintk helper call being emitted and a load-time failure on older kernels. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210917182911.2426606-5-davemarchevsky@fb.com	2021-09-29 12:05:46 -07:00
Dave Marchevsky	029273039d	bpf: Add bpf_trace_vprintk helper This helper is meant to be "bpf_trace_printk, but with proper vararg support". Follow bpf_snprintf's example and take a u64 pseudo-vararg array. Write to /sys/kernel/debug/tracing/trace_pipe using the same mechanism as bpf_trace_printk. The functionality of this helper was requested in the libbpf issue tracker [0]. [0] Closes: https://github.com/libbpf/libbpf/issues/315 Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210917182911.2426606-4-davemarchevsky@fb.com	2021-09-29 12:05:46 -07:00
Andrii Nakryiko	b177fac2e2	libbpf: Constify all high-level program attach APIs Attach APIs shouldn't need to modify bpf_program/bpf_map structs, so change all struct bpf_program and struct bpf_map pointers to const pointers. This is completely backwards compatible with no functional change. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210916015836.1248906-8-andrii@kernel.org	2021-09-29 12:05:46 -07:00
Andrii Nakryiko	3e7c04669e	libbpf: Schedule open_opts.attach_prog_fd deprecation since v0.7 bpf_object_open_opts.attach_prog_fd makes a pretty strong assumption that bpf_object contains either only single freplace BPF program or all of BPF programs in BPF object are freplaces intended to replace different subprograms of the same target BPF program. This seems both a bit confusing, too assuming, and limiting. We've had bpf_program__set_attach_target() API which allows more fine-grained control over this, on a per-program level. As such, mark open_opts.attach_prog_fd as deprecated starting from v0.7, so that we have one more universal way of setting freplace targets. With previous change to allow NULL attach_func_name argument, and especially combined with BPF skeleton, arguable bpf_program__set_attach_target() is a more convenient and explicit API as well. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210916015836.1248906-7-andrii@kernel.org	2021-09-29 12:05:46 -07:00
Andrii Nakryiko	599b999f5d	libbpf: Allow skipping attach_func_name in bpf_program__set_attach_target() Allow to use bpf_program__set_attach_target to only set target attach program FD, while letting libbpf to use target attach function name from SEC() definition. This might be useful for some scenarios where bpf_object contains multiple related freplace BPF programs intended to replace different sub-programs in target BPF program. In such case all programs will have the same attach_prog_fd, but different attach_func_name. It's convenient to specify such target function names declaratively in SEC() definitions, but attach_prog_fd is a dynamic runtime setting. To simplify such scenario, allow bpf_program__set_attach_target() to delay BTF ID resolution till the BPF program load time by providing NULL attach_func_name. In that case the behavior will be similar to using bpf_object_open_opts.attach_prog_fd (which is marked deprecated since v0.7), but has the benefit of allowing more control by user in what is attached to what. Such setup allows having BPF programs attached to different target attach_prog_fd with target functions still declaratively recorded in BPF source code in SEC() definitions. Selftests changes in the next patch should make this more obvious. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210916015836.1248906-5-andrii@kernel.org	2021-09-29 12:05:46 -07:00
Andrii Nakryiko	e856a7d560	libbpf: Deprecated bpf_object_open_opts.relaxed_core_relocs It's relevant and hasn't been doing anything for a long while now. Deprecated it. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210916015836.1248906-4-andrii@kernel.org	2021-09-29 12:05:46 -07:00
Andrii Nakryiko	0a76ce1395	libbpf: Use pre-setup sec_def in libbpf_find_attach_btf_id() Don't perform another search for sec_def inside libbpf_find_attach_btf_id(), as each recognized bpf_program already has prog->sec_def set. Also remove unnecessary NULL check for prog->sec_name, as it can never be NULL. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210916015836.1248906-2-andrii@kernel.org	2021-09-29 12:05:46 -07:00
Matteo Croce	1eba3a6470	bpf: Update bpf_get_smp_processor_id() documentation BPF programs run with migration disabled regardless of preemption, as they are protected by migrate_disable(). Update the uapi documentation accordingly. Signed-off-by: Matteo Croce <mcroce@microsoft.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210914235400.59427-1-mcroce@linux.microsoft.com	2021-09-29 12:05:46 -07:00
Andrii Nakryiko	7d365e49f3	ci: use -a/-d with exact string match for black/white-listing Doing substring matches allows accidental new tests to be enabled, when they are not supposed to be. E.g., whitelisting "xdp" allows new "xdpwall" test on 5.5.0, which wasn't supposed to happen. Cc: Yucong Sun <fallentree@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-09-29 11:15:38 -07:00
Andrii Nakryiko	980777cc16	vmtest: temporarily blacklist spammy selftests There is a problem in bpf-next tree which causes get_stack_raw_tp and few other selftests to produce tons of kernel warnings, timing out and failing CI test runs. Blacklist until bpf tree, which has a fix, is merged into bpf-next. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-09-17 14:03:39 -07:00
Andrii Nakryiko	24dbcb3a30	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 47bb27a20d6ea22cd092c1fc2bb4fcecac374838 Checkpoint bpf-next commit: 69cd823956ba8ce266a901170b1060db8073bddd Baseline bpf commit: 3776f3517ed94d40ff0e3851d7ce2ce17b63099f Checkpoint bpf commit: bc23f724481759d0fac61dfb5ce979af2190bbe0 Andrii Nakryiko (4): libbpf: Make libbpf_version.h non-auto-generated libbpf: Ensure BPF prog types are set before relocations libbpf: Simplify BPF program auto-attach code libbpf: Minimize explicit iterator of section definition array Grant Seltzer (1): libbpf: Add sphinx code documentation comments Quentin Monnet (1): libbpf: Add LIBBPF_DEPRECATED_SINCE macro for scheduling API deprecations Rafael David Tinoco (1): libbpf: Introduce legacy kprobe events support Rocco Yue (1): ipv6: add IFLA_INET6_RA_MTU to expose mtu value Song Liu (1): bpf: Introduce helper bpf_get_branch_snapshot Vadim Fedorenko (1): bpf: Add hardware timestamp field to __sk_buff Yonghong Song (4): btf: Change BTF_KIND_* macros to enums bpf: Support for new btf kind BTF_KIND_TAG libbpf: Rename btf_{hash,equal}_int to btf_{hash,equal}_int_tag libbpf: Add support for BTF_KIND_TAG include/uapi/linux/bpf.h \| 24 +++ include/uapi/linux/btf.h \| 55 ++++-- include/uapi/linux/if_link.h \| 1 + src/btf.c \| 84 +++++++- src/btf.h \| 87 +++++++++ src/btf_dump.c \| 3 + src/libbpf.c \| 359 ++++++++++++++++++++++++----------- src/libbpf.map \| 5 + src/libbpf_common.h \| 19 ++ src/libbpf_internal.h \| 2 + src/libbpf_version.h \| 9 + 11 files changed, 505 insertions(+), 143 deletions(-) create mode 100644 src/libbpf_version.h -- 2.30.2	2021-09-17 14:03:39 -07:00
Andrii Nakryiko	42da89eb16	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2021-09-17 14:03:39 -07:00
Grant Seltzer	627cbb395b	libbpf: Add sphinx code documentation comments This adds comments above five functions in btf.h which document their uses. These comments are of a format that doxygen and sphinx can pick up and render. These are rendered by libbpf.readthedocs.org Signed-off-by: Grant Seltzer <grantseltzer@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210915021951.117186-1-grantseltzer@gmail.com	2021-09-17 14:03:39 -07:00
Yonghong Song	7e7f59d658	libbpf: Add support for BTF_KIND_TAG Add BTF_KIND_TAG support for parsing and dedup. Also added sanitization for BTF_KIND_TAG. If BTF_KIND_TAG is not supported in the kernel, sanitize it to INTs. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210914223025.246687-1-yhs@fb.com	2021-09-17 14:03:39 -07:00
Yonghong Song	966ba8918d	libbpf: Rename btf_{hash,equal}_int to btf_{hash,equal}_int_tag This patch renames functions btf_{hash,equal}_int() to btf_{hash,equal}_int_tag() so they can be reused for BTF_KIND_TAG support. There is no functionality change for this patch. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210914223020.245829-1-yhs@fb.com	2021-09-17 14:03:39 -07:00
Yonghong Song	fad7357469	bpf: Support for new btf kind BTF_KIND_TAG LLVM14 added support for a new C attribute ([1]) __attribute__((btf_tag("arbitrary_str"))) This attribute will be emitted to dwarf ([2]) and pahole will convert it to BTF. Or for bpf target, this attribute will be emitted to BTF directly ([3], [4]). The attribute is intended to provide additional information for - struct/union type or struct/union member - static/global variables - static/global function or function parameter. For linux kernel, the btf_tag can be applied in various places to specify user pointer, function pre- or post- condition, function allow/deny in certain context, etc. Such information will be encoded in vmlinux BTF and can be used by verifier. The btf_tag can also be applied to bpf programs to help global verifiable functions, e.g., specifying preconditions, etc. This patch added basic parsing and checking support in kernel for new BTF_KIND_TAG kind. [1] https://reviews.llvm.org/D106614 [2] https://reviews.llvm.org/D106621 [3] https://reviews.llvm.org/D106622 [4] https://reviews.llvm.org/D109560 Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210914223015.245546-1-yhs@fb.com	2021-09-17 14:03:39 -07:00
Yonghong Song	a6188fc5b4	btf: Change BTF_KIND_* macros to enums Change BTF_KIND_* macros to enums so they are encoded in dwarf and appear in vmlinux.h. This will make it easier for bpf programs to use these constants without macro definitions. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210914223009.245307-1-yhs@fb.com	2021-09-17 14:03:39 -07:00
Andrii Nakryiko	d328ba7768	libbpf: Minimize explicit iterator of section definition array Remove almost all the code that explicitly iterated BPF program section definitions in favor of using find_sec_def(). The only remaining user of section_defs is libbpf_get_type_names that has to iterate all of them to construct its result. Having one internal API entry point for section definitions will simplify further refactorings around libbpf's program section definitions parsing. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210914014733.2768-5-andrii@kernel.org	2021-09-17 14:03:39 -07:00
Andrii Nakryiko	27d14b6e3b	libbpf: Simplify BPF program auto-attach code Remove the need to explicitly pass bpf_sec_def for auto-attachable BPF programs, as it is already recorded at bpf_object__open() time for all recognized type of BPF programs. This further reduces number of explicit calls to find_sec_def(), simplifying further refactorings. No functional changes are done by this patch. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210914014733.2768-4-andrii@kernel.org	2021-09-17 14:03:39 -07:00
Andrii Nakryiko	f3c744997f	libbpf: Ensure BPF prog types are set before relocations Refactor bpf_object__open() sequencing to perform BPF program type detection based on SEC() definitions before we get to relocations collection. This allows to have more information about BPF program by the time we get to, say, struct_ops relocation gathering. This, subsequently, simplifies struct_ops logic and removes the need to perform extra find_sec_def() resolution. With this patch libbpf will require all struct_ops BPF programs to be marked with SEC("struct_ops") or SEC("struct_ops/xxx") annotations. Real-world applications are already doing that through something like selftests's BPF_STRUCT_OPS() macro. This change streamlines libbpf's internal handling of SEC() definitions and is in the sprit of upcoming libbpf-1.0 section strictness changes ([0]). [0] https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0#stricter-and-more-uniform-bpf-program-section-name-sec-handling Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210914014733.2768-3-andrii@kernel.org	2021-09-17 14:03:39 -07:00
Rafael David Tinoco	749b3942a0	libbpf: Introduce legacy kprobe events support Allow kprobe tracepoint events creation through legacy interface, as the kprobe dynamic PMUs support, used by default, was only created in v4.17. Store legacy kprobe name in struct bpf_perf_link, instead of creating a new "subclass" off of bpf_perf_link. This is ok as it's just two new fields, which are also going to be reused for legacy uprobe support in follow up patches. Signed-off-by: Rafael David Tinoco <rafaeldtinoco@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210912064844.3181742-1-rafaeldtinoco@gmail.com	2021-09-17 14:03:39 -07:00
Andrii Nakryiko	8ade99a6f8	libbpf: Make libbpf_version.h non-auto-generated Turn previously auto-generated libbpf_version.h header into a normal header file. This prevents various tricky Makefile integration issues, simplifies the overall build process, but also allows to further extend it with some more versioning-related APIs in the future. To prevent accidental out-of-sync versions as defined by libbpf.map and libbpf_version.h, Makefile checks their consistency at build time. Simultaneously with this change bump libbpf.map to v0.6. Also undo adding libbpf's output directory into include path for kernel/bpf/preload, bpftool, and resolve_btfids, which is not necessary because libbpf_version.h is just a normal header like any other. Fixes: 0b46b7550560 ("libbpf: Add LIBBPF_DEPRECATED_SINCE macro for scheduling API deprecations") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210913222309.3220849-1-andrii@kernel.org	2021-09-17 14:03:39 -07:00
Song Liu	0c0f4a57da	bpf: Introduce helper bpf_get_branch_snapshot Introduce bpf_get_branch_snapshot(), which allows tracing pogram to get branch trace from hardware (e.g. Intel LBR). To use the feature, the user need to create perf_event with proper branch_record filtering on each cpu, and then calls bpf_get_branch_snapshot in the bpf function. On Intel CPUs, VLBR event (raw event 0x1b00) can be use for this. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210910183352.3151445-3-songliubraving@fb.com	2021-09-17 14:03:39 -07:00
Vadim Fedorenko	03f31a6aed	bpf: Add hardware timestamp field to __sk_buff BPF programs may want to know hardware timestamps if NIC supports such timestamping. Expose this data as hwtstamp field of __sk_buff the same way as gso_segs/gso_size. This field could be accessed from the same programs as tstamp field, but it's read-only field. Explicit test to deny access to padding data is added to bpf_skb_is_valid_access. Also update BPF_PROG_TEST_RUN tests of the feature. Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210909220409.8804-2-vfedorenko@novek.ru	2021-09-17 14:03:39 -07:00
Quentin Monnet	f8983f7fb0	libbpf: Add LIBBPF_DEPRECATED_SINCE macro for scheduling API deprecations Introduce a macro LIBBPF_DEPRECATED_SINCE(major, minor, message) to prepare the deprecation of two API functions. This macro marks functions as deprecated when libbpf's version reaches the values passed as an argument. As part of this change libbpf_version.h header is added with recorded major (LIBBPF_MAJOR_VERSION) and minor (LIBBPF_MINOR_VERSION) libbpf version macros. They are now part of libbpf public API and can be relied upon by user code. libbpf_version.h is installed system-wide along other libbpf public headers. Due to this new build-time auto-generated header, in-kernel applications relying on libbpf (resolve_btfids, bpftool, bpf_preload) are updated to include libbpf's output directory as part of a list of include search paths. Better fix would be to use libbpf's make_install target to install public API headers, but that clean up is left out as a future improvement. The build changes were tested by building kernel (with KBUILD_OUTPUT and O= specified explicitly), bpftool, libbpf, selftests/bpf, and resolve_btfids builds. No problems were detected. Note that because of the constraints of the C preprocessor we have to write a few lines of macro magic for each version used to prepare deprecation (0.6 for now). Also, use LIBBPF_DEPRECATED_SINCE() to schedule deprecation of btf__get_from_id() and btf__load(), which are replaced by btf__load_from_kernel_by_id() and btf__load_into_kernel(), respectively, starting from future libbpf v0.6. This is part of libbpf 1.0 effort ([0]). [0] Closes: https://github.com/libbpf/libbpf/issues/278 Co-developed-by: Quentin Monnet <quentin@isovalent.com> Co-developed-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210908213226.1871016-1-andrii@kernel.org	2021-09-17 14:03:39 -07:00
Rocco Yue	514bb47ac5	ipv6: add IFLA_INET6_RA_MTU to expose mtu value The kernel provides a "/proc/sys/net/ipv6/conf/<iface>/mtu" file, which can temporarily record the mtu value of the last received RA message when the RA mtu value is lower than the interface mtu, but this proc has following limitations: (1) when the interface mtu (/sys/class/net/<iface>/mtu) is updeated, mtu6 (/proc/sys/net/ipv6/conf/<iface>/mtu) will be updated to the value of interface mtu; (2) mtu6 (/proc/sys/net/ipv6/conf/<iface>/mtu) only affect ipv6 connection, and not affect ipv4. Therefore, when the mtu option is carried in the RA message, there will be a problem that the user sometimes cannot obtain RA mtu value correctly by reading mtu6. After this patch set, if a RA message carries the mtu option, you can send a netlink msg which nlmsg_type is RTM_GETLINK, and then by parsing the attribute of IFLA_INET6_RA_MTU to get the mtu value carried in the RA message received on the inet6 device. In addition, you can also get a link notification when ra_mtu is updated so it doesn't have to poll. In this way, if the MTU values that the device receives from the network in the PCO IPv4 and the RA IPv6 procedures are different, the user can obtain the correct ipv6 ra_mtu value and compare the value of ra_mtu and ipv4 mtu, then the device can use the lower MTU value for both IPv4 and IPv6. Signed-off-by: Rocco Yue <rocco.yue@mediatek.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20210827150412.9267-1-rocco.yue@mediatek.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-09-17 14:03:39 -07:00
Andrii Nakryiko	05d95ef6fa	makefile: add libbpf_version.h to the list of installed headers libbpf_version.h is a new header that is installed along other public libbpf API headers. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-09-17 14:03:39 -07:00
Andrii Nakryiko	eda1ebe520	ci: use test_progs -l option to generate the list of whitelisted tests Use this list of enabled tests as a whitelist, so that we don't have to keep updating BLACKLIST-5.5.0 anymore. I'll keep BLACKLIST-5.5.0 for now, because it serves as a nice historic log of which tests depend on which kernels. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-09-17 14:03:39 -07:00
grantseltzer	7c26fe30f3	Add enum to be displayed in documentation Signed-off-by: grantseltzer <grantseltzer@gmail.com>	2021-09-16 11:39:22 -07:00
Andrii Nakryiko	5579664205	libbpf: Fix build with latest gcc/binutils with LTO After updating to binutils 2.35, the build began to fail with an assembler error. A bug was opened on the Red Hat Bugzilla a few days later for the same issue. Work around the problem by using the new `symver` attribute (introduced in GCC 10) as needed instead of assembler directives. This addresses Red Hat ([0]) and OpenSUSE ([1]) bug reports, as well as libbpf issue ([2]). [0]: https://bugzilla.redhat.com/show_bug.cgi?id=1863059 [1]: https://bugzilla.opensuse.org/show_bug.cgi?id=1188749 [2]: Closes: https://github.com/libbpf/libbpf/issues/338 Co-developed-by: Patrick McCarty <patrick.mccarty@intel.com> Co-developed-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Patrick McCarty <patrick.mccarty@intel.com> Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210907221023.2660953-1-andrii@kernel.org	2021-09-08 16:01:51 -07:00
Matt Smith	860b201cd0	libbpf: Change bpf_object_skeleton data field to const pointer This change was necessary to enforce the implied contract that bpf_object_skeleton->data should not be mutated. The data will be cast to `void ` during assignment to handle the case where a user is compiling with older libbpf headers to avoid a compiler warning of `const void ` data being cast to `void *` Signed-off-by: Matt Smith <alastorze@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210901194439.3853238-2-alastorze@fb.com	2021-09-08 16:01:51 -07:00
Toke Høiland-Jørgensen	50e13993a1	libbpf: Don't crash on object files with no symbol tables If libbpf encounters an ELF file that has been stripped of its symbol table, it will crash in bpf_object__add_programs() when trying to dereference the obj->efile.symbols pointer. Fix this by erroring out of bpf_object__elf_collect() if it is not able able to find the symbol table. v2: - Move check into bpf_object__elf_collect() and add nice error message Fixes: 6245947c1b3c ("libbpf: Allow gaps in BPF program sections to support overriden weak functions") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210901114812.204720-1-toke@redhat.com	2021-09-08 16:01:51 -07:00
Andrii Nakryiko	72fd44da53	ci: blacklist few new selftests for 5.5 Blacklist netns_cookie, sockopt_qos_to_cc, task_pt_regs for 5.5. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-08-31 14:21:44 -07:00
Andrii Nakryiko	d6e9681b0d	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: d20b41115ad53293201cc07ee429a38740cb056b Checkpoint bpf-next commit: 47bb27a20d6ea22cd092c1fc2bb4fcecac374838 Baseline bpf commit: 3776f3517ed94d40ff0e3851d7ce2ce17b63099f Checkpoint bpf commit: 3776f3517ed94d40ff0e3851d7ce2ce17b63099f Daniel Xu (1): bpf: Add bpf_task_pt_regs() helper Dave Marchevsky (1): bpf: Migrate cgroup_bpf to internal cgroup_bpf_attach_type enum include/uapi/linux/bpf.h \| 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) -- 2.30.2	2021-08-31 14:21:44 -07:00
Andrii Nakryiko	517762deca	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2021-08-31 14:21:44 -07:00
Daniel Xu	1670e6100b	bpf: Add bpf_task_pt_regs() helper The motivation behind this helper is to access userspace pt_regs in a kprobe handler. uprobe's ctx is the userspace pt_regs. kprobe's ctx is the kernelspace pt_regs. bpf_task_pt_regs() allows accessing userspace pt_regs in a kprobe handler. The final case (kernelspace pt_regs in uprobe) is pretty rare (usermode helper) so I think that can be solved later if necessary. More concretely, this helper is useful in doing BPF-based DWARF stack unwinding. Currently the kernel can only do framepointer based stack unwinds for userspace code. This is because the DWARF state machines are too fragile to be computed in kernelspace [0]. The idea behind DWARF-based stack unwinds w/ BPF is to copy a chunk of the userspace stack (while in prog context) and send it up to userspace for unwinding (probably with libunwind) [1]. This would effectively enable profiling applications with -fomit-frame-pointer using kprobes and uprobes. [0]: https://lkml.org/lkml/2012/2/10/356 [1]: https://github.com/danobi/bpf-dwarf-walk Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/e2718ced2d51ef4268590ab8562962438ab82815.1629772842.git.dxu@dxuuu.xyz	2021-08-31 14:21:44 -07:00
Dave Marchevsky	ed529685db	bpf: Migrate cgroup_bpf to internal cgroup_bpf_attach_type enum Add an enum (cgroup_bpf_attach_type) containing only valid cgroup_bpf attach types and a function to map bpf_attach_type values to the new enum. Inspired by netns_bpf_attach_type. Then, migrate cgroup_bpf to use cgroup_bpf_attach_type wherever possible. Functionality is unchanged as attach_type_to_prog_type switches in bpf/syscall.c were preventing non-cgroup programs from making use of the invalid cgroup_bpf array slots. As a result struct cgroup_bpf uses 504 fewer bytes relative to when its arrays were sized using MAX_BPF_ATTACH_TYPE. bpf_cgroup_storage is notably not migrated as struct bpf_cgroup_storage_key is part of uapi and contains a bpf_attach_type member which is not meant to be opaque. Similarly, bpf_cgroup_link continues to report its bpf_attach_type member to userspace via fdinfo and bpf_link_info. To ease disambiguation, bpf_attach_type variables are renamed from 'type' to 'atype' when changed to cgroup_bpf_attach_type. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210819092420.1984861-2-davemarchevsky@fb.com	2021-08-31 14:21:44 -07:00
Andrii Nakryiko	c92a5d043e	ci: don't hang on kernel crashes in qemu Backport of kernel-patches PR ([0]), done by @thefallentree. [0] https://github.com/kernel-patches/vmtest/pull/29 Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-08-30 12:52:51 -07:00
grantseltzer	8bdc267e7b	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 3c3bd542ffbb2ac09631313ede46ae66660ae550 Checkpoint bpf-next commit: d20b41115ad53293201cc07ee429a38740cb056b Baseline bpf commit: 3776f3517ed94d40ff0e3851d7ce2ce17b63099f Checkpoint bpf commit: 3776f3517ed94d40ff0e3851d7ce2ce17b63099f Grant Seltzer (1): libbpf: Rename libbpf documentation index file docs/{libbpf.rst => index.rst} \| 8 ++++++++ docs/libbpf_naming_convention.rst \| 2 +- 2 files changed, 9 insertions(+), 1 deletion(-) rename docs/{libbpf.rst => index.rst} (75%) -- 2.31.1	2021-08-18 15:22:43 -07:00
Grant Seltzer	d0c398be4f	libbpf: Rename libbpf documentation index file This patch renames a documentation libbpf.rst to index.rst. In order for readthedocs.org to pick this file up and properly build the documentation site. It also changes the title type of the ABI subsection in the naming convention doc. This is so that readthedocs.org doesn't treat this section as a separate document. Signed-off-by: Grant Seltzer <grantseltzer@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210818151313.49992-1-grantseltzer@gmail.com	2021-08-18 15:22:43 -07:00
grantseltzer	7d9cc837ef	Fix path to Doxygen source code input Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>	2021-08-18 12:28:09 -07:00
Andrii Nakryiko	a3c0cc19d4	ci: blacklist new selftests on 5.5 Blacklist bpf_cookie, perf_link, and xdp_bonding selftests on 5.5 kernel. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-08-17 08:18:57 -07:00
Andrii Nakryiko	7c6d34a2c9	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 372642ea83ff1c71a5d567a704c912359eb59776 Checkpoint bpf-next commit: 3c3bd542ffbb2ac09631313ede46ae66660ae550 Baseline bpf commit: 7c4a22339e7ce7b6ed473a8e682da622c3a774ee Checkpoint bpf commit: 3776f3517ed94d40ff0e3851d7ce2ce17b63099f Andrii Nakryiko (8): bpf: Implement minimal BPF perf link bpf: Allow to specify user-provided bpf_cookie for BPF perf links bpf: Add bpf_get_attach_cookie() BPF helper to access bpf_cookie value libbpf: Remove unused bpf_link's destroy operation, but add dealloc libbpf: Use BPF perf link when supported by kernel libbpf: Add bpf_cookie support to bpf_link_create() API libbpf: Add bpf_cookie to perf_event, kprobe, uprobe, and tp attach APIs libbpf: Add uprobe ref counter offset support for USDT semaphores Hangbin Liu (1): bonding: add new option lacp_active Hao Luo (1): libbpf: Support weak typed ksyms. Randy Dunlap (1): libbpf, doc: Eliminate warnings in libbpf_naming_convention Robin Gögge (1): libbpf: Fix probe for BPF_PROG_TYPE_CGROUP_SOCKOPT grantseltzer (1): bpf: Reconfigure libbpf docs to remove unversioned API docs/libbpf_api.rst \| 27 ---- docs/libbpf_naming_convention.rst \| 4 +- include/uapi/linux/bpf.h \| 25 ++++ include/uapi/linux/if_link.h \| 1 + src/bpf.c \| 32 ++++- src/bpf.h \| 8 +- src/libbpf.c \| 229 +++++++++++++++++++++++------- src/libbpf.h \| 75 ++++++++-- src/libbpf.map \| 3 + src/libbpf_internal.h \| 32 +++-- src/libbpf_probes.c \| 4 +- 11 files changed, 332 insertions(+), 108 deletions(-) delete mode 100644 docs/libbpf_api.rst -- 2.30.2	2021-08-17 08:18:57 -07:00
Andrii Nakryiko	a69c52bb11	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2021-08-17 08:18:57 -07:00
Andrii Nakryiko	6d67d53143	libbpf: Add uprobe ref counter offset support for USDT semaphores When attaching to uprobes through perf subsystem, it's possible to specify offset of a so-called USDT semaphore, which is just a reference counted u16, used by kernel to keep track of how many tracers are attached to a given location. Support for this feature was added in [0], so just wire this through uprobe_opts. This is important to enable implementing USDT attachment and tracing through libbpf's bpf_program__attach_uprobe_opts() API. [0] a6ca88b241d5 ("trace_uprobe: support reference counter in fd-based uprobe") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210815070609.987780-16-andrii@kernel.org	2021-08-17 08:18:57 -07:00
Andrii Nakryiko	91259bc676	libbpf: Add bpf_cookie to perf_event, kprobe, uprobe, and tp attach APIs Wire through bpf_cookie for all attach APIs that use perf_event_open under the hood: - for kprobes, extend existing bpf_kprobe_opts with bpf_cookie field; - for perf_event, uprobe, and tracepoint APIs, add their _opts variants and pass bpf_cookie through opts. For kernel that don't support BPF_LINK_CREATE for perf_events, and thus bpf_cookie is not supported either, return error and log warning for user. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210815070609.987780-12-andrii@kernel.org	2021-08-17 08:18:57 -07:00
Andrii Nakryiko	a3f8c5a306	libbpf: Add bpf_cookie support to bpf_link_create() API Add ability to specify bpf_cookie value when creating BPF perf link with bpf_link_create() low-level API. Given BPF_LINK_CREATE command is growing and keeps getting new fields that are specific to the type of BPF_LINK, extend libbpf side of bpf_link_create() API and corresponding OPTS struct to accomodate such changes. Add extra checks to prevent using incompatible/unexpected combinations of fields. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210815070609.987780-11-andrii@kernel.org	2021-08-17 08:18:57 -07:00
Andrii Nakryiko	d23679b415	libbpf: Use BPF perf link when supported by kernel Detect kernel support for BPF perf link and prefer it when attaching to perf_event, tracepoint, kprobe/uprobe. Underlying perf_event FD will be kept open until BPF link is destroyed, at which point both perf_event FD and BPF link FD will be closed. This preserves current behavior in which perf_event FD is open for the duration of bpf_link's lifetime and user is able to "disconnect" bpf_link from underlying FD (with bpf_link__disconnect()), so that bpf_link__destroy() doesn't close underlying perf_event FD.When BPF perf link is used, disconnect will keep both perf_event and bpf_link FDs open, so it will be up to (advanced) user to close them. This approach is demonstrated in bpf_cookie.c selftests, added in this patch set. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210815070609.987780-10-andrii@kernel.org	2021-08-17 08:18:57 -07:00
Andrii Nakryiko	9923f25600	libbpf: Remove unused bpf_link's destroy operation, but add dealloc bpf_link->destroy() isn't used by any code, so remove it. Instead, add ability to override deallocation procedure, with default doing plain free(link). This is necessary for cases when we want to "subclass" struct bpf_link to keep extra information, as is the case in the next patch adding struct bpf_link_perf. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210815070609.987780-9-andrii@kernel.org	2021-08-17 08:18:57 -07:00
Andrii Nakryiko	40160ed4d4	bpf: Add bpf_get_attach_cookie() BPF helper to access bpf_cookie value Add new BPF helper, bpf_get_attach_cookie(), which can be used by BPF programs to get access to a user-provided bpf_cookie value, specified during BPF program attachment (BPF link creation) time. Naming is hard, though. With the concept being named "BPF cookie", I've considered calling the helper: - bpf_get_cookie() -- seems too unspecific and easily mistaken with socket cookie; - bpf_get_bpf_cookie() -- too much tautology; - bpf_get_link_cookie() -- would be ok, but while we create a BPF link to attach BPF program to BPF hook, it's still an "attachment" and the bpf_cookie is associated with BPF program attachment to a hook, not a BPF link itself. Technically, we could support bpf_cookie with old-style cgroup programs.So I ultimately rejected it in favor of bpf_get_attach_cookie(). Currently all perf_event-backed BPF program types support bpf_get_attach_cookie() helper. Follow-up patches will add support for fentry/fexit programs as well. While at it, mark bpf_tracing_func_proto() as static to make it obvious that it's only used from within the kernel/trace/bpf_trace.c. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210815070609.987780-7-andrii@kernel.org	2021-08-17 08:18:57 -07:00
Andrii Nakryiko	7b22fc4cdb	bpf: Allow to specify user-provided bpf_cookie for BPF perf links Add ability for users to specify custom u64 value (bpf_cookie) when creating BPF link for perf_event-backed BPF programs (kprobe/uprobe, perf_event, tracepoints). This is useful for cases when the same BPF program is used for attaching and processing invocation of different tracepoints/kprobes/uprobes in a generic fashion, but such that each invocation is distinguished from each other (e.g., BPF program can look up additional information associated with a specific kernel function without having to rely on function IP lookups). This enables new use cases to be implemented simply and efficiently that previously were possible only through code generation (and thus multiple instances of almost identical BPF program) or compilation at runtime (BCC-style) on target hosts (even more expensive resource-wise). For uprobes it is not even possible in some cases to know function IP before hand (e.g., when attaching to shared library without PID filtering, in which case base load address is not known for a library). This is done by storing u64 bpf_cookie in struct bpf_prog_array_item, corresponding to each attached and run BPF program. Given cgroup BPF programs already use two 8-byte pointers for their needs and cgroup BPF programs don't have (yet?) support for bpf_cookie, reuse that space through union of cgroup_storage and new bpf_cookie field. Make it available to kprobe/tracepoint BPF programs through bpf_trace_run_ctx. This is set by BPF_PROG_RUN_ARRAY, used by kprobe/uprobe/tracepoint BPF program execution code, which luckily is now also split from BPF_PROG_RUN_ARRAY_CG. This run context will be utilized by a new BPF helper giving access to this user-provided cookie value from inside a BPF program. Generic perf_event BPF programs will access this value from perf_event itself through passed in BPF program context. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/bpf/20210815070609.987780-6-andrii@kernel.org	2021-08-17 08:18:57 -07:00
Andrii Nakryiko	152882e17a	bpf: Implement minimal BPF perf link Introduce a new type of BPF link - BPF perf link. This brings perf_event-based BPF program attachments (perf_event, tracepoints, kprobes, and uprobes) into the common BPF link infrastructure, allowing to list all active perf_event based attachments, auto-detaching BPF program from perf_event when link's FD is closed, get generic BPF link fdinfo/get_info functionality. BPF_LINK_CREATE command expects perf_event's FD as target_fd. No extra flags are currently supported. Force-detaching and atomic BPF program updates are not yet implemented, but with perf_event-based BPF links we now have common framework for this without the need to extend ioctl()-based perf_event interface. One interesting consideration is a new value for bpf_attach_type, which BPF_LINK_CREATE command expects. Generally, it's either 1-to-1 mapping from bpf_attach_type to bpf_prog_type, or many-to-1 mapping from a subset of bpf_attach_types to one bpf_prog_type (e.g., see BPF_PROG_TYPE_SK_SKB or BPF_PROG_TYPE_CGROUP_SOCK). In this case, though, we have three different program types (KPROBE, TRACEPOINT, PERF_EVENT) using the same perf_event-based mechanism, so it's many bpf_prog_types to one bpf_attach_type. I chose to define a single BPF_PERF_EVENT attach type for all of them and adjust link_create()'s logic for checking correspondence between attach type and program type. The alternative would be to define three new attach types (e.g., BPF_KPROBE, BPF_TRACEPOINT, and BPF_PERF_EVENT), but that seemed like unnecessary overkill and BPF_KPROBE will cause naming conflicts with BPF_KPROBE() macro, defined by libbpf. I chose to not do this to avoid unnecessary proliferation of bpf_attach_type enum values and not have to deal with naming conflicts. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/bpf/20210815070609.987780-5-andrii@kernel.org	2021-08-17 08:18:57 -07:00
Hao Luo	0e7520949e	libbpf: Support weak typed ksyms. Currently weak typeless ksyms have default value zero, when they don't exist in the kernel. However, weak typed ksyms are rejected by libbpf if they can not be resolved. This means that if a bpf object contains the declaration of a nonexistent weak typed ksym, it will be rejected even if there is no program that references the symbol. Nonexistent weak typed ksyms can also default to zero just like typeless ones. This allows programs that access weak typed ksyms to be accepted by verifier, if the accesses are guarded. For example, extern const int bpf_link_fops3 __ksym __weak; /* then in BPF program / if (&bpf_link_fops3) { / use bpf_link_fops3 */ } If actual use of nonexistent typed ksym is not guarded properly, verifier would see that register is not PTR_TO_BTF_ID and wouldn't allow to use it for direct memory reads or passing it to BPF helpers. Signed-off-by: Hao Luo <haoluo@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210812003819.2439037-1-haoluo@google.com	2021-08-17 08:18:57 -07:00
Randy Dunlap	c3f7daaab5	libbpf, doc: Eliminate warnings in libbpf_naming_convention Use "code-block: none" instead of "c" for non-C-language code blocks. Removes these warnings: lnx-514-rc4/Documentation/bpf/libbpf/libbpf_naming_convention.rst:111: WARNING: Could not lex literal_block as "c". Highlighting skipped. lnx-514-rc4/Documentation/bpf/libbpf/libbpf_naming_convention.rst:124: WARNING: Could not lex literal_block as "c". Highlighting skipped. Fixes: f42cfb469f9b ("bpf: Add documentation for libbpf including API autogen") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210802015037.787-1-rdunlap@infradead.org	2021-08-17 08:18:57 -07:00
Hangbin Liu	1a1e7a0612	bonding: add new option lacp_active Add an option lacp_active, which is similar with team's runner.active. This option specifies whether to send LACPDU frames periodically. If set on, the LACPDU frames are sent along with the configured lacp_rate setting. If set off, the LACPDU frames acts as "speak when spoken to". Note, the LACPDU state frames still will be sent when init or unbind port. v2: remove module parameter Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-08-17 08:18:57 -07:00
Andrii Nakryiko	827963ffb3	sync: fix up docs sync path mapping Kernel docs from Documentation/bpf/libbpf go straight to docs/ under libbpf. Also ignore libbpf-only parts of docs subdir. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-08-16 22:37:57 -07:00
Andrii Nakryiko	4ab24e7d62	docs: initial set of libbpf docs Add libbpf-related .rst files before they started being synced automatically. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-08-16 22:37:57 -07:00
grantseltzer	b2a63c974d	docs: reconfigure libbpf documentation syncing This adds documentation files, including ones for autogenerating API documentation based on code comments in the source code that's pulled in via the mirror. Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>	2021-08-16 22:30:23 -07:00
Quentin Monnet	88649fe655	ci: run script to test bpftool types/options sync When new eBPF program, map, or attach types are added to the kernel, bpftool needs to be updated in order to support the related features. These updates should add the new types to the code itself, but also to the help messages, documentation, and bash completion. Given that it is easy to omit one of those, a script has been created to attempt to validate that all parts have been consistently updated. This new script for bpftool is hosted in the kernel repository, amongst the BPF selftests. But it is not called from the Makefile, and not run along with the other selftests. If it was, all patches updating the BPF UAPI would require the relevant changes in bpftool at the same time, _in the same patches_, which is not desirable. To ensure that bpftool's parts remain in sync, let's run this script from the CI. This patch adds a new section to the run.sh script, focused on bpftool, and calling the new test_bpftool_synctypes.py.	2021-08-16 15:16:08 -07:00
Sergei Iudin	1778e0b1bd	Make CI tests compatible with vanilla kernel tree This is required to migrate kernel-patches CI to use this code instead of fork	2021-08-11 16:06:23 -07:00
Andrii Nakryiko	64f027efda	ci: restore all temporary disabled tests Upstream bpf-next should be good, so no temporary blocked tests should remain. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-08-09 13:54:27 -07:00
Yucong Sun	6bf8babb33	Add a test step to produce a minimal binary using libbpf. This patch adds a test step to link a minimal program to libbpf library produced, making sure that the library works.	2021-08-09 13:54:19 -07:00
Rafael David Tinoco	70ad3e8314	makefile: fix missing object for static compilation Makefile needs relo_core object added to objects list to avoid static linking errors when doing static compilation: /bin/ld: .../libbpf.a(libbpf.o): in function `bpf_core_apply_relo': .../libbpf/src/libbpf.c:5134: undefined reference to `bpf_core_apply_relo_insn' Signed-off-by: Rafael David Tinoco <rafaeldtinoco@gmail.com>	2021-08-09 13:54:19 -07:00
Andrii Nakryiko	dbdd8f3b34	ci: make CI build log less verbose Only keep stderr output in case of errors for kernel and selftests builds. Having a multi-thousand-line output isn't useful and slows down Github Actions' log view UI. Also quiet down wget's "progress bar" output. While at the same time see some totals from tar, just for the fun of it. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-08-09 13:54:19 -07:00
Andrii Nakryiko	52e96052a2	ci: blacklist newly migrated netcnt selftest Seems like netcnt uses some map operations not supported by 5.5. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-08-09 13:54:19 -07:00
Andrii Nakryiko	41db5534d8	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 807b8f0e24e6004984094e1bcbbd2b297011a085 Checkpoint bpf-next commit: 372642ea83ff1c71a5d567a704c912359eb59776 Baseline bpf commit: d6371c76e20d7d3f61b05fd67b596af4d14a8886 Checkpoint bpf commit: a02215ce72a37a19a690803b23b091186ee4f7b2 Alexei Starovoitov (4): libbpf: Cleanup the layering between CORE and bpf_program. libbpf: Split bpf_core_apply_relo() into bpf_program independent helper. libbpf: Move CO-RE types into relo_core.h. libbpf: Split CO-RE logic into relo_core.c. Daniel Xu (1): libbpf: Do not close un-owned FD 0 on errors Evgeniy Litvinenko (1): libbpf: Add bpf_map__pin_path function Hengqi Chen (1): libbpf: Add btf__load_vmlinux_btf/btf__load_module_btf Jason Wang (1): libbpf: Fix comment typo Jiri Olsa (3): libbpf: Fix func leak in attach_kprobe libbpf: Allow decimal offset for kprobes libbpf: Export bpf_program__attach_kprobe_opts function Martynas Pumputis (1): libbpf: Fix race when pinning maps in parallel Quentin Monnet (4): libbpf: Return non-null error on failures in libbpf_find_prog_btf_id() libbpf: Rename btf__load() as btf__load_into_kernel() libbpf: Rename btf__get_from_id() as btf__load_from_kernel_by_id() libbpf: Add split BTF support for btf__load_from_kernel_by_id() Robin Gögge (1): libbpf: Fix probe for BPF_PROG_TYPE_CGROUP_SOCKOPT src/btf.c \| 50 +- src/btf.h \| 12 +- src/libbpf.c \| 1419 +++-------------------------------------- src/libbpf.h \| 16 + src/libbpf.map \| 7 + src/libbpf_internal.h \| 81 +-- src/libbpf_probes.c \| 4 +- src/relo_core.c \| 1295 +++++++++++++++++++++++++++++++++++++ src/relo_core.h \| 100 +++ 9 files changed, 1561 insertions(+), 1423 deletions(-) create mode 100644 src/relo_core.c create mode 100644 src/relo_core.h -- 2.30.2	2021-08-09 13:54:14 -07:00
Daniel Xu	02efadd0b0	libbpf: Do not close un-owned FD 0 on errors Before this patch, btf_new() was liable to close an arbitrary FD 0 if BTF parsing failed. This was because: * btf->fd was initialized to 0 through the calloc() * btf__free() (in the `done` label) closed any FDs >= 0 * btf->fd is left at 0 if parsing fails This issue was discovered on a system using libbpf v0.3 (without BTF_KIND_FLOAT support) but with a kernel that had BTF_KIND_FLOAT types in BTF. Thus, parsing fails. While this patch technically doesn't fix any issues b/c upstream libbpf has BTF_KIND_FLOAT support, it'll help prevent issues in the future if more BTF types are added. It also allow the fix to be backported to older libbpf's. Fixes: 3289959b97ca ("libbpf: Support BTF loading and raw data output in both endianness") Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/5969bb991adedb03c6ae93e051fd2a00d293cf25.1627513670.git.dxu@dxuuu.xyz	2021-08-04 18:27:12 -07:00
Robin Gögge	2805c2a4ca	libbpf: Fix probe for BPF_PROG_TYPE_CGROUP_SOCKOPT This patch fixes the probe for BPF_PROG_TYPE_CGROUP_SOCKOPT, so the probe reports accurate results when used by e.g. bpftool. Fixes: 4cdbfb59c44a ("libbpf: support sockopt hooks") Signed-off-by: Robin Gögge <r.goegge@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20210728225825.2357586-1-r.goegge@gmail.com	2021-08-04 18:27:12 -07:00
Hengqi Chen	e65d128903	libbpf: Add btf__load_vmlinux_btf/btf__load_module_btf Add two new APIs: btf__load_vmlinux_btf and btf__load_module_btf. btf__load_vmlinux_btf is just an alias to the existing API named libbpf_find_kernel_btf, rename to be more precisely and consistent with existing BTF APIs. btf__load_module_btf can be used to load module BTF, add it for completeness. These two APIs are useful for implementing tracing tools and introspection tools. This is part of the effort towards libbpf 1.0 ([0]). [0] Closes: https://github.com/libbpf/libbpf/issues/280 Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210730114012.494408-1-hengqi.chen@gmail.com	2021-08-04 18:27:12 -07:00
Quentin Monnet	512b472d97	libbpf: Add split BTF support for btf__load_from_kernel_by_id() Add a new API function btf__load_from_kernel_by_id_split(), which takes a pointer to a base BTF object in order to support split BTF objects when retrieving BTF information from the kernel. Reference: https://github.com/libbpf/libbpf/issues/314 Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210729162028.29512-8-quentin@isovalent.com	2021-08-04 18:27:12 -07:00
Quentin Monnet	73788dd22f	libbpf: Rename btf__get_from_id() as btf__load_from_kernel_by_id() Rename function btf__get_from_id() as btf__load_from_kernel_by_id() to better indicate what the function does. Change the new function so that, instead of requiring a pointer to the pointer to update and returning with an error code, it takes a single argument (the id of the BTF object) and returns the corresponding pointer. This is more in line with the existing constructors. The other tools calling the (soon-to-be) deprecated btf__get_from_id() function will be updated in a future commit. References: - https://github.com/libbpf/libbpf/issues/278 - https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0#btfh-apis Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210729162028.29512-4-quentin@isovalent.com	2021-08-04 18:27:12 -07:00
Quentin Monnet	a180eb551e	libbpf: Rename btf__load() as btf__load_into_kernel() As part of the effort to move towards a v1.0 for libbpf, rename btf__load() function, used to "upload" BTF information into the kernel, as btf__load_into_kernel(). This new name better reflects what the function does. References: - https://github.com/libbpf/libbpf/issues/278 - https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0#btfh-apis Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210729162028.29512-3-quentin@isovalent.com	2021-08-04 18:27:12 -07:00
Quentin Monnet	9d2b7e471b	libbpf: Return non-null error on failures in libbpf_find_prog_btf_id() Variable "err" is initialised to -EINVAL so that this error code is returned when something goes wrong in libbpf_find_prog_btf_id(). However, a recent change in the function made use of the variable in such a way that it is set to 0 if retrieving linear information on the program is successful, and this 0 value remains if we error out on failures at later stages. Let's fix this by setting err to -EINVAL later in the function. Fixes: e9fc3ce99b34 ("libbpf: Streamline error reporting for high-level APIs") Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210729162028.29512-2-quentin@isovalent.com	2021-08-04 18:27:12 -07:00
Martynas Pumputis	3a0fc666ef	libbpf: Fix race when pinning maps in parallel When loading in parallel multiple programs which use the same to-be pinned map, it is possible that two instances of the loader will call bpf_object__create_maps() at the same time. If the map doesn't exist when both instances call bpf_object__reuse_map(), then one of the instances will fail with EEXIST when calling bpf_map__pin(). Fix the race by retrying reusing a map if bpf_map__pin() returns EEXIST. The fix is similar to the one in iproute2: e4c4685fd6e4 ("bpf: Fix race condition with map pinning"). Before retrying the pinning, we don't do any special cleaning of an internal map state. The closer code inspection revealed that it's not required: - bpf_object__create_map(): map->inner_map is destroyed after a successful call, map->fd is closed if pinning fails. - bpf_object__populate_internal_map(): created map elements is destroyed upon close(map->fd). - init_map_slots(): slots are freed after their initialization. Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210726152001.34845-1-m@lambda.lt	2021-08-04 18:27:12 -07:00
Jason Wang	7c25b1d569	libbpf: Fix comment typo Remove the repeated word 'the' in line 48. Signed-off-by: Jason Wang <wangborong@cdjrlc.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210727115928.74600-1-wangborong@cdjrlc.com	2021-08-04 18:27:12 -07:00
Alexei Starovoitov	d41e821ccf	libbpf: Split CO-RE logic into relo_core.c. Move CO-RE logic into separate file. The internal interface between libbpf and CO-RE is through bpf_core_apply_relo_insn() function and few structs defined in relo_core.h. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210721000822.40958-5-alexei.starovoitov@gmail.com	2021-08-04 18:27:12 -07:00
Alexei Starovoitov	2fe57e40ac	libbpf: Move CO-RE types into relo_core.h. In order to make a clean split of CO-RE logic move its types into independent header file. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210721000822.40958-4-alexei.starovoitov@gmail.com	2021-08-04 18:27:12 -07:00
Alexei Starovoitov	f81dbd3475	libbpf: Split bpf_core_apply_relo() into bpf_program independent helper. bpf_core_apply_relo() doesn't need to know bpf_program internals and hashmap details. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210721000822.40958-3-alexei.starovoitov@gmail.com	2021-08-04 18:27:12 -07:00
Alexei Starovoitov	035fd6aca0	libbpf: Cleanup the layering between CORE and bpf_program. CO-RE processing functions don't need to know 'struct bpf_program' details. Cleanup the layering to eventually be able to move CO-RE logic into a separate file. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210721000822.40958-2-alexei.starovoitov@gmail.com	2021-08-04 18:27:12 -07:00
Evgeniy Litvinenko	e44c8486c6	libbpf: Add bpf_map__pin_path function Add bpf_map__pin_path, so that the inconsistently named bpf_map__get_pin_path can be deprecated later. This is part of the effort towards libbpf v1.0: https://github.com/libbpf/libbpf/issues/307 Also, add a selftest for the new function. Signed-off-by: Evgeniy Litvinenko <evgeniyl@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210723221511.803683-1-evgeniyl@fb.com	2021-08-04 18:27:12 -07:00
Jiri Olsa	14f5433b2e	libbpf: Export bpf_program__attach_kprobe_opts function Export bpf_program__attach_kprobe_opts as a public API. Rename bpf_program_attach_kprobe_opts to bpf_kprobe_opts and turn it into OPTS struct. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Tested-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/20210721215810.889975-4-jolsa@kernel.org	2021-08-04 18:27:12 -07:00
Jiri Olsa	d7a2de020b	libbpf: Allow decimal offset for kprobes Allow to specify decimal offset in SEC macro, like: SEC("kprobe/bpf_fentry_test7+5") Add selftest for that. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Tested-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/20210721215810.889975-3-jolsa@kernel.org	2021-08-04 18:27:12 -07:00
Jiri Olsa	bb92e7ab4d	libbpf: Fix func leak in attach_kprobe Add missing free() for func pointer in attach_kprobe function. Fixes: a2488b5f483f ("libbpf: Allow specification of "kprobe/function+offset"") Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Tested-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/20210721215810.889975-2-jolsa@kernel.org	2021-08-04 18:27:12 -07:00
Yucong Sun	3f22535d56	Fix text grouping issue on github actions github action grouping is broken because we were outputing "::endgroup" where it needs "::endgroup::". This patch also added some addtional grouping around contianer setup phase, making output easier to read.	2021-08-04 12:18:41 -07:00
Andrii Nakryiko	f8ab8bde8e	ci: bump nightly Clang/LLVM version to 14 CI started to fail with missing clang-13 warning. Bump the version to 14. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-08-03 12:44:00 -07:00
Sergei Iudin	506a544834	ci: improve CI UX a little by setting names a hiding debug	2021-07-29 17:35:23 -07:00
Andrii Nakryiko	ec2c78c034	README: remove Travis CI build badge We stop using Travis CI from now on.	2021-07-29 14:22:49 -07:00
Andrii Nakryiko	030ff87857	ci: fix log folding in Github Actions Backport kernel-patches fix for the same issue ([0] and [1]). [0] `17c596fe8b` [1] `a38de685c7` Cc: Sergei Iudin <siudin@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-07-27 19:26:06 -07:00
Andrii Nakryiko	0db006d28e	ci: add cancel-in-progress behavior for main test CI workflow Make sure that only the latest enqueued workflow is running for any given PR and/or branch. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-07-22 23:22:42 -07:00
Andrii Nakryiko	6e6f18ac5d	ci: add Github Actions status badge Add badge to show the Github Actions test.yml workflow status.	2021-07-21 11:20:34 -07:00
Andrii Nakryiko	deca7932c3	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 08f71a1e39a1f07a464ac782d9b612d6a74c7015 Checkpoint bpf-next commit: 807b8f0e24e6004984094e1bcbbd2b297011a085 Baseline bpf commit: d6371c76e20d7d3f61b05fd67b596af4d14a8886 Checkpoint bpf commit: d6371c76e20d7d3f61b05fd67b596af4d14a8886 Alan Maguire (2): libbpf: Avoid use of __int128 in typed dump display libbpf: Propagate errors when retrieving enum value for typed data display src/btf_dump.c \| 103 ++++++++++++++++++++++++++++++++----------------- 1 file changed, 68 insertions(+), 35 deletions(-) -- 2.30.2	2021-07-21 11:16:01 -07:00
Alan Maguire	ebcae72279	libbpf: Propagate errors when retrieving enum value for typed data display When retrieving the enum value associated with typed data during "is data zero?" checking in btf_dump_type_data_check_zero(), the return value of btf_dump_get_enum_value() is not passed to the caller if the function returns a non-zero (error) value. Currently, 0 is returned if the function returns an error. We should instead propagate the error to the caller. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1626770993-11073-4-git-send-email-alan.maguire@oracle.com	2021-07-21 11:16:01 -07:00
Alan Maguire	64362b8896	libbpf: Avoid use of __int128 in typed dump display __int128 is not supported for some 32-bit platforms (arm and i386). __int128 was used in carrying out computations on bitfields which aid display, but the same calculations could be done with __u64 with the small effect of not supporting 128-bit bitfields. With these changes, a big-endian issue with casting 128-bit integers to 64-bit for enum bitfields is solved also, as we now use 64-bit integers for bitfield calculations. Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1626770993-11073-2-git-send-email-alan.maguire@oracle.com	2021-07-21 11:16:01 -07:00
Michal Suchanek	df01b246df	README: State the source origin more prominently. Signed-off-by: Michal Suchanek <msuchanek@suse.de>	2021-07-20 14:45:49 -07:00
Michal Suchanek	6eb5e25905	Makefile: Default LIBSUBDIR to lib64 on 64bit architectures. commit `a82a66e` ("Extend build and add install rules to Makefile") adds special handling for LIBSUBDIR on x86_64. Expand this to all architectures with 64 in name which suggests a 32bit variant exists, and s390x which is 64bit extension of s390. Fixes: #337 Fixes: `a82a66e` ("Extend build and add install rules to Makefile") Signed-off-by: Michal Suchanek <msuchanek@suse.de>	2021-07-20 14:45:49 -07:00
Andrii Nakryiko	a603965dad	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 068dfc655b666b54e08fc3d7108b309d7f906d34 Checkpoint bpf-next commit: 08f71a1e39a1f07a464ac782d9b612d6a74c7015 Baseline bpf commit: a6c39de76d709f30982d4b80a9b9537e1d388858 Checkpoint bpf commit: d6371c76e20d7d3f61b05fd67b596af4d14a8886 Alan Maguire (3): libbpf: Clarify/fix unaligned data issues for btf typed dump libbpf: Fix compilation errors on ppc64le for btf dump typed data libbpf: Btf typed dump does not need to allocate dump data Martynas Pumputis (1): libbpf: Fix removal of inner map in bpf_object__create_map src/btf_dump.c \| 41 ++++++++++++++++++++++++++++++----------- src/libbpf.c \| 10 ++++------ 2 files changed, 34 insertions(+), 17 deletions(-) -- 2.30.2	2021-07-19 17:45:10 -07:00
Martynas Pumputis	f61c3b318b	libbpf: Fix removal of inner map in bpf_object__create_map If creating an outer map of a BTF-defined map-in-map fails (via bpf_object__create_map()), then the previously created its inner map won't be destroyed. Fix this by ensuring that the destroy routines are not bypassed in the case of a failure. Fixes: 646f02ffdd49c ("libbpf: Add BTF-defined map-in-map support") Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210719173838.423148-2-m@lambda.lt	2021-07-19 17:45:10 -07:00
Alan Maguire	8235032464	libbpf: Btf typed dump does not need to allocate dump data By using the stack for this small structure, we avoid the need for freeing memory in error paths. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1626475617-25984-4-git-send-email-alan.maguire@oracle.com	2021-07-19 17:45:10 -07:00
Alan Maguire	dc2c53b7f6	libbpf: Fix compilation errors on ppc64le for btf dump typed data __s64 can be defined as either long or long long, depending on the architecture. On ppc64le it's defined as long, giving this error: In file included from btf_dump.c:22: btf_dump.c: In function 'btf_dump_type_data_check_overflow': libbpf_internal.h:111:22: error: format '%lld' expects argument of type 'long long int', but argument 3 has type '__s64' {aka 'long int'} [-Werror=format=] 111 \| libbpf_print(level, "libbpf: " fmt, ##__VA_ARGS__); \ \| ^~~~~~~~~~ libbpf_internal.h:114:27: note: in expansion of macro '__pr' 114 \| #define pr_warn(fmt, ...) __pr(LIBBPF_WARN, fmt, ##__VA_ARGS__) \| ^~~~ btf_dump.c:1992:3: note: in expansion of macro 'pr_warn' 1992 \| pr_warn("unexpected size [%lld] for id [%u]\n", \| ^~~~~~~ btf_dump.c:1992:32: note: format string is defined here 1992 \| pr_warn("unexpected size [%lld] for id [%u]\n", \| ~~~^ \| \| \| long long int \| %ld Cast to size_t and use %zu instead. Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1626475617-25984-3-git-send-email-alan.maguire@oracle.com	2021-07-19 17:45:10 -07:00
Alan Maguire	fb3809e940	libbpf: Clarify/fix unaligned data issues for btf typed dump If data is packed, data structures can store it outside of usual boundaries. For example a 4-byte int can be stored on a unaligned boundary in a case like this: struct s { char f1; int f2; } __attribute((packed)); ...the int is stored at an offset of one byte. Some platforms have problems dereferencing data that is not aligned with its size, and code exists to handle most cases of this for BTF typed data display. However pointer display was missed, and a simple function to test if "ptr_is_aligned(data, data_sz)" would help clarify this code. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1626475617-25984-2-git-send-email-alan.maguire@oracle.com	2021-07-19 17:45:10 -07:00
Fejes Ferenc	74d3571880	Update README.md	2021-07-19 16:43:05 -07:00
Fejes Ferenc	be570b29c1	Update README.md Manjaro is a popular and friendly Arch based distro. Recently they also enabled the BTF support: https://forum.manjaro.org/t/co-re-support-in-kernel/46134/19 I can confirm that: [user@pc ~]$ uname -a Linux pc 5.12.16-1-MANJARO #1 SMP PREEMPT Sun Jul 11 13:23:34 UTC 2021 x86_64 GNU/Linux [user@pc ~]$ ls -la /sys/kernel/btf/vmlinux -r--r--r-- 1 root root 4226769 jul 17 15.27 /sys/kernel/btf/vmlinux	2021-07-19 16:43:05 -07:00
Sergei Iudin	9aa71e1040	Run apt-get update as a first step for GH actions otherwise container may contain stall repo metadata cached	2021-07-19 14:57:35 -07:00
Andrii Nakryiko	b3ffd258fc	vmtest: blacklist 5.5 selftests Add few new selftests to blacklist. They can't succeed on 5.5. Also temporarily remove btf_dump for 4.9 due to newly added data dumping subtests. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-07-19 11:36:37 -07:00
Andrii Nakryiko	4447ac82d4	ci: temporary work-around to get green CI builds back Temporary disable tc_bpf tests that seem to have regressed. Temporary and artificially bump pahole version from 1.21 to 1.22 to get per-CPU BTF data built. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-07-19 11:36:37 -07:00
Andrii Nakryiko	8fa229c455	ci: disable -Wstringop-truncation for GCC10 configurations as well We used to have it disabled for GCC8, but now GCC10 is false-report same warnings, so disable stringop-truncation warnigs for GCC10 as well. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-07-19 11:36:37 -07:00
Andrii Nakryiko	8a670b7422	vmtest: regenerate latest vmlinux.h This is necessary to make runqslower compile with task->__state field on old kernels, for which we don't have an actual vmlinux.h. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-07-19 11:36:37 -07:00
Andrii Nakryiko	21f90f61b0	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: f42cfb469f9b4a1c002a03cce3d9329376800a6f Checkpoint bpf-next commit: 068dfc655b666b54e08fc3d7108b309d7f906d34 Baseline bpf commit: 61e8aeda9398925f8c6fc290585bdd9727d154c4 Checkpoint bpf commit: a6c39de76d709f30982d4b80a9b9537e1d388858 Alan Maguire (2): libbpf: Allow specification of "kprobe/function+offset" libbpf: BTF dumper support for typed data Alexei Starovoitov (2): bpf: Sync tools/include/uapi/linux/bpf.h bpf: Introduce bpf timers. Jiri Olsa (3): bpf: Add bpf_get_func_ip helper for tracing programs bpf: Add bpf_get_func_ip helper for kprobe programs libbpf: Add bpf_program__attach_kprobe_opts function Jonathan Edwards (1): libbpf: Add extra BPF_PROG_TYPE check to bpf_object__probe_loading Kumar Kartikeya Dwivedi (2): libbpf: Add request buffer type for netlink messages libbpf: Switch to void * casting in netlink helpers Kuniyuki Iwashima (1): bpf: Fix a typo of reuseport map in bpf.h. Martynas Pumputis (1): libbpf: Fix reuse of pinned map on older kernel Shuyi Cheng (2): libbpf: Introduce 'btf_custom_path' to 'bpf_obj_open_opts' libbpf: Fix the possible memory leak on error Toke Høiland-Jørgensen (1): libbpf: Restore errno return for functions that were already returning it include/uapi/linux/bpf.h \| 85 +++- src/btf.h \| 19 + src/btf_dump.c \| 819 ++++++++++++++++++++++++++++++++++++++- src/libbpf.c \| 146 ++++++- src/libbpf.h \| 9 +- src/libbpf.map \| 1 + src/netlink.c \| 115 +++--- src/nlattr.c \| 2 +- src/nlattr.h \| 38 +- 9 files changed, 1117 insertions(+), 117 deletions(-) -- 2.30.2	2021-07-16 17:05:44 -07:00
Andrii Nakryiko	c8b1d14b03	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2021-07-16 17:05:44 -07:00
Alan Maguire	c0b2ceba1d	libbpf: BTF dumper support for typed data Add a BTF dumper for typed data, so that the user can dump a typed version of the data provided. The API is int btf_dump__dump_type_data(struct btf_dump d, __u32 id, void data, size_t data_sz, const struct btf_dump_type_data_opts opts); ...where the id is the BTF id of the data pointed to by the "void " argument; for example the BTF id of "struct sk_buff" for a "struct skb " data pointer. Options supported are - a starting indent level (indent_lvl) - a user-specified indent string which will be printed once per indent level; if NULL, tab is chosen but any string <= 32 chars can be provided. - a set of boolean options to control dump display, similar to those used for BPF helper bpf_snprintf_btf(). Options are - compact : omit newlines and other indentation - skip_names: omit member names - emit_zeroes: show zero-value members Default output format is identical to that dumped by bpf_snprintf_btf(), for example a "struct sk_buff" representation would look like this: struct sk_buff){ (union){ (struct){ .next = (struct sk_buff )0xffffffffffffffff, .prev = (struct sk_buff )0xffffffffffffffff, (union){ .dev = (struct net_device )0xffffffffffffffff, .dev_scratch = (long unsigned int)18446744073709551615, }, }, ... If the data structure is larger than the data_sz number of bytes that are available in data, as much of the data as possible will be dumped and -E2BIG will be returned. This is useful as tracers will sometimes not be able to capture all of the data associated with a type; for example a "struct task_struct" is ~16k. Being able to specify that only a subset is available is important for such cases. On success, the amount of data dumped is returned. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1626362126-27775-2-git-send-email-alan.maguire@oracle.com	2021-07-16 17:05:44 -07:00
Shuyi Cheng	bd25fc7df1	libbpf: Fix the possible memory leak on error If the strdup() fails then we need to call bpf_object__close(obj) to avoid a resource leak. Fixes: 166750bc1dd2 ("libbpf: Support libbpf-provided extern variables") Signed-off-by: Shuyi Cheng <chengshuyi@linux.alibaba.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1626180159-112996-3-git-send-email-chengshuyi@linux.alibaba.com	2021-07-16 17:05:44 -07:00
Shuyi Cheng	4920031c88	libbpf: Introduce 'btf_custom_path' to 'bpf_obj_open_opts' btf_custom_path allows developers to load custom BTF which libbpf will subsequently use for CO-RE relocation instead of vmlinux BTF. Having btf_custom_path in bpf_object_open_opts one can directly use the skeleton's <objname>_bpf__open_opts() API to pass in the btf_custom_path parameter, as opposed to using bpf_object__load_xattr() which is slated to be deprecated ([0]). This work continues previous work started by another developer ([1]). [0] https://lore.kernel.org/bpf/CAEf4BzbJZLjNoiK8_VfeVg_Vrg=9iYFv+po-38SMe=UzwDKJ=Q@mail.gmail.com/#t [1] https://yhbt.net/lore/all/CAEf4Bzbgw49w2PtowsrzKQNcxD4fZRE6AKByX-5-dMo-+oWHHA@mail.gmail.com/ Signed-off-by: Shuyi Cheng <chengshuyi@linux.alibaba.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1626180159-112996-2-git-send-email-chengshuyi@linux.alibaba.com	2021-07-16 17:05:44 -07:00
Alan Maguire	8fa50e86c1	libbpf: Allow specification of "kprobe/function+offset" kprobes can be placed on most instructions in a function, not just entry, and ftrace and bpftrace support the function+offset notification for probe placement. Adding parsing of func_name into func+offset to bpf_program__attach_kprobe() allows the user to specify SEC("kprobe/bpf_fentry_test5+0x6") ...for example, and the offset can be passed to perf_event_open_probe() to support kprobe attachment. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210714094400.396467-8-jolsa@kernel.org	2021-07-16 17:05:44 -07:00
Jiri Olsa	330a158982	libbpf: Add bpf_program__attach_kprobe_opts function Adding bpf_program__attach_kprobe_opts that does the same as bpf_program__attach_kprobe, but takes opts argument. Currently opts struct holds just retprobe bool, but we will add new field in following patch. The function is not exported, so there's no need to add size to the struct bpf_program_attach_kprobe_opts for now. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210714094400.396467-7-jolsa@kernel.org	2021-07-16 17:05:44 -07:00
Jiri Olsa	a524ae0bbf	bpf: Add bpf_get_func_ip helper for kprobe programs Adding bpf_get_func_ip helper for BPF_PROG_TYPE_KPROBE programs, so it's now possible to call bpf_get_func_ip from both kprobe and kretprobe programs. Taking the caller's address from 'struct kprobe::addr', which is defined for both kprobe and kretprobe. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lore.kernel.org/bpf/20210714094400.396467-5-jolsa@kernel.org	2021-07-16 17:05:44 -07:00
Jiri Olsa	97e2a9c9a1	bpf: Add bpf_get_func_ip helper for tracing programs Adding bpf_get_func_ip helper for BPF_PROG_TYPE_TRACING programs, specifically for all trampoline attach types. The trampoline's caller IP address is stored in (ctx - 8) address. so there's no reason to actually call the helper, but rather fixup the call instruction and return [ctx - 8] value directly. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210714094400.396467-4-jolsa@kernel.org	2021-07-16 17:05:44 -07:00
Alexei Starovoitov	bef77595ca	bpf: Introduce bpf timers. Introduce 'struct bpf_timer { __u64 :64; __u64 :64; };' that can be embedded in hash/array/lru maps as a regular field and helpers to operate on it: // Initialize the timer. // First 4 bits of 'flags' specify clockid. // Only CLOCK_MONOTONIC, CLOCK_REALTIME, CLOCK_BOOTTIME are allowed. long bpf_timer_init(struct bpf_timer timer, struct bpf_map map, int flags); // Configure the timer to call 'callback_fn' static function. long bpf_timer_set_callback(struct bpf_timer timer, void callback_fn); // Arm the timer to expire 'nsec' nanoseconds from the current time. long bpf_timer_start(struct bpf_timer timer, u64 nsec, u64 flags); // Cancel the timer and wait for callback_fn to finish if it was running. long bpf_timer_cancel(struct bpf_timer timer); Here is how BPF program might look like: struct map_elem { int counter; struct bpf_timer timer; }; struct { __uint(type, BPF_MAP_TYPE_HASH); __uint(max_entries, 1000); __type(key, int); __type(value, struct map_elem); } hmap SEC(".maps"); static int timer_cb(void map, int key, struct map_elem val); / val points to particular map element that contains bpf_timer. / SEC("fentry/bpf_fentry_test1") int BPF_PROG(test1, int a) { struct map_elem val; int key = 0; val = bpf_map_lookup_elem(&hmap, &key); if (val) { bpf_timer_init(&val->timer, &hmap, CLOCK_REALTIME); bpf_timer_set_callback(&val->timer, timer_cb); bpf_timer_start(&val->timer, 1000 /* call timer_cb2 in 1 usec */, 0); } } This patch adds helper implementations that rely on hrtimers to call bpf functions as timers expire. The following patches add necessary safety checks. Only programs with CAP_BPF are allowed to use bpf_timer. The amount of timers used by the program is constrained by the memcg recorded at map creation time. The bpf_timer_init() helper needs explicit 'map' argument because inner maps are dynamic and not known at load time. While the bpf_timer_set_callback() is receiving hidden 'aux->prog' argument supplied by the verifier. The prog pointer is needed to do refcnting of bpf program to make sure that program doesn't get freed while the timer is armed. This approach relies on "user refcnt" scheme used in prog_array that stores bpf programs for bpf_tail_call. The bpf_timer_set_callback() will increment the prog refcnt which is paired with bpf_timer_cancel() that will drop the prog refcnt. The ops->map_release_uref is responsible for cancelling the timers and dropping prog refcnt when user space reference to a map reaches zero. This uref approach is done to make sure that Ctrl-C of user space process will not leave timers running forever unless the user space explicitly pinned a map that contained timers in bpffs. bpf_timer_init() and bpf_timer_set_callback() will return -EPERM if map doesn't have user references (is not held by open file descriptor from user space and not pinned in bpffs). The bpf_map_delete_elem() and bpf_map_update_elem() operations cancel and free the timer if given map element had it allocated. "bpftool map update" command can be used to cancel timers. The 'struct bpf_timer' is explicitly __attribute__((aligned(8))) because '__u64 :64' has 1 byte alignment of 8 byte padding. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20210715005417.78572-4-alexei.starovoitov@gmail.com	2021-07-16 17:05:44 -07:00
Kuniyuki Iwashima	6f7839f477	bpf: Fix a typo of reuseport map in bpf.h. Fix s/BPF_MAP_TYPE_REUSEPORT_ARRAY/BPF_MAP_TYPE_REUSEPORT_SOCKARRAY/ typo in bpf.h. Fixes: 2dbb9b9e6df6 ("bpf: Introduce BPF_PROG_TYPE_SK_REUSEPORT") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210714124317.67526-1-kuniyu@amazon.co.jp	2021-07-16 17:05:44 -07:00
Alexei Starovoitov	90aba5e582	bpf: Sync tools/include/uapi/linux/bpf.h Commit 47316f4a3053 missed updating tools/.../bpf.h. Sync it. Fixes: 47316f4a3053 ("bpf: Support input xdp_md context in BPF_PROG_TEST_RUN") Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2021-07-16 17:05:44 -07:00
Martynas Pumputis	4dc3aeb072	libbpf: Fix reuse of pinned map on older kernel When loading a BPF program with a pinned map, the loader checks whether the pinned map can be reused, i.e. their properties match. To derive such of the pinned map, the loader invokes BPF_OBJ_GET_INFO_BY_FD and then does the comparison. Unfortunately, on < 4.12 kernels the BPF_OBJ_GET_INFO_BY_FD is not available, so loading the program fails with the following error: libbpf: failed to get map info for map FD 5: Invalid argument libbpf: couldn't reuse pinned map at '/sys/fs/bpf/tc/globals/cilium_call_policy': parameter mismatch" libbpf: map 'cilium_call_policy': error reusing pinned map libbpf: map 'cilium_call_policy': failed to create: Invalid argument(-22) libbpf: failed to load object 'bpf_overlay.o' To fix this, fallback to derivation of the map properties via /proc/$PID/fdinfo/$MAP_FD if BPF_OBJ_GET_INFO_BY_FD fails with EINVAL, which can be used as an indicator that the kernel doesn't support the latter. Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210712125552.58705-1-m@lambda.lt	2021-07-16 17:05:44 -07:00
Toke Høiland-Jørgensen	4ce0551ee5	libbpf: Restore errno return for functions that were already returning it The update to streamline libbpf error reporting intended to change all functions to return the errno as a negative return value if LIBBPF_STRICT_DIRECT_ERRS is set. However, if the flag is not set, the return value changes for the two functions that were already returning a negative errno unconditionally: bpf_link__unpin() and perf_buffer__poll(). This is a user-visible API change that breaks applications; so let's revert these two functions back to unconditionally returning a negative errno value. Fixes: e9fc3ce99b34 ("libbpf: Streamline error reporting for high-level APIs") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210706122355.236082-1-toke@redhat.com	2021-07-16 17:05:44 -07:00
Kumar Kartikeya Dwivedi	f8411901c4	libbpf: Switch to void * casting in netlink helpers Netlink helpers I added in 8bbb77b7c7a2 ("libbpf: Add various netlink helpers") used char * casts everywhere, and there were a few more that existed from before. Convert all of them to void * cast, as it is treated equivalently by clang/gcc for the purposes of pointer arithmetic and to follow the convention elsewhere in the kernel/libbpf. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210619041454.417577-2-memxor@gmail.com	2021-07-16 17:05:44 -07:00
Kumar Kartikeya Dwivedi	9ff2b76693	libbpf: Add request buffer type for netlink messages Coverity complains about OOB writes to nlmsghdr. There is no OOB as we write to the trailing buffer, but static analyzers and compilers may rightfully be confused as the nlmsghdr pointer has subobject provenance (and hence subobject bounds). Fix this by using an explicit request structure containing the nlmsghdr, struct tcmsg/ifinfomsg, and attribute buffer. Also switch nh_tail (renamed to req_tail) to cast req * to char * so that it can be understood as arithmetic on pointer to the representation array (hence having same bound as request structure), which should further appease analyzers. As a bonus, callers don't have to pass sizeof(req) all the time now, as size is implicitly obtained using the pointer. While at it, also reduce the size of attribute buffer to 128 bytes (132 for ifinfomsg using functions due to the padding). Summary of problem: Even though C standard allows interconvertibility of pointer to first member and pointer to struct, for the purposes of alias analysis it would still consider the first as having pointer value "pointer to T" where T is type of first member hence having subobject bounds, allowing analyzers within reason to complain when object is accessed beyond the size of pointed to object. The only exception to this rule may be when a char * is formed to a member subobject. It is not possible for the compiler to be able to tell the intent of the programmer that it is a pointer to member object or the underlying representation array of the containing object, so such diagnosis is suppressed. Fixes: 715c5ce454a6 ("libbpf: Add low level TC-BPF management API") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210619041454.417577-1-memxor@gmail.com	2021-07-16 17:05:44 -07:00
Jonathan Edwards	df023f5cfc	libbpf: Add extra BPF_PROG_TYPE check to bpf_object__probe_loading eBPF has been backported for RHEL 7 w/ kernel 3.10-940+ [0]. However only the following program types are supported [1]: BPF_PROG_TYPE_KPROBE BPF_PROG_TYPE_TRACEPOINT BPF_PROG_TYPE_PERF_EVENT For libbpf this causes an EINVAL return during the bpf_object__probe_loading call which only checks to see if programs of type BPF_PROG_TYPE_SOCKET_FILTER can load. The following will try BPF_PROG_TYPE_TRACEPOINT as a fallback attempt before erroring out. BPF_PROG_TYPE_KPROBE was not a good candidate because on some kernels it requires knowledge of the LINUX_VERSION_CODE. [0] https://www.redhat.com/en/blog/introduction-ebpf-red-hat-enterprise-linux-7 [1] https://access.redhat.com/articles/3550581 Signed-off-by: Jonathan Edwards <jonathan.edwards@165gc.onmicrosoft.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210619151007.GA6963@165gc.onmicrosoft.com	2021-07-16 17:05:44 -07:00
Andrii Nakryiko	ae62c159ec	include: initial sync of pkt_cls.h and pkt_sched.h Add pkt_cls.h and pkt_sched.h to include/uapi/linux. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-07-16 14:22:07 -07:00
Yonghong Song	8bf016110e	sync uapi headers linux/pkt_cls.h and linux/pkt_sched.h Let us sync linux/{pkt_cls.h,pkt_sched.h} to libbpf repo. Otherwise, on ubuntu 16.04, system headers will be picked up and this will result in compilation error like: .../netlink.c:416:23: error: ‘TC_H_CLSACT’ undeclared (first use in this function) *parent = TC_H_MAKE(TC_H_CLSACT, ^ .../netlink.c:418:9: error: ‘TC_H_MIN_INGRESS’ undeclared (first use in this function) TC_H_MIN_INGRESS : TC_H_MIN_EGRESS); ^ .../netlink.c:418:28: error: ‘TC_H_MIN_EGRESS’ undeclared (first use in this function) TC_H_MIN_INGRESS : TC_H_MIN_EGRESS); ^ .../netlink.c: In function ‘__get_tc_info’: .../netlink.c:522:11: error: ‘TCA_BPF_ID’ undeclared (first use in this function) if (!tbb[TCA_BPF_ID]) ^ Signed-off-by: Yonghong Song <yhs@fb.com>	2021-07-12 14:01:21 -07:00
Yucong Sun	d3e4039a0a	create ondemand vmtest workflow	2021-07-09 14:09:51 -07:00
Jussi Mäki	dd34504b43	vmtest: Set CONFIG_BONDING=y in latest.config This is preparation for the XDP bonding patch set [1] to avoid having to mangle the kernel configuration from vmtest.sh. [1]: https://lore.kernel.org/bpf/202106221509.kwNvAAZg-lkp@intel.com/T/#m4635dc0003944f38a54059b11147ab46abeffa13 Signed-off-by: Jussi Maki <joamaki@gmail.com>	2021-07-08 15:18:32 -07:00
Andrii Nakryiko	bec2ae0c6e	sync: update rewritten bpf-next SHA Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-07-06 13:31:12 -07:00
Andrii Nakryiko	1d6106cf45	ci: blacklist few new tests on 5.5 tc_redirect and migrate_reuseport use new functionality not present on 5.5 Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-06-18 13:05:10 -07:00
Andrii Nakryiko	95e51c1dbe	ci: disable fail-fast for Github Actions tests Make sure we run all of the tests even if some of them fail. This allows to test all of them independently, especially kernel LATEST slow test. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-06-18 13:05:10 -07:00
Andrii Nakryiko	db132757c9	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: cf68fa431d5da7ef0b5ea142dd603611696cbd44 Checkpoint bpf-next commit: f540a7d2c37f9ae0867de0a14bf06cf50b63d65e Baseline bpf commit: 11fc79fc9f2e395aa39fa5baccae62767c5d8280 Checkpoint bpf commit: 61e8aeda9398925f8c6fc290585bdd9727d154c4 Kumar Kartikeya Dwivedi (2): libbpf: Remove unneeded check for flags during tc detach libbpf: Set NLM_F_EXCL when creating qdisc Kuniyuki Iwashima (3): bpf: Support BPF_FUNC_get_socket_cookie() for BPF_PROG_TYPE_SK_REUSEPORT. bpf: Support socket migration by eBPF. libbpf: Set expected_attach_type for BPF_PROG_TYPE_SK_REUSEPORT. Lorenz Bauer (1): libbpf: Fail compilation if target arch is missing Wang Hai (1): libbpf: Simplify the return expression of bpf_object__init_maps function grantseltzer (1): Add documentation for libbpf including API autogen include/uapi/linux/bpf.h \| 16 ++++ src/README.rst \| 168 --------------------------------------- src/bpf_tracing.h \| 46 ++++++++++- src/libbpf.c \| 9 ++- src/netlink.c \| 4 +- 5 files changed, 64 insertions(+), 179 deletions(-) delete mode 100644 src/README.rst -- 2.30.2	2021-06-18 13:05:10 -07:00
grantseltzer	41cddf18f4	Add documentation for libbpf including API autogen This patch is meant to start the initiative to document libbpf. It includes .rst files which are text documentation describing building, API naming convention, as well as an index to generated API documentation. In this approach the generated API documentation is enabled by the kernels existing kernel documentation system which uses sphinx. The resulting docs would then be synced to kernel.org/doc You can test this by running `make htmldocs` and serving the html in Documentation/output. Since libbpf does not yet have comments in kernel doc format, see kernel.org/doc/html/latest/doc-guide/kernel-doc.html for an example so you can test this. The advantage of this approach is to use the existing sphinx infrastructure that the kernel has, and have libbpf docs in the same place as everything else. The current plan is to have the libbpf mirror sync the generated docs and version them based on the libbpf releases which are cut on github. This patch includes the addition of libbpf_api.rst which pulls comment documentation from header files in libbpf under tools/lib/bpf/. The comment docs would be of the standard kernel doc format. Signed-off-by: grantseltzer <grantseltzer@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210618140459.9887-2-grantseltzer@gmail.com	2021-06-18 13:05:10 -07:00
Lorenz Bauer	f883bbf3f4	libbpf: Fail compilation if target arch is missing bpf2go is the Go equivalent of libbpf skeleton. The convention is that the compiled BPF is checked into the repository to facilitate distributing BPF as part of Go packages. To make this portable, bpf2go by default generates both bpfel and bpfeb variants of the C. Using bpf_tracing.h is inherently non-portable since the fields of struct pt_regs differ between platforms, so CO-RE can't help us here. The only way of working around this is to compile for each target platform independently. bpf2go can't do this by default since there are too many platforms. Define the various PT_... macros when no target can be determined and turn them into compilation failures. This works because bpf2go always compiles for bpf targets, so the compiler fallback doesn't kick in. Conditionally define __BPF_MISSING_TARGET so that we can inject a more appropriate error message at build time. The user can then choose which platform to target explicitly. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210616083635.11434-1-lmb@cloudflare.com	2021-06-18 13:05:10 -07:00
Kuniyuki Iwashima	db8982bcaa	libbpf: Set expected_attach_type for BPF_PROG_TYPE_SK_REUSEPORT. This commit introduces a new section (sk_reuseport/migrate) and sets expected_attach_type to two each section in BPF_PROG_TYPE_SK_REUSEPORT program. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210612123224.12525-11-kuniyu@amazon.co.jp	2021-06-18 13:05:10 -07:00
Kuniyuki Iwashima	d1571ab5ce	bpf: Support socket migration by eBPF. This patch introduces a new bpf_attach_type for BPF_PROG_TYPE_SK_REUSEPORT to check if the attached eBPF program is capable of migrating sockets. When the eBPF program is attached, we run it for socket migration if the expected_attach_type is BPF_SK_REUSEPORT_SELECT_OR_MIGRATE or net.ipv4.tcp_migrate_req is enabled. Currently, the expected_attach_type is not enforced for the BPF_PROG_TYPE_SK_REUSEPORT type of program. Thus, this commit follows the earlier idea in the commit aac3fc320d94 ("bpf: Post-hooks for sys_bind") to fix up the zero expected_attach_type in bpf_prog_load_fixup_attach_type(). Moreover, this patch adds a new field (migrating_sk) to sk_reuseport_md to select a new listener based on the child socket. migrating_sk varies depending on if it is migrating a request in the accept queue or during 3WHS. - accept_queue : sock (ESTABLISHED/SYN_RECV) - 3WHS : request_sock (NEW_SYN_RECV) In the eBPF program, we can select a new listener by BPF_FUNC_sk_select_reuseport(). Also, we can cancel migration by returning SK_DROP. This feature is useful when listeners have different settings at the socket API level or when we want to free resources as soon as possible. - SK_PASS with selected_sk, select it as a new listener - SK_PASS with selected_sk NULL, fallbacks to the random selection - SK_DROP, cancel the migration. There is a noteworthy point. We select a listening socket in three places, but we do not have struct skb at closing a listener or retransmitting a SYN+ACK. On the other hand, some helper functions do not expect skb is NULL (e.g. skb_header_pointer() in BPF_FUNC_skb_load_bytes(), skb_tail_pointer() in BPF_FUNC_skb_load_bytes_relative()). So we allocate an empty skb temporarily before running the eBPF program. Suggested-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/netdev/20201123003828.xjpjdtk4ygl6tg6h@kafai-mbp.dhcp.thefacebook.com/ Link: https://lore.kernel.org/netdev/20201203042402.6cskdlit5f3mw4ru@kafai-mbp.dhcp.thefacebook.com/ Link: https://lore.kernel.org/netdev/20201209030903.hhow5r53l6fmozjn@kafai-mbp.dhcp.thefacebook.com/ Link: https://lore.kernel.org/bpf/20210612123224.12525-10-kuniyu@amazon.co.jp	2021-06-18 13:05:10 -07:00
Kuniyuki Iwashima	03b0787342	bpf: Support BPF_FUNC_get_socket_cookie() for BPF_PROG_TYPE_SK_REUSEPORT. We will call sock_reuseport.prog for socket migration in the next commit, so the eBPF program has to know which listener is closing to select a new listener. We can currently get a unique ID of each listener in the userspace by calling bpf_map_lookup_elem() for BPF_MAP_TYPE_REUSEPORT_SOCKARRAY map. This patch makes the pointer of sk available in sk_reuseport_md so that we can get the ID by BPF_FUNC_get_socket_cookie() in the eBPF program. Suggested-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/netdev/20201119001154.kapwihc2plp4f7zc@kafai-mbp.dhcp.thefacebook.com/ Link: https://lore.kernel.org/bpf/20210612123224.12525-9-kuniyu@amazon.co.jp	2021-06-18 13:05:10 -07:00
Kumar Kartikeya Dwivedi	a1bd8104a9	libbpf: Set NLM_F_EXCL when creating qdisc This got lost during the refactoring across versions. We always use NLM_F_EXCL when creating some TC object, so reflect what the function says and set the flag. Fixes: 715c5ce454a6 ("libbpf: Add low level TC-BPF management API") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210612023502.1283837-3-memxor@gmail.com	2021-06-18 13:05:10 -07:00
Kumar Kartikeya Dwivedi	ccead28901	libbpf: Remove unneeded check for flags during tc detach Coverity complained about this being unreachable code. It is right because we already enforce flags to be unset, so a check validating the flag value is redundant. Fixes: 715c5ce454a6 ("libbpf: Add low level TC-BPF management API") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210612023502.1283837-2-memxor@gmail.com	2021-06-18 13:05:10 -07:00
Wang Hai	0b59d75ecd	libbpf: Simplify the return expression of bpf_object__init_maps function There is no need for special treatment of the 'ret == 0' case. This patch simplifies the return expression. Signed-off-by: Wang Hai <wanghai38@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210609115651.3392580-1-wanghai38@huawei.com	2021-06-18 13:05:10 -07:00
Sergei Iudin	a5ee05d505	Run pahole staging once a day	2021-06-17 17:49:56 -07:00
Sergei Iudin	42ebbbce7d	test pahole	2021-06-17 13:38:13 -07:00
Sergei Iudin	26497b9a88	Add coverity workflow	2021-06-15 16:13:44 -07:00
Sergei Iudin	5d5af3f07e	Migrate libbpf ci to GH actions changes to docker command require to run it in non-interactive mode	2021-06-15 14:13:57 -07:00
Andrii Nakryiko	899c45baa2	travis-ci: extend 5.5 blacklist Blacklist few more recent selftests. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-06-08 13:04:27 -07:00
Andrii Nakryiko	95008d47dd	Makefile: sync Makefile with upstream Add gen_loader.o to list of built object files. Complete the list of installed headers. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-06-08 13:04:27 -07:00
Andrii Nakryiko	13acc0af00	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: f18ba26da88a89db9b50cb4ff47fadb159f2810b Checkpoint bpf-next commit: cf68fa431d5da7ef0b5ea142dd603611696cbd44 Baseline bpf commit: d0c0fe10ce6d87734b65c18dc8f4bcae3f4dbea4 Checkpoint bpf commit: 11fc79fc9f2e395aa39fa5baccae62767c5d8280 Alexei Starovoitov (12): bpf: Introduce bpf_sys_bpf() helper and program type. libbpf: Support for syscall program type bpf: Introduce fd_idx bpf: Add bpf_btf_find_by_name_kind() helper. bpf: Add bpf_sys_close() helper. libbpf: Change the order of data and text relocations. libbpf: Add bpf_object pointer to kernel_supports(). libbpf: Preliminary support for fd_idx libbpf: Generate loader program out of BPF ELF file. libbpf: Cleanup temp FDs when intermediate sys_bpf fails. libbpf: Introduce bpf_map__initial_value(). bpf: Add cmd alias BPF_PROG_RUN Andrii Nakryiko (4): libbpf: Add libbpf_set_strict_mode() API to turn on libbpf 1.0 behaviors libbpf: Streamline error reporting for low-level APIs libbpf: Streamline error reporting for high-level APIs libbpf: Move few APIs from 0.4 to 0.5 version Denis Salopek (2): bpf: Add lookup_and_delete_elem support to hashtab bpf: Extend libbpf with bpf_map_lookup_and_delete_elem_flags Florent Revest (1): libbpf: Move BPF_SEQ_PRINTF and BPF_SNPRINTF to bpf_helpers.h Hangbin Liu (1): xdp: Extend xdp_redirect_map with broadcast support Kev Jackson (1): libbpf: Fixes incorrect rx_ring_setup_done Michal Suchanek (1): libbpf: Fix pr_warn type warnings on 32bit Stanislav Fomichev (1): libbpf: Skip bpf_object__probe_loading for light skeleton include/uapi/linux/bpf.h \| 66 ++- src/bpf.c \| 179 +++++--- src/bpf.h \| 2 + src/bpf_gen_internal.h \| 41 ++ src/bpf_helpers.h \| 66 +++ src/bpf_prog_linfo.c \| 18 +- src/bpf_tracing.h \| 62 +-- src/btf.c \| 302 ++++++------- src/btf_dump.c \| 14 +- src/gen_loader.c \| 729 +++++++++++++++++++++++++++++++ src/libbpf.c \| 909 +++++++++++++++++++++++++-------------- src/libbpf.h \| 14 + src/libbpf.map \| 8 + src/libbpf_errno.c \| 7 +- src/libbpf_internal.h \| 55 +++ src/libbpf_legacy.h \| 59 +++ src/linker.c \| 22 +- src/netlink.c \| 81 ++-- src/ringbuf.c \| 26 +- src/skel_internal.h \| 123 ++++++ src/xsk.c \| 2 +- 21 files changed, 2135 insertions(+), 650 deletions(-) create mode 100644 src/bpf_gen_internal.h create mode 100644 src/gen_loader.c create mode 100644 src/libbpf_legacy.h create mode 100644 src/skel_internal.h -- 2.30.2	2021-06-08 13:04:27 -07:00
Andrii Nakryiko	1b9138e452	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2021-06-08 13:04:27 -07:00
Kev Jackson	2da7f66d3f	libbpf: Fixes incorrect rx_ring_setup_done When calling xsk_socket__create_shared(), the logic at line 1097 marks a boolean flag true within the xsk_umem structure to track setup progress in order to support multiple calls to the function. However, instead of marking umem->tx_ring_setup_done, the code incorrectly sets umem->rx_ring_setup_done. This leads to improper behaviour when creating and destroying xsk and umem structures. Multiple calls to this function is documented as supported. Fixes: ca7a83e2487a ("libbpf: Only create rx and tx XDP rings when necessary") Signed-off-by: Kev Jackson <foamdino@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/YL4aU4f3Aaik7CN0@linux-dev	2021-06-08 13:04:27 -07:00
Michal Suchanek	9d5ac4931d	libbpf: Fix pr_warn type warnings on 32bit The printed value is ptrdiff_t and is formatted wiht %ld. This works on 64bit but produces a warning on 32bit. Fix the format specifier to %td. Fixes: 67234743736a ("libbpf: Generate loader program out of BPF ELF file.") Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210604112448.32297-1-msuchanek@suse.de	2021-06-08 13:04:27 -07:00
Andrii Nakryiko	5bfbb36440	libbpf: Move few APIs from 0.4 to 0.5 version Official libbpf 0.4 release doesn't include three APIs that were tentatively put into 0.4 section. Fix libbpf.map and move these three APIs: - bpf_map__initial_value; - bpf_map_lookup_and_delete_elem_flags; - bpf_object__gen_loader. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210603004026.2698513-2-andrii@kernel.org	2021-06-08 13:04:27 -07:00
Florent Revest	343f63e245	libbpf: Move BPF_SEQ_PRINTF and BPF_SNPRINTF to bpf_helpers.h These macros are convenient wrappers around the bpf_seq_printf and bpf_snprintf helpers. They are currently provided by bpf_tracing.h which targets low level tracing primitives. bpf_helpers.h is a better fit. The __bpf_narg and __bpf_apply are needed in both files and provided twice. __bpf_empty isn't used anywhere and is removed from bpf_tracing.h Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Florent Revest <revest@chromium.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210526164643.2881368-1-revest@chromium.org	2021-06-08 13:04:27 -07:00
Hangbin Liu	0dccb885a3	xdp: Extend xdp_redirect_map with broadcast support This patch adds two flags BPF_F_BROADCAST and BPF_F_EXCLUDE_INGRESS to extend xdp_redirect_map for broadcast support. With BPF_F_BROADCAST the packet will be broadcasted to all the interfaces in the map. with BPF_F_EXCLUDE_INGRESS the ingress interface will be excluded when do broadcasting. When getting the devices in dev hash map via dev_map_hash_get_next_key(), there is a possibility that we fall back to the first key when a device was removed. This will duplicate packets on some interfaces. So just walk the whole buckets to avoid this issue. For dev array map, we also walk the whole map to find valid interfaces. Function bpf_clear_redirect_map() was removed in commit ee75aef23afe ("bpf, xdp: Restructure redirect actions"). Add it back as we need to use ri->map again. With test topology: +-------------------+ +-------------------+ \| Host A (i40e 10G) \| ---------- \| eno1(i40e 10G) \| +-------------------+ \| \| \| Host B \| +-------------------+ \| \| \| Host C (i40e 10G) \| ---------- \| eno2(i40e 10G) \| +-------------------+ \| \| \| +------+ \| \| veth0 -- \| Peer \| \| \| veth1 -- \| \| \| \| veth2 -- \| NS \| \| \| +------+ \| +-------------------+ On Host A: # pktgen/pktgen_sample03_burst_single_flow.sh -i eno1 -d $dst_ip -m $dst_mac -s 64 On Host B(Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz, 128G Memory): Use xdp_redirect_map and xdp_redirect_map_multi in samples/bpf for testing. All the veth peers in the NS have a XDP_DROP program loaded. The forward_map max_entries in xdp_redirect_map_multi is modify to 4. Testing the performance impact on the regular xdp_redirect path with and without patch (to check impact of additional check for broadcast mode): 5.12 rc4 \| redirect_map i40e->i40e \| 2.0M \| 9.7M 5.12 rc4 \| redirect_map i40e->veth \| 1.7M \| 11.8M 5.12 rc4 + patch \| redirect_map i40e->i40e \| 2.0M \| 9.6M 5.12 rc4 + patch \| redirect_map i40e->veth \| 1.7M \| 11.7M Testing the performance when cloning packets with the redirect_map_multi test, using a redirect map size of 4, filled with 1-3 devices: 5.12 rc4 + patch \| redirect_map multi i40e->veth (x1) \| 1.7M \| 11.4M 5.12 rc4 + patch \| redirect_map multi i40e->veth (x2) \| 1.1M \| 4.3M 5.12 rc4 + patch \| redirect_map multi i40e->veth (x3) \| 0.8M \| 2.6M Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/bpf/20210519090747.1655268-3-liuhangbin@gmail.com	2021-06-08 13:04:27 -07:00
Andrii Nakryiko	8e3a63ea48	libbpf: Streamline error reporting for high-level APIs Implement changes to error reporting for high-level libbpf APIs to make them less surprising and less error-prone to users: - in all the cases when error happens, errno is set to an appropriate error value; - in libbpf 1.0 mode, all pointer-returning APIs return NULL on error and error code is communicated through errno; this applies both to APIs that already returned NULL before (so now they communicate more detailed error codes), as well as for many APIs that used ERR_PTR() macro and encoded error numbers as fake pointers. - in legacy (default) mode, those APIs that were returning ERR_PTR(err), continue doing so, but still set errno. With these changes, errno can be always used to extract actual error, regardless of legacy or libbpf 1.0 modes. This is utilized internally in libbpf in places where libbpf uses it's own high-level APIs. libbpf_get_error() is adapted to handle both cases completely transparently to end-users (and is used by libbpf consistently as well). More context, justification, and discussion can be found in "Libbpf: the road to v1.0" document ([0]). [0] https://docs.google.com/document/d/1UyjTZuPFWiPFyKk1tV5an11_iaRuec6U-ZESZ54nNTY Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20210525035935.1461796-5-andrii@kernel.org	2021-06-08 13:04:27 -07:00
Andrii Nakryiko	7c7ba067fc	libbpf: Streamline error reporting for low-level APIs Ensure that low-level APIs behave uniformly across the libbpf as follows: - in case of an error, errno is always set to the correct error code; - when libbpf 1.0 mode is enabled with LIBBPF_STRICT_DIRECT_ERRS option to libbpf_set_strict_mode(), return -Exxx error value directly, instead of -1; - by default, until libbpf 1.0 is released, keep returning -1 directly. More context, justification, and discussion can be found in "Libbpf: the road to v1.0" document ([0]). [0] https://docs.google.com/document/d/1UyjTZuPFWiPFyKk1tV5an11_iaRuec6U-ZESZ54nNTY Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20210525035935.1461796-4-andrii@kernel.org	2021-06-08 13:04:27 -07:00
Andrii Nakryiko	12eb2666d9	libbpf: Add libbpf_set_strict_mode() API to turn on libbpf 1.0 behaviors Add libbpf_set_strict_mode() API that allows application to simulate libbpf 1.0 breaking changes before libbpf 1.0 is released. This will help users migrate gradually and with confidence. For now only ALL or NONE options are available, subsequent patches will add more flags. This patch is preliminary for selftests/bpf changes. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20210525035935.1461796-2-andrii@kernel.org	2021-06-08 13:04:27 -07:00
Denis Salopek	234dea015b	bpf: Extend libbpf with bpf_map_lookup_and_delete_elem_flags Add bpf_map_lookup_and_delete_elem_flags() libbpf API in order to use the BPF_F_LOCK flag with the map_lookup_and_delete_elem() function. Signed-off-by: Denis Salopek <denis.salopek@sartura.hr> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/15b05dafe46c7e0750d110f233977372029d1f62.1620763117.git.denis.salopek@sartura.hr	2021-06-08 13:04:27 -07:00
Denis Salopek	c3c2e52201	bpf: Add lookup_and_delete_elem support to hashtab Extend the existing bpf_map_lookup_and_delete_elem() functionality to hashtab map types, in addition to stacks and queues. Create a new hashtab bpf_map_ops function that does lookup and deletion of the element under the same bucket lock and add the created map_ops to bpf.h. Signed-off-by: Denis Salopek <denis.salopek@sartura.hr> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/4d18480a3e990ffbf14751ddef0325eed3be2966.1620763117.git.denis.salopek@sartura.hr	2021-06-08 13:04:27 -07:00
Stanislav Fomichev	b79c698300	libbpf: Skip bpf_object__probe_loading for light skeleton I'm getting the following error when running 'gen skeleton -L' as regular user: libbpf: Error in bpf_object__probe_loading():Operation not permitted(1). Couldn't load trivial BPF program. Make sure your kernel supports BPF (CONFIG_BPF_SYSCALL=y) and/or that RLIMIT_MEMLOCK is set to big enough value. Fixes: 67234743736a ("libbpf: Generate loader program out of BPF ELF file.") Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210521030653.2626513-1-sdf@google.com	2021-06-08 13:04:27 -07:00
Alexei Starovoitov	546199a723	bpf: Add cmd alias BPF_PROG_RUN Add BPF_PROG_RUN command as an alias to BPF_RPOG_TEST_RUN to better indicate the full range of use cases done by the command. Suggested-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210519014032.20908-1-alexei.starovoitov@gmail.com	2021-06-08 13:04:27 -07:00
Alexei Starovoitov	b44566c71b	libbpf: Introduce bpf_map__initial_value(). Introduce bpf_map__initial_value() to read initial contents of mmaped data/rodata/bss maps. Note that bpf_map__set_initial_value() doesn't allow modifying kconfig map while bpf_map__initial_value() allows reading its values. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210514003623.28033-17-alexei.starovoitov@gmail.com	2021-06-08 13:04:27 -07:00
Alexei Starovoitov	594960b3db	libbpf: Cleanup temp FDs when intermediate sys_bpf fails. Fix loader program to close temporary FDs when intermediate sys_bpf command fails. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210514003623.28033-16-alexei.starovoitov@gmail.com	2021-06-08 13:04:27 -07:00
Alexei Starovoitov	694a70c522	libbpf: Generate loader program out of BPF ELF file. The BPF program loading process performed by libbpf is quite complex and consists of the following steps: "open" phase: - parse elf file and remember relocations, sections - collect externs and ksyms including their btf_ids in prog's BTF - patch BTF datasec (since llvm couldn't do it) - init maps (old style map_def, BTF based, global data map, kconfig map) - collect relocations against progs and maps "load" phase: - probe kernel features - load vmlinux BTF - resolve externs (kconfig and ksym) - load program BTF - init struct_ops - create maps - apply CO-RE relocations - patch ld_imm64 insns with src_reg=PSEUDO_MAP, PSEUDO_MAP_VALUE, PSEUDO_BTF_ID - reposition subprograms and adjust call insns - sanitize and load progs During this process libbpf does sys_bpf() calls to load BTF, create maps, populate maps and finally load programs. Instead of actually doing the syscalls generate a trace of what libbpf would have done and represent it as the "loader program". The "loader program" consists of single map with: - union bpf_attr(s) - BTF bytes - map value bytes - insns bytes and single bpf program that passes bpf_attr(s) and data into bpf_sys_bpf() helper. Executing such "loader program" via bpf_prog_test_run() command will replay the sequence of syscalls that libbpf would have done which will result the same maps created and programs loaded as specified in the elf file. The "loader program" removes libelf and majority of libbpf dependency from program loading process. kconfig, typeless ksym, struct_ops and CO-RE are not supported yet. The order of relocate_data and relocate_calls had to change, so that bpf_gen__prog_load() can see all relocations for a given program with correct insn_idx-es. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210514003623.28033-15-alexei.starovoitov@gmail.com	2021-06-08 13:04:27 -07:00
Alexei Starovoitov	c96f2f1b29	libbpf: Preliminary support for fd_idx Prep libbpf to use FD_IDX kernel feature when generating loader program. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210514003623.28033-14-alexei.starovoitov@gmail.com	2021-06-08 13:04:27 -07:00
Alexei Starovoitov	ac2095783a	libbpf: Add bpf_object pointer to kernel_supports(). Add a pointer to 'struct bpf_object' to kernel_supports() helper. It will be used in the next patch. No functional changes. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210514003623.28033-13-alexei.starovoitov@gmail.com	2021-06-08 13:04:27 -07:00
Alexei Starovoitov	fecf2cf6dd	libbpf: Change the order of data and text relocations. In order to be able to generate loader program in the later patches change the order of data and text relocations. Also improve the test to include data relos. If the kernel supports "FD array" the map_fd relocations can be processed before text relos since generated loader program won't need to manually patch ld_imm64 insns with map_fd. But ksym and kfunc relocations can only be processed after all calls are relocated, since loader program will consist of a sequence of calls to bpf_btf_find_by_name_kind() followed by patching of btf_id and btf_obj_fd into corresponding ld_imm64 insns. The locations of those ld_imm64 insns are specified in relocations. Hence process all data relocations (maps, ksym, kfunc) together after call relos. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210514003623.28033-12-alexei.starovoitov@gmail.com	2021-06-08 13:04:27 -07:00
Alexei Starovoitov	c1f36fb3e3	bpf: Add bpf_sys_close() helper. Add bpf_sys_close() helper to be used by the syscall/loader program to close intermediate FDs and other cleanup. Note this helper must never be allowed inside fdget/fdput bracketing. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210514003623.28033-11-alexei.starovoitov@gmail.com	2021-06-08 13:04:27 -07:00
Alexei Starovoitov	6eac86910c	bpf: Add bpf_btf_find_by_name_kind() helper. Add new helper: long bpf_btf_find_by_name_kind(char *name, int name_sz, u32 kind, int flags) Description Find BTF type with given name and kind in vmlinux BTF or in module's BTFs. Return Returns btf_id and btf_obj_fd in lower and upper 32 bits. It will be used by loader program to find btf_id to attach the program to and to find btf_ids of ksyms. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210514003623.28033-10-alexei.starovoitov@gmail.com	2021-06-08 13:04:27 -07:00
Alexei Starovoitov	64a654f398	bpf: Introduce fd_idx Typical program loading sequence involves creating bpf maps and applying map FDs into bpf instructions in various places in the bpf program. This job is done by libbpf that is using compiler generated ELF relocations to patch certain instruction after maps are created and BTFs are loaded. The goal of fd_idx is to allow bpf instructions to stay immutable after compilation. At load time the libbpf would still create maps as usual, but it wouldn't need to patch instructions. It would store map_fds into __u32 fd_array[] and would pass that pointer to sys_bpf(BPF_PROG_LOAD). Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210514003623.28033-9-alexei.starovoitov@gmail.com	2021-06-08 13:04:27 -07:00
Alexei Starovoitov	34eb4fb3f1	libbpf: Support for syscall program type Trivial support for syscall program type. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210514003623.28033-5-alexei.starovoitov@gmail.com	2021-06-08 13:04:27 -07:00
Alexei Starovoitov	007709011e	bpf: Introduce bpf_sys_bpf() helper and program type. Add placeholders for bpf_sys_bpf() helper and new program type. Make sure to check that expected_attach_type is zero for future extensibility. Allow tracing helper functions to be used in this program type, since they will only execute from user context via bpf_prog_test_run. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210514003623.28033-2-alexei.starovoitov@gmail.com	2021-06-08 13:04:27 -07:00
Yonghong Song	db9614b6bd	libbpf: Add support for new llvm bpf relocations LLVM patch https://reviews.llvm.org/D102712 narrowed the scope of existing R_BPF_64_64 and R_BPF_64_32 relocations, and added three new relocations, R_BPF_64_ABS64, R_BPF_64_ABS32 and R_BPF_64_NODYLD32. The main motivation is to make relocations linker friendly. This change, unfortunately, breaks libbpf build, and we will see errors like below: libbpf: ELF relo #0 in section #6 has unexpected type 2 in /home/yhs/work/bpf-next/tools/testing/selftests/bpf/bpf_tcp_nogpl.o Error: failed to link '/home/yhs/work/bpf-next/tools/testing/selftests/bpf/bpf_tcp_nogpl.o': Unknown error -22 (-22) The new relocation R_BPF_64_ABS64 is generated and libbpf linker sanity check doesn't understand it. Relocation section '.rel.struct_ops' at offset 0x1410 contains 1 entries: Offset Info Type Symbol's Value Symbol's Name 0000000000000018 0000000700000002 R_BPF_64_ABS64 0000000000000000 nogpltcp_init Look at the selftests/bpf/bpf_tcp_nogpl.c, void BPF_STRUCT_OPS(nogpltcp_init, struct sock sk) { } SEC(".struct_ops") struct tcp_congestion_ops bpf_nogpltcp = { .init = (void )nogpltcp_init, .name = "bpf_nogpltcp", }; The new llvm relocation scheme categorizes 'nogpltcp_init' reference as R_BPF_64_ABS64 instead of R_BPF_64_64 which is used to specify ld_imm64 relocation in the new scheme. Let us fix the linker sanity checking by including R_BPF_64_ABS64 and R_BPF_64_ABS32. There is no need to check R_BPF_64_NODYLD32 which is used for .BTF and .BTF.ext. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210522162341.3687617-1-yhs@fb.com	2021-05-24 21:24:56 -07:00
Andrii Nakryiko	57375504c6	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: c1cccec9c63637c4c5ee0aa2da2850d983c19e88 Checkpoint bpf-next commit: f18ba26da88a89db9b50cb4ff47fadb159f2810b Baseline bpf commit: c9a7c013569d73881199a7a3011c03336f592cc8 Checkpoint bpf commit: d0c0fe10ce6d87734b65c18dc8f4bcae3f4dbea4 Andrii Nakryiko (1): libbpf: Reject static entry-point BPF programs Kumar Kartikeya Dwivedi (2): libbpf: Add various netlink helpers libbpf: Add low level TC-BPF management API src/libbpf.c \| 5 + src/libbpf.h \| 44 ++++ src/libbpf.map \| 5 + src/netlink.c \| 568 +++++++++++++++++++++++++++++++++++++++++-------- src/nlattr.h \| 48 +++++ 5 files changed, 587 insertions(+), 83 deletions(-) -- 2.30.2	2021-05-17 17:12:06 -07:00
Kumar Kartikeya Dwivedi	d71ff87a2d	libbpf: Add low level TC-BPF management API This adds functions that wrap the netlink API used for adding, manipulating, and removing traffic control filters. The API summary: A bpf_tc_hook represents a location where a TC-BPF filter can be attached. This means that creating a hook leads to creation of the backing qdisc, while destruction either removes all filters attached to a hook, or destroys qdisc if requested explicitly (as discussed below). The TC-BPF API functions operate on this bpf_tc_hook to attach, replace, query, and detach tc filters. All functions return 0 on success, and a negative error code on failure. bpf_tc_hook_create - Create a hook Parameters: @hook - Cannot be NULL, ifindex > 0, attach_point must be set to proper enum constant. Note that parent must be unset when attach_point is one of BPF_TC_INGRESS or BPF_TC_EGRESS. Note that as an exception BPF_TC_INGRESS\|BPF_TC_EGRESS is also a valid value for attach_point. Returns -EOPNOTSUPP when hook has attach_point as BPF_TC_CUSTOM. bpf_tc_hook_destroy - Destroy a hook Parameters: @hook - Cannot be NULL. The behaviour depends on value of attach_point. If BPF_TC_INGRESS, all filters attached to the ingress hook will be detached. If BPF_TC_EGRESS, all filters attached to the egress hook will be detached. If BPF_TC_INGRESS\|BPF_TC_EGRESS, the clsact qdisc will be deleted, also detaching all filters. As before, parent must be unset for these attach_points, and set for BPF_TC_CUSTOM. It is advised that if the qdisc is operated on by many programs, then the program at least check that there are no other existing filters before deleting the clsact qdisc. An example is shown below: DECLARE_LIBBPF_OPTS(bpf_tc_hook, .ifindex = if_nametoindex("lo"), .attach_point = BPF_TC_INGRESS); /* set opts as NULL, as we're not really interested in * getting any info for a particular filter, but just * detecting its presence. / r = bpf_tc_query(&hook, NULL); if (r == -ENOENT) { / no filters / hook.attach_point = BPF_TC_INGRESS\|BPF_TC_EGREESS; return bpf_tc_hook_destroy(&hook); } else { / failed or r == 0, the latter means filters do exist / return r; } Note that there is a small race between checking for no filters and deleting the qdisc. This is currently unavoidable. Returns -EOPNOTSUPP when hook has attach_point as BPF_TC_CUSTOM. bpf_tc_attach - Attach a filter to a hook Parameters: @hook - Cannot be NULL. Represents the hook the filter will be attached to. Requirements for ifindex and attach_point are same as described in bpf_tc_hook_create, but BPF_TC_CUSTOM is also supported. In that case, parent must be set to the handle where the filter will be attached (using BPF_TC_PARENT). E.g. to set parent to 1:16 like in tc command line, the equivalent would be BPF_TC_PARENT(1, 16). @opts - Cannot be NULL. The following opts are optional: handle - The handle of the filter * priority - The priority of the filter Must be >= 0 and <= UINT16_MAX Note that when left unset, they will be auto-allocated by the kernel. The following opts must be set: * prog_fd - The fd of the loaded SCHED_CLS prog The following opts must be unset: * prog_id - The ID of the BPF prog The following opts are optional: * flags - Currently only BPF_TC_F_REPLACE is allowed. It allows replacing an existing filter instead of failing with -EEXIST. The following opts will be filled by bpf_tc_attach on a successful attach operation if they are unset: * handle - The handle of the attached filter * priority - The priority of the attached filter * prog_id - The ID of the attached SCHED_CLS prog This way, the user can know what the auto allocated values for optional opts like handle and priority are for the newly attached filter, if they were unset. Note that some other attributes are set to fixed default values listed below (this holds for all bpf_tc_* APIs): protocol as ETH_P_ALL, direct action mode, chain index of 0, and class ID of 0 (this can be set by writing to the skb->tc_classid field from the BPF program). bpf_tc_detach Parameters: @hook - Cannot be NULL. Represents the hook the filter will be detached from. Requirements are same as described above in bpf_tc_attach. @opts - Cannot be NULL. The following opts must be set: * handle, priority The following opts must be unset: * prog_fd, prog_id, flags bpf_tc_query Parameters: @hook - Cannot be NULL. Represents the hook where the filter lookup will be performed. Requirements are same as described above in bpf_tc_attach(). @opts - Cannot be NULL. The following opts must be set: * handle, priority The following opts must be unset: * prog_fd, prog_id, flags The following fields will be filled by bpf_tc_query upon a successful lookup: * prog_id Some usage examples (using BPF skeleton infrastructure): BPF program (test_tc_bpf.c): #include <linux/bpf.h> #include <bpf/bpf_helpers.h> SEC("classifier") int cls(struct __sk_buff skb) { return 0; } Userspace loader: struct test_tc_bpf skel = NULL; int fd, r; skel = test_tc_bpf__open_and_load(); if (!skel) return -ENOMEM; fd = bpf_program__fd(skel->progs.cls); DECLARE_LIBBPF_OPTS(bpf_tc_hook, hook, .ifindex = if_nametoindex("lo"), .attach_point = BPF_TC_INGRESS); /* Create clsact qdisc / r = bpf_tc_hook_create(&hook); if (r < 0) goto end; DECLARE_LIBBPF_OPTS(bpf_tc_opts, opts, .prog_fd = fd); r = bpf_tc_attach(&hook, &opts); if (r < 0) goto end; / Print the auto allocated handle and priority / printf("Handle=%u", opts.handle); printf("Priority=%u", opts.priority); opts.prog_fd = opts.prog_id = 0; bpf_tc_detach(&hook, &opts); end: test_tc_bpf__destroy(skel); This is equivalent to doing the following using tc command line: # tc qdisc add dev lo clsact # tc filter add dev lo ingress bpf obj foo.o sec classifier da # tc filter del dev lo ingress handle <h> prio <p> bpf ... where the handle and priority can be found using: # tc filter show dev lo ingress Another example replacing a filter (extending prior example): / We can also choose both (or one), let's try replacing an * existing filter. / DECLARE_LIBBPF_OPTS(bpf_tc_opts, replace_opts, .handle = opts.handle, .priority = opts.priority, .prog_fd = fd); r = bpf_tc_attach(&hook, &replace_opts); if (r == -EEXIST) { / Expected, now use BPF_TC_F_REPLACE to replace it / replace_opts.flags = BPF_TC_F_REPLACE; return bpf_tc_attach(&hook, &replace_opts); } else if (r < 0) { return r; } / There must be no existing filter with these * attributes, so cleanup and return an error. / replace_opts.prog_fd = replace_opts.prog_id = 0; bpf_tc_detach(&hook, &replace_opts); return -1; To obtain info of a particular filter: / Find info for filter with handle 1 and priority 50 */ DECLARE_LIBBPF_OPTS(bpf_tc_opts, info_opts, .handle = 1, .priority = 50); r = bpf_tc_query(&hook, &info_opts); if (r == -ENOENT) printf("Filter not found"); else if (r < 0) return r; printf("Prog ID: %u", info_opts.prog_id); return 0; Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Co-developed-by: Daniel Borkmann <daniel@iogearbox.net> # libbpf API design [ Daniel: also did major patch cleanup ] Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20210512103451.989420-3-memxor@gmail.com	2021-05-17 17:12:06 -07:00
Kumar Kartikeya Dwivedi	01515b8f05	libbpf: Add various netlink helpers This change introduces a few helpers to wrap open coded attribute preparation in netlink.c. It also adds a libbpf_netlink_send_recv() that is useful to wrap send + recv handling in a generic way. Subsequent patch will also use this function for sending and receiving a netlink response. The libbpf_nl_get_link() helper has been removed instead, moving socket creation into the newly named libbpf_netlink_send_recv(). Every nested attribute's closure must happen using the helper nlattr_end_nested(), which sets its length properly. NLA_F_NESTED is enforced using nlattr_begin_nested() helper. Other simple attributes can be added directly. The maxsz parameter corresponds to the size of the request structure which is being filled in, so for instance with req being: struct { struct nlmsghdr nh; struct tcmsg t; char buf[4096]; } req; Then, maxsz should be sizeof(req). This change also converts the open coded attribute preparation with these helpers. Note that the only failure the internal call to nlattr_add() could result in the nested helper would be -EMSGSIZE, hence that is what we return to our caller. The libbpf_netlink_send_recv() call takes care of opening the socket, sending the netlink message, receiving the response, potentially invoking callbacks, and return errors if any, and then finally close the socket. This allows users to avoid identical socket setup code in different places. The only user of libbpf_nl_get_link() has been converted to make use of it. __bpf_set_link_xdp_fd_replace() has also been refactored to use it. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> [ Daniel: major patch cleanup ] Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20210512103451.989420-2-memxor@gmail.com	2021-05-17 17:12:06 -07:00
Andrii Nakryiko	6028cec50c	libbpf: Reject static entry-point BPF programs Detect use of static entry-point BPF programs (those with SEC() markings) and emit error message. This is similar to c1cccec9c636 ("libbpf: Reject static maps") but for BPF programs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210514195534.1440970-1-andrii@kernel.org	2021-05-17 17:12:06 -07:00
Andrii Nakryiko	68695d0173	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 9d31d2338950293ec19d9b095fbaa9030899dcb4 Checkpoint bpf-next commit: c1cccec9c63637c4c5ee0aa2da2850d983c19e88 Baseline bpf commit: 9683e5775c75097c46bd24e65411b16ac6c6cbb3 Checkpoint bpf commit: c9a7c013569d73881199a7a3011c03336f592cc8 Andrii Nakryiko (4): libbpf: Add per-file linker opts libbpf: Fix ELF symbol visibility update logic libbpf: Treat STV_INTERNAL same as STV_HIDDEN for functions libbpf: Reject static maps Arnaldo Carvalho de Melo (1): libbpf: Provide GELF_ST_VISIBILITY() define for older libelf src/libbpf.c \| 35 +++++++++++++++++++++++++---------- src/libbpf.h \| 10 +++++++++- src/libbpf_internal.h \| 5 +++++ src/linker.c \| 18 +++++++++++++----- 4 files changed, 52 insertions(+), 16 deletions(-) -- 2.30.2	2021-05-17 14:36:22 -07:00
Arnaldo Carvalho de Melo	72cdd6ed42	libbpf: Provide GELF_ST_VISIBILITY() define for older libelf Where that macro isn't available. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/YJaspEh0qZr4LYOc@kernel.org	2021-05-17 14:36:22 -07:00
Andrii Nakryiko	b5bfbab488	libbpf: Reject static maps Static maps never really worked with libbpf, because all such maps were always silently resolved to the very first map. Detect static maps (both legacy and BTF-defined) and report user-friendly error. Tested locally by switching few maps (legacy and BTF-defined) in selftests to static ones and verifying that now libbpf rejects them loudly. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210513233643.194711-2-andrii@kernel.org	2021-05-17 14:36:22 -07:00
Andrii Nakryiko	1cf1c245d1	libbpf: Treat STV_INTERNAL same as STV_HIDDEN for functions Do the same global -> static BTF update for global functions with STV_INTERNAL visibility to turn on static BPF verification mode. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210507054119.270888-7-andrii@kernel.org	2021-05-17 14:36:22 -07:00
Andrii Nakryiko	076dd5dadb	libbpf: Fix ELF symbol visibility update logic Fix silly bug in updating ELF symbol's visibility. Fixes: a46349227cd8 ("libbpf: Add linker extern resolution support for functions and global variables") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210507054119.270888-6-andrii@kernel.org	2021-05-17 14:36:22 -07:00
Andrii Nakryiko	6f585ab88f	libbpf: Add per-file linker opts For better future extensibility add per-file linker options. Currently the set of available options is empty. This changes bpf_linker__add_file() API, but it's not a breaking change as bpf_linker APIs hasn't been released yet. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210507054119.270888-3-andrii@kernel.org	2021-05-17 14:36:22 -07:00
Andrii Nakryiko	c5389a965b	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 87bd9e602e39585c5280556a2b6a6363bb334257 Checkpoint bpf-next commit: 9d31d2338950293ec19d9b095fbaa9030899dcb4 Baseline bpf commit: b02265429681c9c827c45978a61a9f00be5ea9aa Checkpoint bpf commit: 9683e5775c75097c46bd24e65411b16ac6c6cbb3 Andrii Nakryiko (2): libbpf: Support BTF_KIND_FLOAT during type compatibility checks in CO-RE selftests/bpf: Fix BPF_CORE_READ_BITFIELD() macro Brendan Jackman (1): libbpf: Fix signed overflow in ringbuf_process_ring Ian Rogers (1): libbpf: Add NULL check to add_dummy_ksym_var src/bpf_core_read.h \| 16 ++++++++++++---- src/libbpf.c \| 9 +++++++-- src/ringbuf.c \| 30 +++++++++++++++++++++--------- 3 files changed, 40 insertions(+), 15 deletions(-) -- 2.30.2	2021-05-05 16:39:05 -07:00
Ian Rogers	a58b8ca93e	libbpf: Add NULL check to add_dummy_ksym_var Avoids a segv if btf isn't present. Seen on the call path __bpf_object__open calling bpf_object__collect_externs. Fixes: 5bd022ec01f0 (libbpf: Support extern kernel function) Suggested-by: Stanislav Fomichev <sdf@google.com> Suggested-by: Petar Penkov <ppenkov@google.com> Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210504234910.976501-1-irogers@google.com	2021-05-05 16:39:05 -07:00
Brendan Jackman	1691c37b39	libbpf: Fix signed overflow in ringbuf_process_ring One of our benchmarks running in (Google-internal) CI pushes data through the ringbuf faster htan than userspace is able to consume it. In this case it seems we're actually able to get >INT_MAX entries in a single ring_buffer__consume() call. ASAN detected that cnt overflows in this case. Fix by using 64-bit counter internally and then capping the result to INT_MAX before converting to the int return type. Do the same for the ring_buffer__poll(). Fixes: bf99c936f947 (libbpf: Add BPF ring buffer support) Signed-off-by: Brendan Jackman <jackmanb@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210429130510.1621665-1-jackmanb@google.com	2021-05-05 16:39:05 -07:00
Andrii Nakryiko	242842b34c	selftests/bpf: Fix BPF_CORE_READ_BITFIELD() macro Fix BPF_CORE_READ_BITFIELD() macro used for reading CO-RE-relocatable bitfields. Missing breaks in a switch caused 8-byte reads always. This can confuse libbpf because it does strict checks that memory load size corresponds to the original size of the field, which in this case quite often would be wrong. After fixing that, we run into another problem, which quite subtle, so worth documenting here. The issue is in Clang optimization and CO-RE relocation interactions. Without that asm volatile construct (also known as barrier_var()), Clang will re-order BYTE_OFFSET and BYTE_SIZE relocations and will apply BYTE_OFFSET 4 times for each switch case arm. This will result in the same error from libbpf about mismatch of memory load size and original field size. I.e., if we were reading u32, we'd still have (u8 ), (u16 ), (u32 ), and (u64 ) memory loads, three of which will fail. Using barrier_var() forces Clang to apply BYTE_OFFSET relocation first (and once) to calculate p, after which value of p is used without relocation in each of switch case arms, doing appropiately-sized memory load. Here's the list of relevant relocations and pieces of generated BPF code before and after this patch for test_core_reloc_bitfields_direct selftests. BEFORE ===== #45: core_reloc: insn #160 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32 #46: core_reloc: insn #167 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #47: core_reloc: insn #174 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #48: core_reloc: insn #178 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #49: core_reloc: insn #182 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 157: 18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll 159: 7b 12 20 01 00 00 00 00 (u64 )(r2 + 288) = r1 160: b7 02 00 00 04 00 00 00 r2 = 4 ; BYTE_SIZE relocation here ^^^ 161: 66 02 07 00 03 00 00 00 if w2 s> 3 goto +7 <LBB0_63> 162: 16 02 0d 00 01 00 00 00 if w2 == 1 goto +13 <LBB0_65> 163: 16 02 01 00 02 00 00 00 if w2 == 2 goto +1 <LBB0_66> 164: 05 00 12 00 00 00 00 00 goto +18 <LBB0_69> 0000000000000528 <LBB0_66>: 165: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 167: 69 11 08 00 00 00 00 00 r1 = (u16 )(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 168: 05 00 0e 00 00 00 00 00 goto +14 <LBB0_69> 0000000000000548 <LBB0_63>: 169: 16 02 0a 00 04 00 00 00 if w2 == 4 goto +10 <LBB0_67> 170: 16 02 01 00 08 00 00 00 if w2 == 8 goto +1 <LBB0_68> 171: 05 00 0b 00 00 00 00 00 goto +11 <LBB0_69> 0000000000000560 <LBB0_68>: 172: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 174: 79 11 08 00 00 00 00 00 r1 = (u64 )(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 175: 05 00 07 00 00 00 00 00 goto +7 <LBB0_69> 0000000000000580 <LBB0_65>: 176: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 178: 71 11 08 00 00 00 00 00 r1 = (u8 )(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 179: 05 00 03 00 00 00 00 00 goto +3 <LBB0_69> 00000000000005a0 <LBB0_67>: 180: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 182: 61 11 08 00 00 00 00 00 r1 = (u32 )(r1 + 8) ; BYTE_OFFSET relo here w/ RIGHT size ^^^^^^^^^^^^^^^^ 00000000000005b8 <LBB0_69>: 183: 67 01 00 00 20 00 00 00 r1 <<= 32 184: b7 02 00 00 00 00 00 00 r2 = 0 185: 16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71> 186: c7 01 00 00 20 00 00 00 r1 s>>= 32 187: 05 00 01 00 00 00 00 00 goto +1 <LBB0_72> 00000000000005e0 <LBB0_71>: 188: 77 01 00 00 20 00 00 00 r1 >>= 32 AFTER ===== #30: core_reloc: insn #132 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #31: core_reloc: insn #134 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32 129: 18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll 131: 7b 12 20 01 00 00 00 00 (u64 )(r2 + 288) = r1 132: b7 01 00 00 08 00 00 00 r1 = 8 ; BYTE_OFFSET relo here ^^^ ; no size check for non-memory dereferencing instructions 133: 0f 12 00 00 00 00 00 00 r2 += r1 134: b7 03 00 00 04 00 00 00 r3 = 4 ; BYTE_SIZE relocation here ^^^ 135: 66 03 05 00 03 00 00 00 if w3 s> 3 goto +5 <LBB0_63> 136: 16 03 09 00 01 00 00 00 if w3 == 1 goto +9 <LBB0_65> 137: 16 03 01 00 02 00 00 00 if w3 == 2 goto +1 <LBB0_66> 138: 05 00 0a 00 00 00 00 00 goto +10 <LBB0_69> 0000000000000458 <LBB0_66>: 139: 69 21 00 00 00 00 00 00 r1 = (u16 )(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 140: 05 00 08 00 00 00 00 00 goto +8 <LBB0_69> 0000000000000468 <LBB0_63>: 141: 16 03 06 00 04 00 00 00 if w3 == 4 goto +6 <LBB0_67> 142: 16 03 01 00 08 00 00 00 if w3 == 8 goto +1 <LBB0_68> 143: 05 00 05 00 00 00 00 00 goto +5 <LBB0_69> 0000000000000480 <LBB0_68>: 144: 79 21 00 00 00 00 00 00 r1 = (u64 )(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 145: 05 00 03 00 00 00 00 00 goto +3 <LBB0_69> 0000000000000490 <LBB0_65>: 146: 71 21 00 00 00 00 00 00 r1 = (u8 )(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 147: 05 00 01 00 00 00 00 00 goto +1 <LBB0_69> 00000000000004a0 <LBB0_67>: 148: 61 21 00 00 00 00 00 00 r1 = (u32 )(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 00000000000004a8 <LBB0_69>: 149: 67 01 00 00 20 00 00 00 r1 <<= 32 150: b7 02 00 00 00 00 00 00 r2 = 0 151: 16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71> 152: c7 01 00 00 20 00 00 00 r1 s>>= 32 153: 05 00 01 00 00 00 00 00 goto +1 <LBB0_72> 00000000000004d0 <LBB0_71>: 154: 77 01 00 00 20 00 00 00 r1 >>= 323 Fixes: ee26dade0e3b ("libbpf: Add support for relocatable bitfields") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Lorenz Bauer <lmb@cloudflare.com> Link: https://lore.kernel.org/bpf/20210426192949.416837-4-andrii@kernel.org	2021-05-05 16:39:05 -07:00
Andrii Nakryiko	6b14cfa56e	libbpf: Support BTF_KIND_FLOAT during type compatibility checks in CO-RE Add BTF_KIND_FLOAT support when doing CO-RE field type compatibility check. Without this, relocations against float/double fields will fail. Also adjust one error message to emit instruction index instead of less convenient instruction byte offset. Fixes: 22541a9eeb0d ("libbpf: Add BTF_KIND_FLOAT support") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Lorenz Bauer <lmb@cloudflare.com> Link: https://lore.kernel.org/bpf/20210426192949.416837-3-andrii@kernel.org	2021-05-05 16:39:05 -07:00
Andrii Nakryiko	b8b1faa3d4	travis: fix libelf-dev build-dep issues by using aptitude instead of apt-get Use aptitude to actually see what's wrong with the dependencies. And it actually magically resolves whatever minor version conflicts there are. The big surprise came from the apparent difference in build-dep command behavior. Aptitude's build-dep doesn't seem to install the libpfelf-dev package itself. Adding explicit `aptitude install libelf-dev` after build-dep solves the issue for now. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-05-05 13:57:07 -07:00
Andrii Nakryiko	9e123fa5d2	vmtests: fix libc6 dependency and remove explicit libelf-dev install Force libc6 dependency version. Drop explicit libelf-dev install command, as it should be pre-installed by Travis CI already, according to .travis.yaml. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-04-30 09:39:10 -07:00
Andrii Nakryiko	4ccc1f0b9f	vmtest: blacklist 2 tests on 5.5 Blacklist linked_vars, which expects typed ksym support. Blacklist snprintf, which expectes new bpf_snprintf() helper. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-04-26 16:30:18 -07:00
Andrii Nakryiko	b9278634aa	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 1e1032b0c4afaed7739a6681ff6b4cb120b82994 Checkpoint bpf-next commit: 87bd9e602e39585c5280556a2b6a6363bb334257 Baseline bpf commit: a14d273ba15968495896a38b7b3399dba66d0270 Checkpoint bpf commit: b02265429681c9c827c45978a61a9f00be5ea9aa Alexei Starovoitov (1): libbpf: Remove unused field. Andrii Nakryiko (11): libbpf: Add bpf_map__inner_map API libbpf: Suppress compiler warning when using SEC() macro with externs libbpf: Mark BPF subprogs with hidden visibility as static for BPF verifier libbpf: Allow gaps in BPF program sections to support overriden weak functions libbpf: Refactor BTF map definition parsing libbpf: Factor out symtab and relos sanity checks libbpf: Make few internal helpers available outside of libbpf.c libbpf: Extend sanity checking ELF symbols with externs validation libbpf: Tighten BTF type ID rewriting with error checking libbpf: Add linker extern resolution support for functions and global variables libbpf: Support extern resolution for BTF-defined maps in .maps section Ciara Loftus (1): libbpf: Fix potential NULL pointer dereference Daniel Borkmann (1): bpf: Sync bpf headers in tooling infrastucture Florent Revest (3): bpf: Add a bpf_snprintf helper libbpf: Initialize the bpf_seq_printf parameters array field by field libbpf: Introduce a BPF_SNPRINTF helper macro Pedro Tammela (1): libbpf: Clarify flags in ringbuf helpers Toke Høiland-Jørgensen (1): bpf: Return target info when a tracing bpf_link is queried include/uapi/linux/bpf.h \| 83 ++- src/bpf_helpers.h \| 19 +- src/bpf_tracing.h \| 58 +- src/btf.c \| 5 - src/libbpf.c \| 396 +++++++----- src/libbpf.h \| 1 + src/libbpf.map \| 1 + src/libbpf_internal.h \| 45 ++ src/linker.c \| 1270 ++++++++++++++++++++++++++++++++------ src/xsk.c \| 3 + 10 files changed, 1512 insertions(+), 369 deletions(-) -- 2.30.2	2021-04-26 16:30:18 -07:00
Andrii Nakryiko	af47e6c199	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2021-04-26 16:30:18 -07:00
Andrii Nakryiko	b2c06aec99	libbpf: Support extern resolution for BTF-defined maps in .maps section Add extra logic to handle map externs (only BTF-defined maps are supported for linking). Re-use the map parsing logic used during bpf_object__open(). Map externs are currently restricted to always match complete map definition. So all the specified attributes will be compared (down to pining, map_flags, numa_node, etc). In the future this restriction might be relaxed with no backwards compatibility issues. If any attribute is mismatched between extern and actual map definition, linker will report an error, pointing out which one mismatches. The original intent was to allow for extern to specify attributes that matters (to user) to enforce. E.g., if you specify just key information and omit value, then any value fits. Similarly, it should have been possible to enforce map_flags, pinning, and any other possible map attribute. Unfortunately, that means that multiple externs can be only partially overlapping with each other, which means linker would need to combine their type definitions to end up with the most restrictive and fullest map definition. This requires an extra amount of BTF manipulation which at this time was deemed unnecessary and would require further extending generic BTF writer APIs. So that is left for future follow ups, if there will be demand for that. But the idea seems intresting and useful, so I want to document it here. Weak definitions are also supported, but are pretty strict as well, just like externs: all weak map definitions have to match exactly. In the follow up patches this most probably will be relaxed, with __weak map definitions being able to differ between each other (with non-weak definition always winning, of course). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-13-andrii@kernel.org	2021-04-26 16:30:18 -07:00
Andrii Nakryiko	29e4840915	libbpf: Add linker extern resolution support for functions and global variables Add BPF static linker logic to resolve extern variables and functions across multiple linked together BPF object files. For that, linker maintains a separate list of struct glob_sym structures, which keeps track of few pieces of metadata (is it extern or resolved global, is it a weak symbol, which ELF section it belongs to, etc) and ties together BTF type info and ELF symbol information and keeps them in sync. With adding support for extern variables/funcs, it's now possible for some sections to contain both extern and non-extern definitions. This means that some sections may start out as ephemeral (if only externs are present and thus there is not corresponding ELF section), but will be "upgraded" to actual ELF section as symbols are resolved or new non-extern definitions are appended. Additional care is taken to not duplicate extern entries in sections like .kconfig and .ksyms. Given libbpf requires BTF type to always be present for .kconfig/.ksym externs, linker extends this requirement to all the externs, even those that are supposed to be resolved during static linking and which won't be visible to libbpf. With BTF information always present, static linker will check not just ELF symbol matches, but entire BTF type signature match as well. That logic is stricter that BPF CO-RE checks. It probably should be re-used by .ksym resolution logic in libbpf as well, but that's left for follow up patches. To make it unnecessary to rewrite ELF symbols and minimize BTF type rewriting/removal, ELF symbols that correspond to externs initially will be updated in place once they are resolved. Similarly for BTF type info, VAR/FUNC and var_secinfo's (sec_vars in struct bpf_linker) are staying stable, but types they point to might get replaced when extern is resolved. This might leave some left-over types (even though we try to minimize this for common cases of having extern funcs with not argument names vs concrete function with names properly specified). That can be addresses later with a generic BTF garbage collection. That's left for a follow up as well. Given BTF type appending phase is separate from ELF symbol appending/resolution, special struct glob_sym->underlying_btf_id variable is used to communicate resolution and rewrite decisions. 0 means underlying_btf_id needs to be appended (it's not yet in final linker->btf), <0 values are used for temporary storage of source BTF type ID (not yet rewritten), so -glob_sym->underlying_btf_id is BTF type id in obj-btf. But by the end of linker_append_btf() phase, that underlying_btf_id will be remapped and will always be > 0. This is the uglies part of the whole process, but keeps the other parts much simpler due to stability of sec_var and VAR/FUNC types, as well as ELF symbol, so please keep that in mind while reviewing. BTF-defined maps require some extra custom logic and is addressed separate in the next patch, so that to keep this one smaller and easier to review. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-12-andrii@kernel.org	2021-04-26 16:30:18 -07:00
Andrii Nakryiko	7078c5eae4	libbpf: Tighten BTF type ID rewriting with error checking It should never fail, but if it does, it's better to know about this rather than end up with nonsensical type IDs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-11-andrii@kernel.org	2021-04-26 16:30:18 -07:00
Andrii Nakryiko	692ae888bc	libbpf: Extend sanity checking ELF symbols with externs validation Add logic to validate extern symbols, plus some other minor extra checks, like ELF symbol #0 validation, general symbol visibility and binding validations. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-10-andrii@kernel.org	2021-04-26 16:30:18 -07:00
Andrii Nakryiko	24b5d82967	libbpf: Make few internal helpers available outside of libbpf.c Make skip_mods_and_typedefs(), btf_kind_str(), and btf_func_linkage() helpers available outside of libbpf.c, to be used by static linker code. Also do few cleanups (error code fixes, comment clean up, etc) that don't deserve their own commit. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-9-andrii@kernel.org	2021-04-26 16:30:18 -07:00
Andrii Nakryiko	4dcf439178	libbpf: Factor out symtab and relos sanity checks Factor out logic for sanity checking SHT_SYMTAB and SHT_REL sections into separate sections. They are already quite extensive and are suffering from too deep indentation. Subsequent changes will extend SYMTAB sanity checking further, so it's better to factor each into a separate function. No functional changes are intended. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-8-andrii@kernel.org	2021-04-26 16:30:18 -07:00
Andrii Nakryiko	d9b9d4a43a	libbpf: Refactor BTF map definition parsing Refactor BTF-defined maps parsing logic to allow it to be nicely reused by BPF static linker. Further, at least for BPF static linker, it's important to know which attributes of a BPF map were defined explicitly, so provide a bit set for each known portion of BTF map definition. This allows BPF static linker to do a simple check when dealing with extern map declarations. The same capabilities allow to distinguish attributes explicitly set to zero (e.g., __uint(max_entries, 0)) vs the case of not specifying it at all (no max_entries attribute at all). Libbpf is currently not utilizing that, but it could be useful for backwards compatibility reasons later. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-7-andrii@kernel.org	2021-04-26 16:30:18 -07:00
Andrii Nakryiko	7b106ea4b1	libbpf: Allow gaps in BPF program sections to support overriden weak functions Currently libbpf is very strict about parsing BPF program instruction sections. No gaps are allowed between sequential BPF programs within a given ELF section. Libbpf enforced that by keeping track of the next section offset that should start a new BPF (sub)program and cross-checks that by searching for a corresponding STT_FUNC ELF symbol. But this is too restrictive once we allow to have weak BPF programs and link together two or more BPF object files. In such case, some weak BPF programs might be "overridden" by either non-weak BPF program with the same name and signature, or even by another weak BPF program that just happened to be linked first. That, in turn, leaves BPF instructions of the "lost" BPF (sub)program intact, but there is no corresponding ELF symbol, because no one is going to be referencing it. Libbpf already correctly handles such cases in the sense that it won't append such dead code to actual BPF programs loaded into kernel. So the only change that needs to be done is to relax the logic of parsing BPF instruction sections. Instead of assuming next BPF (sub)program section offset, iterate available STT_FUNC ELF symbols to discover all available BPF subprograms and programs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-6-andrii@kernel.org	2021-04-26 16:30:18 -07:00
Andrii Nakryiko	3319982d34	libbpf: Mark BPF subprogs with hidden visibility as static for BPF verifier Define __hidden helper macro in bpf_helpers.h, which is a short-hand for __attribute__((visibility("hidden"))). Add libbpf support to mark BPF subprograms marked with __hidden as static in BTF information to enforce BPF verifier's static function validation algorithm, which takes more information (caller's context) into account during a subprogram validation. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-5-andrii@kernel.org	2021-04-26 16:30:18 -07:00
Andrii Nakryiko	2e430712f5	libbpf: Suppress compiler warning when using SEC() macro with externs When used on externs SEC() macro will trigger compilation warning about inapplicable `__attribute__((used))`. That's expected for extern declarations, so suppress it with the corresponding _Pragma. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210423181348.1801389-4-andrii@kernel.org	2021-04-26 16:30:18 -07:00
Florent Revest	120a21852b	libbpf: Introduce a BPF_SNPRINTF helper macro Similarly to BPF_SEQ_PRINTF, this macro turns variadic arguments into an array of u64, making it more natural to call the bpf_snprintf helper. Signed-off-by: Florent Revest <revest@chromium.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210419155243.1632274-6-revest@chromium.org	2021-04-26 16:30:18 -07:00
Florent Revest	f4da689d90	libbpf: Initialize the bpf_seq_printf parameters array field by field When initializing the __param array with a one liner, if all args are const, the initial array value will be placed in the rodata section but because libbpf does not support relocation in the rodata section, any pointer in this array will stay NULL. Fixes: c09add2fbc5a ("tools/libbpf: Add bpf_iter support") Signed-off-by: Florent Revest <revest@chromium.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210419155243.1632274-5-revest@chromium.org	2021-04-26 16:30:18 -07:00
Florent Revest	30755d3a1c	bpf: Add a bpf_snprintf helper The implementation takes inspiration from the existing bpf_trace_printk helper but there are a few differences: To allow for a large number of format-specifiers, parameters are provided in an array, like in bpf_seq_printf. Because the output string takes two arguments and the array of parameters also takes two arguments, the format string needs to fit in one argument. Thankfully, ARG_PTR_TO_CONST_STR is guaranteed to point to a zero-terminated read-only map so we don't need a format string length arg. Because the format-string is known at verification time, we also do a first pass of format string validation in the verifier logic. This makes debugging easier. Signed-off-by: Florent Revest <revest@chromium.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210419155243.1632274-4-revest@chromium.org	2021-04-26 16:30:18 -07:00
Alexei Starovoitov	dda0dd6a87	libbpf: Remove unused field. relo->processed is set, but not used. Remove it. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210415141817.53136-1-alexei.starovoitov@gmail.com	2021-04-26 16:30:18 -07:00
Toke Høiland-Jørgensen	c21f91bd35	bpf: Return target info when a tracing bpf_link is queried There is currently no way to discover the target of a tracing program attachment after the fact. Add this information to bpf_link_info and return it when querying the bpf_link fd. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210413091607.58945-1-toke@redhat.com	2021-04-26 16:30:18 -07:00
Pedro Tammela	552dec12dc	libbpf: Clarify flags in ringbuf helpers In 'bpf_ringbuf_reserve()' we require the flag to '0' at the moment. For 'bpf_ringbuf_{discard,submit,output}' a flag of '0' might send a notification to the process if needed. Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210412192434.944343-1-pctammela@mojatatu.com	2021-04-26 16:30:18 -07:00
Daniel Borkmann	678e8c8e49	bpf: Sync bpf headers in tooling infrastucture Synchronize tools/include/uapi/linux/bpf.h which was missing changes from various commits: - f3c45326ee71 ("bpf: Document PROG_TEST_RUN limitations") - e5e35e754c28 ("bpf: BPF-helper for MTU checking add length input") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2021-04-26 16:30:18 -07:00
Andrii Nakryiko	f6de59dc3e	libbpf: Add bpf_map__inner_map API The API gives access to inner map for map in map types (array or hash of map). It will be used to dynamically set max_entries in it. Signed-off-by: Yauheni Kaliuta <yauheni.kaliuta@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210408061310.95877-7-yauheni.kaliuta@redhat.com	2021-04-26 16:30:18 -07:00
Ciara Loftus	823648416c	libbpf: Fix potential NULL pointer dereference Wait until after the UMEM is checked for null to dereference it. Fixes: 43f1bc1efff1 ("libbpf: Restore umem state after socket create failure") Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210408052009.7844-1-ciara.loftus@intel.com	2021-04-26 16:30:18 -07:00
Andrii Nakryiko	02dbcbea28	travis-ci: use default docker from Focal Stop trying to update Docker version. Focal seems to have recent enough Docker. This fixes Debian builds. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-04-26 08:30:13 -07:00
Ilya Leoshkevich	a7502f2707	vmtest: fix error reporting S50-run-tests uses -e, which means that it immediately exits on test failures without writing /exitcode. Fix by temporarily turning -e off. Another issue is that $? in S50-run-tests is not quoted, which causes the random value from the host to be taken (in practice always 0), so fix that as well. Finally, this fix has a positive side effect - QEMU no longer hangs when tests fail. This is because rcS (generated by mkrootfs.sh) also uses -e and immediately exits, if one of the scripts that it calls fails, without calling S99-poweroff. Example output after the fix: Summary: 53/184 PASSED, 5 SKIPPED, 1 FAILED + exitstatus=1 + set -e + echo 1 + chmod 644 /exitstatus + for path in /etc/rcS.d/S* + '[' -x /etc/rcS.d/S99-poweroff ']' + /etc/rcS.d/S99-poweroff travis_fold:start:shutdown Shutdown starting pid 232, tty '': '/sbin/swapoff -a' starting pid 233, tty '': '/bin/umount -a -r' [ 45.909033] EXT4-fs (vda): re-mounted. Opts: (null) The system is going down NOW! Sent SIGTERM to all processes Sent SIGKILL to all processes Requesting system poweroff [ 48.932007] ACPI: Preparing to enter system sleep state S5 [ 48.932785] reboot: Power down Tests exit status: 1 travis_fold🔚shutdown Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>	2021-04-06 21:57:01 -07:00
Andrii Nakryiko	bab780e6f9	travis-ci: update to Ubuntu Focal Update to Ubuntu Focal 20.04. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-04-06 21:18:31 -07:00
Ilya Leoshkevich	915f3abe94	travis-ci: prohibit uninitialized variables in managers/ The scripts in this directory rely on certain environment variables, so fail if they are not set in order to improve the debugging experience. The vmtest/ scripts already do it. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>	2021-04-06 21:18:31 -07:00
Ilya Leoshkevich	0c248143d4	travis-ci: disable GCC's -Wstringop-truncation noisy error on Ubuntu This is the same as commit `4d86cae4f0` ("ci: disable GCC's -Wstringop-truncation noisy error"), but for Ubuntu. Without this, there are false positives in bpf_object__new() on Ubuntu 20.04: this function calls strncpy() with the correct bounds, but still triggers the warning. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>	2021-04-06 21:18:31 -07:00
Ilya Leoshkevich	9f0e42b512	vmtest: blacklist stacktrace_build_id selftest It requires v5.9+ kernel when the test code is built with a newer toolchain. The support was added by commit b33164f2bd1c ("bpf: Iterate through all PT_NOTE sections when looking for build id"). Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>	2021-04-06 21:18:31 -07:00
Andrii Nakryiko	174d0b7b49	vmtest: blacklist kfunc_call selftests kfunc_call requires 5.13+ kernel. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-04-05 11:30:05 -07:00
Andrii Nakryiko	95f83b8b0c	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 155f556d64b1a48710f01305e14bb860734ed1e3 Checkpoint bpf-next commit: 1e1032b0c4afaed7739a6681ff6b4cb120b82994 Baseline bpf commit: 002322402dafd846c424ffa9240a937f49b48c42 Checkpoint bpf commit: a14d273ba15968495896a38b7b3399dba66d0270 Andrii Nakryiko (2): libbpf: Preserve empty DATASEC BTFs during static linking libbpf: Fix memory leak when emitting final btf_ext Ciara Loftus (3): libbpf: Ensure umem pointer is non-NULL before dereferencing libbpf: Restore umem state after socket create failure libbpf: Only create rx and tx XDP rings when necessary Cong Wang (1): sock_map: Introduce BPF_SK_SKB_VERDICT Hengqi Chen (1): libbpf: Fix KERNEL_VERSION macro Maciej Fijalkowski (1): libbpf: xsk: Use bpf_link Martin KaFai Lau (6): bpf: Support bpf program calling kernel function libbpf: Refactor bpf_object__resolve_ksyms_btf_id libbpf: Refactor codes for finding btf id of a kernel symbol libbpf: Rename RELO_EXTERN to RELO_EXTERN_VAR libbpf: Record extern sym relocation first libbpf: Support extern kernel function Pedro Tammela (1): libbpf: Fix bail out from 'ringbuf_process_ring()' on error Yang Yingliang (1): libbpf: Remove redundant semi-colon include/uapi/linux/bpf.h \| 5 + src/bpf_helpers.h \| 2 +- src/libbpf.c \| 389 +++++++++++++++++++++++++++++---------- src/linker.c \| 39 +++- src/ringbuf.c \| 2 +- src/xsk.c \| 315 ++++++++++++++++++++++++------- 6 files changed, 574 insertions(+), 178 deletions(-) -- 2.30.2	2021-04-05 11:30:05 -07:00
Ciara Loftus	416343d95c	libbpf: Only create rx and tx XDP rings when necessary Prior to this commit xsk_socket__create(_shared) always attempted to create the rx and tx rings for the socket. However this causes an issue when the socket being setup is that which shares the fd with the UMEM. If a previous call to this function failed with this socket after the rings were set up, a subsequent call would always fail because the rings are not torn down after the first call and when we try to set them up again we encounter an error because they already exist. Solve this by remembering whether the rings were set up by introducing new bools to struct xsk_umem which represent the ring setup status and using them to determine whether or not to set up the rings. Fixes: 1cad07884239 ("libbpf: add support for using AF_XDP sockets") Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210331061218.1647-4-ciara.loftus@intel.com	2021-04-05 11:30:05 -07:00
Ciara Loftus	4f932c1ee9	libbpf: Restore umem state after socket create failure If the call to xsk_socket__create fails, the user may want to retry the socket creation using the same umem. Ensure that the umem is in the same state on exit if the call fails by: 1. ensuring the umem _save pointers are unmodified. 2. not unmapping the set of umem rings that were set up with the umem during xsk_umem__create, since those maps existed before the call to xsk_socket__create and should remain in tact even in the event of failure. Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices") Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210331061218.1647-3-ciara.loftus@intel.com	2021-04-05 11:30:05 -07:00
Ciara Loftus	37e838f959	libbpf: Ensure umem pointer is non-NULL before dereferencing Calls to xsk_socket__create dereference the umem to access the fill_save and comp_save pointers. Make sure the umem is non-NULL before doing this. Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices") Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20210331061218.1647-2-ciara.loftus@intel.com	2021-04-05 11:30:05 -07:00
Pedro Tammela	d98e968707	libbpf: Fix bail out from 'ringbuf_process_ring()' on error The current code bails out with negative and positive returns. If the callback returns a positive return code, 'ring_buffer__consume()' and 'ring_buffer__poll()' will return a spurious number of records consumed, but mostly important will continue the processing loop. This patch makes positive returns from the callback a no-op. Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support") Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210325150115.138750-1-pctammela@mojatatu.com	2021-04-05 11:30:05 -07:00
Hengqi Chen	d4beac571a	libbpf: Fix KERNEL_VERSION macro Add missing ')' for KERNEL_VERSION macro. Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210405040119.802188-1-hengqi.chen@gmail.com	2021-04-05 11:30:05 -07:00
Yang Yingliang	bea42d49f8	libbpf: Remove redundant semi-colon Remove redundant semi-colon in finalize_btf_ext(). Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210402012634.1965453-1-yangyingliang@huawei.com	2021-04-05 11:30:05 -07:00
Cong Wang	4e8d8d5cd2	sock_map: Introduce BPF_SK_SKB_VERDICT Reusing BPF_SK_SKB_STREAM_VERDICT is possible but its name is confusing and more importantly we still want to distinguish them from user-space. So we can just reuse the stream verdict code but introduce a new type of eBPF program, skb_verdict. Users are not allowed to attach stream_verdict and skb_verdict programs to the same map. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210331023237.41094-10-xiyou.wangcong@gmail.com	2021-04-05 11:30:05 -07:00
Maciej Fijalkowski	8628610c32	libbpf: xsk: Use bpf_link Currently, if there are multiple xdpsock instances running on a single interface and in case one of the instances is terminated, the rest of them are left in an inoperable state due to the fact of unloaded XDP prog from interface. Consider the scenario below: // load xdp prog and xskmap and add entry to xskmap at idx 10 $ sudo ./xdpsock -i ens801f0 -t -q 10 // add entry to xskmap at idx 11 $ sudo ./xdpsock -i ens801f0 -t -q 11 terminate one of the processes and another one is unable to work due to the fact that the XDP prog was unloaded from interface. To address that, step away from setting bpf prog in favour of bpf_link. This means that refcounting of BPF resources will be done automatically by bpf_link itself. Provide backward compatibility by checking if underlying system is bpf_link capable. Do this by looking up/creating bpf_link on loopback device. If it failed in any way, stick with netlink-based XDP prog. therwise, use bpf_link-based logic. When setting up BPF resources during xsk socket creation, check whether bpf_link for a given ifindex already exists via set of calls to bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd and comparing the ifindexes from bpf_link and xsk socket. For case where resources exist but they are not AF_XDP related, bail out and ask user to remove existing prog and then retry. Lastly, do a bit of refactoring within __xsk_setup_xdp_prog and pull out existing code branches based on prog_id value onto separate functions that are responsible for resource initialization if prog_id was 0 and for lookup existing resources for non-zero prog_id as that implies that XDP program is present on the underlying net device. This in turn makes it easier to follow, especially the teardown part of both branches. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210329224316.17793-7-maciej.fijalkowski@intel.com	2021-04-05 11:30:05 -07:00
Andrii Nakryiko	2e51adc9bc	libbpf: Fix memory leak when emitting final btf_ext Free temporary allocated memory used to construct finalized .BTF.ext data. Found by Coverity static analysis on libbpf's Github repo. Fixes: 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210327042502.969745-1-andrii@kernel.org	2021-04-05 11:30:05 -07:00
Martin KaFai Lau	1ccb9d99d6	libbpf: Support extern kernel function This patch is to make libbpf able to handle the following extern kernel function declaration and do the needed relocations before loading the bpf program to the kernel. extern int foo(struct sock *) __attribute__((section(".ksyms"))) In the collect extern phase, needed changes is made to bpf_object__collect_externs() and find_extern_btf_id() to collect extern function in ".ksyms" section. The func in the BTF datasec also needs to be replaced by an int var. The idea is similar to the existing handling in extern var. In case the BTF may not have a var, a dummy ksym var is added at the beginning of bpf_object__collect_externs() if there is func under ksyms datasec. It will also change the func linkage from extern to global which the kernel can support. It also assigns a param name if it does not have one. In the collect relo phase, it will record the kernel function call as RELO_EXTERN_FUNC. bpf_object__resolve_ksym_func_btf_id() is added to find the func btf_id of the running kernel. During actual relocation, it will patch the BPF_CALL instruction with src_reg = BPF_PSEUDO_FUNC_CALL and insn->imm set to the running kernel func's btf_id. The required LLVM patch: https://reviews.llvm.org/D93563 Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210325015234.1548923-1-kafai@fb.com	2021-04-05 11:30:05 -07:00
Martin KaFai Lau	90e052e6dd	libbpf: Record extern sym relocation first This patch records the extern sym relocs first before recording subprog relocs. The later patch will have relocs for extern kernel function call which is also using BPF_JMP \| BPF_CALL. It will be easier to handle the extern symbols first in the later patch. is_call_insn() helper is added. The existing is_ldimm64() helper is renamed to is_ldimm64_insn() for consistency. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210325015227.1548623-1-kafai@fb.com	2021-04-05 11:30:05 -07:00
Martin KaFai Lau	7ef7ed2a5d	libbpf: Rename RELO_EXTERN to RELO_EXTERN_VAR This patch renames RELO_EXTERN to RELO_EXTERN_VAR. It is to avoid the confusion with a later patch adding RELO_EXTERN_FUNC. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210325015221.1547722-1-kafai@fb.com	2021-04-05 11:30:05 -07:00
Martin KaFai Lau	7036f3356e	libbpf: Refactor codes for finding btf id of a kernel symbol This patch refactors code, that finds kernel btf_id by kind and symbol name, to a new function find_ksym_btf_id(). It also adds a new helper __btf_kind_str() to return a string by the numeric kind value. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210325015214.1547069-1-kafai@fb.com	2021-04-05 11:30:05 -07:00
Martin KaFai Lau	e5d7cbe15a	libbpf: Refactor bpf_object__resolve_ksyms_btf_id This patch refactors most of the logic from bpf_object__resolve_ksyms_btf_id() into a new function bpf_object__resolve_ksym_var_btf_id(). It is to get ready for a later patch adding bpf_object__resolve_ksym_func_btf_id() which resolves a kernel function to the running kernel btf_id. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210325015207.1546749-1-kafai@fb.com	2021-04-05 11:30:05 -07:00
Martin KaFai Lau	e35afcb289	bpf: Support bpf program calling kernel function This patch adds support to BPF verifier to allow bpf program calling kernel function directly. The use case included in this set is to allow bpf-tcp-cc to directly call some tcp-cc helper functions (e.g. "tcp_cong_avoid_ai()"). Those functions have already been used by some kernel tcp-cc implementations. This set will also allow the bpf-tcp-cc program to directly call the kernel tcp-cc implementation, For example, a bpf_dctcp may only want to implement its own dctcp_cwnd_event() and reuse other dctcp_() directly from the kernel tcp_dctcp.c instead of reimplementing (or copy-and-pasting) them. The tcp-cc kernel functions mentioned above will be white listed for the struct_ops bpf-tcp-cc programs to use in a later patch. The white listed functions are not bounded to a fixed ABI contract. Those functions have already been used by the existing kernel tcp-cc. If any of them has changed, both in-tree and out-of-tree kernel tcp-cc implementations have to be changed. The same goes for the struct_ops bpf-tcp-cc programs which have to be adjusted accordingly. This patch is to make the required changes in the bpf verifier. First change is in btf.c, it adds a case in "btf_check_func_arg_match()". When the passed in "btf->kernel_btf == true", it means matching the verifier regs' states with a kernel function. This will handle the PTR_TO_BTF_ID reg. It also maps PTR_TO_SOCK_COMMON, PTR_TO_SOCKET, and PTR_TO_TCP_SOCK to its kernel's btf_id. In the later libbpf patch, the insn calling a kernel function will look like: insn->code == (BPF_JMP \| BPF_CALL) insn->src_reg == BPF_PSEUDO_KFUNC_CALL / <- new in this patch / insn->imm == func_btf_id / btf_id of the running kernel */ [ For the future calling function-in-kernel-module support, an array of module btf_fds can be passed at the load time and insn->off can be used to index into this array. ] At the early stage of verifier, the verifier will collect all kernel function calls into "struct bpf_kfunc_desc". Those descriptors are stored in "prog->aux->kfunc_tab" and will be available to the JIT. Since this "add" operation is similar to the current "add_subprog()" and looking for the same insn->code, they are done together in the new "add_subprog_and_kfunc()". In the "do_check()" stage, the new "check_kfunc_call()" is added to verify the kernel function call instruction: 1. Ensure the kernel function can be used by a particular BPF_PROG_TYPE. A new bpf_verifier_ops "check_kfunc_call" is added to do that. The bpf-tcp-cc struct_ops program will implement this function in a later patch. 2. Call "btf_check_kfunc_args_match()" to ensure the regs can be used as the args of a kernel function. 3. Mark the regs' type, subreg_def, and zext_dst. At the later do_misc_fixups() stage, the new fixup_kfunc_call() will replace the insn->imm with the function address (relative to __bpf_call_base). If needed, the jit can find the btf_func_model by calling the new bpf_jit_find_kfunc_model(prog, insn). With the imm set to the function address, "bpftool prog dump xlated" will be able to display the kernel function calls the same way as it displays other bpf helper calls. gpl_compatible program is required to call kernel function. This feature currently requires JIT. The verifier selftests are adjusted because of the changes in the verbose log in add_subprog_and_kfunc(). Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210325015142.1544736-1-kafai@fb.com	2021-04-05 11:30:05 -07:00
Andrii Nakryiko	a18b72b920	libbpf: Preserve empty DATASEC BTFs during static linking Ensure that BPF static linker preserves all DATASEC BTF types, even if some of them might not have any variable information at all. This may happen if the compiler promotes local initialized variable contents into .rodata section and there are no global or static functions in the program. For example, $ cat t.c struct t { char a; char b; char c; }; void bar(struct t*); void find() { struct t tmp = {1, 2, 3}; bar(&tmp); } $ clang -target bpf -O2 -g -S t.c .long 104 # BTF_KIND_DATASEC(id = 8) .long 251658240 # 0xf000000 .long 0 .ascii ".rodata" # string offset=104 $ clang -target bpf -O2 -g -c t.c $ readelf -S t.o \| grep data [ 4] .rodata PROGBITS 0000000000000000 00000090 Fixes: 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support") Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210326043036.3081011-1-andrii@kernel.org	2021-04-05 11:30:05 -07:00
Adam Jensen	2bd682d23e	README: Mention Alpine in list of packaging distros	2021-04-04 17:50:11 -07:00
Andrii Nakryiko	99bc176337	README: update links to more up-to-date CO-RE articles Update links to point to blog posts that have some new updates and are generally kept more up-to-date. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-03-28 13:36:49 -07:00
Andrii Nakryiko	ea5752c641	Makefile: add strset.o and linker.o to build Fix libbpf build by linking strset.o and linker.o into libbpf.a. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-03-25 23:40:47 -07:00
Andrii Nakryiko	1d2d2d0034	vmtests: blacklist fexit_sleep on 5.5 Blacklist fexit_sleep selftest that relies on bpf_trampoline fix in 5.12. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-03-25 23:31:23 -07:00
Andrii Nakryiko	582b8fe21b	sync: remove libbpf_util.h from the list of headers libbpf_util.h was removed in 7e8bbe24cb8b ("libbpf: xsk: Move barriers from libbpf_util.h to xsk.h") upstream, so remove it from the list of installable headers. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-03-25 23:31:23 -07:00
Andrii Nakryiko	3ea10e46cb	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 22541a9eeb0d968c133aaebd95fa59da3208e705 Checkpoint bpf-next commit: 155f556d64b1a48710f01305e14bb860734ed1e3 Baseline bpf commit: 6185266c5a853bb0f2a459e3ff594546f277609b Checkpoint bpf commit: 002322402dafd846c424ffa9240a937f49b48c42 Andrii Nakryiko (11): libbpf: Add explicit padding to bpf_xdp_set_link_opts libbpf: provide NULL and KERNEL_VERSION macros in bpf_helpers.h libbpf: Expose btf_type_by_id() internally libbpf: Generalize BTF and BTF.ext type ID and strings iteration libbpf: Rename internal memory-management helpers libbpf: Extract internal set-of-strings datastructure APIs libbpf: Add generic BTF type shallow copy API libbpf: Add BPF static linker APIs libbpf: Add BPF static linker BTF and BTF.ext support libbpf: Skip BTF fixup if object file has no BTF libbpf: Constify few bpf_program getters Björn Töpel (3): libbpf, xsk: Add libbpf_smp_store_release libbpf_smp_load_acquire libbpf: xsk: Remove linux/compiler.h header libbpf: xsk: Move barriers from libbpf_util.h to xsk.h Jean-Philippe Brucker (2): libbpf: Fix arm64 build libbpf: Fix BTF dump of pointer-to-array-of-struct Joe Stringer (2): scripts/bpf: Abstract eBPF API target parameter tools: Sync uapi bpf.h header with latest changes KP Singh (1): libbpf: Add explicit padding to btf_dump_emit_type_decl_opts Kumar Kartikeya Dwivedi (1): libbpf: Use SOCK_CLOEXEC when opening the netlink socket Lorenz Bauer (1): bpf: Add PROG_TEST_RUN support for sk_lookup programs Maciej Fijalkowski (1): libbpf: Clear map_info before each bpf_obj_get_info_by_fd Namhyung Kim (1): libbpf: Fix error path in bpf_object__elf_init() Pedro Tammela (1): libbpf: Avoid inline hint definition from 'linux/stddef.h' Rafael David Tinoco (1): libbpf: Add bpf object kern_version attribute setter Xuesen Huang (1): bpf: Add bpf_skb_adjust_room flag BPF_F_ADJ_ROOM_ENCAP_L2_ETH include/uapi/linux/bpf.h \| 724 +++++++++++++- src/bpf_helpers.h \| 21 +- src/btf.c \| 714 +++++++------- src/btf.h \| 3 + src/btf_dump.c \| 10 +- src/libbpf.c \| 32 +- src/libbpf.h \| 19 +- src/libbpf.map \| 6 + src/libbpf_internal.h \| 38 +- src/libbpf_util.h \| 47 - src/linker.c \| 1944 ++++++++++++++++++++++++++++++++++++++ src/netlink.c \| 2 +- src/strset.c \| 176 ++++ src/strset.h \| 21 + src/xsk.c \| 5 +- src/xsk.h \| 87 +- 16 files changed, 3379 insertions(+), 470 deletions(-) delete mode 100644 src/libbpf_util.h create mode 100644 src/linker.c create mode 100644 src/strset.c create mode 100644 src/strset.h -- 2.30.2	2021-03-25 23:31:23 -07:00
Andrii Nakryiko	3118d38a2e	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2021-03-25 23:31:23 -07:00
Rafael David Tinoco	b09a4999d9	libbpf: Add bpf object kern_version attribute setter Unfortunately some distros don't have their kernel version defined accurately in <linux/version.h> due to different long term support reasons. It is important to have a way to override the bpf kern_version attribute during runtime: some old kernels might still check for kern_version attribute during bpf_prog_load(). Signed-off-by: Rafael David Tinoco <rafaeldtinoco@ubuntu.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210323040952.2118241-1-rafaeldtinoco@ubuntu.com	2021-03-25 23:31:23 -07:00
Andrii Nakryiko	53f0e7d8ec	libbpf: Constify few bpf_program getters bpf_program__get_type() and bpf_program__get_expected_attach_type() shouldn't modify given bpf_program, so mark input parameter as const struct bpf_program. This eliminates unnecessary compilation warnings or explicit casts in user programs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210324172941.2609884-1-andrii@kernel.org	2021-03-25 23:31:23 -07:00
Andrii Nakryiko	d4d3a88b5a	libbpf: Skip BTF fixup if object file has no BTF Skip BTF fixup step when input object file is missing BTF altogether. Fixes: 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support") Reported-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/bpf/20210319205909.1748642-3-andrii@kernel.org	2021-03-25 23:31:23 -07:00
KP Singh	7b1f3e310b	libbpf: Add explicit padding to btf_dump_emit_type_decl_opts Similar to https://lore.kernel.org/bpf/20210313210920.1959628-2-andrii@kernel.org/ When DECLARE_LIBBPF_OPTS is used with inline field initialization, e.g: DECLARE_LIBBPF_OPTS(btf_dump_emit_type_decl_opts, opts, .field_name = var_ident, .indent_level = 2, .strip_mods = strip_mods, ); and compiled in debug mode, the compiler generates code which leaves the padding uninitialized and triggers errors within libbpf APIs which require strict zero initialization of OPTS structs. Adding anonymous padding field fixes the issue. Fixes: 9f81654eebe8 ("libbpf: Expose BTF-to-C type declaration emitting API") Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: KP Singh <kpsingh@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210319192117.2310658-1-kpsingh@kernel.org	2021-03-25 23:31:23 -07:00
Andrii Nakryiko	b156979d19	libbpf: Add BPF static linker BTF and BTF.ext support Add .BTF and .BTF.ext static linking logic. When multiple BPF object files are linked together, their respective .BTF and .BTF.ext sections are merged together. BTF types are not just concatenated, but also deduplicated. .BTF.ext data is grouped by type (func info, line info, core_relos) and target section names, and then all the records are concatenated together, preserving their relative order. All the BTF type ID references and string offsets are updated as necessary, to take into account possibly deduplicated strings and types. BTF DATASEC types are handled specially. Their respective var_secinfos are accumulated separately in special per-section data and then final DATASEC types are emitted at the very end during bpf_linker__finalize() operation, just before emitting final ELF output file. BTF data can also provide "section annotations" for some extern variables. Such concept is missing in ELF, but BTF will have DATASEC types for such special extern datasections (e.g., .kconfig, .ksyms). Such sections are called "ephemeral" internally. Internally linker will keep metadata for each such section, collecting variables information, but those sections won't be emitted into the final ELF file. Also, given LLVM/Clang during compilation emits BTF DATASECS that are incomplete, missing section size and variable offsets for static variables, BPF static linker will initially fix up such DATASECs, using ELF symbols data. The final DATASECs will preserve section sizes and all variable offsets. This is handled correctly by libbpf already, so won't cause any new issues. On the other hand, it's actually a nice property to have a complete BTF data without runtime adjustments done during bpf_object__open() by libbpf. In that sense, BPF static linker is also a BTF normalizer. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210318194036.3521577-8-andrii@kernel.org	2021-03-25 23:31:23 -07:00
Andrii Nakryiko	bd81770e10	libbpf: Add BPF static linker APIs Introduce BPF static linker APIs to libbpf. BPF static linker allows to perform static linking of multiple BPF object files into a single combined resulting object file, preserving all the BPF programs, maps, global variables, etc. Data sections (.bss, .data, .rodata, .maps, maps, etc) with the same name are concatenated together. Similarly, code sections are also concatenated. All the symbols and ELF relocations are also concatenated in their respective ELF sections and are adjusted accordingly to the new object file layout. Static variables and functions are handled correctly as well, adjusting BPF instructions offsets to reflect new variable/function offset within the combined ELF section. Such relocations are referencing STT_SECTION symbols and that stays intact. Data sections in different files can have different alignment requirements, so that is taken care of as well, adjusting sizes and offsets as necessary to satisfy both old and new alignment requirements. DWARF data sections are stripped out, currently. As well as LLLVM_ADDRSIG section, which is ignored by libbpf in bpf_object__open() anyways. So, in a way, BPF static linker is an analogue to `llvm-strip -g`, which is a pretty nice property, especially if resulting .o file is then used to generate BPF skeleton. Original string sections are ignored and instead we construct our own set of unique strings using libbpf-internal `struct strset` API. To reduce the size of the patch, all the .BTF and .BTF.ext processing was moved into a separate patch. The high-level API consists of just 4 functions: - bpf_linker__new() creates an instance of BPF static linker. It accepts output filename and (currently empty) options struct; - bpf_linker__add_file() takes input filename and appends it to the already processed ELF data; it can be called multiple times, one for each BPF ELF object file that needs to be linked in; - bpf_linker__finalize() needs to be called to dump final ELF contents into the output file, specified when bpf_linker was created; after bpf_linker__finalize() is called, no more bpf_linker__add_file() and bpf_linker__finalize() calls are allowed, they will return error; - regardless of whether bpf_linker__finalize() was called or not, bpf_linker__free() will free up all the used resources. Currently, BPF static linker doesn't resolve cross-object file references (extern variables and/or functions). This will be added in the follow up patch set. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210318194036.3521577-7-andrii@kernel.org	2021-03-25 23:31:23 -07:00
Andrii Nakryiko	4fdc36418d	libbpf: Add generic BTF type shallow copy API Add btf__add_type() API that performs shallow copy of a given BTF type from the source BTF into the destination BTF. All the information and type IDs are preserved, but all the strings encountered are added into the destination BTF and corresponding offsets are rewritten. BTF type IDs are assumed to be correct or such that will be (somehow) modified afterwards. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210318194036.3521577-6-andrii@kernel.org	2021-03-25 23:31:23 -07:00
Andrii Nakryiko	861ad35ceb	libbpf: Extract internal set-of-strings datastructure APIs Extract BTF logic for maintaining a set of strings data structure, used for BTF strings section construction in writable mode, into separate re-usable API. This data structure is going to be used by bpf_linker to maintains ELF STRTAB section, which has the same layout as BTF strings section. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210318194036.3521577-5-andrii@kernel.org	2021-03-25 23:31:23 -07:00
Andrii Nakryiko	7fc514acf1	libbpf: Rename internal memory-management helpers Rename btf_add_mem() and btf_ensure_mem() helpers that abstract away details of dynamically resizable memory to use libbpf_ prefix, as they are not BTF-specific. No functional changes. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210318194036.3521577-4-andrii@kernel.org	2021-03-25 23:31:23 -07:00
Andrii Nakryiko	74e94c40fe	libbpf: Generalize BTF and BTF.ext type ID and strings iteration Extract and generalize the logic to iterate BTF type ID and string offset fields within BTF types and .BTF.ext data. Expose this internally in libbpf for re-use by bpf_linker. Additionally, complete strings deduplication handling for BTF.ext (e.g., CO-RE access strings), which was previously missing. There previously was no case of deduplicating .BTF.ext data, but bpf_linker is going to use it. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210318194036.3521577-3-andrii@kernel.org	2021-03-25 23:31:23 -07:00
Andrii Nakryiko	082a5c6020	libbpf: Expose btf_type_by_id() internally btf_type_by_id() is internal-only convenience API returning non-const pointer to struct btf_type. Expose it outside of btf.c for re-use. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210318194036.3521577-2-andrii@kernel.org	2021-03-25 23:31:23 -07:00
Andrii Nakryiko	75a2e3bda8	libbpf: provide NULL and KERNEL_VERSION macros in bpf_helpers.h Given that vmlinux.h is not compatible with headers like stddef.h, NULL poses an annoying problem: it is defined as #define, so is not captured in BTF, so is not emitted into vmlinux.h. This leads to users either sticking to explicit 0, or defining their own NULL (as progs/skb_pkt_end.c does). But it's easy for bpf_helpers.h to provide (conditionally) NULL definition. Similarly, KERNEL_VERSION is another commonly missed macro that came up multiple times. So this patch adds both of them, along with offsetof(), that also is typically defined in stddef.h, just like NULL. This might cause compilation warning for existing BPF applications defining their own NULL and/or KERNEL_VERSION already: progs/skb_pkt_end.c:7:9: warning: 'NULL' macro redefined [-Wmacro-redefined] #define NULL 0 ^ /tmp/linux/tools/testing/selftests/bpf/tools/include/vmlinux.h:4:9: note: previous definition is here #define NULL ((void *)0) ^ It is trivial to fix, though, so long-term benefits outweight temporary inconveniences. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20210317200510.1354627-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2021-03-25 23:31:23 -07:00
Andrii Nakryiko	c8bfeae778	libbpf: Add explicit padding to bpf_xdp_set_link_opts Adding such anonymous padding fixes the issue with uninitialized portions of bpf_xdp_set_link_opts when using LIBBPF_DECLARE_OPTS macro with inline field initialization: DECLARE_LIBBPF_OPTS(bpf_xdp_set_link_opts, opts, .old_fd = -1); When such code is compiled in debug mode, compiler is generating code that leaves padding bytes uninitialized, which triggers error inside libbpf APIs that do strict zero initialization checks for OPTS structs. Adding anonymous padding field fixes the issue. Fixes: bd5ca3ef93cd ("libbpf: Add function to set link XDP fd while specifying old program") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210313210920.1959628-2-andrii@kernel.org	2021-03-25 23:31:23 -07:00
Pedro Tammela	a0ad81d9c4	libbpf: Avoid inline hint definition from 'linux/stddef.h' Linux headers might pull 'linux/stddef.h' which defines '__always_inline' as the following: #ifndef __always_inline #define __always_inline inline #endif This becomes an issue if the program picks up the 'linux/stddef.h' definition as the macro now just hints inline to clang. This change now enforces the proper definition for BPF programs regardless of the include order. Signed-off-by: Pedro Tammela <pctammela@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210314173839.457768-1-pctammela@gmail.com	2021-03-25 23:31:23 -07:00
Björn Töpel	b727e2deca	libbpf: xsk: Move barriers from libbpf_util.h to xsk.h The only user of libbpf_util.h is xsk.h. Move the barriers to xsk.h, and remove libbpf_util.h. The barriers are used as an implementation detail, and should not be considered part of the stable API. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210310080929.641212-3-bjorn.topel@gmail.com	2021-03-25 23:31:23 -07:00
Björn Töpel	27db7104d5	libbpf: xsk: Remove linux/compiler.h header In commit 291471dd1559 ("libbpf, xsk: Add libbpf_smp_store_release libbpf_smp_load_acquire") linux/compiler.h was added as a dependency to xsk.h, which is the user-facing API. This makes it harder for userspace application to consume the library. Here the header inclusion is removed, and instead {READ,WRITE}_ONCE() is added explicitly. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210310080929.641212-2-bjorn.topel@gmail.com	2021-03-25 23:31:23 -07:00
Jean-Philippe Brucker	c14f7e5dcf	libbpf: Fix BTF dump of pointer-to-array-of-struct The vmlinux.h generated from BTF is invalid when building drivers/phy/ti/phy-gmii-sel.c with clang: vmlinux.h:61702:27: error: array type has incomplete element type ‘struct reg_field’ 61702 \| const struct reg_field (regfields)[3]; \| ^~~~~~~~~ bpftool generates a forward declaration for this struct regfield, which compilers aren't happy about. Here's a simplified reproducer: struct inner { int val; }; struct outer { struct inner (ptr_to_array)[2]; } A; After build with clang -> bpftool btf dump c -> clang/gcc: ./def-clang.h:11:23: error: array has incomplete element type 'struct inner' struct inner (ptr_to_array)[2]; Member ptr_to_array of struct outer is a pointer to an array of struct inner. In the DWARF generated by clang, struct outer appears before struct inner, so when converting BTF of struct outer into C, bpftool issues a forward declaration to struct inner. With GCC the DWARF info is reversed so struct inner gets fully defined. That forward declaration is not sufficient when compilers handle an array of the struct, even when it's only used through a pointer. Note that we can trigger the same issue with an intermediate typedef: struct inner { int val; }; typedef struct inner inner2_t[2]; struct outer { inner2_t ptr_to_array; } A; Becomes: struct inner; typedef struct inner inner2_t[2]; And causes: ./def-clang.h:10:30: error: array has incomplete element type 'struct inner' typedef struct inner inner2_t[2]; To fix this, clear through_ptr whenever we encounter an intermediate array, to make the inner struct part of a strong link and force full declaration. Fixes: 351131b51c7a ("libbpf: add btf_dump API for BTF-to-C conversion") Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210319112554.794552-2-jean-philippe@linaro.org	2021-03-25 23:31:23 -07:00
Kumar Kartikeya Dwivedi	bbc65156d7	libbpf: Use SOCK_CLOEXEC when opening the netlink socket Otherwise, there exists a small window between the opening and closing of the socket fd where it may leak into processes launched by some other thread. Fixes: 949abbe88436 ("libbpf: add function to setup XDP") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20210317115857.6536-1-memxor@gmail.com	2021-03-25 23:31:23 -07:00
Namhyung Kim	c903b3ab70	libbpf: Fix error path in bpf_object__elf_init() When it failed to get section names, it should call into bpf_object__elf_finish() like others. Fixes: 88a82120282b ("libbpf: Factor out common ELF operations and improve logging") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210317145414.884817-1-namhyung@kernel.org	2021-03-25 23:31:23 -07:00
Jean-Philippe Brucker	186ffbe0b5	libbpf: Fix arm64 build The macro for libbpf_smp_store_release() doesn't build on arm64, fix it. Fixes: 291471dd1559 ("libbpf, xsk: Add libbpf_smp_store_release libbpf_smp_load_acquire") Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210308182521.155536-1-jean-philippe@linaro.org	2021-03-25 23:31:23 -07:00
Björn Töpel	1d483b45fc	libbpf, xsk: Add libbpf_smp_store_release libbpf_smp_load_acquire Now that the AF_XDP rings have load-acquire/store-release semantics, move libbpf to that as well. The library-internal libbpf_smp_{load_acquire,store_release} are only valid for 32-bit words on ARM64. Also, remove the barriers that are no longer in use. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20210305094113.413544-3-bjorn.topel@gmail.com	2021-03-25 23:31:23 -07:00
Xuesen Huang	d64f8d3207	bpf: Add bpf_skb_adjust_room flag BPF_F_ADJ_ROOM_ENCAP_L2_ETH bpf_skb_adjust_room sets the inner_protocol as skb->protocol for packets encapsulation. But that is not appropriate when pushing Ethernet header. Add an option to further specify encap L2 type and set the inner_protocol as ETH_P_TEB. Suggested-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Xuesen Huang <huangxuesen@kuaishou.com> Signed-off-by: Zhiyong Cheng <chengzhiyong@kuaishou.com> Signed-off-by: Li Wang <wangli09@kuaishou.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/bpf/20210304064046.6232-1-hxseverything@gmail.com	2021-03-25 23:31:23 -07:00
Lorenz Bauer	4f2e1ecbd9	bpf: Add PROG_TEST_RUN support for sk_lookup programs Allow to pass sk_lookup programs to PROG_TEST_RUN. User space provides the full bpf_sk_lookup struct as context. Since the context includes a socket pointer that can't be exposed to user space we define that PROG_TEST_RUN returns the cookie of the selected socket or zero in place of the socket pointer. We don't support testing programs that select a reuseport socket, since this would mean running another (unrelated) BPF program from the sk_lookup test handler. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210303101816.36774-3-lmb@cloudflare.com	2021-03-25 23:31:23 -07:00
Joe Stringer	21f523f235	tools: Sync uapi bpf.h header with latest changes Synchronize the header after all of the recent changes. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20210302171947.2268128-16-joe@cilium.io	2021-03-25 23:31:23 -07:00
Joe Stringer	18c0f03e2d	scripts/bpf: Abstract eBPF API target parameter Abstract out the target parameter so that upcoming commits, more than just the existing "helpers" target can be called to generate specific portions of docs from the eBPF UAPI headers. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20210302171947.2268128-10-joe@cilium.io	2021-03-25 23:31:23 -07:00
Maciej Fijalkowski	fade1c32e6	libbpf: Clear map_info before each bpf_obj_get_info_by_fd xsk_lookup_bpf_maps, based on prog_fd, looks whether current prog has a reference to XSKMAP. BPF prog can include insns that work on various BPF maps and this is covered by iterating through map_ids. The bpf_map_info that is passed to bpf_obj_get_info_by_fd for filling needs to be cleared at each iteration, so that it doesn't contain any outdated fields and that is currently missing in the function of interest. To fix that, zero-init map_info via memset before each bpf_obj_get_info_by_fd call. Also, since the area of this code is touched, in general strcmp is considered harmful, so let's convert it to strncmp and provide the size of the array name for current map_info. While at it, do s/continue/break/ once we have found the xsks_map to terminate the search. Fixes: 5750902a6e9b ("libbpf: proper XSKMAP cleanup") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/20210303185636.18070-4-maciej.fijalkowski@intel.com	2021-03-25 23:31:23 -07:00
Ilya Leoshkevich	8c2c7e5bcf	sync: use bpf_doc.py In the latest bpf-next bpf_helpers_doc.py has been renamed to bpf_doc.py. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>	2021-03-25 23:31:23 -07:00
Andrii Nakryiko	092a606856	Makefile: fix install flags order Having -m flag between source and destination breaks install on MacOS, as reported in [0]. Fix it by moving the flag to the front. [0] https://github.com/openwrt/openwrt/pull/3959 Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-03-05 17:48:04 -08:00
Ilya Leoshkevich	986962fade	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 303dcc25b5c782547eb13b9f29426de843dd6f34 Checkpoint bpf-next commit: 22541a9eeb0d968c133aaebd95fa59da3208e705 Baseline bpf commit: 6185266c5a853bb0f2a459e3ff594546f277609b Checkpoint bpf commit: 22541a9eeb0d968c133aaebd95fa59da3208e705 Ilya Leoshkevich (3): bpf: Add BTF_KIND_FLOAT to uapi libbpf: Fix whitespace in btf_add_composite() comment libbpf: Add BTF_KIND_FLOAT support include/uapi/linux/btf.h \| 5 ++-- src/btf.c \| 51 +++++++++++++++++++++++++++++++++++++++- src/btf.h \| 6 +++++ src/btf_dump.c \| 4 ++++ src/libbpf.c \| 29 ++++++++++++++++++++++- src/libbpf.map \| 5 ++++ src/libbpf_internal.h \| 2 ++ 7 files changed, 98 insertions(+), 4 deletions(-) -- 2.25.1	2021-03-05 14:15:26 -08:00
Ilya Leoshkevich	471e7c241d	libbpf: Add BTF_KIND_FLOAT support The logic follows that of BTF_KIND_INT most of the time. Sanitization replaces BTF_KIND_FLOATs with equally-sized empty BTF_KIND_STRUCTs on older kernels, for example, the following: [4] FLOAT 'float' size=4 becomes the following: [4] STRUCT '(anon)' size=4 vlen=0 With dwarves patch [1] and this patch, the older kernels, which were failing with the floating-point-related errors, will now start working correctly. [1] https://github.com/iii-i/dwarves/commit/btf-kind-float-v2 Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210226202256.116518-4-iii@linux.ibm.com	2021-03-05 14:15:26 -08:00
Ilya Leoshkevich	617f781804	libbpf: Fix whitespace in btf_add_composite() comment Remove trailing space. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210226202256.116518-3-iii@linux.ibm.com	2021-03-05 14:15:26 -08:00
Ilya Leoshkevich	473899d4f7	bpf: Add BTF_KIND_FLOAT to uapi Add a new kind value and expand the kind bitfield. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210226202256.116518-2-iii@linux.ibm.com	2021-03-05 14:15:26 -08:00
Andrii Nakryiko	7e03685b8d	vmtests: blacklist new tests on 5.5 kernel Extend 5.5 blacklist. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-03-03 08:35:13 -08:00
Andrii Nakryiko	7065a809fc	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 86ce322d21eb032ed8fdd294d0fb095d2debb430 Checkpoint bpf-next commit: 303dcc25b5c782547eb13b9f29426de843dd6f34 Baseline bpf commit: 78031381ae9c88f4f914d66154f4745122149c58 Checkpoint bpf commit: 6185266c5a853bb0f2a459e3ff594546f277609b Alexei Starovoitov (1): bpf: Count the number of times recursion was prevented Florent Revest (2): bpf: Be less specific about socket cookies guarantees bpf: Expose bpf_get_socket_cookie to tracing programs Hangbin Liu (1): bpf: Remove blank line in bpf helper description comment Jesper Dangaard Brouer (2): bpf: bpf_fib_lookup return MTU value as output when looked up bpf: Add BPF-helper for MTU checking Jonas Bonn (1): Revert "GTP: add support for flow based tunneling API" Martin KaFai Lau (1): libbpf: Ignore non function pointer member in struct_ops Stanislav Fomichev (1): libbpf: Use AF_LOCAL instead of AF_INET in xsk.c Yonghong Song (3): bpf: Add bpf_for_each_map_elem() helper libbpf: Move function is_ldimm64() earlier in libbpf.c libbpf: Support subprog address relocation include/uapi/linux/bpf.h \| 140 +++++++++++++++++++++++++++++++++-- include/uapi/linux/if_link.h \| 1 - src/libbpf.c \| 98 +++++++++++++++++++----- src/xsk.c \| 2 +- 4 files changed, 213 insertions(+), 28 deletions(-) -- 2.24.1	2021-03-03 08:35:13 -08:00
Andrii Nakryiko	18b55bc136	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2021-03-03 08:35:13 -08:00
Yonghong Song	712f6587c9	libbpf: Support subprog address relocation A new relocation RELO_SUBPROG_ADDR is added to capture subprog addresses loaded with ld_imm64 insns. Such ld_imm64 insns are marked with BPF_PSEUDO_FUNC and will be passed to kernel. For bpf_for_each_map_elem() case, kernel will check that the to-be-used subprog address must be a static function and replace it with proper actual jited func address. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210226204930.3885367-1-yhs@fb.com	2021-03-03 08:35:13 -08:00
Yonghong Song	60aa32b17a	libbpf: Move function is_ldimm64() earlier in libbpf.c Move function is_ldimm64() close to the beginning of libbpf.c, so it can be reused by later code and the next patch as well. There is no functionality change. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210226204929.3885295-1-yhs@fb.com	2021-03-03 08:35:13 -08:00
Yonghong Song	f3612e4117	bpf: Add bpf_for_each_map_elem() helper The bpf_for_each_map_elem() helper is introduced which iterates all map elements with a callback function. The helper signature looks like long bpf_for_each_map_elem(map, callback_fn, callback_ctx, flags) and for each map element, the callback_fn will be called. For example, like hashmap, the callback signature may look like long callback_fn(map, key, val, callback_ctx) There are two known use cases for this. One is from upstream ([1]) where a for_each_map_elem helper may help implement a timeout mechanism in a more generic way. Another is from our internal discussion for a firewall use case where a map contains all the rules. The packet data can be compared to all these rules to decide allow or deny the packet. For array maps, users can already use a bounded loop to traverse elements. Using this helper can avoid using bounded loop. For other type of maps (e.g., hash maps) where bounded loop is hard or impossible to use, this helper provides a convenient way to operate on all elements. For callback_fn, besides map and map element, a callback_ctx, allocated on caller stack, is also passed to the callback function. This callback_ctx argument can provide additional input and allow to write to caller stack for output. If the callback_fn returns 0, the helper will iterate through next element if available. If the callback_fn returns 1, the helper will stop iterating and returns to the bpf program. Other return values are not used for now. Currently, this helper is only available with jit. It is possible to make it work with interpreter with so effort but I leave it as the future work. [1]: https://lore.kernel.org/bpf/20210122205415.113822-1-xiyou.wangcong@gmail.com/ Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210226204925.3884923-1-yhs@fb.com	2021-03-03 08:35:13 -08:00
Hangbin Liu	587d2ab628	bpf: Remove blank line in bpf helper description comment Commit 34b2021cc616 ("bpf: Add BPF-helper for MTU checking") added an extra blank line in bpf helper description. This will make bpf_helpers_doc.py stop building bpf_helper_defs.h immediately after bpf_check_mtu(), which will affect future added functions. Fixes: 34b2021cc616 ("bpf: Add BPF-helper for MTU checking") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/bpf/20210223131457.1378978-1-liuhangbin@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2021-03-03 08:35:13 -08:00
Jesper Dangaard Brouer	6cc16d6401	bpf: Add BPF-helper for MTU checking This BPF-helper bpf_check_mtu() works for both XDP and TC-BPF programs. The SKB object is complex and the skb->len value (accessible from BPF-prog) also include the length of any extra GRO/GSO segments, but without taking into account that these GRO/GSO segments get added transport (L4) and network (L3) headers before being transmitted. Thus, this BPF-helper is created such that the BPF-programmer don't need to handle these details in the BPF-prog. The API is designed to help the BPF-programmer, that want to do packet context size changes, which involves other helpers. These other helpers usually does a delta size adjustment. This helper also support a delta size (len_diff), which allow BPF-programmer to reuse arguments needed by these other helpers, and perform the MTU check prior to doing any actual size adjustment of the packet context. It is on purpose, that we allow the len adjustment to become a negative result, that will pass the MTU check. This might seem weird, but it's not this helpers responsibility to "catch" wrong len_diff adjustments. Other helpers will take care of these checks, if BPF-programmer chooses to do actual size adjustment. V14: - Improve man-page desc of len_diff. V13: - Enforce flag BPF_MTU_CHK_SEGS cannot use len_diff. V12: - Simplify segment check that calls skb_gso_validate_network_len. - Helpers should return long V9: - Use dev->hard_header_len (instead of ETH_HLEN) - Annotate with unlikely req from Daniel - Fix logic error using skb_gso_validate_network_len from Daniel V6: - Took John's advice and dropped BPF_MTU_CHK_RELAX - Returned MTU is kept at L3-level (like fib_lookup) V4: Lot of changes - ifindex 0 now use current netdev for MTU lookup - rename helper from bpf_mtu_check to bpf_check_mtu - fix bug for GSO pkt length (as skb->len is total len) - remove __bpf_len_adj_positive, simply allow negative len adj Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/161287790461.790810.3429728639563297353.stgit@firesoul	2021-03-03 08:35:13 -08:00
Jesper Dangaard Brouer	f0753b5259	bpf: bpf_fib_lookup return MTU value as output when looked up The BPF-helpers for FIB lookup (bpf_xdp_fib_lookup and bpf_skb_fib_lookup) can perform MTU check and return BPF_FIB_LKUP_RET_FRAG_NEEDED. The BPF-prog don't know the MTU value that caused this rejection. If the BPF-prog wants to implement PMTU (Path MTU Discovery) (rfc1191) it need to know this MTU value for the ICMP packet. Patch change lookup and result struct bpf_fib_lookup, to contain this MTU value as output via a union with 'tot_len' as this is the value used for the MTU lookup. V5: - Fixed uninit value spotted by Dan Carpenter. - Name struct output member mtu_result Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/161287789952.790810.13134700381067698781.stgit@firesoul	2021-03-03 08:35:13 -08:00
Martin KaFai Lau	642655629b	libbpf: Ignore non function pointer member in struct_ops When libbpf initializes the kernel's struct_ops in "bpf_map__init_kern_struct_ops()", it enforces all pointer types must be a function pointer and rejects others. It turns out to be too strict. For example, when directly using "struct tcp_congestion_ops" from vmlinux.h, it has a "struct module *owner" member and it is set to NULL in a bpf_tcp_cc.o. Instead, it only needs to ensure the member is a function pointer if it has been set (relocated) to a bpf-prog. This patch moves the "btf_is_func_proto(kern_mtype)" check after the existing "if (!prog) { continue; }". The original debug message in "if (!prog) { continue; }" is also removed since it is no longer valid. Beside, there is a later debug message to tell which function pointer is set. The "btf_is_func_proto(mtype)" has already been guaranteed in "bpf_object__collect_st_ops_relos()" which has been run before "bpf_map__init_kern_struct_ops()". Thus, this check is removed. v2: - Remove outdated debug message (Andrii) Remove because there is a later debug message to tell which function pointer is set. - Following mtype->type is no longer needed. Remove: "skip_mods_and_typedefs(btf, mtype->type, &mtype_id)" - Do "if (!prog)" test before skip_mods_and_typedefs. Fixes: 590a00888250 ("bpf: libbpf: Add STRUCT_OPS support") Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210212021030.266932-1-kafai@fb.com	2021-03-03 08:35:13 -08:00
Stanislav Fomichev	14a61e86f0	libbpf: Use AF_LOCAL instead of AF_INET in xsk.c We have the environments where usage of AF_INET is prohibited (cgroup/sock_create returns EPERM for AF_INET). Let's use AF_LOCAL instead of AF_INET, it should perfectly work with SIOCETHTOOL. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Björn Töpel <bjorn.topel@intel.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/20210209221826.922940-1-sdf@google.com	2021-03-03 08:35:13 -08:00
Florent Revest	d142d4a382	bpf: Expose bpf_get_socket_cookie to tracing programs This needs a new helper that: - can work in a sleepable context (using sock_gen_cookie) - takes a struct sock pointer and checks that it's not NULL Signed-off-by: Florent Revest <revest@chromium.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: KP Singh <kpsingh@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210210111406.785541-2-revest@chromium.org	2021-03-03 08:35:13 -08:00
Florent Revest	99e6a464b8	bpf: Be less specific about socket cookies guarantees Since "92acdc58ab11 bpf, net: Rework cookie generator as per-cpu one" socket cookies are not guaranteed to be non-decreasing. The bpf_get_socket_cookie helper descriptions are currently specifying that cookies are non-decreasing but we don't want users to rely on that. Reported-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Florent Revest <revest@chromium.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: KP Singh <kpsingh@kernel.org> Link: https://lore.kernel.org/bpf/20210210111406.785541-1-revest@chromium.org	2021-03-03 08:35:13 -08:00
Alexei Starovoitov	1015d47c2b	bpf: Count the number of times recursion was prevented Add per-program counter for number of times recursion prevention mechanism was triggered and expose it via show_fdinfo and bpf_prog_info. Teach bpftool to print it. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210210033634.62081-7-alexei.starovoitov@gmail.com	2021-03-03 08:35:13 -08:00
Jonas Bonn	06ee116fb1	Revert "GTP: add support for flow based tunneling API" This reverts commit 9ab7e76aefc97a9aa664accb59d6e8dc5e52514a. This patch was committed without maintainer approval and despite a number of unaddressed concerns from review. There are several issues that impede the acceptance of this patch and that make a reversion of this particular instance of these changes the best way forward: i) the patch contains several logically separate changes that would be better served as smaller patches (for review purposes) ii) functionality like the handling of end markers has been introduced without further explanation iii) symmetry between the handling of GTPv0 and GTPv1 has been unnecessarily broken iv) the patchset produces 'broken' packets when extension headers are included v) there are no available userspace tools to allow for testing this functionality vi) there is an unaddressed Coverity report against the patch concering memory leakage vii) most importantly, the patch contains a large amount of superfluous churn that impedes other ongoing work with this driver This patch will be reworked into a series that aligns with other ongoing work and facilitates review. Signed-off-by: Jonas Bonn <jonas@norrbonn.se> Acked-by: Harald Welte <laforge@gnumonks.org> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-03-03 08:35:13 -08:00
Andrii Nakryiko	e1a90f3768	travis-ci: switch from GCC8 to GCC10 GCC 8 builds started failing due to missing gcc-8 package. Let's switch to GCC 10 instead. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-02-22 14:00:53 -08:00
Andrii Nakryiko	f2a926ba46	Revert "vmtests: revert to Clang/LLVM 12 until Clang 13 regression is fixed" Clang 13 now has the fix for the original regression, time to get back to using nightly versions. This reverts commit `adaf538bca`.	2021-02-22 11:36:12 -08:00
Matteo Croce	b0b5ec0006	fix typo in license name LGPL was incorrectly named LPGL Signed-off-by: Matteo Croce <mcroce@microsoft.com>	2021-02-22 11:35:49 -08:00
Andrii Nakryiko	adaf538bca	vmtests: revert to Clang/LLVM 12 until Clang 13 regression is fixed Clang 13 regressed BPF code generation causing some of BPF selftests to fail. Until that is mitigated, stick to version 12. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-02-08 20:37:11 -08:00
Andrii Nakryiko	f35e87ddc4	vmtest: switch to Clang/LLVM 13 Clang 13 became the new nightly version, so switch to it. Also do vmlinux compilation with a bit more parallelism. And account python-docutils installation as part of selftests build. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-02-03 14:53:19 -08:00
Matteo Croce	767d82caab	install: don't preserve file owner 'cp -p' preserve file ownership, this may leave files owned by the current in user in /lib . Signed-off-by: Matteo Croce <mcroce@microsoft.com>	2021-01-26 19:33:48 -08:00
Andrii Nakryiko	a199b85415	vmtest: blacklist atomics selftest for 5.5 5.5 doesn't support atomics. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-01-26 19:32:13 -08:00
Andrii Nakryiko	649f9dc746	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 3db1a3fa98808aa90f95ec3e0fa2fc7abf28f5c9 Checkpoint bpf-next commit: 86ce322d21eb032ed8fdd294d0fb095d2debb430 Baseline bpf commit: 1a3449c19407a28f7019a887cdf0d6ba2444751a Checkpoint bpf commit: 78031381ae9c88f4f914d66154f4745122149c58 Andrii Nakryiko (5): libbpf: Add user-space variants of BPF_CORE_READ() family of macros libbpf: Add non-CO-RE variants of BPF_CORE_READ() macro family libbpf: Clarify kernel type use with USER variants of CORE reading macros libbpf: Support kernel module ksym externs libbpf: Allow loading empty BTFs Björn Töpel (1): libbpf, xsk: Select AF_XDP BPF program based on kernel version Brendan Jackman (4): bpf: Clarify return value of probe str helpers bpf: Rename BPF_XADD and prepare to encode other atomics in .imm bpf: Add BPF_FETCH field / create atomic_fetch_add instruction bpf: Add instructions for atomic_[cmp]xchg Ian Rogers (1): bpf, libbpf: Avoid unused function warning on bpf_tail_call_static Jiri Olsa (1): libbpf: Use string table index from index table if needed Pravin B Shelar (1): GTP: add support for flow based tunneling API include/uapi/linux/bpf.h \| 20 +++-- include/uapi/linux/if_link.h \| 1 + src/bpf_core_read.h \| 169 +++++++++++++++++++++++++++-------- src/bpf_helpers.h \| 2 +- src/btf.c \| 17 ++-- src/libbpf.c \| 50 +++++++---- src/xsk.c \| 81 ++++++++++++++++- 7 files changed, 265 insertions(+), 75 deletions(-) -- 2.24.1	2021-01-26 19:32:13 -08:00
Andrii Nakryiko	eb56f8fb12	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2021-01-26 19:32:13 -08:00
Björn Töpel	6e01a23cf6	libbpf, xsk: Select AF_XDP BPF program based on kernel version Add detection for kernel version, and adapt the BPF program based on kernel support. This way, users will get the best possible performance from the BPF program. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Marek Majtyka <alardam@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20210122105351.11751-4-bjorn.topel@gmail.com	2021-01-26 19:32:13 -08:00
Jiri Olsa	16d7f413e2	libbpf: Use string table index from index table if needed For very large ELF objects (with many sections), we could get special value SHN_XINDEX (65535) for elf object's string table index - e_shstrndx. Call elf_getshdrstrndx to get the proper string table index, instead of reading it directly from ELF header. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210121202203.9346-4-jolsa@kernel.org	2021-01-26 19:32:13 -08:00
Andrii Nakryiko	f037b92465	libbpf: Allow loading empty BTFs Empty BTFs do come up (e.g., simple kernel modules with no new types and strings, compared to the vmlinux BTF) and there is nothing technically wrong with them. So remove unnecessary check preventing loading empty BTFs. Fixes: d8123624506c ("libbpf: Fix BTF data layout checks and allow empty BTF") Reported-by: Christopher William Snowhill <chris@kode54.net> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210110070341.1380086-2-andrii@kernel.org	2021-01-26 19:32:13 -08:00
Pravin B Shelar	4adbb7b2c7	GTP: add support for flow based tunneling API Following patch add support for flow based tunneling API to send and recv GTP tunnel packet over tunnel metadata API. This would allow this device integration with OVS or eBPF using flow based tunneling APIs. Signed-off-by: Pravin B Shelar <pbshelar@fb.com> Link: https://lore.kernel.org/r/20210110070021.26822-1-pbshelar@fb.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-26 19:32:13 -08:00
Brendan Jackman	d2b784d370	bpf: Add instructions for atomic_[cmp]xchg This adds two atomic opcodes, both of which include the BPF_FETCH flag. XCHG without the BPF_FETCH flag would naturally encode atomic_set. This is not supported because it would be of limited value to userspace (it doesn't imply any barriers). CMPXCHG without BPF_FETCH woulud be an atomic compare-and-write. We don't have such an operation in the kernel so it isn't provided to BPF either. There are two significant design decisions made for the CMPXCHG instruction: - To solve the issue that this operation fundamentally has 3 operands, but we only have two register fields. Therefore the operand we compare against (the kernel's API calls it 'old') is hard-coded to be R0. x86 has similar design (and A64 doesn't have this problem). A potential alternative might be to encode the other operand's register number in the immediate field. - The kernel's atomic_cmpxchg returns the old value, while the C11 userspace APIs return a boolean indicating the comparison result. Which should BPF do? A64 returns the old value. x86 returns the old value in the hard-coded register (and also sets a flag). That means return-old-value is easier to JIT, so that's what we use. Signed-off-by: Brendan Jackman <jackmanb@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210114181751.768687-8-jackmanb@google.com	2021-01-26 19:32:13 -08:00
Brendan Jackman	ac86f42e4a	bpf: Add BPF_FETCH field / create atomic_fetch_add instruction The BPF_FETCH field can be set in bpf_insn.imm, for BPF_ATOMIC instructions, in order to have the previous value of the atomically-modified memory location loaded into the src register after an atomic op is carried out. Suggested-by: Yonghong Song <yhs@fb.com> Signed-off-by: Brendan Jackman <jackmanb@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210114181751.768687-7-jackmanb@google.com	2021-01-26 19:32:13 -08:00
Brendan Jackman	03fbe22a59	bpf: Rename BPF_XADD and prepare to encode other atomics in .imm A subsequent patch will add additional atomic operations. These new operations will use the same opcode field as the existing XADD, with the immediate discriminating different operations. In preparation, rename the instruction mode BPF_ATOMIC and start calling the zero immediate BPF_ADD. This is possible (doesn't break existing valid BPF progs) because the immediate field is currently reserved MBZ and BPF_ADD is zero. All uses are removed from the tree but the BPF_XADD definition is kept around to avoid breaking builds for people including kernel headers. Signed-off-by: Brendan Jackman <jackmanb@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Björn Töpel <bjorn.topel@gmail.com> Link: https://lore.kernel.org/bpf/20210114181751.768687-5-jackmanb@google.com	2021-01-26 19:32:13 -08:00
Ian Rogers	f15814c93a	bpf, libbpf: Avoid unused function warning on bpf_tail_call_static Add inline to __always_inline making it match the linux/compiler.h. Adding this avoids an unused function warning on bpf_tail_call_static when compining with -Wall. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210113223609.3358812-1-irogers@google.com	2021-01-26 19:32:13 -08:00
Andrii Nakryiko	0db7da9a4a	libbpf: Support kernel module ksym externs Add support for searching for ksym externs not just in vmlinux BTF, but across all module BTFs, similarly to how it's done for CO-RE relocations. Kernels that expose module BTFs through sysfs are assumed to support new ldimm64 instruction extension with BTF FD provided in insn[1].imm field, so no extra feature detection is performed. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Hao Luo <haoluo@google.com> Link: https://lore.kernel.org/bpf/20210112075520.4103414-7-andrii@kernel.org	2021-01-26 19:32:13 -08:00
Brendan Jackman	0de8b9a906	bpf: Clarify return value of probe str helpers When the buffer is too small to contain the input string, these helpers return the length of the buffer, not the length of the original string. This tries to make the docs totally clear about that, since "the length of the [copied ]string" could also refer to the length of the input. Signed-off-by: Brendan Jackman <jackmanb@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: KP Singh <kpsingh@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210112123422.2011234-1-jackmanb@google.com	2021-01-26 19:32:13 -08:00
Andrii Nakryiko	d52e5f5f88	libbpf: Clarify kernel type use with USER variants of CORE reading macros Add comments clarifying that USER variants of CO-RE reading macro are still only going to work with kernel types, defined in kernel or kernel module BTF. This should help preventing invalid use of those macro to read user-defined types (which doesn't work with CO-RE). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210108194408.3468860-1-andrii@kernel.org	2021-01-26 19:32:13 -08:00
Andrii Nakryiko	a26ae1b254	libbpf: Add non-CO-RE variants of BPF_CORE_READ() macro family BPF_CORE_READ(), in addition to handling CO-RE relocations, also allows much nicer way to read data structures with nested pointers. Instead of writing a sequence of bpf_probe_read() calls to follow links, one can just write BPF_CORE_READ(a, b, c, d) to effectively do a->b->c->d read. This is a welcome ability when porting BCC code, which (in most cases) allows exactly the intuitive a->b->c->d variant. This patch adds non-CO-RE variants of BPF_CORE_READ() family of macros for cases where CO-RE is not supported (e.g., old kernels). In such cases, the property of shortening a sequence of bpf_probe_read()s to a simple BPF_PROBE_READ(a, b, c, d) invocation is still desirable, especially when porting BCC code to libbpf. Yet, no CO-RE relocation is going to be emitted. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20201218235614.2284956-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2021-01-26 19:32:13 -08:00
Andrii Nakryiko	c1d4bbb8c7	libbpf: Add user-space variants of BPF_CORE_READ() family of macros Add BPF_CORE_READ_USER(), BPF_CORE_READ_USER_STR() and their _INTO() variations to allow reading CO-RE-relocatable kernel data structures from the user-space. One of such cases is reading input arguments of syscalls, while reaping the benefits of CO-RE relocations w.r.t. handling 32/64 bit conversions and handling missing/new fields in UAPI data structs. Suggested-by: Gilad Reti <gilad.reti@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20201218235614.2284956-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2021-01-26 19:32:13 -08:00
Luca Boccassi	7c2a94f4f8	README: mention that Debian 11 ships with BTF support	2021-01-22 12:27:35 -08:00
Luca Boccassi	051a4009f9	pkgconfig: use literal ${prefix} to allow override Various workflows (--define-prefix, --define-variable=prefix) require variables in the pc file to use a literal so that it is overridden. Change the Makefile so that, by default and unless is specified, it is set as expected. Signed-off-by: Luca Boccassi <bluca@debian.org>	2021-01-03 10:41:31 -08:00
Luca Boccassi	a3a5e9688a	README: point to Debian source package rather than binary For consistency with other links Signed-off-by: Luca Boccassi <bluca@debian.org>	2021-01-03 10:41:31 -08:00
Luca Boccassi	5569404346	README: note that Debian 11 (will) ship LLVM 11 Signed-off-by: Luca Boccassi <bluca@debian.org>	2021-01-03 10:41:31 -08:00
Andrii Nakryiko	e05f9be4f4	vmtests: temporarily disable test_maps Disable test_maps test until it's debugged why they started failing. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2020-12-20 17:00:58 -08:00
Andrii Nakryiko	4d3535ff7b	vmtests: test_maps needs more memory, so bump to 4G Memory is cheap. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2020-12-20 17:00:58 -08:00
Andrii Nakryiko	c66a9770e3	vmtests: fix up bpf_testmod.ko generation for 5.5 and 4.9 Selftests makefile deletes local bpf_testmod.ko, so that invalidates current approach of faking bpf_testmod.ko "generation". Instead, generate a fake Makefile that will create an empty bpf_testmod/bpf_testmod.ko. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2020-12-20 17:00:58 -08:00
Andrii Nakryiko	8262be6034	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 5c667dca71095abec90420eb09503f35f66c9585 Checkpoint bpf-next commit: 3db1a3fa98808aa90f95ec3e0fa2fc7abf28f5c9 Baseline bpf commit: 12c8a8ca117f3d734babc3fba131fdaa329d2163 Checkpoint bpf commit: 1a3449c19407a28f7019a887cdf0d6ba2444751a Andrii Nakryiko (2): bpf: Fix enum names for bpf_this_cpu_ptr() and bpf_per_cpu_ptr() helpers libbpf: Support modules in bpf_program__set_attach_target() API Brendan Jackman (1): libbpf: Expose libbpf ring_buffer epoll_fd Florent Revest (1): bpf: Add a bpf_sock_from_file helper include/uapi/linux/bpf.h \| 13 ++++++-- src/libbpf.c \| 64 +++++++++++++++++++++++++--------------- src/libbpf.h \| 1 + src/libbpf.map \| 1 + src/ringbuf.c \| 6 ++++ 5 files changed, 59 insertions(+), 26 deletions(-) -- 2.24.1	2020-12-20 17:00:58 -08:00
Andrii Nakryiko	182e9dde0d	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2020-12-20 17:00:58 -08:00
Brendan Jackman	30e2c16571	libbpf: Expose libbpf ring_buffer epoll_fd This provides a convenient perf ringbuf -> libbpf ringbuf migration path for users of external polling systems. It is analogous to perf_buffer__epoll_fd. Signed-off-by: Brendan Jackman <jackmanb@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20201214113812.305274-1-jackmanb@google.com	2020-12-20 17:00:58 -08:00
Andrii Nakryiko	ebcae62e7e	libbpf: Support modules in bpf_program__set_attach_target() API Support finding kernel targets in kernel modules when using bpf_program__set_attach_target() API. This brings it up to par with what libbpf supports when doing declarative SEC()-based target determination. Some minor internal refactoring was needed to make sure vmlinux BTF can be loaded before bpf_object's load phase. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20201211215825.3646154-2-andrii@kernel.org	2020-12-20 17:00:58 -08:00
Florent Revest	252ad1f3eb	bpf: Add a bpf_sock_from_file helper While eBPF programs can check whether a file is a socket by file->f_op == &socket_file_ops, they cannot convert the void private_data pointer to a struct socket BTF pointer. In order to do this a new helper wrapping sock_from_file is added. This is useful to tracing programs but also other program types inheriting this set of helpers such as iterators or LSM programs. Signed-off-by: Florent Revest <revest@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: KP Singh <kpsingh@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20201204113609.1850150-2-revest@google.com	2020-12-20 17:00:58 -08:00
Andrii Nakryiko	3e68c60659	bpf: Fix enum names for bpf_this_cpu_ptr() and bpf_per_cpu_ptr() helpers Remove bpf_ prefix, which causes these helpers to be reported in verifier dump as bpf_bpf_this_cpu_ptr() and bpf_bpf_per_cpu_ptr(), respectively. Lets fix it as long as it is still possible before UAPI freezes on these helpers. Fixes: eaa6bcb71ef6 ("bpf: Introduce bpf_per_cpu_ptr()") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-12-20 17:00:58 -08:00
Andrii Nakryiko	42baefba71	vmtests: update blacklist for 5.5 Blacklist selftests relying on newer kernel's features. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2020-12-04 20:04:52 -08:00
Andrii Nakryiko	46ecf7aef3	vmtest: omit building bpf_testmod.ko on non-latest kernels Non-latest kernel versions don't build kernel from sources, so module buliding fails, despite using `make prepare`. For now, just make sure no module is built by overwriting bpf_testmod/Makefile. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2020-12-04 20:04:52 -08:00
Andrii Nakryiko	2981bb8d26	vmtests: update vmlinux.h to latest version Update vmlinux.h to get some of BPF UAPI constants needed for the compilation of new selftests. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2020-12-04 20:04:52 -08:00
Andrii Nakryiko	2042df2fed	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: c6bde958a62b8ca5ee8d2c1fe429aec4ad54efad Checkpoint bpf-next commit: 5c667dca71095abec90420eb09503f35f66c9585 Baseline bpf commit: d3bec0138bfbe58606fc1d6f57a4cdc1a20218db Checkpoint bpf commit: 12c8a8ca117f3d734babc3fba131fdaa329d2163 Alan Maguire (1): libbpf: bpf__find_by_name[_kind] should use btf__get_nr_types() Andrei Matei (1): libbpf: Fail early when loading programs with unspecified type Andrii Nakryiko (11): bpf: Assign ID to vmlinux BTF and return extra info for BTF in GET_OBJ_INFO libbpf: Don't attempt to load unused subprog as an entry-point BPF program libbpf: Add base BTF accessor libbpf: Add internal helper to load BTF data by FD libbpf: Refactor CO-RE relocs to not assume a single BTF object libbpf: Add kernel module BTF support for CO-RE relocations bpf: Allow to specify kernel module BTFs when attaching BPF programs libbpf: Factor out low-level BPF program loading helper libbpf: Support attachment of BPF tracing programs to kernel modules libbpf: Use memcpy instead of strncpy to please GCC libbpf: Fix ring_buffer__poll() to return number of consumed samples Dmitrii Banshchikov (1): bpf: Add bpf_ktime_get_coarse_ns helper KP Singh (5): bpf: Implement task local storage libbpf: Add support for task local storage bpf: Implement get_current_task_btf and RET_PTR_TO_BTF_ID bpf: Add bpf_bprm_opts_set helper bpf: Add a BPF helper for getting the IMA hash of an inode Li RongQing (1): libbpf: Add support for canceling cached_cons advance Magnus Karlsson (1): libbpf: Replace size_t with __u32 in xsk interfaces Mariusz Dudek (1): libbpf: Separate XDP program load with xsk socket creation Stanislav Fomichev (1): libbpf: Cap retries in sys_bpf_prog_load Thomas Karlsson (1): macvlan: Support for high multicast packet rate Toke Høiland-Jørgensen (1): libbpf: Sanitise map names before pinning include/uapi/linux/bpf.h \| 96 +++++- include/uapi/linux/if_link.h \| 2 + src/bpf.c \| 104 +++++-- src/btf.c \| 74 +++-- src/btf.h \| 1 + src/libbpf.c \| 550 +++++++++++++++++++++++++++-------- src/libbpf.map \| 3 + src/libbpf_internal.h \| 31 ++ src/libbpf_probes.c \| 1 + src/ringbuf.c \| 2 +- src/xsk.c \| 92 +++++- src/xsk.h \| 22 +- 12 files changed, 771 insertions(+), 207 deletions(-) -- 2.24.1	2020-12-04 20:04:52 -08:00
Andrii Nakryiko	8c2c4c3451	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2020-12-04 20:04:52 -08:00
Andrii Nakryiko	21ae7bb113	libbpf: Fix ring_buffer__poll() to return number of consumed samples Fix ring_buffer__poll() to return the number of non-discarded records consumed, just like its documentation states. It's also consistent with ring_buffer__consume() return. Fix up selftests with wrong expected results. Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support") Fixes: cb1c9ddd5525 ("selftests/bpf: Add BPF ringbuf selftests") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201130223336.904192-1-andrii@kernel.org	2020-12-04 20:04:52 -08:00
Andrii Nakryiko	b2a34784b2	libbpf: Use memcpy instead of strncpy to please GCC Some versions of GCC are really nit-picky about strncpy() use. Use memcpy(), as they are pretty much equivalent for the case of fixed length strings. Fixes: e459f49b4394 ("libbpf: Separate XDP program load with xsk socket creation") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201203235440.2302137-1-andrii@kernel.org	2020-12-04 20:04:52 -08:00
Andrii Nakryiko	d95b12da56	libbpf: Support attachment of BPF tracing programs to kernel modules Teach libbpf to search for BTF types in kernel modules for tracing BPF programs. This allows attachment of raw_tp/fentry/fexit/fmod_ret/etc BPF program types to tracepoints and functions in kernel modules. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201203204634.1325171-13-andrii@kernel.org	2020-12-04 20:04:52 -08:00
Andrii Nakryiko	a1fd6dab54	libbpf: Factor out low-level BPF program loading helper Refactor low-level API for BPF program loading to not rely on public API types. This allows painless extension without constant efforts to cleverly not break backwards compatibility. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201203204634.1325171-12-andrii@kernel.org	2020-12-04 20:04:52 -08:00
Andrii Nakryiko	fde1be5a9c	bpf: Allow to specify kernel module BTFs when attaching BPF programs Add ability for user-space programs to specify non-vmlinux BTF when attaching BTF-powered BPF programs: raw_tp, fentry/fexit/fmod_ret, LSM, etc. For this, attach_prog_fd (now with the alias name attach_btf_obj_fd) should specify FD of a module or vmlinux BTF object. For backwards compatibility reasons, 0 denotes vmlinux BTF. Only kernel BTF (vmlinux or module) can be specified. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201203204634.1325171-11-andrii@kernel.org	2020-12-04 20:04:52 -08:00
Andrii Nakryiko	6b08519a69	libbpf: Add kernel module BTF support for CO-RE relocations Teach libbpf to search for candidate types for CO-RE relocations across kernel modules BTFs, in addition to vmlinux BTF. If at least one candidate type is found in vmlinux BTF, kernel module BTFs are not iterated. If vmlinux BTF has no matching candidates, then find all kernel module BTFs and search for all matching candidates across all of them. Kernel's support for module BTFs are inferred from the support for BTF name pointer in BPF UAPI. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201203204634.1325171-6-andrii@kernel.org	2020-12-04 20:04:52 -08:00
Andrii Nakryiko	aff8028b6e	libbpf: Refactor CO-RE relocs to not assume a single BTF object Refactor CO-RE relocation candidate search to not expect a single BTF, rather return all candidate types with their corresponding BTF objects. This will allow to extend CO-RE relocations to accommodate kernel module BTFs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20201203204634.1325171-5-andrii@kernel.org	2020-12-04 20:04:52 -08:00
Andrii Nakryiko	10e321f100	libbpf: Add internal helper to load BTF data by FD Add a btf_get_from_fd() helper, which constructs struct btf from in-kernel BTF data by FD. This is used for loading module BTFs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201203204634.1325171-4-andrii@kernel.org	2020-12-04 20:04:52 -08:00
Stanislav Fomichev	8051a539d8	libbpf: Cap retries in sys_bpf_prog_load I've seen a situation, where a process that's under pprof constantly generates SIGPROF which prevents program loading indefinitely. The right thing to do probably is to disable signals in the upper layers while loading, but it still would be nice to get some error from libbpf instead of an endless loop. Let's add some small retry limit to the program loading: try loading the program 5 (arbitrary) times and give up. v2: * 10 -> 5 retires (Andrii Nakryiko) Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20201202231332.3923644-1-sdf@google.com	2020-12-04 20:04:52 -08:00
Toke Høiland-Jørgensen	691c22dc0c	libbpf: Sanitise map names before pinning When we added sanitising of map names before loading programs to libbpf, we still allowed periods in the name. While the kernel will accept these for the map names themselves, they are not allowed in file names when pinning maps. This means that bpf_object__pin_maps() will fail if called on an object that contains internal maps (such as sections .rodata). Fix this by replacing periods with underscores when constructing map pin paths. This only affects the paths generated by libbpf when bpf_object__pin_maps() is called with a path argument. Any pin paths set by bpf_map__set_pin_path() are unaffected, and it will still be up to the caller to avoid invalid characters in those. Fixes: 113e6b7e15e2 ("libbpf: Sanitise internal map names so they are not rejected by the kernel") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20201203093306.107676-1-toke@redhat.com	2020-12-04 20:04:52 -08:00
Andrei Matei	5fe9c1217a	libbpf: Fail early when loading programs with unspecified type Before this patch, a program with unspecified type (BPF_PROG_TYPE_UNSPEC) would be passed to the BPF syscall, only to have the kernel reject it with an opaque invalid argument error. This patch makes libbpf reject such programs with a nicer error message - in particular libbpf now tries to diagnose bad ELF section names at both open time and load time. Signed-off-by: Andrei Matei <andreimatei1@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20201203043410.59699-1-andreimatei1@gmail.com	2020-12-04 20:04:52 -08:00
Mariusz Dudek	78c76a1015	libbpf: Separate XDP program load with xsk socket creation Add support for separation of eBPF program load and xsk socket creation. This is needed for use-case when you want to privide as little privileges as possible to the data plane application that will handle xsk socket creation and incoming traffic. With this patch the data entity container can be run with only CAP_NET_RAW capability to fulfill its purpose of creating xsk socket and handling packages. In case your umem is larger or equal process limit for MEMLOCK you need either increase the limit or CAP_IPC_LOCK capability. To resolve privileges issue two APIs are introduced: - xsk_setup_xdp_prog - loads the built in XDP program. It can also return xsks_map_fd which is needed by unprivileged process to update xsks_map with AF_XDP socket "fd" - xsk_socket__update_xskmap - inserts an AF_XDP socket into an xskmap for a particular xsk_socket Signed-off-by: Mariusz Dudek <mariuszx.dudek@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20201203090546.11976-2-mariuszx.dudek@intel.com	2020-12-04 20:04:52 -08:00
Andrii Nakryiko	a741bc6479	libbpf: Add base BTF accessor Add ability to get base BTF. It can be also used to check if BTF is split BTF. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201202065244.530571-3-andrii@kernel.org	2020-12-04 20:04:52 -08:00
Magnus Karlsson	65e4be6f5d	libbpf: Replace size_t with __u32 in xsk interfaces Replace size_t with __u32 in the xsk interfaces that contain this. There is no reason to have size_t since the internal variable that is manipulated is a __u32. The following APIs are affected: __u32 xsk_ring_prod__reserve(struct xsk_ring_prod prod, __u32 nb, __u32 idx) void xsk_ring_prod__submit(struct xsk_ring_prod prod, __u32 nb) __u32 xsk_ring_cons__peek(struct xsk_ring_cons cons, __u32 nb, __u32 idx) void xsk_ring_cons__cancel(struct xsk_ring_cons cons, __u32 nb) void xsk_ring_cons__release(struct xsk_ring_cons *cons, __u32 nb) The "nb" variable and the return values have been changed from size_t to __u32. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/1606383455-8243-1-git-send-email-magnus.karlsson@gmail.com	2020-12-04 20:04:52 -08:00
KP Singh	3a2739aa8a	bpf: Add a BPF helper for getting the IMA hash of an inode Provide a wrapper function to get the IMA hash of an inode. This helper is useful in fingerprinting files (e.g executables on execution) and using these fingerprints in detections like an executable unlinking itself. Since the ima_inode_hash can sleep, it's only allowed for sleepable LSM hooks. Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20201124151210.1081188-3-kpsingh@chromium.org	2020-12-04 20:04:52 -08:00
Li RongQing	dd2369d2a8	libbpf: Add support for canceling cached_cons advance Add a new function for returning descriptors the user received after an xsk_ring_cons__peek call. After the application has gotten a number of descriptors from a ring, it might not be able to or want to process them all for various reasons. Therefore, it would be useful to have an interface for returning or cancelling a number of them so that they are returned to the ring. This patch adds a new function called xsk_ring_cons__cancel that performs this operation on nb descriptors counted from the end of the batch of descriptors that was received through the peek call. Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> [ Magnus Karlsson: rewrote changelog ] Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/1606202474-8119-1-git-send-email-lirongqing@baidu.com	2020-12-04 20:04:52 -08:00
Dmitrii Banshchikov	39f5b2e75e	bpf: Add bpf_ktime_get_coarse_ns helper The helper uses CLOCK_MONOTONIC_COARSE source of time that is less accurate but more performant. We have a BPF CGROUP_SKB firewall that supports event logging through bpf_perf_event_output(). Each event has a timestamp and currently we use bpf_ktime_get_ns() for it. Use of bpf_ktime_get_coarse_ns() saves ~15-20 ns in time required for event logging. bpf_ktime_get_ns(): EgressLogByRemoteEndpoint 113.82ns 8.79M bpf_ktime_get_coarse_ns(): EgressLogByRemoteEndpoint 95.40ns 10.48M Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20201117184549.257280-1-me@ubique.spb.ru	2020-12-04 20:04:52 -08:00
KP Singh	6969a44914	bpf: Add bpf_bprm_opts_set helper The helper allows modification of certain bits on the linux_binprm struct starting with the secureexec bit which can be updated using the BPF_F_BPRM_SECUREEXEC flag. secureexec can be set by the LSM for privilege gaining executions to set the AT_SECURE auxv for glibc. When set, the dynamic linker disables the use of certain environment variables (like LD_PRELOAD). Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20201117232929.2156341-1-kpsingh@chromium.org	2020-12-04 20:04:52 -08:00
Alan Maguire	de2edae80d	libbpf: bpf__find_by_name[_kind] should use btf__get_nr_types() When operating on split BTF, btf__find_by_name[_kind] will not iterate over all types since they use btf->nr_types to show the number of types to iterate over. For split BTF this is the number of types _on top of base BTF_, so it will underestimate the number of types to iterate over, especially for vmlinux + module BTF, where the latter is much smaller. Use btf__get_nr_types() instead. Fixes: ba451366bf44 ("libbpf: Implement basic split BTF support") Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1605437195-2175-1-git-send-email-alan.maguire@oracle.com	2020-12-04 20:04:52 -08:00
Thomas Karlsson	2ea4ba9c96	macvlan: Support for high multicast packet rate Background: Broadcast and multicast packages are enqueued for later processing. This queue was previously hardcoded to 1000. This proved insufficient for handling very high packet rates. This resulted in packet drops for multicast. While at the same time unicast worked fine. The change: This patch make the queue length adjustable to accommodate for environments with very high multicast packet rate. But still keeps the default value of 1000 unless specified. The queue length is specified as a request per macvlan using the IFLA_MACVLAN_BC_QUEUE_LEN parameter. The actual used queue length will then be the maximum of any macvlan connected to the same port. The actual used queue length for the port can be retrieved (read only) by the IFLA_MACVLAN_BC_QUEUE_LEN_USED parameter for verification. This will be followed up by a patch to iproute2 in order to adjust the parameter from userspace. Signed-off-by: Thomas Karlsson <thomas.karlsson@paneda.se> Link: https://lore.kernel.org/r/dd4673b2-7eab-edda-6815-85c67ce87f63@paneda.se Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-12-04 20:04:52 -08:00
Andrii Nakryiko	2dd5965052	libbpf: Don't attempt to load unused subprog as an entry-point BPF program If BPF code contains unused BPF subprogram and there are no other subprogram calls (which can realistically happen in real-world applications given sufficiently smart Clang code optimizations), libbpf will erroneously assume that subprograms are entry-point programs and will attempt to load them with UNSPEC program type. Fix by not relying on subcall instructions and rather detect it based on the structure of BPF object's sections. Fixes: 9a94f277c4fb ("tools: libbpf: restore the ability to load programs from .text section") Reported-by: Dmitrii Banshchikov <dbanschikov@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20201107000251.256821-1-andrii@kernel.org	2020-12-04 20:04:52 -08:00
Andrii Nakryiko	ef8820fea8	bpf: Assign ID to vmlinux BTF and return extra info for BTF in GET_OBJ_INFO Allocate ID for vmlinux BTF. This makes it visible when iterating over all BTF objects in the system. To allow distinguishing vmlinux BTF (and later kernel module BTF) from user-provided BTFs, expose extra kernel_btf flag, as well as BTF name ("vmlinux" for vmlinux BTF, will equal to module's name for module BTF). We might want to later allow specifying BTF name for user-provided BTFs as well, if that makes sense. But currently this is reserved only for in-kernel BTFs. Having in-kernel BTFs exposed IDs will allow to extend BPF APIs that require in-kernel BTF type with ability to specify BTF types from kernel modules, not just vmlinux BTF. This will be implemented in a follow up patch set for fentry/fexit/fmod_ret/lsm/etc. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201110011932.3201430-3-andrii@kernel.org	2020-12-04 20:04:52 -08:00
KP Singh	eae38a781c	bpf: Implement get_current_task_btf and RET_PTR_TO_BTF_ID The currently available bpf_get_current_task returns an unsigned integer which can be used along with BPF_CORE_READ to read data from the task_struct but still cannot be used as an input argument to a helper that accepts an ARG_PTR_TO_BTF_ID of type task_struct. In order to implement this helper a new return type, RET_PTR_TO_BTF_ID, is added. This is similar to RET_PTR_TO_BTF_ID_OR_NULL but does not require checking the nullness of returned pointer. Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20201106103747.2780972-6-kpsingh@chromium.org	2020-12-04 20:04:52 -08:00
KP Singh	83c2c20acb	libbpf: Add support for task local storage Updates the bpf_probe_map_type API to also support BPF_MAP_TYPE_TASK_STORAGE similar to other local storage maps. Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20201106103747.2780972-4-kpsingh@chromium.org	2020-12-04 20:04:52 -08:00
KP Singh	00ae5bac8f	bpf: Implement task local storage Similar to bpf_local_storage for sockets and inodes add local storage for task_struct. The life-cycle of storage is managed with the life-cycle of the task_struct. i.e. the storage is destroyed along with the owning task with a callback to the bpf_task_storage_free from the task_free LSM hook. The BPF LSM allocates an __rcu pointer to the bpf_local_storage in the security blob which are now stackable and can co-exist with other LSMs. The userspace map operations can be done by using a pid fd as a key passed to the lookup, update and delete operations. Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20201106103747.2780972-3-kpsingh@chromium.org	2020-12-04 20:04:52 -08:00
Andrii Nakryiko	f99c252cbc	vmtest: update Kconfig to accommodate IMA test config test_progs's IMA selftests requires extra Kconfig values, so update latest.config to accommodate those. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2020-12-03 12:27:49 -08:00
Andrii Nakryiko	5ae2a2621c	readme: move gory sync details down and add libbpf-bootstrap references Move gory details about libbpf mirror and sync into a separate section at the bottom of README. Also add references to libbpf-bootstrap and blog about it, as well as libbpf-tools reference.	2020-11-29 13:34:03 -08:00
Andrii Nakryiko	5af3d86b5a	vmtests: blacklist two more tests on 5.5 tcpbpf_user uses cgroup bpf_link, not available in 5.5. hash_large_key is testing a more permissive verifier check, implemented in 5.11. So blacklist both. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2020-11-05 21:20:45 -08:00
Andrii Nakryiko	c55abf0752	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 3cb12d27ff655e57e8efe3486dca2a22f4e30578 Checkpoint bpf-next commit: c6bde958a62b8ca5ee8d2c1fe429aec4ad54efad Baseline bpf commit: c66dca98a24cb5f3493dd08d40bcfa94a220fa92 Checkpoint bpf commit: d3bec0138bfbe58606fc1d6f57a4cdc1a20218db Andrii Nakryiko (6): libbpf: Factor out common operations in BTF writing APIs libbpf: Unify and speed up BTF string deduplication libbpf: Implement basic split BTF support libbpf: Fix BTF data layout checks and allow empty BTF libbpf: Support BTF dedup of split BTFs libbpf: Accomodate DWARF/compiler bug with duplicated identical arrays Ian Rogers (1): libbpf, hashmap: Fix undefined behavior in hash_bits Magnus Karlsson (2): libbpf: Fix null dereference in xsk_socket__delete libbpf: Fix possible use after free in xsk_socket__delete src/btf.c \| 807 +++++++++++++++++++++++++++++-------------------- src/btf.h \| 8 + src/hashmap.h \| 15 +- src/libbpf.map \| 9 + src/xsk.c \| 9 +- 5 files changed, 504 insertions(+), 344 deletions(-) -- 2.24.1	2020-11-05 21:20:45 -08:00
Andrii Nakryiko	e30f758aab	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2020-11-05 21:20:45 -08:00
Magnus Karlsson	8caff995c7	libbpf: Fix possible use after free in xsk_socket__delete Fix a possible use after free in xsk_socket__delete that will happen if xsk_put_ctx() frees the ctx. To fix, save the umem reference taken from the context and just use that instead. Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1604396490-12129-3-git-send-email-magnus.karlsson@gmail.com	2020-11-05 21:20:45 -08:00
Magnus Karlsson	539aa6bea5	libbpf: Fix null dereference in xsk_socket__delete Fix a possible null pointer dereference in xsk_socket__delete that will occur if a null pointer is fed into the function. Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices") Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1604396490-12129-2-git-send-email-magnus.karlsson@gmail.com	2020-11-05 21:20:45 -08:00
Ian Rogers	224db2db07	libbpf, hashmap: Fix undefined behavior in hash_bits If bits is 0, the case when the map is empty, then the >> is the size of the register which is undefined behavior - on x86 it is the same as a shift by 0. Fix by handling the 0 case explicitly and guarding calls to hash_bits for empty maps in hashmap__for_each_key_entry and hashmap__for_each_entry_safe. Fixes: e3b924224028 ("libbpf: add resizable non-thread safe internal hashmap") Suggested-by: Andrii Nakryiko <andriin@fb.com>, Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201029223707.494059-1-irogers@google.com	2020-11-05 21:20:45 -08:00
Andrii Nakryiko	e6725d2467	libbpf: Accomodate DWARF/compiler bug with duplicated identical arrays In some cases compiler seems to generate distinct DWARF types for identical arrays within the same CU. That seems like a bug, but it's already out there and breaks type graph equivalence checks, so accommodate it anyway by checking for identical arrays, regardless of their type ID. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201105043402.2530976-10-andrii@kernel.org	2020-11-05 21:20:45 -08:00
Andrii Nakryiko	658ac1ec19	libbpf: Support BTF dedup of split BTFs Add support for deduplication split BTFs. When deduplicating split BTF, base BTF is considered to be immutable and can't be modified or adjusted. 99% of BTF deduplication logic is left intact (module some type numbering adjustments). There are only two differences. First, each type in base BTF gets hashed (expect VAR and DATASEC, of course, those are always considered to be self-canonical instances) and added into a table of canonical table candidates. Hashing is a shallow, fast operation, so mostly eliminates the overhead of having entire base BTF to be a part of BTF dedup. Second difference is very critical and subtle. While deduplicating split BTF types, it is possible to discover that one of immutable base BTF BTF_KIND_FWD types can and should be resolved to a full STRUCT/UNION type from the split BTF part. This is, obviously, can't happen because we can't modify the base BTF types anymore. So because of that, any type in split BTF that directly or indirectly references that newly-to-be-resolved FWD type can't be considered to be equivalent to the corresponding canonical types in base BTF, because that would result in a loss of type resolution information. So in such case, split BTF types will be deduplicated separately and will cause some duplication of type information, which is unavoidable. With those two changes, the rest of the algorithm manages to deduplicate split BTF correctly, pointing all the duplicates to their canonical counter-parts in base BTF, but also is deduplicating whatever unique types are present in split BTF on their own. Also, theoretically, split BTF after deduplication could end up with either empty type section or empty string section. This is handled by libbpf correctly in one of previous patches in the series. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201105043402.2530976-9-andrii@kernel.org	2020-11-05 21:20:45 -08:00
Andrii Nakryiko	dd36215834	libbpf: Fix BTF data layout checks and allow empty BTF Make data section layout checks stricter, disallowing overlap of types and strings data. Additionally, allow BTFs with no type data. There is nothing inherently wrong with having BTF with no types (put potentially with some strings). This could be a situation with kernel module BTFs, if module doesn't introduce any new type information. Also fix invalid offset alignment check for btf->hdr->type_off. Fixes: 8a138aed4a80 ("bpf: btf: Add BTF support to libbpf") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201105043402.2530976-8-andrii@kernel.org	2020-11-05 21:20:45 -08:00
Andrii Nakryiko	2811d54f8b	libbpf: Implement basic split BTF support Support split BTF operation, in which one BTF (base BTF) provides basic set of types and strings, while another one (split BTF) builds on top of base's types and strings and adds its own new types and strings. From API standpoint, the fact that the split BTF is built on top of the base BTF is transparent. Type numeration is transparent. If the base BTF had last type ID #N, then all types in the split BTF start at type ID N+1. Any type in split BTF can reference base BTF types, but not vice versa. Programmatically construction of a split BTF on top of a base BTF is supported: one can create an empty split BTF with btf__new_empty_split() and pass base BTF as an input, or pass raw binary data to btf__new_split(), or use btf__parse_xxx_split() variants to get initial set of split types/strings from the ELF file with .BTF section. String offsets are similarly transparent and are a logical continuation of base BTF's strings. When building BTF programmatically and adding a new string (explicitly with btf__add_str() or implicitly through appending new types/members), string-to-be-added would first be looked up from the base BTF's string section and re-used if it's there. If not, it will be looked up and/or added to the split BTF string section. Similarly to type IDs, types in split BTF can refer to strings from base BTF absolutely transparently (but not vice versa, of course, because base BTF doesn't "know" about existence of split BTF). Internal type index is slightly adjusted to be zero-indexed, ignoring a fake [0] VOID type. This allows to handle split/base BTF type lookups transparently by using btf->start_id type ID offset, which is always 1 for base/non-split BTF and equals btf__get_nr_types(base_btf) + 1 for the split BTF. BTF deduplication is not yet supported for split BTF and support for it will be added in separate patch. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201105043402.2530976-5-andrii@kernel.org	2020-11-05 21:20:45 -08:00
Andrii Nakryiko	be2dc73ee2	libbpf: Unify and speed up BTF string deduplication Revamp BTF dedup's string deduplication to match the approach of writable BTF string management. This allows to transfer deduplicated strings index back to BTF object after deduplication without expensive extra memory copying and hash map re-construction. It also simplifies the code and speeds it up, because hashmap-based string deduplication is faster than sort + unique approach. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201105043402.2530976-4-andrii@kernel.org	2020-11-05 21:20:45 -08:00
Andrii Nakryiko	4953827790	libbpf: Factor out common operations in BTF writing APIs Factor out commiting of appended type data. Also extract fetching the very last type in the BTF (to append members to). These two operations are common across many APIs and will be easier to refactor with split BTF, if they are extracted into a single place. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20201105043402.2530976-2-andrii@kernel.org	2020-11-05 21:20:45 -08:00
Andrii Nakryiko	d1fd50d475	helpers: add `struct bpf_redir_neigh` forward declaration This avoids compilation warning if `struct bpf_redir_neigh` is not provided by other kernel headers. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2020-10-28 09:59:37 -07:00
Andrii Nakryiko	f0c6b6bdfb	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 376dcfe3a4e5a5475a84e6b5f926066a8614f887 Checkpoint bpf-next commit: 3cb12d27ff655e57e8efe3486dca2a22f4e30578 Baseline bpf commit: 28802e7c0c9954218d1830f7507edc9d49b03a00 Checkpoint bpf commit: c66dca98a24cb5f3493dd08d40bcfa94a220fa92 Daniel Borkmann (1): bpf, libbpf: Guard bpf inline asm from bpf_tail_call_static Toke Høiland-Jørgensen (1): bpf: Fix bpf_redirect_neigh helper api to support supplying nexthop include/uapi/linux/bpf.h \| 22 ++++++++++++++++++---- src/bpf_helpers.h \| 2 ++ 2 files changed, 20 insertions(+), 4 deletions(-) -- 2.24.1	2020-10-28 09:08:35 -07:00
Andrii Nakryiko	475ee87969	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2020-10-28 09:08:35 -07:00
Daniel Borkmann	f754860e35	bpf, libbpf: Guard bpf inline asm from bpf_tail_call_static Yaniv reported a compilation error after pulling latest libbpf: [...] ../libbpf/src/root/usr/include/bpf/bpf_helpers.h:99:10: error: unknown register name 'r0' in asm : "r0", "r1", "r2", "r3", "r4", "r5"); [...] The issue got triggered given Yaniv was compiling tracing programs with native target (e.g. x86) instead of BPF target, hence no BTF generated vmlinux.h nor CO-RE used, and later llc with -march=bpf was invoked to compile from LLVM IR to BPF object file. Given that clang was expecting x86 inline asm and not BPF one the error complained that these regs don't exist on the former. Guard bpf_tail_call_static() with defined(__bpf__) where BPF inline asm is valid to use. BPF tracing programs on more modern kernels use BPF target anyway and thus the bpf_tail_call_static() function will be available for them. BPF inline asm is supported since clang 7 (clang <= 6 otherwise throws same above error), and __bpf_unreachable() since clang 8, therefore include the latter condition in order to prevent compilation errors for older clang versions. Given even an old Ubuntu 18.04 LTS has official LLVM packages all the way up to llvm-10, I did not bother to special case the __bpf_unreachable() inside bpf_tail_call_static() further. Also, undo the sockex3_kern's use of bpf_tail_call_static() sample given they still have the old hacky way to even compile networking progs with native instead of BPF target so bpf_tail_call_static() won't be defined there anymore. Fixes: 0e9f6841f664 ("bpf, libbpf: Add bpf_tail_call_static helper for bpf programs") Reported-by: Yaniv Agman <yanivagman@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Tested-by: Yaniv Agman <yanivagman@gmail.com> Link: https://lore.kernel.org/bpf/CAMy7=ZUk08w5Gc2Z-EKi4JFtuUCaZYmE4yzhJjrExXpYKR4L8w@mail.gmail.com Link: https://lore.kernel.org/bpf/20201021203257.26223-1-daniel@iogearbox.net	2020-10-28 09:08:35 -07:00
Toke Høiland-Jørgensen	78d61150e9	bpf: Fix bpf_redirect_neigh helper api to support supplying nexthop Based on the discussion in [0], update the bpf_redirect_neigh() helper to accept an optional parameter specifying the nexthop information. This makes it possible to combine bpf_fib_lookup() and bpf_redirect_neigh() without incurring a duplicate FIB lookup - since the FIB lookup helper will return the nexthop information even if no neighbour is present, this can simply be passed on to bpf_redirect_neigh() if bpf_fib_lookup() returns BPF_FIB_LKUP_RET_NO_NEIGH. Thus fix & extend it before helper API is frozen. [0] https://lore.kernel.org/bpf/393e17fc-d187-3a8d-2f0d-a627c7c63fca@iogearbox.net/ Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/bpf/160322915615.32199.1187570224032024535.stgit@toke.dk	2020-10-28 09:08:35 -07:00
Andrii Nakryiko	49280406a2	readme: add Ubuntu mentions Ubuntu 20.10 is now a good version to do BPF + CO-RE development.	2020-10-26 21:16:14 -07:00
Andrii Nakryiko	de58d0cccf	sync: update 5.5.0 blacklist Blacklist 2 new selftests, which depend on 5.10 kernel. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2020-10-12 14:27:04 -07:00
Andrii Nakryiko	6fa81d4dbe	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: f4d385e4d51d035c7f0d68a3e9564c9453c13aa4 Checkpoint bpf-next commit: 376dcfe3a4e5a5475a84e6b5f926066a8614f887 Baseline bpf commit: 9cf51446e68607136e42a4e531a30c888c472463 Checkpoint bpf commit: 28802e7c0c9954218d1830f7507edc9d49b03a00 Andrii Nakryiko (3): libbpf: Skip CO-RE relocations for not loaded BPF programs libbpf: Support safe subset of load/store instruction resizing with CO-RE libbpf: Allow specifying both ELF and raw BTF for CO-RE BTF override Daniel Borkmann (3): bpf: Improve bpf_redirect_neigh helper description bpf: Add redirect_peer helper bpf: Allow for map-in-map with dynamic inner array map entries Hangbin Liu (2): libbpf: Close map fd if init map slots failed libbpf: Check if pin_path was set even map fd exist Hao Luo (4): bpf: Introduce pseudo_btf_id bpf/libbpf: BTF support for typed ksyms bpf: Introduce bpf_per_cpu_ptr() bpf: Introducte bpf_this_cpu_ptr() Jakub Wilk (1): bpf: Fix typo in uapi/linux/bpf.h Luigi Rizzo (1): bpf, libbpf: Use valid btf in bpf_program__set_attach_target Magnus Karlsson (1): libbpf: Fix compatibility problem in xsk_socket__create Nikita V. Shirokov (1): bpf: Add tcp_notsent_lowat bpf setsockopt Song Liu (1): bpf: Introduce BPF_F_PRESERVE_ELEMS for perf event array include/uapi/linux/bpf.h \| 104 ++++++++++-- src/libbpf.c \| 348 ++++++++++++++++++++++++++++++++------- src/xsk.c \| 7 +- 3 files changed, 385 insertions(+), 74 deletions(-) -- 2.24.1	2020-10-12 14:27:04 -07:00
Andrii Nakryiko	bc94c2b82f	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2020-10-12 14:27:04 -07:00
Daniel Borkmann	d47094a2ce	bpf: Allow for map-in-map with dynamic inner array map entries Recent work in f4d05259213f ("bpf: Add map_meta_equal map ops") and 134fede4eecf ("bpf: Relax max_entries check for most of the inner map types") added support for dynamic inner max elements for most map-in-map types. Exceptions were maps like array or prog array where the map_gen_lookup() callback uses the maps' max_entries field as a constant when emitting instructions. We recently implemented Maglev consistent hashing into Cilium's load balancer which uses map-in-map with an outer map being hash and inner being array holding the Maglev backend table for each service. This has been designed this way in order to reduce overall memory consumption given the outer hash map allows to avoid preallocating a large, flat memory area for all services. Also, the number of service mappings is not always known a-priori. The use case for dynamic inner array map entries is to further reduce memory overhead, for example, some services might just have a small number of back ends while others could have a large number. Right now the Maglev backend table for small and large number of backends would need to have the same inner array map entries which adds a lot of unneeded overhead. Dynamic inner array map entries can be realized by avoiding the inlined code generation for their lookup. The lookup will still be efficient since it will be calling into array_map_lookup_elem() directly and thus avoiding retpoline. The patch adds a BPF_F_INNER_MAP flag to map creation which therefore skips inline code generation and relaxes array_map_meta_equal() check to ignore both maps' max_entries. This also still allows to have faster lookups for map-in-map when BPF_F_INNER_MAP is not specified and hence dynamic max_entries not needed. Example code generation where inner map is dynamic sized array: # bpftool p d x i 125 int handle__sys_enter(void * ctx): ; int handle__sys_enter(void ctx) 0: (b4) w1 = 0 ; int key = 0; 1: (63) (u32 )(r10 -4) = r1 2: (bf) r2 = r10 ; 3: (07) r2 += -4 ; inner_map = bpf_map_lookup_elem(&outer_arr_dyn, &key); 4: (18) r1 = map[id:468] 6: (07) r1 += 272 7: (61) r0 = (u32 )(r2 +0) 8: (35) if r0 >= 0x3 goto pc+5 9: (67) r0 <<= 3 10: (0f) r0 += r1 11: (79) r0 = (u64 )(r0 +0) 12: (15) if r0 == 0x0 goto pc+1 13: (05) goto pc+1 14: (b7) r0 = 0 15: (b4) w6 = -1 ; if (!inner_map) 16: (15) if r0 == 0x0 goto pc+6 17: (bf) r2 = r10 ; 18: (07) r2 += -4 ; val = bpf_map_lookup_elem(inner_map, &key); 19: (bf) r1 = r0 \| No inlining but instead 20: (85) call array_map_lookup_elem#149280 \| call to array_map_lookup_elem() ; return val ? val : -1; \| for inner array lookup. 21: (15) if r0 == 0x0 goto pc+1 ; return val ? val : -1; 22: (61) r6 = (u32 *)(r0 +0) ; } 23: (bc) w0 = w6 24: (95) exit Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20201010234006.7075-4-daniel@iogearbox.net	2020-10-12 14:27:04 -07:00
Daniel Borkmann	4672fb6790	bpf: Add redirect_peer helper Add an efficient ingress to ingress netns switch that can be used out of tc BPF programs in order to redirect traffic from host ns ingress into a container veth device ingress without having to go via CPU backlog queue [0]. For local containers this can also be utilized and path via CPU backlog queue only needs to be taken once, not twice. On a high level this borrows from ipvlan which does similar switch in __netif_receive_skb_core() and then iterates via another_round. This helps to reduce latency for mentioned use cases. Pod to remote pod with redirect(), TCP_RR [1]: # percpu_netperf 10.217.1.33 RT_LATENCY: 122.450 (per CPU: 122.666 122.401 122.333 122.401 ) MEAN_LATENCY: 121.210 (per CPU: 121.100 121.260 121.320 121.160 ) STDDEV_LATENCY: 120.040 (per CPU: 119.420 119.910 125.460 115.370 ) MIN_LATENCY: 46.500 (per CPU: 47.000 47.000 47.000 45.000 ) P50_LATENCY: 118.500 (per CPU: 118.000 119.000 118.000 119.000 ) P90_LATENCY: 127.500 (per CPU: 127.000 128.000 127.000 128.000 ) P99_LATENCY: 130.750 (per CPU: 131.000 131.000 129.000 132.000 ) TRANSACTION_RATE: 32666.400 (per CPU: 8152.200 8169.842 8174.439 8169.897 ) Pod to remote pod with redirect_peer(), TCP_RR: # percpu_netperf 10.217.1.33 RT_LATENCY: 44.449 (per CPU: 43.767 43.127 45.279 45.622 ) MEAN_LATENCY: 45.065 (per CPU: 44.030 45.530 45.190 45.510 ) STDDEV_LATENCY: 84.823 (per CPU: 66.770 97.290 84.380 90.850 ) MIN_LATENCY: 33.500 (per CPU: 33.000 33.000 34.000 34.000 ) P50_LATENCY: 43.250 (per CPU: 43.000 43.000 43.000 44.000 ) P90_LATENCY: 46.750 (per CPU: 46.000 47.000 47.000 47.000 ) P99_LATENCY: 52.750 (per CPU: 51.000 54.000 53.000 53.000 ) TRANSACTION_RATE: 90039.500 (per CPU: 22848.186 23187.089 22085.077 21919.130 ) [0] https://linuxplumbersconf.org/event/7/contributions/674/attachments/568/1002/plumbers_2020_cilium_load_balancer.pdf [1] https://github.com/borkmann/netperf_scripts/blob/master/percpu_netperf Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201010234006.7075-3-daniel@iogearbox.net	2020-10-12 14:27:04 -07:00
Daniel Borkmann	a8a505a36f	bpf: Improve bpf_redirect_neigh helper description Follow-up to address David's feedback that we should better describe internals of the bpf_redirect_neigh() helper. Suggested-by: David Ahern <dsahern@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: David Ahern <dsahern@gmail.com> Link: https://lore.kernel.org/bpf/20201010234006.7075-2-daniel@iogearbox.net	2020-10-12 14:27:04 -07:00
Nikita V. Shirokov	e3b9cf7aaa	bpf: Add tcp_notsent_lowat bpf setsockopt Adding support for TCP_NOTSENT_LOWAT sockoption (https://lwn.net/Articles/560082/) in tcp bpf programs. Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20201009070325.226855-1-tehnerd@tehnerd.com	2020-10-12 14:27:04 -07:00
Andrii Nakryiko	76764b891b	libbpf: Allow specifying both ELF and raw BTF for CO-RE BTF override Use generalized BTF parsing logic, making it possible to parse BTF both from ELF file, as well as a raw BTF dump. This makes it easier to write custom tests with manually generated BTFs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201008001025.292064-4-andrii@kernel.org	2020-10-12 14:27:04 -07:00
Andrii Nakryiko	8ef6a6e709	libbpf: Support safe subset of load/store instruction resizing with CO-RE Add support for patching instructions of the following form: - rX = (T )(rY + <off>); - (T )(rX + <off>) = rY; - (T )(rX + <off>) = <imm>, where T is one of {u8, u16, u32, u64}. For such instructions, if the actual kernel field recorded in CO-RE relocation has a different size than the one recorded locally (e.g., from vmlinux.h), then libbpf will adjust T to an appropriate 1-, 2-, 4-, or 8-byte loads. In general, such transformation is not always correct and could lead to invalid final value being loaded or stored. But two classes of cases are always safe: - if both local and target (kernel) types are unsigned integers, but of different sizes, then it's OK to adjust load/store instruction according to the necessary memory size. Zero-extending nature of such instructions and unsignedness make sure that the final value is always correct; - pointer size mismatch between BPF target architecture (which is always 64-bit) and 32-bit host kernel architecture can be similarly resolved automatically, because pointer is essentially an unsigned integer. Loading 32-bit pointer into 64-bit BPF register with zero extension will leave correct pointer in the register. Both cases are necessary to support CO-RE on 32-bit kernels, as `unsigned long` in vmlinux.h generated from 32-bit kernel is 32-bit, but when compiled with BPF program for BPF target it will be treated by compiler as 64-bit integer. Similarly, pointers in vmlinux.h are 32-bit for kernel, but treated as 64-bit values by compiler for BPF target. Both problems are now resolved by libbpf for direct memory reads. But similar transformations are useful in general when kernel fields are "resized" from, e.g., unsigned int to unsigned long (or vice versa). Now, similar transformations for signed integers are not safe to perform as they will result in incorrect sign extension of the value. If such situation is detected, libbpf will emit helpful message and will poison the instruction. Not failing immediately means that it's possible to guard the instruction based on kernel version (or other conditions) and make sure it's not reachable. If there is a need to read signed integers that change sizes between different kernels, it's possible to use BPF_CORE_READ_BITFIELD() macro, which works both with bitfields and non-bitfield integers of any signedness and handles sign-extension properly. Also, bpf_core_read() with proper size and/or use of bpf_core_field_size() relocation could allow to deal with such complicated situations explicitly, if not so conventiently as direct memory reads. Selftests added in a separate patch in progs/test_core_autosize.c demonstrate both direct memory and probed use cases. BPF_CORE_READ() is not changed and it won't deal with such situations as automatically as direct memory reads due to the signedness integer limitations, which are much harder to detect and control with compiler macro magic. So it's encouraged to utilize direct memory reads as much as possible. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201008001025.292064-3-andrii@kernel.org	2020-10-12 14:27:04 -07:00
Andrii Nakryiko	44d5bc1709	libbpf: Skip CO-RE relocations for not loaded BPF programs Bypass CO-RE relocations step for BPF programs that are not going to be loaded. This allows to have BPF programs compiled in and disabled dynamically if kernel is not supposed to provide enough relocation information. In such case, there won't be unnecessary warnings about failed relocations. Fixes: d929758101fc ("libbpf: Support disabling auto-loading BPF programs") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201008001025.292064-2-andrii@kernel.org	2020-10-12 14:27:04 -07:00
Magnus Karlsson	95848b59b9	libbpf: Fix compatibility problem in xsk_socket__create Fix a compatibility problem when the old XDP_SHARED_UMEM mode is used together with the xsk_socket__create() call. In the old XDP_SHARED_UMEM mode, only sharing of the same device and queue id was allowed, and in this mode, the fill ring and completion ring were shared between the AF_XDP sockets. Therefore, it was perfectly fine to call the xsk_socket__create() API for each socket and not use the new xsk_socket__create_shared() API. This behavior was ruined by the commit introducing XDP_SHARED_UMEM support between different devices and/or queue ids. This patch restores the ability to use xsk_socket__create in these circumstances so that backward compatibility is not broken. Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/1602070946-11154-1-git-send-email-magnus.karlsson@gmail.com	2020-10-12 14:27:04 -07:00
Jakub Wilk	1bc08143b5	bpf: Fix typo in uapi/linux/bpf.h Reported-by: Samanta Navarro <ferivoz@riseup.net> Signed-off-by: Jakub Wilk <jwilk@jwilk.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20201007055717.7319-1-jwilk@jwilk.net	2020-10-12 14:27:04 -07:00
Luigi Rizzo	b9682e291d	bpf, libbpf: Use valid btf in bpf_program__set_attach_target bpf_program__set_attach_target(prog, fd, ...) will always fail when fd = 0 (attach to a kernel symbol) because obj->btf_vmlinux is NULL and there is no way to set it (at the moment btf_vmlinux is meant to be temporary storage for use in bpf_object__load_xattr()). Fix this by using libbpf_find_vmlinux_btf_id(). At some point we may want to opportunistically cache btf_vmlinux so it can be reused with multiple programs. Signed-off-by: Luigi Rizzo <lrizzo@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Petar Penkov <ppenkov@google.com> Link: https://lore.kernel.org/bpf/20201005224528.389097-1-lrizzo@google.com	2020-10-12 14:27:04 -07:00
Hangbin Liu	54fe2f1e26	libbpf: Check if pin_path was set even map fd exist Say a user reuse map fd after creating a map manually and set the pin_path, then load the object via libbpf. In libbpf bpf_object__create_maps(), bpf_object__reuse_map() will return 0 if there is no pinned map in map->pin_path. Then after checking if map fd exist, we should also check if pin_path was set and do bpf_map__pin() instead of continue the loop. Fix it by creating map if fd not exist and continue checking pin_path after that. Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20201006021345.3817033-3-liuhangbin@gmail.com	2020-10-12 14:27:04 -07:00
Hangbin Liu	fd28e0130a	libbpf: Close map fd if init map slots failed Previously we forgot to close the map fd if bpf_map_update_elem() failed during map slot init, which will leak map fd. Let's move map slot initialization to new function init_map_slots() to simplify the code. And close the map fd if init slot failed. Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20201006021345.3817033-2-liuhangbin@gmail.com	2020-10-12 14:27:04 -07:00
Hao Luo	f908087023	bpf: Introducte bpf_this_cpu_ptr() Add bpf_this_cpu_ptr() to help access percpu var on this cpu. This helper always returns a valid pointer, therefore no need to check returned value for NULL. Also note that all programs run with preemption disabled, which means that the returned pointer is stable during all the execution of the program. Signed-off-by: Hao Luo <haoluo@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200929235049.2533242-6-haoluo@google.com	2020-10-12 14:27:04 -07:00
Hao Luo	b3b297aa16	bpf: Introduce bpf_per_cpu_ptr() Add bpf_per_cpu_ptr() to help bpf programs access percpu vars. bpf_per_cpu_ptr() has the same semantic as per_cpu_ptr() in the kernel except that it may return NULL. This happens when the cpu parameter is out of range. So the caller must check the returned value. Signed-off-by: Hao Luo <haoluo@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200929235049.2533242-5-haoluo@google.com	2020-10-12 14:27:04 -07:00
Hao Luo	6d0fcc3bd5	bpf/libbpf: BTF support for typed ksyms If a ksym is defined with a type, libbpf will try to find the ksym's btf information from kernel btf. If a valid btf entry for the ksym is found, libbpf can pass in the found btf id to the verifier, which validates the ksym's type and value. Typeless ksyms (i.e. those defined as 'void') will not have such btf_id, but it has the symbol's address (read from kallsyms) and its value is treated as a raw pointer. Signed-off-by: Hao Luo <haoluo@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200929235049.2533242-3-haoluo@google.com	2020-10-12 14:27:04 -07:00
Hao Luo	3706bf773b	bpf: Introduce pseudo_btf_id Pseudo_btf_id is a type of ld_imm insn that associates a btf_id to a ksym so that further dereferences on the ksym can use the BTF info to validate accesses. Internally, when seeing a pseudo_btf_id ld insn, the verifier reads the btf_id stored in the insn[0]'s imm field and marks the dst_reg as PTR_TO_BTF_ID. The btf_id points to a VAR_KIND, which is encoded in btf_vminux by pahole. If the VAR is not of a struct type, the dst reg will be marked as PTR_TO_MEM instead of PTR_TO_BTF_ID and the mem_size is resolved to the size of the VAR's type. >From the VAR btf_id, the verifier can also read the address of the ksym's corresponding kernel var from kallsyms and use that to fill dst_reg. Therefore, the proper functionality of pseudo_btf_id depends on (1) kallsyms and (2) the encoding of kernel global VARs in pahole, which should be available since pahole v1.18. Signed-off-by: Hao Luo <haoluo@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200929235049.2533242-2-haoluo@google.com	2020-10-12 14:27:04 -07:00
Song Liu	09718f4ecd	bpf: Introduce BPF_F_PRESERVE_ELEMS for perf event array Currently, perf event in perf event array is removed from the array when the map fd used to add the event is closed. This behavior makes it difficult to the share perf events with perf event array. Introduce perf event map that keeps the perf event open with a new flag BPF_F_PRESERVE_ELEMS. With this flag set, perf events in the array are not removed when the original map fd is closed. Instead, the perf event will stay in the map until 1) it is explicitly removed from the array; or 2) the array is freed. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200930224927.1936644-2-songliubraving@fb.com	2020-10-12 14:27:04 -07:00
Andrii Nakryiko	8205f37a56	sync: ignore libc_compat.h Libbpf doesn't rely on libc_compat.h anymore, so ignore it for the purposes of syncing libbpf sources into Github. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2020-10-12 12:18:53 -07:00
Andrii Nakryiko	ecbd504994	makefile: add quiet mode support Add quiet-by-default mode to Makefile, similar to libbpf Makefile in Linux repo. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2020-10-11 00:39:03 -07:00
Andrii Nakryiko	b6dd2f2b7d	vmtests: un-blacklist fixed selftests Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-09-30 18:19:55 -07:00
Andrii Nakryiko	a132697261	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: b0efc216f577997bf563d76d51673ed79c3d5f71 Checkpoint bpf-next commit: f4d385e4d51d035c7f0d68a3e9564c9453c13aa4 Baseline bpf commit: 9cf51446e68607136e42a4e531a30c888c472463 Checkpoint bpf commit: 9cf51446e68607136e42a4e531a30c888c472463 Andrii Nakryiko (1): libbpf: Make btf_dump work with modifiable BTF Daniel Borkmann (3): bpf: Add classid helper only based on skb->sk bpf: Add redirect_neigh helper as redirect drop-in bpf, libbpf: Add bpf_tail_call_static helper for bpf programs include/uapi/linux/bpf.h \| 24 ++++++++++++++ src/bpf_helpers.h \| 46 +++++++++++++++++++++++++++ src/btf.c \| 17 ++++++++++ src/btf_dump.c \| 69 +++++++++++++++++++++++++++------------- src/libbpf_internal.h \| 1 + 5 files changed, 135 insertions(+), 22 deletions(-) -- 2.24.1	2020-09-30 18:19:55 -07:00
Andrii Nakryiko	2d0aa12ea3	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2020-09-30 18:19:55 -07:00
Andrii Nakryiko	317ef1c295	libbpf: Make btf_dump work with modifiable BTF Ensure that btf_dump can accommodate new BTF types being appended to BTF instance after struct btf_dump was created. This came up during attemp to use btf_dump for raw type dumping in selftests, but given changes are not excessive, it's good to not have any gotchas in API usage, so I decided to support such use case in general. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200929232843.1249318-2-andriin@fb.com	2020-09-30 18:19:55 -07:00
Daniel Borkmann	80c7838600	bpf, libbpf: Add bpf_tail_call_static helper for bpf programs Port of tail_call_static() helper function from Cilium's BPF code base [0] to libbpf, so others can easily consume it as well. We've been using this in production code for some time now. The main idea is that we guarantee that the kernel's BPF infrastructure and JIT (here: x86_64) can patch the JITed BPF insns with direct jumps instead of having to fall back to using expensive retpolines. By using inline asm, we guarantee that the compiler won't merge the call from different paths with potentially different content of r2/r3. We're also using Cilium's __throw_build_bug() macro (here as: __bpf_unreachable()) in different places as a neat trick to trigger compilation errors when compiler does not remove code at compilation time. This works for the BPF back end as it does not implement the __builtin_trap(). [0] `f5537c2602` Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/1656a082e077552eb46642d513b4a6bde9a7dd01.1601477936.git.daniel@iogearbox.net	2020-09-30 18:19:55 -07:00
Daniel Borkmann	750801a0d5	bpf: Add redirect_neigh helper as redirect drop-in Add a redirect_neigh() helper as redirect() drop-in replacement for the xmit side. Main idea for the helper is to be very similar in semantics to the latter just that the skb gets injected into the neighboring subsystem in order to let the stack do the work it knows best anyway to populate the L2 addresses of the packet and then hand over to dev_queue_xmit() as redirect() does. This solves two bigger items: i) skbs don't need to go up to the stack on the host facing veth ingress side for traffic egressing the container to achieve the same for populating L2 which also has the huge advantage that ii) the skb->sk won't get orphaned in ip_rcv_core() when entering the IP routing layer on the host stack. Given that skb->sk neither gets orphaned when crossing the netns as per 9c4c325252c5 ("skbuff: preserve sock reference when scrubbing the skb.") the helper can then push the skbs directly to the phys device where FQ scheduler can do its work and TCP stack gets proper backpressure given we hold on to skb->sk as long as skb is still residing in queues. With the helper used in BPF data path to then push the skb to the phys device, I observed a stable/consistent TCP_STREAM improvement on veth devices for traffic going container -> host -> host -> container from ~10Gbps to ~15Gbps for a single stream in my test environment. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: David Ahern <dsahern@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Cc: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/bpf/f207de81629e1724899b73b8112e0013be782d35.1601477936.git.daniel@iogearbox.net	2020-09-30 18:19:55 -07:00
Daniel Borkmann	b5fd4c774d	bpf: Add classid helper only based on skb->sk Similarly to 5a52ae4e32a6 ("bpf: Allow to retrieve cgroup v1 classid from v2 hooks"), add a helper to retrieve cgroup v1 classid solely based on the skb->sk, so it can be used as key as part of BPF map lookups out of tc from host ns, in particular given the skb->sk is retained these days when crossing net ns thanks to 9c4c325252c5 ("skbuff: preserve sock reference when scrubbing the skb."). This is similar to bpf_skb_cgroup_id() which implements the same for v2. Kubernetes ecosystem is still operating on v1 however, hence net_cls needs to be used there until this can be dropped in with the v2 helper of bpf_skb_cgroup_id(). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/ed633cf27a1c620e901c5aa99ebdefb028dce600.1601477936.git.daniel@iogearbox.net	2020-09-30 18:19:55 -07:00
Vladimír Čunát	5a10cd2060	remove internal reallocarray() ... as it's covered by libbpf_reallocarray() since commit `dc70da9c70`.	2020-09-30 12:55:50 -07:00
Andrii Nakryiko	ff797cc905	vmtests: blacklist new tests for 5.5 Blacklist new tests that are depending on features in latest kernel. Also temporarily blacklist raw_tp_test_run test, until it is fixed upstream. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-09-29 18:29:49 -07:00
Andrii Nakryiko	21ea184818	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 2f7de9865ba3cbfcf8b504f07154fdb6124176a4 Checkpoint bpf-next commit: b0efc216f577997bf563d76d51673ed79c3d5f71 Baseline bpf commit: 87f92ac4c12758c4da3bbe4393f1d884b610b8a6 Checkpoint bpf commit: 9cf51446e68607136e42a4e531a30c888c472463 Alan Maguire (2): bpf: Add bpf_snprintf_btf helper bpf: Add bpf_seq_printf_btf helper Andrii Nakryiko (11): libbpf: Refactor internals of BTF type index libbpf: Remove assumption of single contiguous memory for BTF data libbpf: Generalize common logic for managing dynamically-sized arrays libbpf: Extract generic string hashing function for reuse libbpf: Allow modification of BTF and add btf__add_str API libbpf: Add btf__new_empty() to create an empty BTF object libbpf: Add BTF writing APIs libbpf: Add btf__str_by_offset() as a more generic variant of name_by_offset selftests/bpf: Test BTF writing APIs libbpf: Support BTF loading and raw data output in both endianness libbpf: Fix uninitialized variable in btf_parse_type_sec Martin KaFai Lau (4): bpf: Change bpf_sk_release and bpf_sk_cgroup_id to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON bpf: Change bpf_sk_storage_() to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON bpf: Change bpf_tcp_*_syncookie to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON bpf: Change bpf_sk_assign to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON Song Liu (3): bpf: Fix comment for helper bpf_current_task_under_cgroup() bpf: Enable BPF_PROG_TEST_RUN for raw_tracepoint libbpf: Support test run of raw tracepoint programs Toke Høiland-Jørgensen (2): bpf: Support attaching freplace programs to multiple attach points libbpf: Add support for freplace attachment in bpf_link_create YiFei Zhu (2): bpf: Add BPF_PROG_BIND_MAP syscall libbpf: Add BPF_PROG_BIND_MAP syscall and use it on .rodata section Yonghong Song (1): libbpf: Fix a compilation error with xsk.c for ubuntu 16.04 include/uapi/linux/bpf.h \| 118 ++- src/bpf.c \| 67 +- src/bpf.h \| 39 +- src/btf.c \| 1851 ++++++++++++++++++++++++++++++++------ src/btf.h \| 51 ++ src/btf_dump.c \| 9 +- src/hashmap.h \| 12 + src/libbpf.c \| 113 ++- src/libbpf.h \| 3 + src/libbpf.map \| 28 + src/libbpf_internal.h \| 8 + src/xsk.c \| 1 + 12 files changed, 1997 insertions(+), 303 deletions(-) -- 2.24.1	2020-09-29 18:29:49 -07:00
Andrii Nakryiko	760f71ec87	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2020-09-29 18:29:49 -07:00
Andrii Nakryiko	91e666c94c	libbpf: Fix uninitialized variable in btf_parse_type_sec Fix obvious unitialized variable use that wasn't reported by compiler. libbpf Makefile changes to catch such errors are added separately. Fixes: 3289959b97ca ("libbpf: Support BTF loading and raw data output in both endianness") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200929220604.833631-1-andriin@fb.com	2020-09-29 18:29:49 -07:00
Toke Høiland-Jørgensen	e40af4de0c	libbpf: Add support for freplace attachment in bpf_link_create This adds support for supplying a target btf ID for the bpf_link_create() operation, and adds a new bpf_program__attach_freplace() high-level API for attaching freplace functions with a target. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/160138355387.48470.18026176785351166890.stgit@toke.dk	2020-09-29 18:29:49 -07:00
Toke Høiland-Jørgensen	5e359219aa	bpf: Support attaching freplace programs to multiple attach points This enables support for attaching freplace programs to multiple attach points. It does this by amending the UAPI for bpf_link_Create with a target btf ID that can be used to supply the new attachment point along with the target program fd. The target must be compatible with the target that was supplied at program load time. The implementation reuses the checks that were factored out of check_attach_btf_id() to ensure compatibility between the BTF types of the old and new attachment. If these match, a new bpf_tracing_link will be created for the new attach target, allowing multiple attachments to co-exist simultaneously. The code could theoretically support multiple-attach of other types of tracing programs as well, but since I don't have a use case for any of those, there is no API support for doing so. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/160138355169.48470.17165680973640685368.stgit@toke.dk	2020-09-29 18:29:49 -07:00
Andrii Nakryiko	488110df60	libbpf: Support BTF loading and raw data output in both endianness Teach BTF to recognized wrong endianness and transparently convert it internally to host endianness. Original endianness of BTF will be preserved and used during btf__get_raw_data() to convert resulting raw data to the same endianness and a source raw_data. This means that little-endian host can parse big-endian BTF with no issues, all the type data will be presented to the client application in native endianness, but when it's time for emitting BTF to persist it in a file (e.g., after BTF deduplication), original non-native endianness will be preserved and stored. It's possible to query original endianness of BTF data with new btf__endianness() API. It's also possible to override desired output endianness with btf__set_endianness(), so that if application needs to load, say, big-endian BTF and store it as little-endian BTF, it's possible to manually override this. If btf__set_endianness() was used to change endianness, btf__endianness() will reflect overridden endianness. Given there are no known use cases for supporting cross-endianness for .BTF.ext, loading .BTF.ext in non-native endianness is not supported. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200929043046.1324350-3-andriin@fb.com	2020-09-29 18:29:49 -07:00
Andrii Nakryiko	f007a6bfdf	selftests/bpf: Test BTF writing APIs Add selftests for BTF writer APIs. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200929020533.711288-4-andriin@fb.com	2020-09-29 18:29:49 -07:00
Andrii Nakryiko	6f90197ab0	libbpf: Add btf__str_by_offset() as a more generic variant of name_by_offset BTF strings are used not just for names, they can be arbitrary strings used for CO-RE relocations, line/func infos, etc. Thus "name_by_offset" terminology is too specific and might be misleading. Instead, introduce btf__str_by_offset() API which uses generic string terminology. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200929020533.711288-3-andriin@fb.com	2020-09-29 18:29:49 -07:00
Andrii Nakryiko	a388fcb0f5	libbpf: Add BTF writing APIs Add APIs for appending new BTF types at the end of BTF object. Each BTF kind has either one API of the form btf__add_<kind>(). For types that have variable amount of additional items (struct/union, enum, func_proto, datasec), additional API is provided to emit each such item. E.g., for emitting a struct, one would use the following sequence of API calls: btf__add_struct(...); btf__add_field(...); ... btf__add_field(...); Each btf__add_field() will ensure that the last BTF type is of STRUCT or UNION kind and will automatically increment that type's vlen field. All the strings are provided as C strings (const char *), not a string offset. This significantly improves usability of BTF writer APIs. All such strings will be automatically appended to string section or existing string will be re-used, if such string was already added previously. Each API attempts to do all the reasonable validations, like enforcing non-empty names for entities with required names, proper value bounds, various bit offset restrictions, etc. Type ID validation is minimal because it's possible to emit a type that refers to type that will be emitted later, so libbpf has no way to enforce such cases. User must be careful to properly emit all the necessary types and specify type IDs that will be valid in the finally generated BTF. Each of btf__add_<kind>() APIs return new type ID on success or negative value on error. APIs like btf__add_field() that emit additional items return zero on success and negative value on error. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200929020533.711288-2-andriin@fb.com	2020-09-29 18:29:49 -07:00
Alan Maguire	2654268c79	bpf: Add bpf_seq_printf_btf helper A helper is added to allow seq file writing of kernel data structures using vmlinux BTF. Its signature is long bpf_seq_printf_btf(struct seq_file m, struct btf_ptr ptr, u32 btf_ptr_size, u64 flags); Flags and struct btf_ptr definitions/use are identical to the bpf_snprintf_btf helper, and the helper returns 0 on success or a negative error value. Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/1601292670-1616-8-git-send-email-alan.maguire@oracle.com	2020-09-29 18:29:49 -07:00
Alan Maguire	e7647823a1	bpf: Add bpf_snprintf_btf helper A helper is added to support tracing kernel type information in BPF using the BPF Type Format (BTF). Its signature is long bpf_snprintf_btf(char str, u32 str_size, struct btf_ptr ptr, u32 btf_ptr_size, u64 flags); struct btf_ptr * specifies - a pointer to the data to be traced - the BTF id of the type of data pointed to - a flags field is provided for future use; these flags are not to be confused with the BTF_F_* flags below that control how the btf_ptr is displayed; the flags member of the struct btf_ptr may be used to disambiguate types in kernel versus module BTF, etc; the main distinction is the flags relate to the type and information needed in identifying it; not how it is displayed. For example a BPF program with a struct sk_buff skb could do the following: static struct btf_ptr b = { }; b.ptr = skb; b.type_id = __builtin_btf_type_id(struct sk_buff, 1); bpf_snprintf_btf(str, sizeof(str), &b, sizeof(b), 0, 0); Default output looks like this: (struct sk_buff){ .transport_header = (__u16)65535, .mac_header = (__u16)65535, .end = (sk_buff_data_t)192, .head = (unsigned char )0x000000007524fd8b, .data = (unsigned char *)0x000000007524fd8b, .truesize = (unsigned int)768, .users = (refcount_t){ .refs = (atomic_t){ .counter = (int)1, }, }, } Flags modifying display are as follows: - BTF_F_COMPACT: no formatting around type information - BTF_F_NONAME: no struct/union member names/types - BTF_F_PTR_RAW: show raw (unobfuscated) pointer values; equivalent to %px. - BTF_F_ZERO: show zero-valued struct/union members; they are not displayed by default Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/1601292670-1616-4-git-send-email-alan.maguire@oracle.com	2020-09-29 18:29:49 -07:00
Andrii Nakryiko	3cfff16611	libbpf: Add btf__new_empty() to create an empty BTF object Add an ability to create an empty BTF object from scratch. This is going to be used by pahole for BTF encoding. And also by selftest for convenient creation of BTF objects. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200926011357.2366158-7-andriin@fb.com	2020-09-29 18:29:49 -07:00
Andrii Nakryiko	7ac1547f32	libbpf: Allow modification of BTF and add btf__add_str API Allow internal BTF representation to switch from default read-only mode, in which raw BTF data is a single non-modifiable block of memory with BTF header, types, and strings layed out sequentially and contiguously in memory, into a writable representation with types and strings data split out into separate memory regions, that can be dynamically expanded. Such writable internal representation is transparent to users of libbpf APIs, but allows to append new types and strings at the end of BTF, which is a typical use case when generating BTF programmatically. All the basic guarantees of BTF types and strings layout is preserved, i.e., user can get `struct btf_type *` pointer and read it directly. Such btf_type pointers might be invalidated if BTF is modified, so some care is required in such mixed read/write scenarios. Switch from read-only to writable configuration happens automatically the first time when user attempts to modify BTF by either adding a new type or new string. It is still possible to get raw BTF data, which is a single piece of memory that can be persisted in ELF section or into a file as raw BTF. Such raw data memory is also still owned by BTF and will be freed either when BTF object is freed or if another modification to BTF happens, as any modification invalidates BTF raw representation. This patch adds the first two BTF manipulation APIs: btf__add_str(), which allows to add arbitrary strings to BTF string section, and btf__find_str() which allows to find existing string offset, but not add it if it's missing. All the added strings are automatically deduplicated. This is achieved by maintaining an additional string lookup index for all unique strings. Such index is built when BTF is switched to modifiable mode. If at that time BTF strings section contained duplicate strings, they are not de-duplicated. This is done specifically to not modify the existing content of BTF (types, their string offsets, etc), which can cause confusion and is especially important property if there is struct btf_ext associated with struct btf. By following this "imperfect deduplication" process, btf_ext is kept consitent and correct. If deduplication of strings is necessary, it can be forced by doing BTF deduplication, at which point all the strings will be eagerly deduplicated and all string offsets both in struct btf and struct btf_ext will be updated. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200926011357.2366158-6-andriin@fb.com	2020-09-29 18:29:49 -07:00
Andrii Nakryiko	897a0e79bd	libbpf: Extract generic string hashing function for reuse Calculating a hash of zero-terminated string is a common need when using hashmap, so extract it for reuse. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200926011357.2366158-5-andriin@fb.com	2020-09-29 18:29:49 -07:00
Andrii Nakryiko	063eed6105	libbpf: Generalize common logic for managing dynamically-sized arrays Managing dynamically-sized array is a common, but not trivial functionality, which significant amount of logic and code to implement properly. So instead of re-implementing it all the time, extract it into a helper function ans reuse. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200926011357.2366158-4-andriin@fb.com	2020-09-29 18:29:49 -07:00
Andrii Nakryiko	71e8af71c5	libbpf: Remove assumption of single contiguous memory for BTF data Refactor internals of struct btf to remove assumptions that BTF header, type data, and string data are layed out contiguously in a memory in a single memory allocation. Now we have three separate pointers pointing to the start of each respective are: header, types, strings. In the next patches, these pointers will be re-assigned to point to independently allocated memory areas, if BTF needs to be modified. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200926011357.2366158-3-andriin@fb.com	2020-09-29 18:29:49 -07:00
Andrii Nakryiko	4023fbd99e	libbpf: Refactor internals of BTF type index Refactor implementation of internal BTF type index to not use direct pointers. Instead it uses offset relative to the start of types data section. This allows for types data to be reallocatable, enabling implementation of modifiable BTF. As now getting type by ID has an extra indirection step, convert all internal type lookups to a new helper btf_type_id(), that returns non-const pointer to a type by its ID. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200926011357.2366158-2-andriin@fb.com	2020-09-29 18:29:49 -07:00
Song Liu	b2e50daea8	libbpf: Support test run of raw tracepoint programs Add bpf_prog_test_run_opts() with support of new fields in bpf_attr.test, namely, flags and cpu. Also extend _opts operations to support outputs via opts. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200925205432.1777-3-songliubraving@fb.com	2020-09-29 18:29:49 -07:00
Song Liu	b6f1385458	bpf: Enable BPF_PROG_TEST_RUN for raw_tracepoint Add .test_run for raw_tracepoint. Also, introduce a new feature that runs the target program on a specific CPU. This is achieved by a new flag in bpf_attr.test, BPF_F_TEST_RUN_ON_CPU. When this flag is set, the program is triggered on cpu with id bpf_attr.test.cpu. This feature is needed for BPF programs that handle perf_event and other percpu resources, as the program can access these resource locally. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200925205432.1777-2-songliubraving@fb.com	2020-09-29 18:29:49 -07:00
Martin KaFai Lau	146bdd7535	bpf: Change bpf_sk_assign to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON This patch changes the bpf_sk_assign() to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will work with the pointer returned by the bpf_skc_to_*() helpers also. The bpf_sk_lookup_assign() is taking ARG_PTR_TO_SOCKET_"OR_NULL". Meaning it specifically takes a literal NULL. ARG_PTR_TO_BTF_ID_SOCK_COMMON does not allow a literal NULL, so another ARG type is required for this purpose and another follow-up patch can be used if there is such need. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200925000415.3857374-1-kafai@fb.com	2020-09-29 18:29:49 -07:00
Martin KaFai Lau	76ee807ee3	bpf: Change bpf_tcp__syncookie to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON This patch changes the bpf_tcp__syncookie() to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will work with the pointer returned by the bpf_skc_to_*() helpers also. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Lorenz Bauer <lmb@cloudflare.com> Link: https://lore.kernel.org/bpf/20200925000409.3856725-1-kafai@fb.com	2020-09-29 18:29:49 -07:00
Martin KaFai Lau	32e5add48f	bpf: Change bpf_sk_storage_() to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON This patch changes the bpf_sk_storage_() to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will work with the pointer returned by the bpf_skc_to_*() helpers also. A micro benchmark has been done on a "cgroup_skb/egress" bpf program which does a bpf_sk_storage_get(). It was driven by netperf doing a 4096 connected UDP_STREAM test with 64bytes packet. The stats from "kernel.bpf_stats_enabled" shows no meaningful difference. The sk_storage_get_btf_proto, sk_storage_delete_btf_proto, btf_sk_storage_get_proto, and btf_sk_storage_delete_proto are no longer needed, so they are removed. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Lorenz Bauer <lmb@cloudflare.com> Link: https://lore.kernel.org/bpf/20200925000402.3856307-1-kafai@fb.com	2020-09-29 18:29:49 -07:00
Martin KaFai Lau	120e99ccd8	bpf: Change bpf_sk_release and bpf_sk_cgroup_id to accept ARG_PTR_TO_BTF_ID_SOCK_COMMON The previous patch allows the networking bpf prog to use the bpf_skc_to_() helpers to get a PTR_TO_BTF_ID socket pointer, e.g. "struct tcp_sock ". It allows the bpf prog to read all the fields of the tcp_sock. This patch changes the bpf_sk_release() and bpf_sk_cgroup_id() to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will work with the pointer returned by the bpf_skc_to_() helpers also. For example, the following will work: sk = bpf_skc_lookup_tcp(skb, tuple, tuplen, BPF_F_CURRENT_NETNS, 0); if (!sk) return; tp = bpf_skc_to_tcp_sock(sk); if (!tp) { bpf_sk_release(sk); return; } lsndtime = tp->lsndtime; / Pass tp to bpf_sk_release() will also work / bpf_sk_release(tp); Since PTR_TO_BTF_ID could be NULL, the helper taking ARG_PTR_TO_BTF_ID_SOCK_COMMON has to check for NULL at runtime. A btf_id of "struct sock" may not always mean a fullsock. Regardless the helper's running context may get a non-fullsock or not, considering fullsock check/handling is pretty cheap, it is better to keep the same verifier expectation on helper that takes ARG_PTR_TO_BTF_ID will be able to handle the minisock situation. In the bpf_sk_cgroup_id() case, it will try to get a fullsock by using sk_to_full_sk() as its skb variant bpf_sk"b"_cgroup_id() has already been doing. bpf_sk_release can already handle minisock, so nothing special has to be done. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200925000356.3856047-1-kafai@fb.com	2020-09-29 18:29:49 -07:00
YiFei Zhu	3cf3c6cd26	libbpf: Add BPF_PROG_BIND_MAP syscall and use it on .rodata section The patch adds a simple wrapper bpf_prog_bind_map around the syscall. When the libbpf tries to load a program, it will probe the kernel for the support of this syscall and unconditionally bind .rodata section to the program. Signed-off-by: YiFei Zhu <zhuyifei@google.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Cc: YiFei Zhu <zhuyifei1999@gmail.com> Link: https://lore.kernel.org/bpf/20200915234543.3220146-4-sdf@google.com	2020-09-29 18:29:49 -07:00
YiFei Zhu	f38fccf3cc	bpf: Add BPF_PROG_BIND_MAP syscall This syscall binds a map to a program. Returns success if the map is already bound to the program. Signed-off-by: YiFei Zhu <zhuyifei@google.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Cc: YiFei Zhu <zhuyifei1999@gmail.com> Link: https://lore.kernel.org/bpf/20200915234543.3220146-3-sdf@google.com	2020-09-29 18:29:49 -07:00
Yonghong Song	08dc84e54a	libbpf: Fix a compilation error with xsk.c for ubuntu 16.04 When syncing latest libbpf repo to bcc, ubuntu 16.04 (4.4.0 LTS kernel) failed compilation for xsk.c: In file included from /tmp/debuild.0jkauG/bcc/src/cc/libbpf/src/xsk.c:23:0: /tmp/debuild.0jkauG/bcc/src/cc/libbpf/src/xsk.c: In function ‘xsk_get_ctx’: /tmp/debuild.0jkauG/bcc/src/cc/libbpf/include/linux/list.h:81:9: warning: implicit declaration of function ‘container_of’ [-Wimplicit-function-declaration] container_of(ptr, type, member) ^ /tmp/debuild.0jkauG/bcc/src/cc/libbpf/include/linux/list.h:83:9: note: in expansion of macro ‘list_entry’ list_entry((ptr)->next, type, member) ... src/cc/CMakeFiles/bpf-static.dir/build.make:209: recipe for target 'src/cc/CMakeFiles/bpf-static.dir/libbpf/src/xsk.c.o' failed Commit 2f6324a3937f ("libbpf: Support shared umems between queues and devices") added include file <linux/list.h>, which uses macro "container_of". xsk.c file also includes <linux/ethtool.h> before <linux/list.h>. In a more recent distro kernel, <linux/ethtool.h> includes <linux/kernel.h> which contains the macro definition for "container_of". So compilation is all fine. But in ubuntu 16.04 kernel, <linux/ethtool.h> does not contain <linux/kernel.h> which caused the above compilation error. Let explicitly add <linux/kernel.h> in xsk.c to avoid compilation error in old distro's. Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices") Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200914223210.1831262-1-yhs@fb.com	2020-09-29 18:29:49 -07:00
Song Liu	0102f65d72	bpf: Fix comment for helper bpf_current_task_under_cgroup() This should be "current" not "skb". Fixes: c6b5fb8690fa ("bpf: add documentation for eBPF helpers (42-50)") Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/bpf/20200910203314.70018-1-songliubraving@fb.com	2020-09-29 18:29:49 -07:00
Andrii Nakryiko	f700cf6667	vmtests: unblacklist few tests They should be fixed by now. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-09-28 15:09:26 -07:00
Julia Kartseva	99921245f0	vmtest: update root fs, whitelist sk_{assign\|lookup} test 1. Update mkrootfs.sh building root fs - Remove /etc/fstab from root fs and mount each fs type separately in S10-mount script. - devtmpfs can be already mounted prior to S10-mount execution so make it opt-out. This addresses [0]. - set -eux for scripts 2. Add iproute2 to root fs and whitelist sk_assign test. Addresses [1][2]. Update INDEX file with 2020-09-27 version. [0] https://github.com/libbpf/libbpf/pull/145#issuecomment-609673493 [1] https://github.com/libbpf/libbpf/pull/144 [2] https://github.com/libbpf/libbpf/pull/145	2020-09-28 13:09:06 -07:00
Andrii Nakryiko	37c5973bb7	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 2f7de9865ba3cbfcf8b504f07154fdb6124176a4 Checkpoint bpf-next commit: 2f7de9865ba3cbfcf8b504f07154fdb6124176a4 Baseline bpf commit: 746f534a4809e07f427f7d13d10f3a6a9641e5c3 Checkpoint bpf commit: 87f92ac4c12758c4da3bbe4393f1d884b610b8a6 Andrii Nakryiko (1): libbpf: Fix XDP program load regression for old kernels Tony Ambardar (1): libbpf: Fix native endian assumption when parsing BTF src/btf.c \| 6 ++++++ src/libbpf.c \| 2 +- 2 files changed, 7 insertions(+), 1 deletion(-) -- 2.24.1	2020-09-24 10:56:51 -07:00
Andrii Nakryiko	2200fefd87	libbpf: Fix XDP program load regression for old kernels Fix regression in libbpf, introduced by XDP link change, which causes XDP programs to fail to be loaded into kernel due to specified BPF_XDP expected_attach_type. While kernel doesn't enforce expected_attach_type for BPF_PROG_TYPE_XDP, some old kernels already support XDP program, but they don't yet recognize expected_attach_type field in bpf_attr, so setting it to non-zero value causes program load to fail. Luckily, libbpf already has a mechanism to deal with such cases, so just make expected_attach_type optional for XDP programs. Fixes: dc8698cac7aa ("libbpf: Add support for BPF XDP link") Reported-by: Nikita Shirokov <tehnerd@tehnerd.com> Reported-by: Udip Pant <udippant@fb.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200924171705.3803628-1-andriin@fb.com	2020-09-24 10:56:51 -07:00
Tony Ambardar	5f50b4b8c9	libbpf: Fix native endian assumption when parsing BTF Code in btf__parse_raw() fails to detect raw BTF of non-native endianness and assumes it must be ELF data, which then fails to parse as ELF and yields a misleading error message: root:/# bpftool btf dump file /sys/kernel/btf/vmlinux libbpf: failed to get EHDR from /sys/kernel/btf/vmlinux For example, this could occur after cross-compiling a BTF-enabled kernel for a target with non-native endianness, which is currently unsupported. Check for correct endianness and emit a clearer error message: root:/# bpftool btf dump file /sys/kernel/btf/vmlinux libbpf: non-native BTF endianness is not supported Fixes: 94a1fedd63ed ("libbpf: Add btf__parse_raw() and generic btf__parse() APIs") Signed-off-by: Tony Ambardar <Tony.Ambardar@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/90f81508ecc57bc0da318e0fe0f45cfe49b17ea7.1600417359.git.Tony.Ambardar@gmail.com	2020-09-24 10:56:51 -07:00
Andrii Nakryiko	787abf721e	vmtests: ensure rst2man is installed, needed for bpftool selftests Ensure rst2man package is installed. This is now a dependency for selftests/bpf. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-09-11 10:09:12 -07:00
Andrii Nakryiko	820813bd1b	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: f9bec5d756b30d5b21aa5ff9b7d5d115741517c1 Checkpoint bpf-next commit: 2f7de9865ba3cbfcf8b504f07154fdb6124176a4 Baseline bpf commit: e6135df45e21f1815a5948f452593124b1544a3e Checkpoint bpf commit: 746f534a4809e07f427f7d13d10f3a6a9641e5c3 Quentin Monnet (1): tools, bpf: Synchronise BPF UAPI header with tools include/uapi/linux/bpf.h \| 87 +++++++++++++++++++++------------------- 1 file changed, 45 insertions(+), 42 deletions(-) -- 2.24.1	2020-09-11 10:09:12 -07:00
Andrii Nakryiko	8333e57e91	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2020-09-11 10:09:12 -07:00
Quentin Monnet	8052936468	tools, bpf: Synchronise BPF UAPI header with tools Synchronise the bpf.h header under tools, to report the fixes recently brought to the documentation for the BPF helpers. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200904161454.31135-4-quentin@isovalent.com	2020-09-11 10:09:12 -07:00
Vladimír Čunát	8b14cb43ff	Makefile: link against zlib Without this we would be missing symbols, as shown e.g. by ldd -r libbpf.so	2020-09-09 00:03:51 -07:00
Andrii Nakryiko	011700e68d	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 95cec14b0308085c028c4d4fb3d09fad3902b4c3 Checkpoint bpf-next commit: f9bec5d756b30d5b21aa5ff9b7d5d115741517c1 Baseline bpf commit: e6135df45e21f1815a5948f452593124b1544a3e Checkpoint bpf commit: e6135df45e21f1815a5948f452593124b1544a3e Andrii Nakryiko (2): libbpf: Fix another __u64 cast in printf libbpf: Fix potential multiplication overflow src/libbpf.c \| 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- 2.24.1	2020-09-04 14:35:25 -07:00
Andrii Nakryiko	106e7dcf58	libbpf: Fix potential multiplication overflow Detected by LGTM static analyze in Github repo, fix potential multiplication overflow before result is casted to size_t. Fixes: 8505e8709b5e ("libbpf: Implement generalized .BTF.ext func/line info adjustment") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200904041611.1695163-2-andriin@fb.com	2020-09-04 14:35:25 -07:00
Andrii Nakryiko	3a2ebfc21e	libbpf: Fix another __u64 cast in printf Another issue of __u64 needing either %lu or %llu, depending on the architecture. Fix with cast to `unsigned long long`. Fixes: 7e06aad52929 ("libbpf: Add multi-prog section support for struct_ops") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200904041611.1695163-1-andriin@fb.com	2020-09-04 14:35:25 -07:00
Andrii Nakryiko	91001a9923	include: implement list_empty() and list_for_each_entry() Implement list_empty() function and list_for_each_entry() macro, newly used by xsk.c in 2f6324a3937f ("libbpf: Support shared umems between queues and devices") (Linux commit sha). Fixes: 5f630710f52e ("libbpf: Support shared umems between queues and devices") Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-09-03 21:21:34 -07:00
Andrii Nakryiko	6384ee1968	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 2e80be60c465a4f8559327340eaf40845dd7797a Checkpoint bpf-next commit: 95cec14b0308085c028c4d4fb3d09fad3902b4c3 Baseline bpf commit: 7787b6fc938e16aa418613c4a765c1dbb268ed9f Checkpoint bpf commit: e6135df45e21f1815a5948f452593124b1544a3e Alexei Starovoitov (3): bpf: Introduce sleepable BPF programs bpf: Add bpf_copy_from_user() helper. libbpf: Support sleepable progs Andrii Nakryiko (7): libbpf: Ensure ELF symbols table is found before further ELF processing libbpf: Parse multi-function sections into multiple BPF programs libbpf: Support CO-RE relocations for multi-prog sections libbpf: Make RELO_CALL work for multi-prog sections and sub-program calls libbpf: Implement generalized .BTF.ext func/line info adjustment libbpf: Add multi-prog section support for struct_ops libbpf: Deprecate notion of BPF program "title" in favor of "section name" Magnus Karlsson (1): libbpf: Support shared umems between queues and devices Tony Ambardar (1): libbpf: Fix build failure from uninitialized variable warning Yonghong Song (1): bpf: Make bpf_link_info.iter similar to bpf_iter_link_info include/uapi/linux/bpf.h \| 22 +- src/btf.h \| 18 +- src/libbpf.c \| 1314 +++++++++++++++++++++++++------------- src/libbpf.h \| 5 +- src/libbpf.map \| 2 + src/libbpf_common.h \| 2 + src/xsk.c \| 376 +++++++---- src/xsk.h \| 9 + 8 files changed, 1156 insertions(+), 592 deletions(-) -- 2.24.1	2020-09-03 21:21:34 -07:00
Andrii Nakryiko	3f9447bf92	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2020-09-03 21:21:34 -07:00
Tony Ambardar	3b80b6c77e	libbpf: Fix build failure from uninitialized variable warning While compiling libbpf, some GCC versions (at least 8.4.0) have difficulty determining control flow and a emit warning for potentially uninitialized usage of 'map', which results in a build error if using "-Werror": In file included from libbpf.c:56: libbpf.c: In function '__bpf_object__open': libbpf_internal.h:59:2: warning: 'map' may be used uninitialized in this function [-Wmaybe-uninitialized] libbpf_print(level, "libbpf: " fmt, ##__VA_ARGS__); \ ^~~~~~~~~~~~ libbpf.c:5032:18: note: 'map' was declared here struct bpf_map map, targ_map; ^~~ The warning/error is false based on code inspection, so silence it with a NULL initialization. Fixes: 646f02ffdd49 ("libbpf: Add BTF-defined map-in-map support") Reference: 063e68813391 ("libbpf: Fix false uninitialized variable warning") Signed-off-by: Tony Ambardar <Tony.Ambardar@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200831000304.1696435-1-Tony.Ambardar@gmail.com	2020-09-03 21:21:34 -07:00
Andrii Nakryiko	78cdb58bdf	libbpf: Deprecate notion of BPF program "title" in favor of "section name" BPF program title is ambigious and misleading term. It is ELF section name, so let's just call it that and deprecate bpf_program__title() API in favor of bpf_program__section_name(). Additionally, using bpf_object__find_program_by_title() is now inherently dangerous and ambiguous, as multiple BPF program can have the same section name. So deprecate this API as well and recommend to switch to non-ambiguous bpf_object__find_program_by_name(). Internally, clean up usage and mis-usage of BPF program section name for denoting BPF program name. Shorten the field name to prog->sec_name to be consistent with all other prog->sec_* variables. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200903203542.15944-11-andriin@fb.com	2020-09-03 21:21:34 -07:00
Andrii Nakryiko	4b60f82516	libbpf: Add multi-prog section support for struct_ops Adjust struct_ops handling code to work with multi-program ELF sections properly. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200903203542.15944-7-andriin@fb.com	2020-09-03 21:21:34 -07:00
Andrii Nakryiko	2b28b4fa4d	libbpf: Implement generalized .BTF.ext func/line info adjustment Complete multi-prog sections and multi sub-prog support in libbpf by properly adjusting .BTF.ext's line and function information. Mark exposed btf_ext__reloc_func_info() and btf_ext__reloc_func_info() APIs as deprecated. These APIs have simplistic assumption that all sub-programs are going to be appended to all main BPF programs, which doesn't hold in real life. It's unlikely there are any users of this API, as it's very libbpf internals-specific. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200903203542.15944-6-andriin@fb.com	2020-09-03 21:21:34 -07:00
Andrii Nakryiko	448789ba27	libbpf: Make RELO_CALL work for multi-prog sections and sub-program calls This patch implements general and correct logic for bpf-to-bpf sub-program calls. Only sub-programs used (called into) from entry-point (main) BPF program are going to be appended at the end of main BPF program. This ensures that BPF verifier won't encounter any dead code due to copying unreferenced sub-program. This change means that each entry-point (main) BPF program might have a different set of sub-programs appended to it and potentially in different order. This has implications on how sub-program call relocations need to be handled, described below. All relocations are now split into two categores: data references (maps and global variables) and code references (sub-program calls). This distinction is important because data references need to be relocated just once per each BPF program and sub-program. These relocation are agnostic to instruction locations, because they are not code-relative and they are relocating against static targets (maps, variables with fixes offsets, etc). Sub-program RELO_CALL relocations, on the other hand, are highly-dependent on code position, because they are recorded as instruction-relative offset. So BPF sub-programs (those that do calls into other sub-programs) can't be relocated once, they need to be relocated each time such a sub-program is appended at the end of the main entry-point BPF program. As mentioned above, each main BPF program might have different subset and differen order of sub-programs, so call relocations can't be done just once. Splitting data reference and calls relocations as described above allows to do this efficiently and cleanly. bpf_object__find_program_by_name() will now ignore non-entry BPF programs. Previously one could have looked up '.text' fake BPF program, but the existence of such BPF program was always an implementation detail and you can't do much useful with it. Now, though, all non-entry sub-programs get their own BPF program with name corresponding to a function name, so there is no more '.text' name for BPF program. This means there is no regression, effectively, w.r.t. API behavior. But this is important aspect to highlight, because it's going to be critical once libbpf implements static linking of BPF programs. Non-entry static BPF programs will be allowed to have conflicting names, but global and main-entry BPF program names should be unique. Just like with normal user-space linking process. So it's important to restrict this aspect right now, keep static and non-entry functions as internal implementation details, and not have to deal with regressions in behavior later. This patch leaves .BTF.ext adjustment as is until next patch. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200903203542.15944-5-andriin@fb.com	2020-09-03 21:21:34 -07:00
Andrii Nakryiko	a3abae5122	libbpf: Support CO-RE relocations for multi-prog sections Fix up CO-RE relocation code to handle relocations against ELF sections containing multiple BPF programs. This requires lookup of a BPF program by its section name and instruction index it contains. While it could have been done as a simple loop, it could run into performance issues pretty quickly, as number of CO-RE relocations can be quite large in real-world applications, and each CO-RE relocation incurs BPF program look up now. So instead of simple loop, implement a binary search by section name + insn offset. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200903203542.15944-4-andriin@fb.com	2020-09-03 21:21:34 -07:00
Andrii Nakryiko	bb5e70706a	libbpf: Parse multi-function sections into multiple BPF programs Teach libbpf how to parse code sections into potentially multiple bpf_program instances, based on ELF FUNC symbols. Each BPF program will keep track of its position within containing ELF section for translating section instruction offsets into program instruction offsets: regardless of BPF program's location in ELF section, it's first instruction is always at local instruction offset 0, so when libbpf is working with relocations (which use section-based instruction offsets) this is critical to make proper translations. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200903203542.15944-3-andriin@fb.com	2020-09-03 21:21:34 -07:00
Andrii Nakryiko	994aae7fc8	libbpf: Ensure ELF symbols table is found before further ELF processing libbpf ELF parsing logic might need symbols available before ELF parsing is completed, so we need to make sure that symbols table section is found in a separate pass before all the subsequent sections are processed. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200903203542.15944-2-andriin@fb.com	2020-09-03 21:21:34 -07:00
Magnus Karlsson	a6e9cf1532	libbpf: Support shared umems between queues and devices Add support for shared umems between hardware queues and devices to the AF_XDP part of libbpf. This so that zero-copy can be achieved in applications that want to send and receive packets between HW queues on one device or between different devices/netdevs. In order to create sockets that share a umem between hardware queues and devices, a new function has been added called xsk_socket__create_shared(). It takes the same arguments as xsk_socket_create() plus references to a fill ring and a completion ring. So for every socket that share a umem, you need to have one more set of fill and completion rings. This in order to maintain the single-producer single-consumer semantics of the rings. You can create all the sockets via the new xsk_socket__create_shared() call, or create the first one with xsk_socket__create() and the rest with xsk_socket__create_shared(). Both methods work. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-14-git-send-email-magnus.karlsson@intel.com	2020-09-03 21:21:34 -07:00
Alexei Starovoitov	06ae1b0e38	libbpf: Support sleepable progs Pass request to load program as sleepable via ".s" suffix in the section name. If it happens in the future that all map types and helpers are allowed with BPF_F_SLEEPABLE flag "fmod_ret/" and "lsm/" can be aliased to "fmod_ret.s/" and "lsm.s/" to make all lsm and fmod_ret programs sleepable by default. The fentry and fexit programs would always need to have sleepable vs non-sleepable distinction, since not all fentry/fexit progs will be attached to sleepable kernel functions. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: KP Singh <kpsingh@google.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200827220114.69225-5-alexei.starovoitov@gmail.com	2020-09-03 21:21:34 -07:00
Alexei Starovoitov	b228eb84f1	bpf: Add bpf_copy_from_user() helper. Sleepable BPF programs can now use copy_from_user() to access user memory. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: KP Singh <kpsingh@google.com> Link: https://lore.kernel.org/bpf/20200827220114.69225-4-alexei.starovoitov@gmail.com	2020-09-03 21:21:34 -07:00
Alexei Starovoitov	5bd7cae11d	bpf: Introduce sleepable BPF programs Introduce sleepable BPF programs that can request such property for themselves via BPF_F_SLEEPABLE flag at program load time. In such case they will be able to use helpers like bpf_copy_from_user() that might sleep. At present only fentry/fexit/fmod_ret and lsm programs can request to be sleepable and only when they are attached to kernel functions that are known to allow sleeping. The non-sleepable programs are relying on implicit rcu_read_lock() and migrate_disable() to protect life time of programs, maps that they use and per-cpu kernel structures used to pass info between bpf programs and the kernel. The sleepable programs cannot be enclosed into rcu_read_lock(). migrate_disable() maps to preempt_disable() in non-RT kernels, so the progs should not be enclosed in migrate_disable() as well. Therefore rcu_read_lock_trace is used to protect the life time of sleepable progs. There are many networking and tracing program types. In many cases the 'struct bpf_prog *' pointer itself is rcu protected within some other kernel data structure and the kernel code is using rcu_dereference() to load that program pointer and call BPF_PROG_RUN() on it. All these cases are not touched. Instead sleepable bpf programs are allowed with bpf trampoline only. The program pointers are hard-coded into generated assembly of bpf trampoline and synchronize_rcu_tasks_trace() is used to protect the life time of the program. The same trampoline can hold both sleepable and non-sleepable progs. When rcu_read_lock_trace is held it means that some sleepable bpf program is running from bpf trampoline. Those programs can use bpf arrays and preallocated hash/lru maps. These map types are waiting on programs to complete via synchronize_rcu_tasks_trace(); Updates to trampoline now has to do synchronize_rcu_tasks_trace() and synchronize_rcu_tasks() to wait for sleepable progs to finish and for trampoline assembly to finish. This is the first step of introducing sleepable progs. Eventually dynamically allocated hash maps can be allowed and networking program types can become sleepable too. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: KP Singh <kpsingh@google.com> Link: https://lore.kernel.org/bpf/20200827220114.69225-3-alexei.starovoitov@gmail.com	2020-09-03 21:21:34 -07:00
Yonghong Song	a454a08f53	bpf: Make bpf_link_info.iter similar to bpf_iter_link_info bpf_link_info.iter is used by link_query to return bpf_iter_link_info to user space. Fields may be different, e.g., map_fd vs. map_id, so we cannot reuse the exact structure. But make them similar, e.g., struct bpf_link_info { /* common fields / union { struct { ... } raw_tracepoint; struct { ... } tracing; ... struct { / common fields for iter / union { struct { __u32 map_id; } map; / other structs for other targets */ }; }; }; }; so the structure is extensible the same way as bpf_iter_link_info. Fixes: 6b0a249a301e ("bpf: Implement link_query for bpf iterators") Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200828051922.758950-1-yhs@fb.com	2020-09-03 21:21:34 -07:00
Andrii Nakryiko	829e50fc15	sync: improve sync script to handle common issues Few recurring issues are fixed. 1. When there are patches in bpf tree that hasn't been synced yet, but bpf was already merged into bpf-next, merged patches would be applied twice, causing failures, requiring manual resolution. Now this is handled smarter and shouldn't happen. 2. When synced libbpf repo contains fixes from bpf that weren't yet merged into bpf-next, those bpf tree changes would cause inconsistency against bpf-next tree state. That's expected and usually is pretty easy for human to discard during consistency check, but is hard for automation. So instead of failing at the very end, ask human whether discrepancies look good. 3. If sync script detected no new patches needed syncing, it previously didn't restore linux repo state back. Fixed. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-09-03 20:14:51 -07:00
Andrii Nakryiko	66780a46cb	README.md: update Travis CI badge link Update Travis CI status badge to point to travis-ci.com, now that libbpf was migrated there.	2020-08-27 10:15:29 -07:00
Andrii Nakryiko	7bc52e6602	vmtests: blacklist 2 new feature tests and (temporarily) 3 existing selftest Permanently blacklist 2 new selftest on 5.5 and temporarily blacklist 3 existing selftests. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-08-26 23:30:55 -07:00
Andrii Nakryiko	7267270f5f	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 0fcdfffe80346d015b920228203d0269284d8b13 Checkpoint bpf-next commit: 2e80be60c465a4f8559327340eaf40845dd7797a Baseline bpf commit: 7787b6fc938e16aa418613c4a765c1dbb268ed9f Checkpoint bpf commit: 7787b6fc938e16aa418613c4a765c1dbb268ed9f Alex Gartrell (1): libbpf: Fix unintentional success return code in bpf_object__load Andrii Nakryiko (1): libbpf: Fix compilation warnings for 64-bit printf args Jiri Olsa (1): bpf: Add d_path helper KP Singh (3): bpf: Generalize bpf_sk_storage bpf: Implement bpf_local_storage for inodes bpf: Allow local storage to be used from LSM programs include/uapi/linux/bpf.h \| 69 +++++++++++++++++++++++++++++++++++++--- src/libbpf.c \| 10 +++--- src/libbpf_probes.c \| 5 +-- 3 files changed, 73 insertions(+), 11 deletions(-) -- 2.24.1	2020-08-26 23:30:55 -07:00
Andrii Nakryiko	b16bc44bd3	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2020-08-26 23:30:55 -07:00
Andrii Nakryiko	4cdad1b34b	libbpf: Fix compilation warnings for 64-bit printf args Fix compilation warnings due to __u64 defined differently as `unsigned long` or `unsigned long long` on different architectures (e.g., ppc64le differs from x86-64). Also cast one argument to size_t to fix printf warning of similar nature. Fixes: eacaaed784e2 ("libbpf: Implement enum value-based CO-RE relocations") Fixes: 50e09460d9f8 ("libbpf: Skip well-known ELF sections when iterating ELF") Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200827041109.3613090-1-andriin@fb.com	2020-08-26 23:30:55 -07:00
Alex Gartrell	f557d9e1fc	libbpf: Fix unintentional success return code in bpf_object__load There are code paths where EINVAL is returned directly without setting errno. In that case, errno could be 0, which would mask the failure. For example, if a careless programmer set log_level to 10000 out of laziness, they would have to spend a long time trying to figure out why. Fixes: 4f33ddb4e3e2 ("libbpf: Propagate EPERM to caller on program load") Signed-off-by: Alex Gartrell <alexgartrell@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200826075549.1858580-1-alexgartrell@gmail.com	2020-08-26 23:30:55 -07:00
Jiri Olsa	e82da07e2d	bpf: Add d_path helper Adding d_path helper function that returns full path for given 'struct path' object, which needs to be the kernel BTF 'path' object. The path is returned in buffer provided 'buf' of size 'sz' and is zero terminated. bpf_d_path(&file->f_path, buf, size); The helper calls directly d_path function, so there's only limited set of function it can be called from. Adding just very modest set for the start. Updating also bpf.h tools uapi header and adding 'path' to bpf_helpers_doc.py script. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: KP Singh <kpsingh@google.com> Link: https://lore.kernel.org/bpf/20200825192124.710397-11-jolsa@kernel.org	2020-08-26 23:30:55 -07:00
KP Singh	c42c140954	bpf: Allow local storage to be used from LSM programs Adds support for both bpf_{sk, inode}_storage_{get, delete} to be used in LSM programs. These helpers are not used for tracing programs (currently) as their usage is tied to the life-cycle of the object and should only be used where the owning object won't be freed (when the owning object is passed as an argument to the LSM hook). Thus, they are safer to use in LSM hooks than tracing. Usage of local storage in tracing programs will probably follow a per function based whitelist approach. Since the UAPI helper signature for bpf_sk_storage expect a bpf_sock, it, leads to a compilation warning for LSM programs, it's also updated to accept a void * pointer instead. Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200825182919.1118197-7-kpsingh@chromium.org	2020-08-26 23:30:55 -07:00
KP Singh	e565f2bfe9	bpf: Implement bpf_local_storage for inodes Similar to bpf_local_storage for sockets, add local storage for inodes. The life-cycle of storage is managed with the life-cycle of the inode. i.e. the storage is destroyed along with the owning inode. The BPF LSM allocates an __rcu pointer to the bpf_local_storage in the security blob which are now stackable and can co-exist with other LSMs. Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200825182919.1118197-6-kpsingh@chromium.org	2020-08-26 23:30:55 -07:00
KP Singh	2bd0d158d4	bpf: Generalize bpf_sk_storage Refactor the functionality in bpf_sk_storage.c so that concept of storage linked to kernel objects can be extended to other objects like inode, task_struct etc. Each new local storage will still be a separate map and provide its own set of helpers. This allows for future object specific extensions and still share a lot of the underlying implementation. This includes the changes suggested by Martin in: https://lore.kernel.org/bpf/20200725013047.4006241-1-kafai@fb.com/ adding new map operations to support bpf_local_storage maps: * storages for different kernel objects to optionally have different memory charging strategy (map_local_storage_charge, map_local_storage_uncharge) * Functionality to extract the storage pointer from a pointer to the owning object (map_owner_storage_ptr) Co-developed-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200825182919.1118197-4-kpsingh@chromium.org	2020-08-26 23:30:55 -07:00
Andrii Nakryiko	bbe442da7a	sync: allow 3-way merge for patching to simplify manual conflict resolution Allowing --3way leaves conflicts in the local files, which makes manual conflict resolution so much easier. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-08-25 00:53:18 -07:00
Andrii Nakryiko	3f7b5b32b8	vmtests: blacklist tcp_hdr_options selftest for 5.5 Blacklist selftests for a new feature, not supported by 5.5 kernel. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-08-25 00:53:18 -07:00
Andrii Nakryiko	5a913e9401	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: dca5612f8eb9d0cf1dc254eb2adff1f16a588a7d Checkpoint bpf-next commit: 0fcdfffe80346d015b920228203d0269284d8b13 Baseline bpf commit: 4af7b32f84aa4cd60e39b355bc8a1eab6cd8d8a4 Checkpoint bpf commit: 7787b6fc938e16aa418613c4a765c1dbb268ed9f Andrii Nakryiko (6): libbpf: Factor out common ELF operations and improve logging libbpf: Add __noinline macro to bpf_helpers.h libbpf: Skip well-known ELF sections when iterating ELF libbpf: Normalize and improve logging across few functions libbpf: Avoid false unuinitialized variable warning in bpf_core_apply_relo libbpf: Fix type compatibility check copy-paste error Martin KaFai Lau (6): tcp: bpf: Add TCP_BPF_DELACK_MAX setsockopt tcp: bpf: Add TCP_BPF_RTO_MIN for bpf_setsockopt bpf: tcp: Add bpf_skops_parse_hdr() bpf: tcp: Add bpf_skops_hdr_opt_len() and bpf_skops_write_hdr_opt() bpf: tcp: Allow bpf prog to write and parse TCP header option tcp: bpf: Optionally store mac header in TCP_SAVE_SYN include/uapi/linux/bpf.h \| 306 ++++++++++++++++++++++- src/bpf_helpers.h \| 3 + src/libbpf.c \| 525 +++++++++++++++++++++++---------------- 3 files changed, 623 insertions(+), 211 deletions(-) -- 2.24.1	2020-08-25 00:53:18 -07:00
Andrii Nakryiko	cead23ac75	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2020-08-25 00:53:18 -07:00
Andrii Nakryiko	66091d267c	libbpf: Fix type compatibility check copy-paste error Fix copy-paste error in types compatibility check. Local type is accidentally used instead of target type for the very first type check strictness check. This can result in potentially less strict candidate comparison. Fix the error. Fixes: 3fc32f40c402 ("libbpf: Implement type-based CO-RE relocations support") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200821225653.2180782-1-andriin@fb.com	2020-08-25 00:53:18 -07:00
Andrii Nakryiko	2819b00b74	libbpf: Avoid false unuinitialized variable warning in bpf_core_apply_relo Some versions of GCC report uninitialized targ_spec usage. GCC is wrong, but let's avoid unnecessary warnings. Fixes: ddc7c3042614 ("libbpf: implement BPF CO-RE offset relocation algorithm") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200821225556.2178419-1-andriin@fb.com	2020-08-25 00:53:18 -07:00
Martin KaFai Lau	cb4d6d6f1a	tcp: bpf: Optionally store mac header in TCP_SAVE_SYN This patch is adapted from Eric's patch in an earlier discussion [1]. The TCP_SAVE_SYN currently only stores the network header and tcp header. This patch allows it to optionally store the mac header also if the setsockopt's optval is 2. It requires one more bit for the "save_syn" bit field in tcp_sock. This patch achieves this by moving the syn_smc bit next to the is_mptcp. The syn_smc is currently used with the TCP experimental option. Since syn_smc is only used when CONFIG_SMC is enabled, this patch also puts the "IS_ENABLED(CONFIG_SMC)" around it like the is_mptcp did with "IS_ENABLED(CONFIG_MPTCP)". The mac_hdrlen is also stored in the "struct saved_syn" to allow a quick offset from the bpf prog if it chooses to start getting from the network header or the tcp header. [1]: https://lore.kernel.org/netdev/CANn89iLJNWh6bkH7DNhy_kmcAexuUCccqERqe7z2QsvPhGrYPQ@mail.gmail.com/ Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/bpf/20200820190123.2886935-1-kafai@fb.com	2020-08-25 00:53:18 -07:00
Martin KaFai Lau	4f160ed607	bpf: tcp: Allow bpf prog to write and parse TCP header option [ Note: The TCP changes here is mainly to implement the bpf pieces into the bpf_skops_() functions introduced in the earlier patches. ] The earlier effort in BPF-TCP-CC allows the TCP Congestion Control algorithm to be written in BPF. It opens up opportunities to allow a faster turnaround time in testing/releasing new congestion control ideas to production environment. The same flexibility can be extended to writing TCP header option. It is not uncommon that people want to test new TCP header option to improve the TCP performance. Another use case is for data-center that has a more controlled environment and has more flexibility in putting header options for internal only use. For example, we want to test the idea in putting maximum delay ACK in TCP header option which is similar to a draft RFC proposal [1]. This patch introduces the necessary BPF API and use them in the TCP stack to allow BPF_PROG_TYPE_SOCK_OPS program to parse and write TCP header options. It currently supports most of the TCP packet except RST. Supported TCP header option: ─────────────────────────── This patch allows the bpf-prog to write any option kind. Different bpf-progs can write its own option by calling the new helper bpf_store_hdr_opt(). The helper will ensure there is no duplicated option in the header. By allowing bpf-prog to write any option kind, this gives a lot of flexibility to the bpf-prog. Different bpf-prog can write its own option kind. It could also allow the bpf-prog to support a recently standardized option on an older kernel. Sockops Callback Flags: ────────────────────── The bpf program will only be called to parse/write tcp header option if the following newly added callback flags are enabled in tp->bpf_sock_ops_cb_flags: BPF_SOCK_OPS_PARSE_UNKNOWN_HDR_OPT_CB_FLAG BPF_SOCK_OPS_PARSE_ALL_HDR_OPT_CB_FLAG BPF_SOCK_OPS_WRITE_HDR_OPT_CB_FLAG A few words on the PARSE CB flags. When the above PARSE CB flags are turned on, the bpf-prog will be called on packets received at a sk that has at least reached the ESTABLISHED state. The parsing of the SYN-SYNACK-ACK will be discussed in the "3 Way HandShake" section. The default is off for all of the above new CB flags, i.e. the bpf prog will not be called to parse or write bpf hdr option. There are details comment on these new cb flags in the UAPI bpf.h. sock_ops->skb_data and bpf_load_hdr_opt() ───────────────────────────────────────── sock_ops->skb_data and sock_ops->skb_data_end covers the whole TCP header and its options. They are read only. The new bpf_load_hdr_opt() helps to read a particular option "kind" from the skb_data. Please refer to the comment in UAPI bpf.h. It has details on what skb_data contains under different sock_ops->op. 3 Way HandShake ─────────────── The bpf-prog can learn if it is sending SYN or SYNACK by reading the sock_ops->skb_tcp_flags. Passive side When writing SYNACK (i.e. sock_ops->op == BPF_SOCK_OPS_WRITE_HDR_OPT_CB), the received SYN skb will be available to the bpf prog. The bpf prog can use the SYN skb (which may carry the header option sent from the remote bpf prog) to decide what bpf header option should be written to the outgoing SYNACK skb. The SYN packet can be obtained by getsockopt(TCP_BPF_SYN). More on this later. Also, the bpf prog can learn if it is in syncookie mode (by checking sock_ops->args[0] == BPF_WRITE_HDR_TCP_SYNACK_COOKIE). The bpf prog can store the received SYN pkt by using the existing bpf_setsockopt(TCP_SAVE_SYN). The example in a later patch does it. [ Note that the fullsock here is a listen sk, bpf_sk_storage is not very useful here since the listen sk will be shared by many concurrent connection requests. Extending bpf_sk_storage support to request_sock will add weight to the minisock and it is not necessary better than storing the whole ~100 bytes SYN pkt. ] When the connection is established, the bpf prog will be called in the existing PASSIVE_ESTABLISHED_CB callback. At that time, the bpf prog can get the header option from the saved syn and then apply the needed operation to the newly established socket. The later patch will use the max delay ack specified in the SYN header and set the RTO of this newly established connection as an example. The received ACK (that concludes the 3WHS) will also be available to the bpf prog during PASSIVE_ESTABLISHED_CB through the sock_ops->skb_data. It could be useful in syncookie scenario. More on this later. There is an existing getsockopt "TCP_SAVED_SYN" to return the whole saved syn pkt which includes the IP[46] header and the TCP header. A few "TCP_BPF_SYN" getsockopt has been added to allow specifying where to start getting from, e.g. starting from TCP header, or from IP[46] header. The new getsockopt(TCP_BPF_SYN) will also know where it can get the SYN's packet from: - (a) the just received syn (available when the bpf prog is writing SYNACK) and it is the only way to get SYN during syncookie mode. or - (b) the saved syn (available in PASSIVE_ESTABLISHED_CB and also other existing CB). The bpf prog does not need to know where the SYN pkt is coming from. The getsockopt(TCP_BPF_SYN) will hide this details. Similarly, a flags "BPF_LOAD_HDR_OPT_TCP_SYN" is also added to bpf_load_hdr_opt() to read a particular header option from the SYN packet. * Fastopen Fastopen should work the same as the regular non fastopen case. This is a test in a later patch. * Syncookie For syncookie, the later example patch asks the active side's bpf prog to resend the header options in ACK. The server can use bpf_load_hdr_opt() to look at the options in this received ACK during PASSIVE_ESTABLISHED_CB. * Active side The bpf prog will get a chance to write the bpf header option in the SYN packet during WRITE_HDR_OPT_CB. The received SYNACK pkt will also be available to the bpf prog during the existing ACTIVE_ESTABLISHED_CB callback through the sock_ops->skb_data and bpf_load_hdr_opt(). * Turn off header CB flags after 3WHS If the bpf prog does not need to write/parse header options beyond the 3WHS, the bpf prog can clear the bpf_sock_ops_cb_flags to avoid being called for header options. Or the bpf-prog can select to leave the UNKNOWN_HDR_OPT_CB_FLAG on so that the kernel will only call it when there is option that the kernel cannot handle. [1]: draft-wang-tcpm-low-latency-opt-00 https://tools.ietf.org/html/draft-wang-tcpm-low-latency-opt-00 Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200820190104.2885895-1-kafai@fb.com	2020-08-25 00:53:18 -07:00
Martin KaFai Lau	647df00570	bpf: tcp: Add bpf_skops_hdr_opt_len() and bpf_skops_write_hdr_opt() The bpf prog needs to parse the SYN header to learn what options have been sent by the peer's bpf-prog before writing its options into SYNACK. This patch adds a "syn_skb" arg to tcp_make_synack() and send_synack(). This syn_skb will eventually be made available (as read-only) to the bpf prog. This will be the only SYN packet available to the bpf prog during syncookie. For other regular cases, the bpf prog can also use the saved_syn. When writing options, the bpf prog will first be called to tell the kernel its required number of bytes. It is done by the new bpf_skops_hdr_opt_len(). The bpf prog will only be called when the new BPF_SOCK_OPS_WRITE_HDR_OPT_CB_FLAG is set in tp->bpf_sock_ops_cb_flags. When the bpf prog returns, the kernel will know how many bytes are needed and then update the "remaining" arg accordingly. 4 byte alignment will be included in the "remaining" before this function returns. The 4 byte aligned number of bytes will also be stored into the opts->bpf_opt_len. "bpf_opt_len" is a newly added member to the struct tcp_out_options. Then the new bpf_skops_write_hdr_opt() will call the bpf prog to write the header options. The bpf prog is only called if it has reserved spaces before (opts->bpf_opt_len > 0). The bpf prog is the last one getting a chance to reserve header space and writing the header option. These two functions are half implemented to highlight the changes in TCP stack. The actual codes preparing the bpf running context and invoking the bpf prog will be added in the later patch with other necessary bpf pieces. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/bpf/20200820190052.2885316-1-kafai@fb.com	2020-08-25 00:53:18 -07:00
Martin KaFai Lau	44fdfd8e6e	bpf: tcp: Add bpf_skops_parse_hdr() The patch adds a function bpf_skops_parse_hdr(). It will call the bpf prog to parse the TCP header received at a tcp_sock that has at least reached the ESTABLISHED state. For the packets received during the 3WHS (SYN, SYNACK and ACK), the received skb will be available to the bpf prog during the callback in bpf_skops_established() introduced in the previous patch and in the bpf_skops_write_hdr_opt() that will be added in the next patch. Calling bpf prog to parse header is controlled by two new flags in tp->bpf_sock_ops_cb_flags: BPF_SOCK_OPS_PARSE_UNKNOWN_HDR_OPT_CB_FLAG and BPF_SOCK_OPS_PARSE_ALL_HDR_OPT_CB_FLAG. When BPF_SOCK_OPS_PARSE_UNKNOWN_HDR_OPT_CB_FLAG is set, the bpf prog will only be called when there is unknown option in the TCP header. When BPF_SOCK_OPS_PARSE_ALL_HDR_OPT_CB_FLAG is set, the bpf prog will be called on all received TCP header. This function is half implemented to highlight the changes in TCP stack. The actual codes preparing the bpf running context and invoking the bpf prog will be added in the later patch with other necessary bpf pieces. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/bpf/20200820190046.2885054-1-kafai@fb.com	2020-08-25 00:53:18 -07:00
Martin KaFai Lau	75d2adfe84	tcp: bpf: Add TCP_BPF_RTO_MIN for bpf_setsockopt This patch adds bpf_setsockopt(TCP_BPF_RTO_MIN) to allow bpf prog to set the min rto of a connection. It could be used together with the earlier patch which has added bpf_setsockopt(TCP_BPF_DELACK_MAX). A later selftest patch will communicate the max delay ack in a bpf tcp header option and then the receiving side can use bpf_setsockopt(TCP_BPF_RTO_MIN) to set a shorter rto. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200820190027.2884170-1-kafai@fb.com	2020-08-25 00:53:18 -07:00
Martin KaFai Lau	f0f75f36a7	tcp: bpf: Add TCP_BPF_DELACK_MAX setsockopt This change is mostly from an internal patch and adapts it from sysctl config to the bpf_setsockopt setup. The bpf_prog can set the max delay ack by using bpf_setsockopt(TCP_BPF_DELACK_MAX). This max delay ack can be communicated to its peer through bpf header option. The receiving peer can then use this max delay ack and set a potentially lower rto by using bpf_setsockopt(TCP_BPF_RTO_MIN) which will be introduced in the next patch. Another later selftest patch will also use it like the above to show how to write and parse bpf tcp header option. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200820190021.2884000-1-kafai@fb.com	2020-08-25 00:53:18 -07:00
Andrii Nakryiko	a8fa8b6eea	libbpf: Normalize and improve logging across few functions Make libbpf logs follow similar pattern and provide more context like section name or program name, where appropriate. Also, add BPF_INSN_SZ constant and use it throughout to clean up code a little bit. This commit doesn't have any functional changes and just removes some code changes out of the way before bigger refactoring in libbpf internals. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200820231250.1293069-6-andriin@fb.com	2020-08-25 00:53:18 -07:00
Andrii Nakryiko	8a1acb7dfe	libbpf: Skip well-known ELF sections when iterating ELF Skip and don't log ELF sections that libbpf knows about and ignores during ELF processing. This allows to not unnecessarily log details about those ELF sections and cleans up libbpf debug log. Ignored sections include DWARF data, string table, empty .text section and few special (e.g., .llvm_addrsig) useless sections. With such ELF sections out of the way, log unrecognized ELF sections at pr_info level to increase visibility. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200820231250.1293069-5-andriin@fb.com	2020-08-25 00:53:18 -07:00
Andrii Nakryiko	b6e179e67c	libbpf: Add __noinline macro to bpf_helpers.h __noinline is pretty frequently used, especially with BPF subprograms, so add them along the __always_inline, for user convenience and completeness. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200820231250.1293069-4-andriin@fb.com	2020-08-25 00:53:18 -07:00
Andrii Nakryiko	d81d872279	libbpf: Factor out common ELF operations and improve logging Factor out common ELF operations done throughout the libbpf. This simplifies usage across multiple places in libbpf, as well as hide error reporting from higher-level functions and make error logging more consistent. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200820231250.1293069-3-andriin@fb.com	2020-08-25 00:53:18 -07:00
Andrii Nakryiko	4001a658e0	vmtests: add log folding Sprinkle log folds around, including timing. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-08-22 00:57:32 -07:00
Andrii Nakryiko	dc1cd8503f	vmtests: use built-in BPF_PRELOAD_UMD=y config Modules might not be picked up properly in our qemu setup. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-08-21 19:06:11 -07:00
Andrii Nakryiko	9a3a42608d	vmtests: update latest.config Re-generate latest.config. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-08-21 18:22:15 -07:00
Andrii Nakryiko	63c78982c7	vmtests: harden fetching kernel sources Ensure that corrupted tar archive won't screw up build. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-08-21 18:22:15 -07:00
Andrii Nakryiko	28e26bdc3e	sync: add BPF_RAW_INSN macro Add BPF_RAW_INSNS macro used by libbpf. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-08-21 18:22:15 -07:00
Andrii Nakryiko	7297e38474	vmtests: add CONFIG_BPF_PRELOAD=y and CONFIG_BPF_PRELOAD_UMD=m Add new Kconfig values needed for selftests. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-08-21 18:22:15 -07:00
Andrii Nakryiko	a44116bb1f	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 06a4ec1d9dc652e17ee3ac2ceb6c7cf6c2b75cdd Checkpoint bpf-next commit: dca5612f8eb9d0cf1dc254eb2adff1f16a588a7d Baseline bpf commit: 3fb1a96a91120877488071a167d26d76be4be977 Checkpoint bpf commit: 4af7b32f84aa4cd60e39b355bc8a1eab6cd8d8a4 Andrii Nakryiko (17): libbpf: Make kernel feature probing lazy libbpf: Factor out common logic of testing and closing FD libbpf: Sanitize BPF program code for bpf_probe_read_{kernel, user}[_str] libbpf: Switch tracing and CO-RE helper macros to bpf_probe_read_kernel() libbpf: Detect minimal BTF support and skip BTF loading, if missing libbpf: Improve error logging for mismatched BTF kind cases libbpf: Clean up and improve CO-RE reloc logging libbpf: Improve relocation ambiguity detection libbpf: Remove any use of reallocarray() in libbpf tools/bpftool: Remove libbpf_internal.h usage in bpftool libbpf: Centralize poisoning and poison reallocarray() tools: Remove feature-libelf-mmap feature detection libbpf: Implement type-based CO-RE relocations support libbpf: Implement enum value-based CO-RE relocations libbpf: Fix detection of BPF helper call instruction libbpf: Fix libbpf build on compilers missing __builtin_mul_overflow libbpf: Add perf_buffer APIs for better integration with outside epoll loop Tobias Klauser (1): bpf: Fix two typos in uapi/linux/bpf.h Toke Høiland-Jørgensen (1): libbpf: Fix map index used in error message Xu Wang (2): libbpf: Convert comma to semicolon libbpf: Simplify the return expression of build_map_pin_path() Yonghong Song (1): bpf: Implement link_query for bpf iterators include/uapi/linux/bpf.h \| 17 +- src/bpf.c \| 3 - src/bpf_core_read.h \| 120 +++- src/bpf_prog_linfo.c \| 3 - src/bpf_tracing.h \| 4 +- src/btf.c \| 31 +- src/btf.h \| 38 -- src/btf_dump.c \| 9 +- src/hashmap.c \| 3 + src/libbpf.c \| 1177 ++++++++++++++++++++++++++++---------- src/libbpf.h \| 4 + src/libbpf.map \| 8 + src/libbpf_internal.h \| 138 ++++- src/libbpf_probes.c \| 3 - src/netlink.c \| 128 +---- src/nlattr.c \| 9 +- src/ringbuf.c \| 8 +- src/xsk.c \| 3 - 18 files changed, 1149 insertions(+), 557 deletions(-) -- 2.24.1	2020-08-21 18:22:15 -07:00
Andrii Nakryiko	4069acb787	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2020-08-21 18:22:15 -07:00
Tobias Klauser	c7d2b1f31b	bpf: Fix two typos in uapi/linux/bpf.h Also remove trailing whitespaces in bpf_skb_get_tunnel_key example code. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200821133642.18870-1-tklauser@distanz.ch	2020-08-21 18:22:15 -07:00
Toke Høiland-Jørgensen	b06fb2312c	libbpf: Fix map index used in error message The error message emitted by bpf_object__init_user_btf_maps() was using the wrong section ID. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200819110534.9058-1-toke@redhat.com	2020-08-21 18:22:15 -07:00
Andrii Nakryiko	1e2c7823f5	libbpf: Add perf_buffer APIs for better integration with outside epoll loop Add a set of APIs to perf_buffer manage to allow applications to integrate perf buffer polling into existing epoll-based infrastructure. One example is applications using libevent already and wanting to plug perf_buffer polling, instead of relying on perf_buffer__poll() and waste an extra thread to do it. But perf_buffer is still extremely useful to set up and consume perf buffer rings even for such use cases. So to accomodate such new use cases, add three new APIs: - perf_buffer__buffer_cnt() returns number of per-CPU buffers maintained by given instance of perf_buffer manager; - perf_buffer__buffer_fd() returns FD of perf_event corresponding to a specified per-CPU buffer; this FD is then polled independently; - perf_buffer__consume_buffer() consumes data from single per-CPU buffer, identified by its slot index. To support a simpler, but less efficient, way to integrate perf_buffer into external polling logic, also expose underlying epoll FD through perf_buffer__epoll_fd() API. It will need to be followed by perf_buffer__poll(), wasting extra syscall, or perf_buffer__consume(), wasting CPU to iterate buffers with no data. But could be simpler and more convenient for some cases. These APIs allow for great flexiblity, but do not sacrifice general usability of perf_buffer. Also exercise and check new APIs in perf_buffer selftest. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/20200821165927.849538-1-andriin@fb.com	2020-08-21 18:22:15 -07:00
Yonghong Song	160917756a	bpf: Implement link_query for bpf iterators This patch implemented bpf_link callback functions show_fdinfo and fill_link_info to support link_query interface. The general interface for show_fdinfo and fill_link_info will print/fill the target_name. Each targets can register show_fdinfo and fill_link_info callbacks to print/fill more target specific information. For example, the below is a fdinfo result for a bpf task iterator. $ cat /proc/1749/fdinfo/7 pos: 0 flags: 02000000 mnt_id: 14 link_type: iter link_id: 11 prog_tag: 990e1f8152f7e54f prog_id: 59 target_name: task Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200821184418.574122-1-yhs@fb.com	2020-08-21 18:22:15 -07:00
Andrii Nakryiko	4a2f7ac55f	libbpf: Fix libbpf build on compilers missing __builtin_mul_overflow GCC compilers older than version 5 don't support __builtin_mul_overflow yet. Given GCC 4.9 is the minimal supported compiler for building kernel and the fact that libbpf is a dependency of resolve_btfids, which is dependency of CONFIG_DEBUG_INFO_BTF=y, this needs to be handled. This patch fixes the issue by falling back to slower detection of integer overflow in such cases. Fixes: 029258d7b228 ("libbpf: Remove any use of reallocarray() in libbpf") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200820061411.1755905-2-andriin@fb.com	2020-08-21 18:22:15 -07:00
Andrii Nakryiko	a8a3089b5e	libbpf: Fix detection of BPF helper call instruction BPF_CALL \| BPF_JMP32 is explicitly not allowed by verifier for BPF helper calls, so don't detect it as a valid call. Also drop the check on func_id pointer, as it's currently always non-null. Fixes: 109cea5a594f ("libbpf: Sanitize BPF program code for bpf_probe_read_{kernel, user}[_str]") Reported-by: Yonghong Song <yhs@fb.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200820061411.1755905-1-andriin@fb.com	2020-08-21 18:22:15 -07:00
Xu Wang	475843fbf4	libbpf: Simplify the return expression of build_map_pin_path() Simplify the return expression. Signed-off-by: Xu Wang <vulab@iscas.ac.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200819025324.14680-1-vulab@iscas.ac.cn	2020-08-21 18:22:15 -07:00
Andrii Nakryiko	f89dab0903	libbpf: Implement enum value-based CO-RE relocations Implement two relocations of a new enumerator value-based CO-RE relocation kind: ENUMVAL_EXISTS and ENUMVAL_VALUE. First, ENUMVAL_EXISTS, allows to detect the presence of a named enumerator value in the target (kernel) BTF. This is useful to do BPF helper/map/program type support detection from BPF program side. bpf_core_enum_value_exists() macro helper is provided to simplify built-in usage. Second, ENUMVAL_VALUE, allows to capture enumerator integer value and relocate it according to the target BTF, if it changes. This is useful to have a guarantee against intentional or accidental re-ordering/re-numbering of some of the internal (non-UAPI) enumerations, where kernel developers don't care about UAPI backwards compatiblity concerns. bpf_core_enum_value() allows to capture this succinctly and use correct enum values in code. LLVM uses ldimm64 instruction to capture enumerator value-based relocations, so add support for ldimm64 instruction patching as well. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200819194519.3375898-5-andriin@fb.com	2020-08-21 18:22:15 -07:00
Andrii Nakryiko	cdb21b05e5	libbpf: Implement type-based CO-RE relocations support Implement support for TYPE_EXISTS/TYPE_SIZE/TYPE_ID_LOCAL/TYPE_ID_REMOTE relocations. These are examples of type-based relocations, as opposed to field-based relocations supported already. The difference is that they are calculating relocation values based on the type itself, not a field within a struct/union. Type-based relos have slightly different semantics when matching local types to kernel target types, see comments in bpf_core_types_are_compat() for details. Their behavior on failure to find target type in kernel BTF also differs. Instead of "poisoning" relocatable instruction and failing load subsequently in kernel, they return 0 (which is rarely a valid return result, so user BPF code can use that to detect success/failure of the relocation and deal with it without extra "guarding" relocations). Also, it's always possible to check existence of the type in target kernel with TYPE_EXISTS relocation, similarly to a field-based FIELD_EXISTS. TYPE_ID_LOCAL relocation is a bit special in that it always succeeds (barring any libbpf/Clang bugs) and resolved to BTF ID using local BTF info of BPF program itself. Tests in subsequent patches demonstrate the usage and semantics of new relocations. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200819194519.3375898-2-andriin@fb.com	2020-08-21 18:22:15 -07:00
Andrii Nakryiko	a734ef0803	tools: Remove feature-libelf-mmap feature detection It's trivial to handle missing ELF_C_MMAP_READ support in libelf the way that objtool has solved it in ("774bec3fddcc objtool: Add fallback from ELF_C_READ_MMAP to ELF_C_READ"). So instead of having an entire feature detector for that, just do what objtool does for perf and libbpf. And keep their Makefiles a bit simpler. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200819013607.3607269-5-andriin@fb.com	2020-08-21 18:22:15 -07:00
Andrii Nakryiko	3c4954d5a6	libbpf: Centralize poisoning and poison reallocarray() Most of libbpf source files already include libbpf_internal.h, so it's a good place to centralize identifier poisoning. So move kernel integer type poisoning there. And also add reallocarray to a poison list to prevent accidental use of it. libbpf_reallocarray() should be used universally instead. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200819013607.3607269-4-andriin@fb.com	2020-08-21 18:22:15 -07:00
Andrii Nakryiko	c3b1c66810	tools/bpftool: Remove libbpf_internal.h usage in bpftool Most netlink-related functions were unique to bpftool usage, so I moved them into net.c. Few functions are still used by both bpftool and libbpf itself internally, so I've copy-pasted them (libbpf_nl_get_link, libbpf_netlink_open). It's a bit of duplication of code, but better separation of libbpf as a library with public API and bpftool, relying on unexposed functions in libbpf. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200819013607.3607269-3-andriin@fb.com	2020-08-21 18:22:15 -07:00
Andrii Nakryiko	dc70da9c70	libbpf: Remove any use of reallocarray() in libbpf Re-implement glibc's reallocarray() for libbpf internal-only use. reallocarray(), unfortunately, is not available in all versions of glibc, so requires extra feature detection and using reallocarray() stub from <tools/libc_compat.h> and COMPAT_NEED_REALLOCARRAY. All this complicates build of libbpf unnecessarily and is just a maintenance burden. Instead, it's trivial to implement libbpf-specific internal version and use it throughout libbpf. Which is what this patch does, along with converting some realloc() uses that should really have been reallocarray() in the first place. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200819013607.3607269-2-andriin@fb.com	2020-08-21 18:22:15 -07:00
Andrii Nakryiko	9106c3028b	libbpf: Improve relocation ambiguity detection Split the instruction patching logic into relocation value calculation and application of relocation to instruction. Using this, evaluate relocation against each matching candidate and validate that all candidates agree on relocated value. If not, report ambiguity and fail load. This logic is necessary to avoid dangerous (however unlikely) accidental match against two incompatible candidate types. Without this change, libbpf will pick a random type as the candidate and apply potentially invalid relocation. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200818223921.2911963-4-andriin@fb.com	2020-08-21 18:22:15 -07:00
Andrii Nakryiko	3bde6ca8e8	libbpf: Clean up and improve CO-RE reloc logging Add logging of local/target type kind (struct/union/typedef/etc). Preserve unresolved root type ID (for cases of typedef). Improve the format of CO-RE reloc spec output format to contain only relevant and succinct info. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200818223921.2911963-3-andriin@fb.com	2020-08-21 18:22:15 -07:00
Andrii Nakryiko	d13e96ee32	libbpf: Improve error logging for mismatched BTF kind cases Instead of printing out integer value of BTF kind, print out a string representation of a kind. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200818223921.2911963-2-andriin@fb.com	2020-08-21 18:22:15 -07:00
Andrii Nakryiko	dc1b3e2a45	libbpf: Detect minimal BTF support and skip BTF loading, if missing Detect whether a kernel supports any BTF at all, and if not, don't even attempt loading BTF to avoid unnecessary log messages like: libbpf: Error loading BTF: Invalid argument(22) libbpf: Error loading .BTF into kernel: -22. BTF is optional, ignoring. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200818213356.2629020-8-andriin@fb.com	2020-08-21 18:22:15 -07:00
Andrii Nakryiko	30c61391bf	libbpf: Switch tracing and CO-RE helper macros to bpf_probe_read_kernel() Now that libbpf can automatically fallback to bpf_probe_read() on old kernels not yet supporting bpf_probe_read_kernel(), switch libbpf BPF-side helper macros to use appropriate BPF helper for reading kernel data. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/bpf/20200818213356.2629020-7-andriin@fb.com	2020-08-21 18:22:15 -07:00
Andrii Nakryiko	e6f118dddd	libbpf: Sanitize BPF program code for bpf_probe_read_{kernel, user}[_str] Add BPF program code sanitization pass, replacing calls to BPF bpf_probe_read_{kernel,user}[_str]() helpers with bpf_probe_read[_str](), if libbpf detects that kernel doesn't support new variants. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200818213356.2629020-5-andriin@fb.com	2020-08-21 18:22:15 -07:00
Andrii Nakryiko	5d4075553b	libbpf: Factor out common logic of testing and closing FD Factor out common piece of logic that detects support for a feature based on successfully created FD. Also take care of closing FD, if it was created. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200818213356.2629020-4-andriin@fb.com	2020-08-21 18:22:15 -07:00
Andrii Nakryiko	5205159359	libbpf: Make kernel feature probing lazy Turn libbpf's kernel feature probing into lazily-performed checks. This allows to skip performing unnecessary feature checks, if a given BPF application doesn't rely on a particular kernel feature. As we grow number of feature probes, libbpf might perform less unnecessary syscalls and scale better with number of feature probes long-term. By decoupling feature checks from bpf_object, it's also possible to perform feature probing from libbpf static helpers and low-level APIs, if necessary. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200818213356.2629020-3-andriin@fb.com	2020-08-21 18:22:15 -07:00
Xu Wang	87d7f1a32b	libbpf: Convert comma to semicolon Replace a comma between expression statements by a semicolon. Signed-off-by: Xu Wang <vulab@iscas.ac.cn> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200818071611.21923-1-vulab@iscas.ac.cn	2020-08-21 18:22:15 -07:00
Andrii Nakryiko	e8547bd4f7	vmtests: fix selftests checkout script Fix the script. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-08-18 11:37:43 -07:00
Andrii Nakryiko	93959e4e43	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: bfdd5aaa54b0a44d9df550fe4c9db7e1470a11b8 Checkpoint bpf-next commit: 06a4ec1d9dc652e17ee3ac2ceb6c7cf6c2b75cdd Baseline bpf commit: 929e54a989680c6f134b02293732030b897475dc Checkpoint bpf commit: 3fb1a96a91120877488071a167d26d76be4be977 Andrii Nakryiko (4): libbpf: Fix BTF-defined map-in-map initialization on 32-bit host arches libbpf: Handle BTF pointer sizes more carefully libbpf: Enforce 64-bitness of BTF for BPF object files libbpf: Fix build on ppc64le architecture Jean-Philippe Brucker (1): libbpf: Handle GCC built-in types for Arm NEON Toke Høiland-Jørgensen (1): libbpf: Prevent overriding errno when logging errors Yonghong Song (1): libbpf: Do not use __builtin_offsetof for offsetof src/bpf_helpers.h \| 2 +- src/btf.c \| 83 +++++++++++++++++++++++++++++++++++++++++++++-- src/btf.h \| 2 ++ src/btf_dump.c \| 39 ++++++++++++++++++++-- src/libbpf.c \| 32 +++++++++++------- src/libbpf.map \| 2 ++ 6 files changed, 143 insertions(+), 17 deletions(-) -- 2.24.1	2020-08-18 11:37:43 -07:00
Andrii Nakryiko	7ee1f12f94	libbpf: Fix build on ppc64le architecture On ppc64le we get the following warning: In file included from btf_dump.c:16:0: btf_dump.c: In function ‘btf_dump_emit_struct_def’: ../include/linux/kernel.h:20:17: error: comparison of distinct pointer types lacks a cast [-Werror] (void) (&_max1 == &_max2); \ ^ btf_dump.c:882:11: note: in expansion of macro ‘max’ m_sz = max(0LL, btf__resolve_size(d->btf, m->type)); ^~~ Fix by explicitly casting to __s64, which is a return type from btf__resolve_size(). Fixes: 702eddc77a90 ("libbpf: Handle GCC built-in types for Arm NEON") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200818164456.1181661-1-andriin@fb.com	2020-08-18 11:37:43 -07:00
Andrii Nakryiko	ff09ad9dac	libbpf: Enforce 64-bitness of BTF for BPF object files BPF object files are always targeting 64-bit BPF target architecture, so enforce that at BTF level as well. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200813204945.1020225-7-andriin@fb.com	2020-08-18 11:37:43 -07:00
Andrii Nakryiko	025fcdc306	libbpf: Handle BTF pointer sizes more carefully With libbpf and BTF it is pretty common to have libbpf built for one architecture, while BTF information was generated for a different architecture (typically, but not always, BPF). In such case, the size of a pointer might differ betweem architectures. libbpf previously was always making an assumption that pointer size for BTF is the same as native architecture pointer size, but that breaks for cases where libbpf is built as 32-bit library, while BTF is for 64-bit architecture. To solve this, add heuristic to determine pointer size by searching for `long` or `unsigned long` integer type and using its size as a pointer size. Also, allow to override the pointer size with a new API btf__set_pointer_size(), for cases where application knows which pointer size should be used. User application can check what libbpf "guessed" by looking at the result of btf__pointer_size(). If it's not 0, then libbpf successfully determined a pointer size, otherwise native arch pointer size will be used. For cases where BTF is parsed from ELF file, use ELF's class (32-bit or 64-bit) to determine pointer size. Fixes: 8a138aed4a80 ("bpf: btf: Add BTF support to libbpf") Fixes: 351131b51c7a ("libbpf: add btf_dump API for BTF-to-C conversion") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200813204945.1020225-5-andriin@fb.com	2020-08-18 11:37:43 -07:00
Andrii Nakryiko	b3405fcb08	libbpf: Fix BTF-defined map-in-map initialization on 32-bit host arches Libbpf built in 32-bit mode should be careful about not conflating 64-bit BPF pointers in BPF ELF file and host architecture pointers. This patch fixes issue of incorrect initializating of map-in-map inner map slots due to such difference. Fixes: 646f02ffdd49 ("libbpf: Add BTF-defined map-in-map support") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200813204945.1020225-4-andriin@fb.com	2020-08-18 11:37:43 -07:00
Toke Høiland-Jørgensen	1194953749	libbpf: Prevent overriding errno when logging errors Turns out there were a few more instances where libbpf didn't save the errno before writing an error message, causing errno to be overridden by the printf() return and the error disappearing if logging is enabled. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200813142905.160381-1-toke@redhat.com	2020-08-18 11:37:43 -07:00
Jean-Philippe Brucker	1d76180057	libbpf: Handle GCC built-in types for Arm NEON When building Arm NEON (SIMD) code from lib/raid6/neon.uc, GCC emits DWARF information using a base type "__Poly8_t", which is internal to GCC and not recognized by Clang. This causes build failures when building with Clang a vmlinux.h generated from an arm64 kernel that was built with GCC. vmlinux.h:47284:9: error: unknown type name '__Poly8_t' typedef __Poly8_t poly8x16_t[16]; ^~~~~~~~~ The polyX_t types are defined as unsigned integers in the "Arm C Language Extension" document (101028_Q220_00_en). Emit typedefs based on standard integer types for the GCC internal types, similar to those emitted by Clang. Including linux/kernel.h to use ARRAY_SIZE() incidentally redefined max(), causing a build bug due to different types, hence the seemingly unrelated change. Reported-by: Jakov Petrina <jakov.petrina@sartura.hr> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200812143909.3293280-1-jean-philippe@linaro.org	2020-08-18 11:37:43 -07:00
Yonghong Song	048bf21dac	libbpf: Do not use __builtin_offsetof for offsetof Commit 5fbc220862fc ("tools/libpf: Add offsetof/container_of macro in bpf_helpers.h") added a macro offsetof() to get the offset of a structure member: #define offsetof(TYPE, MEMBER) ((size_t)&((TYPE *)0)->MEMBER) In certain use cases, size_t type may not be available so Commit da7a35062bcc ("libbpf bpf_helpers: Use __builtin_offsetof for offsetof") changed to use __builtin_offsetof which removed the dependency on type size_t, which I suggested. But using __builtin_offsetof will prevent CO-RE relocation generation in case that, e.g., TYPE is annotated with "preserve_access_info" where a relocation is desirable in case the member offset is changed in a different kernel version. So this patch reverted back to the original macro but using "unsigned long" instead of "site_t". Fixes: da7a35062bcc ("libbpf bpf_helpers: Use __builtin_offsetof for offsetof") Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/bpf/20200811030852.3396929-1-yhs@fb.com	2020-08-18 11:37:43 -07:00
Andrii Nakryiko	e954437a76	travis-ci: flatten build stages to gain more speed ups Do both builds and selftest runs as part of a single build step. This would allow to complete CI testing faster, as builds will happen in parallel with "Kernel LATEST + selftests" run. Also re-enable s390x build. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-08-10 22:38:26 -07:00
Andrii Nakryiko	c57be0b4d6	vmtests: speed up fetching of bpf-next sources Attempt to first fetch bpf-next tree from a snapshot, falling back to shallow clone, and if that is not enough, doing a full bpf-next clone. This should both improve a speed and (because of full clone fallback) improve test reliability if libbpf wasn't synced in a while. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-08-10 22:31:52 -07:00
Andrii Nakryiko	bf3ab4b0d8	travis-ci: remove s390x build as it fails to be queued by Travis CI It's been failing for few days. Comment it out for now. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-08-09 13:19:59 -07:00
Andrii Nakryiko	663f66decf	vmtests: blacklist problematic tests Blacklist btf_map_in_map permanently for 5.5. bpf_verif_scale is broken due to Clang issues on latest. Do not run ALU32 flavor for test_progs on 4.9.0, which doesn't support ALU32 yet. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-08-07 16:37:09 -07:00
Andrii Nakryiko	ed187d0400	vmtest: bump LLVM_VER to 12 Bump LLVM_VER variable used in selftest build to 12. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-08-07 16:37:09 -07:00
Andrii Nakryiko	80453d4b2d	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 3c4f850e8441ac8b3b6dbaa6107604c4199ef01f Checkpoint bpf-next commit: bfdd5aaa54b0a44d9df550fe4c9db7e1470a11b8 Baseline bpf commit: 5b801dfb7feb2738975d80223efc2fc193e55573 Checkpoint bpf commit: 929e54a989680c6f134b02293732030b897475dc Andrii Nakryiko (3): libbpf: Make destructors more robust by handling ERR_PTR(err) cases libbpf: Add bpf_link detach APIs libbpf: Add btf__parse_raw() and generic btf__parse() APIs Daniel T. Lee (1): libbf: Fix uninitialized pointer at btf__parse_raw() Jerry Crunchtime (1): libbpf: Fix register in PT_REGS MIPS macros Yonghong Song (1): tools/bpf: Support new uapi for map element bpf iterator include/uapi/linux/bpf.h \| 20 ++++--- src/bpf.c \| 13 +++++ src/bpf.h \| 7 ++- src/bpf_tracing.h \| 4 +- src/btf.c \| 118 ++++++++++++++++++++++++++------------- src/btf.h \| 5 +- src/btf_dump.c \| 2 +- src/libbpf.c \| 20 ++++--- src/libbpf.h \| 6 +- src/libbpf.map \| 4 ++ 10 files changed, 137 insertions(+), 62 deletions(-) -- 2.24.1	2020-08-07 16:37:09 -07:00
Daniel T. Lee	7f96c4b1d2	libbf: Fix uninitialized pointer at btf__parse_raw() Recently, from commit 94a1fedd63ed ("libbpf: Add btf__parse_raw() and generic btf__parse() APIs"), new API has been added to libbpf that allows to parse BTF from raw data file (btf__parse_raw()). The commit derives build failure of samples/bpf due to improper access of uninitialized pointer at btf_parse_raw(). btf.c: In function btf__parse_raw: btf.c:625:28: error: btf may be used uninitialized in this function 625 \| return err ? ERR_PTR(err) : btf; \| ~~~~~~~~~~~~~~~~~~~^~~~~ This commit fixes the build failure of samples/bpf by adding code of initializing btf pointer as NULL. Fixes: 94a1fedd63ed ("libbpf: Add btf__parse_raw() and generic btf__parse() APIs") Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200805223359.32109-1-danieltimlee@gmail.com	2020-08-07 16:37:09 -07:00
Yonghong Song	2be293cb4a	tools/bpf: Support new uapi for map element bpf iterator Previous commit adjusted kernel uapi for map element bpf iterator. This patch adjusted libbpf API due to uapi change. bpftool and bpf_iter selftests are also changed accordingly. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200805055058.1457623-1-yhs@fb.com	2020-08-07 16:37:09 -07:00
Andrii Nakryiko	a0334e97aa	libbpf: Add btf__parse_raw() and generic btf__parse() APIs Add public APIs to parse BTF from raw data file (e.g., /sys/kernel/btf/vmlinux), as well as generic btf__parse(), which will try to determine correct format, currently either raw or ELF. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200802013219.864880-2-andriin@fb.com	2020-08-07 16:37:09 -07:00
Andrii Nakryiko	2d97d4097f	libbpf: Add bpf_link detach APIs Add low-level bpf_link_detach() API. Also add higher-level bpf_link__detach() one. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200731182830.286260-3-andriin@fb.com	2020-08-07 16:37:09 -07:00
Jerry Crunchtime	80a52e3252	libbpf: Fix register in PT_REGS MIPS macros The o32, n32 and n64 calling conventions require the return value to be stored in $v0 which maps to $2 register, i.e., the register 2. Fixes: c1932cd ("bpf: Add MIPS support to samples/bpf.") Signed-off-by: Jerry Crunchtime <jerry.c.t@web.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/43707d31-0210-e8f0-9226-1af140907641@web.de	2020-08-07 16:37:09 -07:00
Andrii Nakryiko	2dc7cbd893	libbpf: Make destructors more robust by handling ERR_PTR(err) cases Most of libbpf "constructors" on failure return ERR_PTR(err) result encoded as a pointer. It's a common mistake to eventually pass such malformed pointers into xxx__destroy()/xxx__free() "destructors". So instead of fixing up clean up code in selftests and user programs, handle such error pointers in destructors themselves. This works beautifully for NULL pointers passed to destructors, so might as well just work for error pointers. Suggested-by: Song Liu <songliubraving@fb.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200729232148.896125-1-andriin@fb.com	2020-08-07 16:37:09 -07:00
Thomas Hebb	0466b9833b	README: Add Arch to list of downstream distros Signed-off-by: Thomas Hebb <tommyhebb@gmail.com>	2020-08-06 21:21:02 -07:00
Andrii Nakryiko	ba8d45968b	vmtests: specify v12 of clang/llvm for now Whatever happened, clang-11 and llvm-11, to which clang/llvm packages resolve, respectively, are not there anymore. Seems like clang-12/llvm-12 are the latest now, but for whatever reason clang/llvm don't resolve to them yet. Hard-code version 12 for now. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-08-06 17:32:33 -07:00
Thomas Hebb	734b3f0afe	check-reallocarray.sh: Use the same compiler Make does Currently we hardcode "gcc", which means we get a bogus result any time a non-default CC is passed to Make. In fact, it's bogus even when CC is not explicitly set, since Make's default is "cc", which isn't necessarily the same as "gcc". Fix the issue by passing the compiler to use to check-reallocarray.sh. Signed-off-by: Thomas Hebb <tommyhebb@gmail.com>	2020-07-28 14:05:35 -07:00
Andrii Nakryiko	f56874ba8a	vmtests: blacklist sk_lookup on LATEST and cg_storage_multi on 5.5 Blacklist two failing tests. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-07-28 14:03:17 -07:00
Andrii Nakryiko	3f26bf1adf	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 9a97c9d2af5ca798377342debf7f0f44281d050e Checkpoint bpf-next commit: 3c4f850e8441ac8b3b6dbaa6107604c4199ef01f Baseline bpf commit: 5b801dfb7feb2738975d80223efc2fc193e55573 Checkpoint bpf commit: 5b801dfb7feb2738975d80223efc2fc193e55573 Andrii Nakryiko (1): bpf: Fix bpf_ringbuf_output() signature to return long include/uapi/linux/bpf.h \| 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- 2.24.1	2020-07-28 14:03:17 -07:00
Andrii Nakryiko	ab01213b35	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 5c3320d7fece4612d4a413aa3c8e82cdb5b49fcb Checkpoint bpf-next commit: 9a97c9d2af5ca798377342debf7f0f44281d050e Baseline bpf commit: b2f9f1535bb93ee5fa2ea30ac1c26fa0d676154c Checkpoint bpf commit: 5b801dfb7feb2738975d80223efc2fc193e55573 Andrii Nakryiko (3): libbpf: Support stripping modifiers for btf_dump tools/bpftool: Strip away modifiers from global variables libbpf: Add support for BPF XDP link Ciara Loftus (1): xsk: Add new statistics Horatiu Vultur (1): net: bridge: Add port attribute IFLA_BRPORT_MRP_IN_OPEN Ian Rogers (1): libbpf bpf_helpers: Use __builtin_offsetof for offsetof Jakub Sitnicki (2): bpf: Sync linux/bpf.h to tools/ libbpf: Add support for SK_LOOKUP program type Lorenzo Bianconi (3): cpumap: Formalize map value as a named struct bpf: cpumap: Add the possibility to attach an eBPF program to cpumap libbpf: Add SEC name for xdp programs attached to CPUMAP Quentin Monnet (1): bpf: Fix formatting in documentation for BPF helpers Randy Dunlap (1): bpf: Drop duplicated words in uapi helper comments Song Liu (1): libbpf: Print hint when PERF_EVENT_IOC_SET_BPF returns -EPROTO Yonghong Song (2): bpf: Implement bpf iterator for map elements tools/libbpf: Add support for bpf map element iterator include/uapi/linux/bpf.h \| 155 +++++++++++++++++++++++++++++------ include/uapi/linux/if_link.h \| 1 + include/uapi/linux/if_xdp.h \| 5 +- src/bpf.c \| 1 + src/bpf.h \| 3 +- src/bpf_helpers.h \| 2 +- src/btf.h \| 4 +- src/btf_dump.c \| 10 ++- src/libbpf.c \| 27 +++++- src/libbpf.h \| 7 +- src/libbpf.map \| 3 + src/libbpf_probes.c \| 3 + 12 files changed, 188 insertions(+), 33 deletions(-) -- 2.24.1	2020-07-28 14:03:17 -07:00
Andrii Nakryiko	8af35e73a2	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2020-07-28 14:03:17 -07:00
Andrii Nakryiko	a290d45322	libbpf: Add support for BPF XDP link Sync UAPI header and add support for using bpf_link-based XDP attachment. Make xdp/ prog type set expected attach type. Kernel didn't enforce attach_type for XDP programs before, so there is no backwards compatiblity issues there. Also fix section_names selftest to recognize that xdp prog types now have expected attach type. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200722064603.3350758-8-andriin@fb.com	2020-07-28 14:03:17 -07:00
Song Liu	3f6b428909	libbpf: Print hint when PERF_EVENT_IOC_SET_BPF returns -EPROTO The kernel prevents potential unwinder warnings and crashes by blocking BPF program with bpf_get_[stack\|stackid] on perf_event without PERF_SAMPLE_CALLCHAIN, or with exclude_callchain_[kernel\|user]. Print a hint message in libbpf to help the user debug such issues. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200723180648.1429892-4-songliubraving@fb.com	2020-07-28 14:03:17 -07:00
Yonghong Song	5efd8395ef	tools/libbpf: Add support for bpf map element iterator Add map_fd to bpf_iter_attach_opts and flags to bpf_link_create_opts. Later on, bpftool or selftest will be able to create a bpf map element iterator by passing map_fd to the kernel during link creation time. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200723184117.590673-1-yhs@fb.com	2020-07-28 14:03:17 -07:00
Yonghong Song	b1720407ff	bpf: Implement bpf iterator for map elements The bpf iterator for map elements are implemented. The bpf program will receive four parameters: bpf_iter_meta meta: the meta data bpf_map map: the bpf_map whose elements are traversed void key: the key of one element void value: the value of the same element Here, meta and map pointers are always valid, and key has register type PTR_TO_RDONLY_BUF_OR_NULL and value has register type PTR_TO_RDWR_BUF_OR_NULL. The kernel will track the access range of key and value during verification time. Later, these values will be compared against the values in the actual map to ensure all accesses are within range. A new field iter_seq_info is added to bpf_map_ops which is used to add map type specific information, i.e., seq_ops, init/fini seq_file func and seq_file private data size. Subsequent patches will have actual implementation for bpf_map_ops->iter_seq_info. In user space, BPF_ITER_LINK_MAP_FD needs to be specified in prog attr->link_create.flags, which indicates that attr->link_create.target_fd is a map_fd. The reason for such an explicit flag is for possible future cases where one bpf iterator may allow more than one possible customization, e.g., pid and cgroup id for task_file. Current kernel internal implementation only allows the target to register at most one required bpf_iter_link_info. To support the above case, optional bpf_iter_link_info's are needed, the target can be extended to register such link infos, and user provided link_info needs to match one of target supported ones. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200723184112.590360-1-yhs@fb.com	2020-07-28 14:03:17 -07:00
Ian Rogers	698820a9d9	libbpf bpf_helpers: Use __builtin_offsetof for offsetof The non-builtin route for offsetof has a dependency on size_t from stdlib.h/stdint.h that is undeclared and may break targets. The offsetof macro in bpf_helpers may disable the same macro in other headers that have a #ifdef offsetof guard. Rather than add additional dependencies improve the offsetof macro declared here to use the builtin that is available since llvm 3.7 (the first with a BPF backend). Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200720061741.1514673-1-irogers@google.com	2020-07-28 14:03:17 -07:00
Jakub Sitnicki	6d92249be0	libbpf: Add support for SK_LOOKUP program type Make libbpf aware of the newly added program type, and assign it a section name. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200717103536.397595-13-jakub@cloudflare.com	2020-07-28 14:03:17 -07:00
Jakub Sitnicki	1736996279	bpf: Sync linux/bpf.h to tools/ Newly added program, context type and helper is used by tests in a subsequent patch. Synchronize the header file. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200717103536.397595-12-jakub@cloudflare.com	2020-07-28 14:03:17 -07:00
Randy Dunlap	f9f5f054d2	bpf: Drop duplicated words in uapi helper comments Drop doubled words "will" and "attach". Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/6b9f71ae-4f8e-0259-2c5d-187ddaefe6eb@infradead.org	2020-07-28 14:03:17 -07:00
Lorenzo Bianconi	4a5aecf034	libbpf: Add SEC name for xdp programs attached to CPUMAP As for DEVMAP, support SEC("xdp_cpumap/") as a short cut for loading the program with type BPF_PROG_TYPE_XDP and expected attach type BPF_XDP_CPUMAP. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/33174c41993a6d860d9c7c1f280a2477ee39ed11.1594734381.git.lorenzo@kernel.org	2020-07-28 14:03:17 -07:00
Lorenzo Bianconi	77f11b3674	bpf: cpumap: Add the possibility to attach an eBPF program to cpumap Introduce the capability to attach an eBPF program to cpumap entries. The idea behind this feature is to add the possibility to define on which CPU run the eBPF program if the underlying hw does not support RSS. Current supported verdicts are XDP_DROP and XDP_PASS. This patch has been tested on Marvell ESPRESSObin using xdp_redirect_cpu sample available in the kernel tree to identify possible performance regressions. Results show there are no observable differences in packet-per-second: $./xdp_redirect_cpu --progname xdp_cpu_map0 --dev eth0 --cpu 1 rx: 354.8 Kpps rx: 356.0 Kpps rx: 356.8 Kpps rx: 356.3 Kpps rx: 356.6 Kpps rx: 356.6 Kpps rx: 356.7 Kpps rx: 355.8 Kpps rx: 356.8 Kpps rx: 356.8 Kpps Co-developed-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/bpf/5c9febdf903d810b3415732e5cd98491d7d9067a.1594734381.git.lorenzo@kernel.org	2020-07-28 14:03:17 -07:00
Lorenzo Bianconi	cd46c9d67e	cpumap: Formalize map value as a named struct As it has been already done for devmap, introduce 'struct bpf_cpumap_val' to formalize the expected values that can be passed in for a CPUMAP. Update cpumap code to use the struct. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/bpf/754f950674665dae6139c061d28c1d982aaf4170.1594734381.git.lorenzo@kernel.org	2020-07-28 14:03:17 -07:00
Horatiu Vultur	41054a32df	net: bridge: Add port attribute IFLA_BRPORT_MRP_IN_OPEN This patch adds a new port attribute, IFLA_BRPORT_MRP_IN_OPEN, which allows to notify the userspace when the node lost the contiuity of MRP_InTest frames. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-07-28 14:03:17 -07:00
Andrii Nakryiko	852b4c8e73	tools/bpftool: Strip away modifiers from global variables Reliably remove all the type modifiers from read-only (.rodata) global variable definitions, including cases of inner field const modifiers and arrays of const values. Also modify one of selftests to ensure that const volatile struct doesn't prevent user-space from modifying .rodata variable. Fixes: 985ead416df3 ("bpftool: Add skeleton codegen command") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200713232409.3062144-3-andriin@fb.com	2020-07-28 14:03:17 -07:00
Andrii Nakryiko	de60a31eba	libbpf: Support stripping modifiers for btf_dump One important use case when emitting const/volatile/restrict is undesirable is BPF skeleton generation of DATASEC layout. These are further memory-mapped and can be written/read from user-space directly. For important case of .rodata variables, bpftool strips away first-level modifiers, to make their use on user-space side simple and not requiring extra type casts to override compiler complaining about writing to const variables. This logic works mostly fine, but breaks in some more complicated cases. E.g.: const volatile int params[10]; Because in BTF it's a chain of ARRAY -> CONST -> VOLATILE -> INT, bpftool stops at ARRAY and doesn't strip CONST and VOLATILE. In skeleton this variable will be emitted as is. So when used from user-space, compiler will complain about writing to const array. This is problematic, as also mentioned in [0]. To solve this for arrays and other non-trivial cases (e.g., inner const/volatile fields inside the struct), teach btf_dump to strip away any modifier, when requested. This is done as an extra option on btf_dump__emit_type_decl() API. Reported-by: Anton Protopopov <a.s.protopopov@gmail.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200713232409.3062144-2-andriin@fb.com	2020-07-28 14:03:17 -07:00
Ciara Loftus	8ec7d86efe	xsk: Add new statistics It can be useful for the user to know the reason behind a dropped packet. Introduce new counters which track drops on the receive path caused by: 1. rx ring being full 2. fill ring being empty Also, on the tx path introduce a counter which tracks the number of times we attempt pull from the tx ring when it is empty. Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200708072835.4427-2-ciara.loftus@intel.com	2020-07-28 14:03:17 -07:00
Andrii Nakryiko	c3984343bc	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 2977282b63c3b6f112145ecf0bcefff0c65bd3ac Checkpoint bpf-next commit: 5c3320d7fece4612d4a413aa3c8e82cdb5b49fcb Baseline bpf commit: b2f9f1535bb93ee5fa2ea30ac1c26fa0d676154c Checkpoint bpf commit: b2f9f1535bb93ee5fa2ea30ac1c26fa0d676154c Andrii Nakryiko (1): libbpf: Fix memory leak and optimize BTF sanitization src/btf.c \| 2 +- src/btf.h \| 2 +- src/libbpf.c \| 11 +++-------- 3 files changed, 5 insertions(+), 10 deletions(-) -- 2.24.1	2020-07-10 09:11:41 -07:00
Andrii Nakryiko	5255eb2799	libbpf: Fix memory leak and optimize BTF sanitization Coverity's static analysis helpfully reported a memory leak introduced by 0f0e55d8247c ("libbpf: Improve BTF sanitization handling"). While fixing it, I realized that btf__new() already creates a memory copy, so there is no need to do this. So this patch also fixes misleading btf__new() signature to make data into a `const void *` input parameter. And it avoids unnecessary memory allocation and copy in BTF sanitization code altogether. Fixes: 0f0e55d8247c ("libbpf: Improve BTF sanitization handling") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200710011023.1655008-1-andriin@fb.com	2020-07-10 09:11:41 -07:00
Andrii Nakryiko	8b5e81a17a	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 2977282b63c3b6f112145ecf0bcefff0c65bd3ac Checkpoint bpf-next commit: 2977282b63c3b6f112145ecf0bcefff0c65bd3ac Baseline bpf commit: 0f57a1e522f413e87852e632f55de4723e511939 Checkpoint bpf commit: b2f9f1535bb93ee5fa2ea30ac1c26fa0d676154c Jakub Bogusz (1): libbpf: Fix libbpf hashmap on (I)LP32 architectures src/hashmap.h \| 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) -- 2.24.1	2020-07-09 22:00:15 -07:00
Jakub Bogusz	cd016d93f7	libbpf: Fix libbpf hashmap on (I)LP32 architectures On ILP32, 64-bit result was shifted by value calculated for 32-bit long type and returned value was much outside hashmap capacity. As advised by Andrii Nakryiko, this patch uses different hashing variant for architectures with size_t shorter than long long. Fixes: e3b924224028 ("libbpf: add resizable non-thread safe internal hashmap") Signed-off-by: Jakub Bogusz <qboosh@pld-linux.org> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200709225723.1069937-1-andriin@fb.com	2020-07-09 22:00:15 -07:00
Andrii Nakryiko	deaee9541d	vmtests: update blacklist for 5.5 Add two tests (sockopt_sk and udp_limit) to blacklist of 5.5 kernel. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-07-08 17:12:53 -07:00
Andrii Nakryiko	daa2c7f851	ci: re-arrange tests to prioritize higher-signal tests Put selftests in first stage. Put long-running LATEST build & test case first, so that it can be better parallelized with 4.9 and 5.5. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-07-08 17:12:53 -07:00
Andrii Nakryiko	006904d416	vmtests: whitelist core_retro for 4.9 tests Add core_retro to whitelist for 4.9, as it is supposed to work on old kernels. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-07-08 17:12:53 -07:00
Andrii Nakryiko	e47ebc895d	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 6b207d66aa9fad0deed13d5f824e1ea193b0a777 Checkpoint bpf-next commit: 2977282b63c3b6f112145ecf0bcefff0c65bd3ac Baseline bpf commit: e708e2bd55c921f5bb554fa5837d132a878951cf Checkpoint bpf commit: 0f57a1e522f413e87852e632f55de4723e511939 Andrii Nakryiko (4): libbpf: Make BTF finalization strict libbpf: Add btf__set_fd() for more control over loaded BTF FD libbpf: Improve BTF sanitization handling libbpf: Handle missing BPF_OBJ_GET_INFO_BY_FD gracefully in perf_buffer Stanislav Fomichev (1): libbpf: Add support for BPF_CGROUP_INET_SOCK_RELEASE include/uapi/linux/bpf.h \| 1 + src/btf.c \| 7 +- src/btf.h \| 1 + src/libbpf.c \| 154 ++++++++++++++++++++++----------------- src/libbpf.map \| 1 + 5 files changed, 95 insertions(+), 69 deletions(-) -- 2.24.1	2020-07-08 17:12:53 -07:00
Andrii Nakryiko	3b2837e296	libbpf: Handle missing BPF_OBJ_GET_INFO_BY_FD gracefully in perf_buffer perf_buffer__new() is relying on BPF_OBJ_GET_INFO_BY_FD availability for few sanity checks. OBJ_GET_INFO for maps is actually much more recent feature than perf_buffer support itself, so this causes unnecessary problems on old kernels before BPF_OBJ_GET_INFO_BY_FD was added. This patch makes those sanity checks optional and just assumes best if command is not supported. If user specified something incorrectly (e.g., wrong map type), kernel will reject it later anyway, except user won't get a nice explanation as to why it failed. This seems like a good trade off for supporting perf_buffer on old kernels. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200708015318.3827358-6-andriin@fb.com	2020-07-08 17:12:53 -07:00
Andrii Nakryiko	90716e9e14	libbpf: Improve BTF sanitization handling Change sanitization process to preserve original BTF, which might be used by libbpf itself for Kconfig externs, CO-RE relocs, etc, even if kernel is old and doesn't support BTF. To achieve that, if libbpf detects the need for BTF sanitization, it would clone original BTF, sanitize it in-place, attempt to load it into kernel, and if successful, will preserve loaded BTF FD in original `struct btf`, while freeing sanitized local copy. If kernel doesn't support any BTF, original btf and btf_ext will still be preserved to be used later for CO-RE relocation and other BTF-dependent libbpf features, which don't dependon kernel BTF support. Patch takes care to not specify BTF and BTF.ext features when loading BPF programs and/or maps, if it was detected that kernel doesn't support BTF features. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200708015318.3827358-4-andriin@fb.com	2020-07-08 17:12:53 -07:00
Andrii Nakryiko	d5a36e2070	libbpf: Add btf__set_fd() for more control over loaded BTF FD Add setter for BTF FD to allow application more fine-grained control in more advanced scenarios. Storing BTF FD inside `struct btf` provides little benefit and probably would be better done differently (e.g., btf__load() could just return FD on success), but we are stuck with this due to backwards compatibility. The main problem is that it's impossible to load BTF and than free user-space memory, but keep FD intact, because `struct btf` assumes ownership of that FD upon successful load and will attempt to close it during btf__free(). To allow callers (e.g., libbpf itself for BTF sanitization) to have more control over this, add btf__set_fd() to allow to reset FD arbitrarily, if necessary. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200708015318.3827358-3-andriin@fb.com	2020-07-08 17:12:53 -07:00
Andrii Nakryiko	133543c202	libbpf: Make BTF finalization strict With valid ELF and valid BTF, there is no reason (apart from bugs) why BTF finalization should fail. So make it strict and return error if it fails. This makes CO-RE relocation more reliable, as they are not going to be just silently skipped, if BTF finalization failed. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200708015318.3827358-2-andriin@fb.com	2020-07-08 17:12:53 -07:00
Stanislav Fomichev	abb82202da	libbpf: Add support for BPF_CGROUP_INET_SOCK_RELEASE Add auto-detection for the cgroup/sock_release programs. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200706230128.4073544-3-sdf@google.com	2020-07-08 17:12:53 -07:00
Andrii Nakryiko	5020fdf8fc	vmtests: fix 4.9 build Drop blacklist and instead use a small whitelist of tests that are still supposed to work on old 4.9 kernel. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-07-07 11:10:16 -07:00
Andrii Nakryiko	a846caca79	vmtests: test no-alu32 variant of test_progs Add testing of no-alu32 flavor of test_progs. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-07-07 10:41:57 -07:00
Julia Kartseva	1b42b15b5e	travis_ci: run tests for 4.9 kernel Make sure that libbpf sanitizes BTF properly for older kernels. Add a stage for 4.9.0 kernel in TravisCI. For now make test failures non-blocking by adding 4.9.0 to `allow_failures` section. Blacklist is copy-pasted 5.5.0 kernel blacklist.	2020-07-01 15:38:31 -07:00
Andrii Nakryiko	a2b27a1b62	vmtests: remove custom 5.5 selftest preparetion actions Now that pre-generated vmlinux.h is used for compilation of non-latest tests, we don't need custom adjustments for 5.5 kernel selftests. Adjust blacklist now that those new self-tests are built into test_progs. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-07-01 15:19:18 -07:00
Andrii Nakryiko	7b9d71b21d	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: ca4db6389d611eee2eb7c1dfe710b62d8ea06772 Checkpoint bpf-next commit: 6b207d66aa9fad0deed13d5f824e1ea193b0a777 Baseline bpf commit: 2bdeb3ed547d8822b2566797afa6c2584abdb119 Checkpoint bpf commit: e708e2bd55c921f5bb554fa5837d132a878951cf Andrii Nakryiko (1): libbpf: Make bpf_endian co-exist with vmlinux.h Song Liu (1): bpf: Introduce helper bpf_get_task_stack() include/uapi/linux/bpf.h \| 37 +++++++++++++++++++++++++++++++++- src/bpf_endian.h \| 43 ++++++++++++++++++++++++++++++++-------- 2 files changed, 71 insertions(+), 9 deletions(-) -- 2.24.1	2020-07-01 14:36:55 -07:00
Andrii Nakryiko	89f7f0796a	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2020-07-01 14:36:55 -07:00
Song Liu	c054d91247	bpf: Introduce helper bpf_get_task_stack() Introduce helper bpf_get_task_stack(), which dumps stack trace of given task. This is different to bpf_get_stack(), which gets stack track of current task. One potential use case of bpf_get_task_stack() is to call it from bpf_iter__task and dump all /proc/<pid>/stack to a seq_file. bpf_get_task_stack() uses stack_trace_save_tsk() instead of get_perf_callchain() for kernel stack. The benefit of this choice is that stack_trace_save_tsk() doesn't require changes in arch/. The downside of using stack_trace_save_tsk() is that stack_trace_save_tsk() dumps the stack trace to unsigned long array. For 32-bit systems, we need to translate it to u64 array. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200630062846.664389-3-songliubraving@fb.com	2020-07-01 14:36:55 -07:00
Andrii Nakryiko	9c104b1637	libbpf: Make bpf_endian co-exist with vmlinux.h Make bpf_endian.h compatible with vmlinux.h. It is a frequent request from users wanting to use bpf_endian.h in their BPF applications using CO-RE and vmlinux.h. To achieve that, re-implement byte swap macros and drop all the header includes. This way it can be used both with linux header includes, as well as with a vmlinux.h. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200630152125.3631920-2-andriin@fb.com	2020-07-01 14:36:55 -07:00
Andrii Nakryiko	d08d57cd91	vmtests: check in vmlinux.h and use it for non-latest builds Manually generate vmlinux.h based on latest.config to be used for non-latest selftest build. This will keep bpftool and newest selftests builds succeeding, while at runtime blacklist will skip them. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-06-30 18:09:33 -07:00
Andrii Nakryiko	803243cc33	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: b3eece09e2e69f528a1ab6104861550dec149083 Checkpoint bpf-next commit: afa12644c877d3f627281bb6493d7ca8f9976e3d Baseline bpf commit: 4e15507fea70c0c312d79610efa46b6853ccf8e0 Checkpoint bpf commit: 2bdeb3ed547d8822b2566797afa6c2584abdb119 Andrii Nakryiko (4): bpf: Switch most helper return values from 32-bit int to 64-bit long libbpf: Prevent loading vmlinux BTF twice libbpf: Support disabling auto-loading BPF programs libbpf: Fix CO-RE relocs against .text section Colin Ian King (1): libbpf: Fix spelling mistake "kallasyms" -> "kallsyms" Dmitry Yakunin (1): bpf: Add SO_KEEPALIVE and related options to bpf_setsockopt Jesper Dangaard Brouer (1): libbpf: Adjust SEC short cut for expected attach type BPF_XDP_DEVMAP Quentin Monnet (1): bpf: Fix formatting in documentation for BPF helpers Yonghong Song (3): bpf: Add bpf_skc_to_tcp6_sock() helper bpf: Add bpf_skc_to_{tcp, tcp_timewait, tcp_request}_sock() helpers bpf: Add bpf_skc_to_udp6_sock() helper include/uapi/linux/bpf.h \| 277 ++++++++++++++++++++++----------------- src/libbpf.c \| 93 +++++++++---- src/libbpf.h \| 2 + src/libbpf.map \| 2 + 4 files changed, 233 insertions(+), 141 deletions(-) -- 2.24.1	2020-06-29 13:46:33 -07:00
Andrii Nakryiko	d707f8027b	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2020-06-29 13:46:33 -07:00
Jesper Dangaard Brouer	652f2c0a40	libbpf: Adjust SEC short cut for expected attach type BPF_XDP_DEVMAP Adjust the SEC("xdp_devmap/") prog type prefix to contain a slash "/" for expected attach type BPF_XDP_DEVMAP. This is consistent with other prog types like tracing. Fixes: 2778797037a6 ("libbpf: Add SEC name for xdp programs attached to device map") Suggested-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/159309521882.821855.6873145686353617509.stgit@firesoul	2020-06-29 13:46:33 -07:00
Quentin Monnet	2fcd394505	bpf: Fix formatting in documentation for BPF helpers When producing the bpf-helpers.7 man page from the documentation from the BPF user space header file, rst2man complains: <stdin>:2636: (ERROR/3) Unexpected indentation. <stdin>:2640: (WARNING/2) Block quote ends without a blank line; unexpected unindent. Let's fix formatting for the relevant chunk (item list in bpf_ringbuf_query()'s description), and for a couple other functions. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200623153935.6215-1-quentin@isovalent.com	2020-06-29 13:46:33 -07:00
Andrii Nakryiko	af3c9f9fc4	libbpf: Fix CO-RE relocs against .text section bpf_object__find_program_by_title(), used by CO-RE relocation code, doesn't return .text "BPF program", if it is a function storage for sub-programs. Because of that, any CO-RE relocation in helper non-inlined functions will fail. Fix this by searching for .text-corresponding BPF program manually. Adjust one of bpf_iter selftest to exhibit this pattern. Fixes: ddc7c3042614 ("libbpf: implement BPF CO-RE offset relocation algorithm") Reported-by: Yonghong Song <yhs@fb.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200619230423.691274-1-andriin@fb.com	2020-06-29 13:46:33 -07:00
Andrii Nakryiko	a62b08dd0c	libbpf: Support disabling auto-loading BPF programs Currently, bpf_object__load() (and by induction skeleton's load), will always attempt to prepare, relocate, and load into kernel every single BPF program found inside the BPF object file. This is often convenient and the right thing to do and what users expect. But there are plenty of cases (especially with BPF development constantly picking up the pace), where BPF application is intended to work with old kernels, with potentially reduced set of features. But on kernels supporting extra features, it would like to take a full advantage of them, by employing extra BPF program. This could be a choice of using fentry/fexit over kprobe/kretprobe, if kernel is recent enough and is built with BTF. Or BPF program might be providing optimized bpf_iter-based solution that user-space might want to use, whenever available. And so on. With libbpf and BPF CO-RE in particular, it's advantageous to not have to maintain two separate BPF object files to achieve this. So to enable such use cases, this patch adds ability to request not auto-loading chosen BPF programs. In such case, libbpf won't attempt to perform relocations (which might fail due to old kernel), won't try to resolve BTF types for BTF-aware (tp_btf/fentry/fexit/etc) program types, because BTF might not be present, and so on. Skeleton will also automatically skip auto-attachment step for such not loaded BPF programs. Overall, this feature allows to simplify development and deployment of real-world BPF applications with complicated compatibility requirements. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200625232629.3444003-2-andriin@fb.com	2020-06-29 13:46:33 -07:00
Yonghong Song	318ed9d544	bpf: Add bpf_skc_to_udp6_sock() helper The helper is used in tracing programs to cast a socket pointer to a udp6_sock pointer. The return value could be NULL if the casting is illegal. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Cc: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/bpf/20200623230815.3988481-1-yhs@fb.com	2020-06-29 13:46:33 -07:00
Yonghong Song	47370741be	bpf: Add bpf_skc_to_{tcp, tcp_timewait, tcp_request}_sock() helpers Three more helpers are added to cast a sock_common pointer to an tcp_sock, tcp_timewait_sock or a tcp_request_sock for tracing programs. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200623230811.3988277-1-yhs@fb.com	2020-06-29 13:46:33 -07:00
Yonghong Song	26e5e7dcb0	bpf: Add bpf_skc_to_tcp6_sock() helper The helper is used in tracing programs to cast a socket pointer to a tcp6_sock pointer. The return value could be NULL if the casting is illegal. A new helper return type RET_PTR_TO_BTF_ID_OR_NULL is added so the verifier is able to deduce proper return types for the helper. Different from the previous BTF_ID based helpers, the bpf_skc_to_tcp6_sock() argument can be several possible btf_ids. More specifically, all possible socket data structures with sock_common appearing in the first in the memory layout. This patch only added socket types related to tcp and udp. All possible argument btf_id and return value btf_id for helper bpf_skc_to_tcp6_sock() are pre-calculcated and cached. In the future, it is even possible to precompute these btf_id's at kernel build time. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200623230809.3988195-1-yhs@fb.com	2020-06-29 13:46:33 -07:00
Dmitry Yakunin	cd469e21e8	bpf: Add SO_KEEPALIVE and related options to bpf_setsockopt This patch adds support of SO_KEEPALIVE flag and TCP related options to bpf_setsockopt() routine. This is helpful if we want to enable or tune TCP keepalive for applications which don't do it in the userspace code. v3: - update kernel-doc in uapi (Nikita Vetoshkin <nekto0n@yandex-team.ru>) v4: - update kernel-doc in tools too (Alexei Starovoitov) - add test to selftests (Alexei Starovoitov) Signed-off-by: Dmitry Yakunin <zeil@yandex-team.ru> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200620153052.9439-3-zeil@yandex-team.ru	2020-06-29 13:46:33 -07:00
Andrii Nakryiko	18bfe12dc1	libbpf: Prevent loading vmlinux BTF twice Prevent loading/parsing vmlinux BTF twice in some cases: for CO-RE relocations and for BTF-aware hooks (tp_btf, fentry/fexit, etc). Fixes: a6ed02cac690 ("libbpf: Load btf_vmlinux only once per object.") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200624043805.1794620-1-andriin@fb.com	2020-06-29 13:46:33 -07:00
Colin Ian King	fef856084a	libbpf: Fix spelling mistake "kallasyms" -> "kallsyms" There is a spelling mistake in a pr_warn message. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200623084207.149253-1-colin.king@canonical.com	2020-06-29 13:46:33 -07:00
Andrii Nakryiko	6f8e021c3c	bpf: Switch most helper return values from 32-bit int to 64-bit long Switch most of BPF helper definitions from returning int to long. These definitions are coming from comments in BPF UAPI header and are used to generate bpf_helper_defs.h (under libbpf) to be later included and used from BPF programs. In actual in-kernel implementation, all the helpers are defined as returning u64, but due to some historical reasons, most of them are actually defined as returning int in UAPI (usually, to return 0 on success, and negative value on error). This actually causes Clang to quite often generate sub-optimal code, because compiler believes that return value is 32-bit, and in a lot of cases has to be up-converted (usually with a pair of 32-bit bit shifts) to 64-bit values, before they can be used further in BPF code. Besides just "polluting" the code, these 32-bit shifts quite often cause problems for cases in which return value matters. This is especially the case for the family of bpf_probe_read_str() functions. There are few other similar helpers (e.g., bpf_read_branch_records()), in which return value is used by BPF program logic to record variable-length data and process it. For such cases, BPF program logic carefully manages offsets within some array or map to read variable-length data. For such uses, it's crucial for BPF verifier to track possible range of register values to prove that all the accesses happen within given memory bounds. Those extraneous zero-extending bit shifts, inserted by Clang (and quite often interleaved with other code, which makes the issues even more challenging and sometimes requires employing extra per-variable compiler barriers), throws off verifier logic and makes it mark registers as having unknown variable offset. We'll study this pattern a bit later below. Another common pattern is to check return of BPF helper for non-zero state to detect error conditions and attempt alternative actions in such case. Even in this simple and straightforward case, this 32-bit vs BPF's native 64-bit mode quite often leads to sub-optimal and unnecessary extra code. We'll look at this pattern as well. Clang's BPF target supports two modes of code generation: ALU32, in which it is capable of using lower 32-bit parts of registers, and no-ALU32, in which only full 64-bit registers are being used. ALU32 mode somewhat mitigates the above described problems, but not in all cases. This patch switches all the cases in which BPF helpers return 0 or negative error from returning int to returning long. It is shown below that such change in definition leads to equivalent or better code. No-ALU32 mode benefits more, but ALU32 mode doesn't degrade or still gets improved code generation. Another class of cases switched from int to long are bpf_probe_read_str()-like helpers, which encode successful case as non-negative values, while still returning negative value for errors. In all of such cases, correctness is preserved due to two's complement encoding of negative values and the fact that all helpers return values with 32-bit absolute value. Two's complement ensures that for negative values higher 32 bits are all ones and when truncated, leave valid negative 32-bit value with the same value. Non-negative values have upper 32 bits set to zero and similarly preserve value when high 32 bits are truncated. This means that just casting to int/u32 is correct and efficient (and in ALU32 mode doesn't require any extra shifts). To minimize the chances of regressions, two code patterns were investigated, as mentioned above. For both patterns, BPF assembly was analyzed in ALU32/NO-ALU32 compiler modes, both with current 32-bit int return type and new 64-bit long return type. Case 1. Variable-length data reading and concatenation. This is quite ubiquitous pattern in tracing/monitoring applications, reading data like process's environment variables, file path, etc. In such case, many pieces of string-like variable-length data are read into a single big buffer, and at the end of the process, only a part of array containing actual data is sent to user-space for further processing. This case is tested in test_varlen.c selftest (in the next patch). Code flow is roughly as follows: void payload = &sample->payload; u64 len; len = bpf_probe_read_kernel_str(payload, MAX_SZ1, &source_data1); if (len <= MAX_SZ1) { payload += len; sample->len1 = len; } len = bpf_probe_read_kernel_str(payload, MAX_SZ2, &source_data2); if (len <= MAX_SZ2) { payload += len; sample->len2 = len; } / and so on / sample->total_len = payload - &sample->payload; / send over, e.g., perf buffer / There could be two variations with slightly different code generated: when len is 64-bit integer and when it is 32-bit integer. Both variations were analysed. BPF assembly instructions between two successive invocations of bpf_probe_read_kernel_str() were used to check code regressions. Results are below, followed by short analysis. Left side is using helpers with int return type, the right one is after the switch to long. ALU32 + INT ALU32 + LONG =========== ============ 64-BIT (13 insns): 64-BIT (10 insns): ------------------------------------ ------------------------------------ 17: call 115 17: call 115 18: if w0 > 256 goto +9 <LBB0_4> 18: if r0 > 256 goto +6 <LBB0_4> 19: w1 = w0 19: r1 = 0 ll 20: r1 <<= 32 21: (u64 )(r1 + 0) = r0 21: r1 s>>= 32 22: r6 = 0 ll 22: r2 = 0 ll 24: r6 += r0 24: (u64 )(r2 + 0) = r1 00000000000000c8 <LBB0_4>: 25: r6 = 0 ll 25: r1 = r6 27: r6 += r1 26: w2 = 256 00000000000000e0 <LBB0_4>: 27: r3 = 0 ll 28: r1 = r6 29: call 115 29: w2 = 256 30: r3 = 0 ll 32: call 115 32-BIT (11 insns): 32-BIT (12 insns): ------------------------------------ ------------------------------------ 17: call 115 17: call 115 18: if w0 > 256 goto +7 <LBB1_4> 18: if w0 > 256 goto +8 <LBB1_4> 19: r1 = 0 ll 19: r1 = 0 ll 21: (u32 )(r1 + 0) = r0 21: (u32 )(r1 + 0) = r0 22: w1 = w0 22: r0 <<= 32 23: r6 = 0 ll 23: r0 >>= 32 25: r6 += r1 24: r6 = 0 ll 00000000000000d0 <LBB1_4>: 26: r6 += r0 26: r1 = r6 00000000000000d8 <LBB1_4>: 27: w2 = 256 27: r1 = r6 28: r3 = 0 ll 28: w2 = 256 30: call 115 29: r3 = 0 ll 31: call 115 In ALU32 mode, the variant using 64-bit length variable clearly wins and avoids unnecessary zero-extension bit shifts. In practice, this is even more important and good, because BPF code won't need to do extra checks to "prove" that payload/len are within good bounds. 32-bit len is one instruction longer. Clang decided to do 64-to-32 casting with two bit shifts, instead of equivalent `w1 = w0` assignment. The former uses extra register. The latter might potentially lose some range information, but not for 32-bit value. So in this case, verifier infers that r0 is [0, 256] after check at 18:, and shifting 32 bits left/right keeps that range intact. We should probably look into Clang's logic and see why it chooses bitshifts over sub-register assignments for this. NO-ALU32 + INT NO-ALU32 + LONG ============== =============== 64-BIT (14 insns): 64-BIT (10 insns): ------------------------------------ ------------------------------------ 17: call 115 17: call 115 18: r0 <<= 32 18: if r0 > 256 goto +6 <LBB0_4> 19: r1 = r0 19: r1 = 0 ll 20: r1 >>= 32 21: (u64 )(r1 + 0) = r0 21: if r1 > 256 goto +7 <LBB0_4> 22: r6 = 0 ll 22: r0 s>>= 32 24: r6 += r0 23: r1 = 0 ll 00000000000000c8 <LBB0_4>: 25: (u64 )(r1 + 0) = r0 25: r1 = r6 26: r6 = 0 ll 26: r2 = 256 28: r6 += r0 27: r3 = 0 ll 00000000000000e8 <LBB0_4>: 29: call 115 29: r1 = r6 30: r2 = 256 31: r3 = 0 ll 33: call 115 32-BIT (13 insns): 32-BIT (13 insns): ------------------------------------ ------------------------------------ 17: call 115 17: call 115 18: r1 = r0 18: r1 = r0 19: r1 <<= 32 19: r1 <<= 32 20: r1 >>= 32 20: r1 >>= 32 21: if r1 > 256 goto +6 <LBB1_4> 21: if r1 > 256 goto +6 <LBB1_4> 22: r2 = 0 ll 22: r2 = 0 ll 24: (u32 )(r2 + 0) = r0 24: (u32 *)(r2 + 0) = r0 25: r6 = 0 ll 25: r6 = 0 ll 27: r6 += r1 27: r6 += r1 00000000000000e0 <LBB1_4>: 00000000000000e0 <LBB1_4>: 28: r1 = r6 28: r1 = r6 29: r2 = 256 29: r2 = 256 30: r3 = 0 ll 30: r3 = 0 ll 32: call 115 32: call 115 In NO-ALU32 mode, for the case of 64-bit len variable, Clang generates much superior code, as expected, eliminating unnecessary bit shifts. For 32-bit len, code is identical. So overall, only ALU-32 32-bit len case is more-or-less equivalent and the difference stems from internal Clang decision, rather than compiler lacking enough information about types. Case 2. Let's look at the simpler case of checking return result of BPF helper for errors. The code is very simple: long bla; if (bpf_probe_read_kenerl(&bla, sizeof(bla), 0)) return 1; else return 0; ALU32 + CHECK (9 insns) ALU32 + CHECK (9 insns) ==================================== ==================================== 0: r1 = r10 0: r1 = r10 1: r1 += -8 1: r1 += -8 2: w2 = 8 2: w2 = 8 3: r3 = 0 3: r3 = 0 4: call 113 4: call 113 5: w1 = w0 5: r1 = r0 6: w0 = 1 6: w0 = 1 7: if w1 != 0 goto +1 <LBB2_2> 7: if r1 != 0 goto +1 <LBB2_2> 8: w0 = 0 8: w0 = 0 0000000000000048 <LBB2_2>: 0000000000000048 <LBB2_2>: 9: exit 9: exit Almost identical code, the only difference is the use of full register assignment (r1 = r0) vs half-registers (w1 = w0) in instruction #5. On 32-bit architectures, new BPF assembly might be slightly less optimal, in theory. But one can argue that's not a big issue, given that use of full registers is still prevalent (e.g., for parameter passing). NO-ALU32 + CHECK (11 insns) NO-ALU32 + CHECK (9 insns) ==================================== ==================================== 0: r1 = r10 0: r1 = r10 1: r1 += -8 1: r1 += -8 2: r2 = 8 2: r2 = 8 3: r3 = 0 3: r3 = 0 4: call 113 4: call 113 5: r1 = r0 5: r1 = r0 6: r1 <<= 32 6: r0 = 1 7: r1 >>= 32 7: if r1 != 0 goto +1 <LBB2_2> 8: r0 = 1 8: r0 = 0 9: if r1 != 0 goto +1 <LBB2_2> 0000000000000048 <LBB2_2>: 10: r0 = 0 9: exit 0000000000000058 <LBB2_2>: 11: exit NO-ALU32 is a clear improvement, getting rid of unnecessary zero-extension bit shifts. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200623032224.4020118-1-andriin@fb.com	2020-06-29 13:46:33 -07:00
Andrii Nakryiko	143213eb82	README: info on routing general BPF/libbpf quesions We keep getting more and more questions about BPF/libbpf usage. This repo is not the right place to ask them, as not that many people monitor it. Re-route folks to bpf@vger.kernel.org Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-06-22 20:52:42 -07:00
Andrii Nakryiko	ac74ee188d	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 1bdb6c9a1c43fdf9b83b2331dfc6229bd2e71d9b Checkpoint bpf-next commit: b3eece09e2e69f528a1ab6104861550dec149083 Baseline bpf commit: 4e15507fea70c0c312d79610efa46b6853ccf8e0 Checkpoint bpf commit: 4e15507fea70c0c312d79610efa46b6853ccf8e0 Andrii Nakryiko (3): libbpf: Generalize libbpf externs support libbpf: Add support for extracting kernel symbol addresses libbpf: Wrap source argument of BPF_CORE_READ macro in parentheses src/bpf_core_read.h \| 8 +- src/bpf_helpers.h \| 1 + src/btf.h \| 5 + src/libbpf.c \| 482 +++++++++++++++++++++++++++++++------------- 4 files changed, 350 insertions(+), 146 deletions(-) -- 2.24.1	2020-06-22 20:31:52 -07:00
Andrii Nakryiko	15943906dc	libbpf: Wrap source argument of BPF_CORE_READ macro in parentheses Wrap source argument of BPF_CORE_READ family of macros into parentheses to allow uses like this: BPF_CORE_READ((struct cast_struct *)src, a, b, c); Fixes: 7db3822ab991 ("libbpf: Add BPF_CORE_READ/BPF_CORE_READ_INTO helpers") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200619231703.738941-8-andriin@fb.com	2020-06-22 20:31:52 -07:00
Andrii Nakryiko	85749135a6	libbpf: Add support for extracting kernel symbol addresses Add support for another (in addition to existing Kconfig) special kind of externs in BPF code, kernel symbol externs. Such externs allow BPF code to "know" kernel symbol address and either use it for comparisons with kernel data structures (e.g., struct file's f_op pointer, to distinguish different kinds of file), or, with the help of bpf_probe_user_kernel(), to follow pointers and read data from global variables. Kernel symbol addresses are found through /proc/kallsyms, which should be present in the system. Currently, such kernel symbol variables are typeless: they have to be defined as `extern const void <symbol>` and the only operation you can do (in C code) with them is to take its address. Such extern should reside in a special section '.ksyms'. bpf_helpers.h header provides __ksym macro for this. Strong vs weak semantics stays the same as with Kconfig externs. If symbol is not found in /proc/kallsyms, this will be a failure for strong (non-weak) extern, but will be defaulted to 0 for weak externs. If the same symbol is defined multiple times in /proc/kallsyms, then it will be error if any of the associated addresses differs. In that case, address is ambiguous, so libbpf falls on the side of caution, rather than confusing user with randomly chosen address. In the future, once kernel is extended with variables BTF information, such ksym externs will be supported in a typed version, which will allow BPF program to read variable's contents directly, similarly to how it's done for fentry/fexit input arguments. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Hao Luo <haoluo@google.com> Link: https://lore.kernel.org/bpf/20200619231703.738941-3-andriin@fb.com	2020-06-22 20:31:52 -07:00
Andrii Nakryiko	3b320677cd	libbpf: Generalize libbpf externs support Switch existing Kconfig externs to be just one of few possible kinds of more generic externs. This refactoring is in preparation for ksymbol extern support, added in the follow up patch. There are no functional changes intended. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Hao Luo <haoluo@google.com> Link: https://lore.kernel.org/bpf/20200619231703.738941-2-andriin@fb.com	2020-06-22 20:31:52 -07:00
Andrii Nakryiko	15fee53503	vmtests: blacklist test using RINGBUF Test was updated to use BPF_MAP_TYPE_RINGBUF, which is only available starting from 5.8 version. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-06-22 17:11:04 -07:00
Andrii Nakryiko	169d35c746	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 69119673bd50b176ded34032fadd41530fb5af21 Checkpoint bpf-next commit: 1bdb6c9a1c43fdf9b83b2331dfc6229bd2e71d9b Baseline bpf commit: 4e15507fea70c0c312d79610efa46b6853ccf8e0 Checkpoint bpf commit: 4e15507fea70c0c312d79610efa46b6853ccf8e0 Andrii Nakryiko (2): libbpf: Bump version to 0.1.0 libbpf: Add a bunch of attribute getters/setters for map definitions src/libbpf.c \| 100 +++++++++++++++++++++++++++++++++++++++++++++---- src/libbpf.h \| 30 +++++++++++++-- src/libbpf.map \| 17 +++++++++ 3 files changed, 137 insertions(+), 10 deletions(-) -- 2.24.1	2020-06-22 17:11:04 -07:00
Andrii Nakryiko	d8d4713476	libbpf: Add a bunch of attribute getters/setters for map definitions Add a bunch of getter for various aspects of BPF map. Some of these attribute (e.g., key_size, value_size, type, etc) are available right now in struct bpf_map_def, but this patch adds getter allowing to fetch them individually. bpf_map_def approach isn't very scalable, when ABI stability requirements are taken into account. It's much easier to extend libbpf and add support for new features, when each aspect of BPF map has separate getter/setter. Getters follow the common naming convention of not explicitly having "get" in its name: bpf_map__type() returns map type, bpf_map__key_size() returns key_size. Setters, though, explicitly have set in their name: bpf_map__set_type(), bpf_map__set_key_size(). This patch ensures we now have a getter and a setter for the following map attributes: - type; - max_entries; - map_flags; - numa_node; - key_size; - value_size; - ifindex. bpf_map__resize() enforces unnecessary restriction of max_entries > 0. It is unnecessary, because libbpf actually supports zero max_entries for some cases (e.g., for PERF_EVENT_ARRAY map) and treats it specially during map creation time. To allow setting max_entries=0, new bpf_map__set_max_entries() setter is added. bpf_map__resize()'s behavior is preserved for backwards compatibility reasons. Map ifindex getter is added as well. There is a setter already, but no corresponding getter. Fix this assymetry as well. bpf_map__set_ifindex() itself is converted from void function into error-returning one, similar to other setters. The only error returned right now is -EBUSY, if BPF map is already loaded and has corresponding FD. One lacking attribute with no ability to get/set or even specify it declaratively is numa_node. This patch fixes this gap and both adds programmatic getter/setter, as well as adds support for numa_node field in BTF-defined map. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20200621062112.3006313-1-andriin@fb.com	2020-06-22 17:11:04 -07:00
Andrii Nakryiko	ef26b4c37f	libbpf: Bump version to 0.1.0 Bump libbpf version to 0.1.0, as new development cycle starts. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200617183132.1970836-1-andriin@fb.com	2020-06-22 17:11:04 -07:00
Andrii Nakryiko	d7b2934cf9	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 69119673bd50b176ded34032fadd41530fb5af21 Checkpoint bpf-next commit: 4e15507fea70c0c312d79610efa46b6853ccf8e0 Baseline bpf commit: 6903cdae9f9f08d61e49c16cbef11c293e33a615 Checkpoint bpf commit: 4e15507fea70c0c312d79610efa46b6853ccf8e0 Andrii Nakryiko (1): libbpf: Forward-declare bpf_stats_type for systems with outdated UAPI headers src/bpf.h \| 2 ++ 1 file changed, 2 insertions(+) -- 2.24.1	2020-06-22 15:43:44 -07:00
Andrii Nakryiko	c83d2166e8	libbpf: Forward-declare bpf_stats_type for systems with outdated UAPI headers Systems that doesn't yet have the very latest linux/bpf.h header, enum bpf_stats_type will be undefined, causing compilation warnings. Prevents this by forward-declaring enum. Fixes: 0bee106716cf ("libbpf: Add support for command BPF_ENABLE_STATS") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200621031159.2279101-1-andriin@fb.com	2020-06-22 15:43:44 -07:00
Andrii Nakryiko	fb27968bf1	vmtests: blacklist 5.5 test and temporary blacklist core_reloc test Permanently blacklist load_bytes_relative test on 5.5 due to missing functionality. Also temporarily blacklist core_reloc test due to failure on latest kernel. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-06-17 11:48:22 -07:00
Andrii Nakryiko	d6ae406429	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: cb8e59cc87201af93dfbb6c3dccc8fcad72a09c2 Checkpoint bpf-next commit: 69119673bd50b176ded34032fadd41530fb5af21 Baseline bpf commit: 47f6bc4ce1ff70d7ba0924c2f1c218c96cd585fb Checkpoint bpf commit: 6903cdae9f9f08d61e49c16cbef11c293e33a615 Andrii Nakryiko (2): libbpf: Support pre-initializing .bss global variables bpf: Fix definition of bpf_ringbuf_output() helper in UAPI comments include/uapi/linux/bpf.h \| 2 +- src/libbpf.c \| 4 ---- 2 files changed, 1 insertion(+), 5 deletions(-) -- 2.24.1	2020-06-17 11:48:22 -07:00
Andrii Nakryiko	cb174c5b8d	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2020-06-17 11:48:22 -07:00
Andrii Nakryiko	17f747ed38	bpf: Fix definition of bpf_ringbuf_output() helper in UAPI comments Fix definition of bpf_ringbuf_output() in UAPI header comments, which is used to generate libbpf's bpf_helper_defs.h header. Return value is a number (error code), not a pointer. Fixes: 457f44363a88 ("bpf: Implement BPF ring buffer and verifier support for it") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200615214926.3638836-1-andriin@fb.com	2020-06-17 11:48:22 -07:00
Andrii Nakryiko	bf34234885	libbpf: Support pre-initializing .bss global variables Remove invalid assumption in libbpf that .bss map doesn't have to be updated in kernel. With addition of skeleton and memory-mapped initialization image, .bss doesn't have to be all zeroes when BPF map is created, because user-code might have initialized those variables from user-space. Fixes: eba9c5f498a1 ("libbpf: Refactor global data map initialization") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200612194504.557844-1-andriin@fb.com	2020-06-17 11:48:22 -07:00
Andrii Nakryiko	46c272f9b4	sync: don't check and warn about non-empty merges anymore Initial versions of sync script couldn't handle non-empty merges. But since then, script became smarter, more interactive and thus more powerful and can handle some complicated situations easily on its own, while falling back to human intervention for even more complicated situations. This non-empty merge check has outlived its purpose and is just an annoying bump in sync process. Drop it. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-06-10 13:59:07 -07:00
Andrii Nakryiko	40e69c9538	vmtests: un-blacklist ringbuf and cls_redirect selftests Both tests should be fixed now. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-06-10 13:58:45 -07:00
Andrii Nakryiko	a975d8ea28	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 9bc499befeef07a4d79f4924bfca05634ad8fc97 Checkpoint bpf-next commit: cb8e59cc87201af93dfbb6c3dccc8fcad72a09c2 Baseline bpf commit: bdc48fa11e46f867ea4d75fa59ee87a7f48be144 Checkpoint bpf commit: 47f6bc4ce1ff70d7ba0924c2f1c218c96cd585fb Andrii Nakryiko (1): libbpf: Handle GCC noreturn-turned-volatile quirk Arnaldo Carvalho de Melo (1): libbpf: Define __WORDSIZE if not available Jesper Dangaard Brouer (1): bpf: Selftests and tools use struct bpf_devmap_val from uapi include/uapi/linux/bpf.h \| 13 +++++++++++++ src/btf_dump.c \| 33 ++++++++++++++++++++++++--------- src/hashmap.h \| 7 +++---- 3 files changed, 40 insertions(+), 13 deletions(-) -- 2.24.1	2020-06-10 13:58:45 -07:00
Andrii Nakryiko	45f7113925	libbpf: Handle GCC noreturn-turned-volatile quirk Handle a GCC quirk of emitting extra volatile modifier in DWARF (and subsequently preserved in BTF by pahole) for function pointers marked as __attribute__((noreturn)). This was the way to mark such functions before GCC 2.5 added noreturn attribute. Drop such func_proto modifiers, similarly to how it's done for array (also to handle GCC quirk/bug). Such volatile attribute is emitted by GCC only, so existing selftests can't express such test. Simple repro is like this (compiled with GCC + BTF generated by pahole): struct my_struct { void __attribute__((noreturn)) (fn)(int); }; struct my_struct a; Without this fix, output will be: struct my_struct { voidvolatile (fn)(int); }; With the fix: struct my_struct { void (*fn)(int); }; Fixes: 351131b51c7a ("libbpf: add btf_dump API for BTF-to-C conversion") Reported-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Link: https://lore.kernel.org/bpf/20200610052335.2862559-1-andriin@fb.com	2020-06-10 13:58:45 -07:00
Arnaldo Carvalho de Melo	6816734203	libbpf: Define __WORDSIZE if not available Some systems, such as Android, don't have a define for __WORDSIZE, do it in terms of __SIZEOF_LONG__, as done in perf since 2012: http://git.kernel.org/torvalds/c/3f34f6c0233ae055b5 For reference: https://gcc.gnu.org/onlinedocs/cpp/Common-Predefined-Macros.html I build tested it here and Andrii did some Travis CI build tests too. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200608161150.GA3073@kernel.org	2020-06-10 13:58:45 -07:00
Jesper Dangaard Brouer	11d2a59689	bpf: Selftests and tools use struct bpf_devmap_val from uapi Sync tools uapi bpf.h header file and update selftests that use struct bpf_devmap_val. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/159170951195.2102545.1833108712124273987.stgit@firesoul	2020-06-10 13:58:45 -07:00
Andrii Nakryiko	8c7527ea88	travis-ci: fix travis_terminate invocation travis_terminate expects integer argument for exit code. Add it. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-06-10 12:12:01 -07:00
Toke Høiland-Jørgensen	c569e03985	README: Add BTF and Clang information for Arch Linux Arch recently added BTF to their distribution kernels - see https://bugs.archlinux.org/task/66260 Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>	2020-06-08 09:33:59 -07:00
Andrii Nakryiko	1862741fb0	vmtest: disable ringbuf test on latest for now ringbuf selftest is flaky, disable it for now. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-06-04 10:48:08 -07:00
Andrii Nakryiko	6a269cf458	README: add OpenSUSE BTF availability info Add note about OpenSUSE Tumbleweed and BTF.	2020-06-04 10:42:40 -07:00
Andrii Nakryiko	6e15a022db	README: add BTF and CO-RE info Add list of Linux distributions with kernel BTF built-in. Give few useful links to BPF CO-RE-related material to help users get started.	2020-06-03 11:26:00 -07:00
Andrii Nakryiko	20d9816471	vmtest: temporary blacklist changes to make CI green Coarse-grained blacklisting until test_progs blacklisting w/ subtests works better. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-06-02 18:09:36 -07:00
Andrii Nakryiko	538b3f4ce7	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 9a25c1df24a6fea9dc79eec950453c4e00f707fd Checkpoint bpf-next commit: 9bc499befeef07a4d79f4924bfca05634ad8fc97 Baseline bpf commit: bdc48fa11e46f867ea4d75fa59ee87a7f48be144 Checkpoint bpf commit: bdc48fa11e46f867ea4d75fa59ee87a7f48be144 Daniel Borkmann (2): bpf: Fix up bpf_skb_adjust_room helper's skb csum setting bpf: Add csum_level helper for fixing up csum levels include/uapi/linux/bpf.h \| 51 +++++++++++++++++++++++++++++++++++++++- 1 file changed, 50 insertions(+), 1 deletion(-) -- 2.24.1	2020-06-02 18:09:36 -07:00
Andrii Nakryiko	f2610ca9cf	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2020-06-02 18:09:36 -07:00
Daniel Borkmann	adb5dd203c	bpf: Add csum_level helper for fixing up csum levels Add a bpf_csum_level() helper which BPF programs can use in combination with bpf_skb_adjust_room() when they pass in BPF_F_ADJ_ROOM_NO_CSUM_RESET flag to the latter to avoid falling back to CHECKSUM_NONE. The bpf_csum_level() allows to adjust CHECKSUM_UNNECESSARY skb->csum_levels via BPF_CSUM_LEVEL_{INC,DEC} which calls __skb_{incr,decr}_checksum_unnecessary() on the skb. The helper also allows a BPF_CSUM_LEVEL_RESET which sets the skb's csum to CHECKSUM_NONE as well as a BPF_CSUM_LEVEL_QUERY to just return the current level. Without this helper, there is no way to otherwise adjust the skb->csum_level. I did not add an extra dummy flags as there is plenty of free bitspace in level argument itself iff ever needed in future. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Lorenz Bauer <lmb@cloudflare.com> Link: https://lore.kernel.org/bpf/279ae3717cb3d03c0ffeb511493c93c450a01e1a.1591108731.git.daniel@iogearbox.net	2020-06-02 18:09:36 -07:00
Daniel Borkmann	3aadd91e97	bpf: Fix up bpf_skb_adjust_room helper's skb csum setting Lorenz recently reported: In our TC classifier cls_redirect [0], we use the following sequence of helper calls to decapsulate a GUE (basically IP + UDP + custom header) encapsulated packet: bpf_skb_adjust_room(skb, -encap_len, BPF_ADJ_ROOM_MAC, BPF_F_ADJ_ROOM_FIXED_GSO) bpf_redirect(skb->ifindex, BPF_F_INGRESS) It seems like some checksums of the inner headers are not validated in this case. For example, a TCP SYN packet with invalid TCP checksum is still accepted by the network stack and elicits a SYN ACK. [...] That is, we receive the following packet from the driver: \| ETH \| IP \| UDP \| GUE \| IP \| TCP \| skb->ip_summed == CHECKSUM_UNNECESSARY ip_summed is CHECKSUM_UNNECESSARY because our NICs do rx checksum offloading. On this packet we run skb_adjust_room_mac(-encap_len), and get the following: \| ETH \| IP \| TCP \| skb->ip_summed == CHECKSUM_UNNECESSARY Note that ip_summed is still CHECKSUM_UNNECESSARY. After bpf_redirect()'ing into the ingress, we end up in tcp_v4_rcv(). There, skb_checksum_init() is turned into a no-op due to CHECKSUM_UNNECESSARY. The bpf_skb_adjust_room() helper is not aware of protocol specifics. Internally, it handles the CHECKSUM_COMPLETE case via skb_postpull_rcsum(), but that does not cover CHECKSUM_UNNECESSARY. In this case skb->csum_level of the original skb prior to bpf_skb_adjust_room() call was 0, that is, covering UDP. Right now there is no way to adjust the skb->csum_level. NICs that have checksum offload disabled (CHECKSUM_NONE) or that support CHECKSUM_COMPLETE are not affected. Use a safe default for CHECKSUM_UNNECESSARY by resetting to CHECKSUM_NONE and add a flag to the helper called BPF_F_ADJ_ROOM_NO_CSUM_RESET that allows users from opting out. Opting out is useful for the case where we don't remove/add full protocol headers, or for the case where a user wants to adjust the csum level manually e.g. through bpf_csum_level() helper that is added in subsequent patch. The bpf_skb_proto_{4_to_6,6_to_4}() for NAT64/46 translation from the BPF bpf_skb_change_proto() helper uses bpf_skb_net_hdr_{push,pop}() pair internally as well but doesn't change layers, only transitions between v4 to v6 and vice versa, therefore no adoption is required there. [0] https://lore.kernel.org/bpf/20200424185556.7358-1-lmb@cloudflare.com/ Fixes: 2be7e212d541 ("bpf: add bpf_skb_adjust_room helper") Reported-by: Lorenz Bauer <lmb@cloudflare.com> Reported-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/bpf/CACAyw9-uU_52esMd1JjuA80fRPHJv5vsSg8GnfW3t_qDU4aVKQ@mail.gmail.com/ Link: https://lore.kernel.org/bpf/11a90472e7cce83e76ddbfce81fdfce7bfc68808.1591108731.git.daniel@iogearbox.net	2020-06-02 18:09:36 -07:00
Andrii Nakryiko	1206ab0e75	vmtest: optionally adjust selftest files depending on kernel version Some selftests can't be compiled on older kernels. This allows to fix these problems, if necessary. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-06-01 22:22:32 -07:00
Andrii Nakryiko	70eac9941d	Makefile: add ringbuf.o to the list of object files Add newly added ringbuf.o to the list of OBJS. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-06-01 22:22:32 -07:00
Andrii Nakryiko	2fdbf42f98	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: dda18a5c0b75461d1ed228f80b59c67434b8d601 Checkpoint bpf-next commit: 9a25c1df24a6fea9dc79eec950453c4e00f707fd Baseline bpf commit: f85c1598ddfe83f61d0656bd1d2025fa3b148b99 Checkpoint bpf commit: bdc48fa11e46f867ea4d75fa59ee87a7f48be144 Alexei Starovoitov (1): tools/bpf: sync bpf.h Andrii Nakryiko (3): bpf: Implement BPF ring buffer and verifier support for it libbpf: Add BPF ring buffer support libbpf: Add _GNU_SOURCE for reallocarray to ringbuf.c David Ahern (3): bpf: Add support to attach bpf program to a devmap entry xdp: Add xdp_txq_info to xdp_buff libbpf: Add SEC name for xdp programs attached to device map Eelco Chaudron (2): libbpf: Add API to consume the perf ring buffer content libbpf: Fix perf_buffer__free() API for sparse allocs Jakub Sitnicki (2): bpf: Add link-based BPF program attachment to network namespace libbpf: Add support for bpf_link-based netns attachment John Fastabend (1): bpf, sk_msg: Add get socket storage helpers include/uapi/linux/bpf.h \| 95 ++++++++++++- src/libbpf.c \| 49 ++++++- src/libbpf.h \| 24 ++++ src/libbpf.map \| 7 + src/libbpf_probes.c \| 5 + src/ringbuf.c \| 288 +++++++++++++++++++++++++++++++++++++++ 6 files changed, 461 insertions(+), 7 deletions(-) create mode 100644 src/ringbuf.c -- 2.24.1	2020-06-01 22:22:32 -07:00
Andrii Nakryiko	365e4805a1	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2020-06-01 22:22:32 -07:00
Jakub Sitnicki	890f25520a	libbpf: Add support for bpf_link-based netns attachment Add bpf_program__attach_nets(), which uses LINK_CREATE subcommand to create an FD-based kernel bpf_link, for attach types tied to network namespace, that is BPF_FLOW_DISSECTOR for the moment. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200531082846.2117903-7-jakub@cloudflare.com	2020-06-01 22:22:32 -07:00
Jakub Sitnicki	fbdee96fa1	bpf: Add link-based BPF program attachment to network namespace Extend bpf() syscall subcommands that operate on bpf_link, that is LINK_CREATE, LINK_UPDATE, OBJ_GET_INFO, to accept attach types tied to network namespaces (only flow dissector at the moment). Link-based and prog-based attachment can be used interchangeably, but only one can exist at a time. Attempts to attach a link when a prog is already attached directly, and the other way around, will be met with -EEXIST. Attempts to detach a program when link exists result in -EINVAL. Attachment of multiple links of same attach type to one netns is not supported with the intention to lift the restriction when a use-case presents itself. Because of that link create returns -E2BIG when trying to create another netns link, when one already exists. Link-based attachments to netns don't keep a netns alive by holding a ref to it. Instead links get auto-detached from netns when the latter is being destroyed, using a pernet pre_exit callback. When auto-detached, link lives in defunct state as long there are open FDs for it. -ENOLINK is returned if a user tries to update a defunct link. Because bpf_link to netns doesn't hold a ref to struct net, special care is taken when releasing, updating, or filling link info. The netns might be getting torn down when any of these link operations are in progress. That is why auto-detach and update/release/fill_info are synchronized by the same mutex. Also, link ops have to always check if auto-detach has not happened yet and if netns is still alive (refcnt > 0). Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200531082846.2117903-5-jakub@cloudflare.com	2020-06-01 22:22:32 -07:00
Andrii Nakryiko	f54c56be0d	libbpf: Add _GNU_SOURCE for reallocarray to ringbuf.c On systems with recent enough glibc, reallocarray compat won't kick in, so reallocarray() itself has to come from stdlib.h include. But _GNU_SOURCE is necessary to enable it. So add it. Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200601202601.2139477-1-andriin@fb.com	2020-06-01 22:22:32 -07:00
Alexei Starovoitov	8dc4b38871	tools/bpf: sync bpf.h Sync bpf.h into tool/include/uapi/ Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-06-01 22:22:32 -07:00
David Ahern	ed023acd35	libbpf: Add SEC name for xdp programs attached to device map Support SEC("xdp_devmap*") as a short cut for loading the program with type BPF_PROG_TYPE_XDP and expected attach type BPF_XDP_DEVMAP. Signed-off-by: David Ahern <dsahern@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20200529220716.75383-5-dsahern@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-06-01 22:22:32 -07:00
David Ahern	ff3116bfcb	xdp: Add xdp_txq_info to xdp_buff Add xdp_txq_info as the Tx counterpart to xdp_rxq_info. At the moment only the device is added. Other fields (queue_index) can be added as use cases arise. >From a UAPI perspective, add egress_ifindex to xdp context for bpf programs to see the Tx device. Update the verifier to only allow accesses to egress_ifindex by XDP programs with BPF_XDP_DEVMAP expected attach type. Signed-off-by: David Ahern <dsahern@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20200529220716.75383-4-dsahern@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-06-01 22:22:32 -07:00
David Ahern	65f4b3ba4c	bpf: Add support to attach bpf program to a devmap entry Add BPF_XDP_DEVMAP attach type for use with programs associated with a DEVMAP entry. Allow DEVMAPs to associate a program with a device entry by adding a bpf_prog.fd to 'struct bpf_devmap_val'. Values read show the program id, so the fd and id are a union. bpf programs can get access to the struct via vmlinux.h. The program associated with the fd must have type XDP with expected attach type BPF_XDP_DEVMAP. When a program is associated with a device index, the program is run on an XDP_REDIRECT and before the buffer is added to the per-cpu queue. At this point rxq data is still valid; the next patch adds tx device information allowing the prorgam to see both ingress and egress device indices. XDP generic is skb based and XDP programs do not work with skb's. Block the use case by walking maps used by a program that is to be attached via xdpgeneric and fail if any of them are DEVMAP / DEVMAP_HASH with Block attach of BPF_XDP_DEVMAP programs to devices. Signed-off-by: David Ahern <dsahern@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20200529220716.75383-3-dsahern@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-06-01 22:22:32 -07:00
Andrii Nakryiko	e1bf7a787e	libbpf: Add BPF ring buffer support Declaring and instantiating BPF ring buffer doesn't require any changes to libbpf, as it's just another type of maps. So using existing BTF-defined maps syntax with __uint(type, BPF_MAP_TYPE_RINGBUF) and __uint(max_elements, <size-of-ring-buf>) is all that's necessary to create and use BPF ring buffer. This patch adds BPF ring buffer consumer to libbpf. It is very similar to perf_buffer implementation in terms of API, but also attempts to fix some minor problems and inconveniences with existing perf_buffer API. ring_buffer support both single ring buffer use case (with just using ring_buffer__new()), as well as allows to add more ring buffers, each with its own callback and context. This allows to efficiently poll and consume multiple, potentially completely independent, ring buffers, using single epoll instance. The latter is actually a problem in practice for applications that are using multiple sets of perf buffers. They have to create multiple instances for struct perf_buffer and poll them independently or in a loop, each approach having its own problems (e.g., inability to use a common poll timeout). struct ring_buffer eliminates this problem by aggregating many independent ring buffer instances under the single "ring buffer manager". Second, perf_buffer's callback can't return error, so applications that need to stop polling due to error in data or data signalling the end, have to use extra mechanisms to signal that polling has to stop. ring_buffer's callback can return error, which will be passed through back to user code and can be acted upon appropariately. Two APIs allow to consume ring buffer data: - ring_buffer__poll(), which will wait for data availability notification and will consume data only from reported ring buffer(s); this API allows to efficiently use resources by reading data only when it becomes available; - ring_buffer__consume(), will attempt to read new records regardless of data availablity notification sub-system. This API is useful for cases when lowest latency is required, in expense of burning CPU resources. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200529075424.3139988-3-andriin@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-06-01 22:22:32 -07:00
Andrii Nakryiko	17a6d61898	bpf: Implement BPF ring buffer and verifier support for it This commit adds a new MPSC ring buffer implementation into BPF ecosystem, which allows multiple CPUs to submit data to a single shared ring buffer. On the consumption side, only single consumer is assumed. Motivation ---------- There are two distinctive motivators for this work, which are not satisfied by existing perf buffer, which prompted creation of a new ring buffer implementation. - more efficient memory utilization by sharing ring buffer across CPUs; - preserving ordering of events that happen sequentially in time, even across multiple CPUs (e.g., fork/exec/exit events for a task). These two problems are independent, but perf buffer fails to satisfy both. Both are a result of a choice to have per-CPU perf ring buffer. Both can be also solved by having an MPSC implementation of ring buffer. The ordering problem could technically be solved for perf buffer with some in-kernel counting, but given the first one requires an MPSC buffer, the same solution would solve the second problem automatically. Semantics and APIs ------------------ Single ring buffer is presented to BPF programs as an instance of BPF map of type BPF_MAP_TYPE_RINGBUF. Two other alternatives considered, but ultimately rejected. One way would be to, similar to BPF_MAP_TYPE_PERF_EVENT_ARRAY, make BPF_MAP_TYPE_RINGBUF could represent an array of ring buffers, but not enforce "same CPU only" rule. This would be more familiar interface compatible with existing perf buffer use in BPF, but would fail if application needed more advanced logic to lookup ring buffer by arbitrary key. HASH_OF_MAPS addresses this with current approach. Additionally, given the performance of BPF ringbuf, many use cases would just opt into a simple single ring buffer shared among all CPUs, for which current approach would be an overkill. Another approach could introduce a new concept, alongside BPF map, to represent generic "container" object, which doesn't necessarily have key/value interface with lookup/update/delete operations. This approach would add a lot of extra infrastructure that has to be built for observability and verifier support. It would also add another concept that BPF developers would have to familiarize themselves with, new syntax in libbpf, etc. But then would really provide no additional benefits over the approach of using a map. BPF_MAP_TYPE_RINGBUF doesn't support lookup/update/delete operations, but so doesn't few other map types (e.g., queue and stack; array doesn't support delete, etc). The approach chosen has an advantage of re-using existing BPF map infrastructure (introspection APIs in kernel, libbpf support, etc), being familiar concept (no need to teach users a new type of object in BPF program), and utilizing existing tooling (bpftool). For common scenario of using a single ring buffer for all CPUs, it's as simple and straightforward, as would be with a dedicated "container" object. On the other hand, by being a map, it can be combined with ARRAY_OF_MAPS and HASH_OF_MAPS map-in-maps to implement a wide variety of topologies, from one ring buffer for each CPU (e.g., as a replacement for perf buffer use cases), to a complicated application hashing/sharding of ring buffers (e.g., having a small pool of ring buffers with hashed task's tgid being a look up key to preserve order, but reduce contention). Key and value sizes are enforced to be zero. max_entries is used to specify the size of ring buffer and has to be a power of 2 value. There are a bunch of similarities between perf buffer (BPF_MAP_TYPE_PERF_EVENT_ARRAY) and new BPF ring buffer semantics: - variable-length records; - if there is no more space left in ring buffer, reservation fails, no blocking; - memory-mappable data area for user-space applications for ease of consumption and high performance; - epoll notifications for new incoming data; - but still the ability to do busy polling for new data to achieve the lowest latency, if necessary. BPF ringbuf provides two sets of APIs to BPF programs: - bpf_ringbuf_output() allows to copy data from one place to a ring buffer, similarly to bpf_perf_event_output(); - bpf_ringbuf_reserve()/bpf_ringbuf_commit()/bpf_ringbuf_discard() APIs split the whole process into two steps. First, a fixed amount of space is reserved. If successful, a pointer to a data inside ring buffer data area is returned, which BPF programs can use similarly to a data inside array/hash maps. Once ready, this piece of memory is either committed or discarded. Discard is similar to commit, but makes consumer ignore the record. bpf_ringbuf_output() has disadvantage of incurring extra memory copy, because record has to be prepared in some other place first. But it allows to submit records of the length that's not known to verifier beforehand. It also closely matches bpf_perf_event_output(), so will simplify migration significantly. bpf_ringbuf_reserve() avoids the extra copy of memory by providing a memory pointer directly to ring buffer memory. In a lot of cases records are larger than BPF stack space allows, so many programs have use extra per-CPU array as a temporary heap for preparing sample. bpf_ringbuf_reserve() avoid this needs completely. But in exchange, it only allows a known constant size of memory to be reserved, such that verifier can verify that BPF program can't access memory outside its reserved record space. bpf_ringbuf_output(), while slightly slower due to extra memory copy, covers some use cases that are not suitable for bpf_ringbuf_reserve(). The difference between commit and discard is very small. Discard just marks a record as discarded, and such records are supposed to be ignored by consumer code. Discard is useful for some advanced use-cases, such as ensuring all-or-nothing multi-record submission, or emulating temporary malloc()/free() within single BPF program invocation. Each reserved record is tracked by verifier through existing reference-tracking logic, similar to socket ref-tracking. It is thus impossible to reserve a record, but forget to submit (or discard) it. bpf_ringbuf_query() helper allows to query various properties of ring buffer. Currently 4 are supported: - BPF_RB_AVAIL_DATA returns amount of unconsumed data in ring buffer; - BPF_RB_RING_SIZE returns the size of ring buffer; - BPF_RB_CONS_POS/BPF_RB_PROD_POS returns current logical possition of consumer/producer, respectively. Returned values are momentarily snapshots of ring buffer state and could be off by the time helper returns, so this should be used only for debugging/reporting reasons or for implementing various heuristics, that take into account highly-changeable nature of some of those characteristics. One such heuristic might involve more fine-grained control over poll/epoll notifications about new data availability in ring buffer. Together with BPF_RB_NO_WAKEUP/BPF_RB_FORCE_WAKEUP flags for output/commit/discard helpers, it allows BPF program a high degree of control and, e.g., more efficient batched notifications. Default self-balancing strategy, though, should be adequate for most applications and will work reliable and efficiently already. Design and implementation ------------------------- This reserve/commit schema allows a natural way for multiple producers, either on different CPUs or even on the same CPU/in the same BPF program, to reserve independent records and work with them without blocking other producers. This means that if BPF program was interruped by another BPF program sharing the same ring buffer, they will both get a record reserved (provided there is enough space left) and can work with it and submit it independently. This applies to NMI context as well, except that due to using a spinlock during reservation, in NMI context, bpf_ringbuf_reserve() might fail to get a lock, in which case reservation will fail even if ring buffer is not full. The ring buffer itself internally is implemented as a power-of-2 sized circular buffer, with two logical and ever-increasing counters (which might wrap around on 32-bit architectures, that's not a problem): - consumer counter shows up to which logical position consumer consumed the data; - producer counter denotes amount of data reserved by all producers. Each time a record is reserved, producer that "owns" the record will successfully advance producer counter. At that point, data is still not yet ready to be consumed, though. Each record has 8 byte header, which contains the length of reserved record, as well as two extra bits: busy bit to denote that record is still being worked on, and discard bit, which might be set at commit time if record is discarded. In the latter case, consumer is supposed to skip the record and move on to the next one. Record header also encodes record's relative offset from the beginning of ring buffer data area (in pages). This allows bpf_ringbuf_commit()/bpf_ringbuf_discard() to accept only the pointer to the record itself, without requiring also the pointer to ring buffer itself. Ring buffer memory location will be restored from record metadata header. This significantly simplifies verifier, as well as improving API usability. Producer counter increments are serialized under spinlock, so there is a strict ordering between reservations. Commits, on the other hand, are completely lockless and independent. All records become available to consumer in the order of reservations, but only after all previous records where already committed. It is thus possible for slow producers to temporarily hold off submitted records, that were reserved later. Reservation/commit/consumer protocol is verified by litmus tests in Documentation/litmus-test/bpf-rb. One interesting implementation bit, that significantly simplifies (and thus speeds up as well) implementation of both producers and consumers is how data area is mapped twice contiguously back-to-back in the virtual memory. This allows to not take any special measures for samples that have to wrap around at the end of the circular buffer data area, because the next page after the last data page would be first data page again, and thus the sample will still appear completely contiguous in virtual memory. See comment and a simple ASCII diagram showing this visually in bpf_ringbuf_area_alloc(). Another feature that distinguishes BPF ringbuf from perf ring buffer is a self-pacing notifications of new data being availability. bpf_ringbuf_commit() implementation will send a notification of new record being available after commit only if consumer has already caught up right up to the record being committed. If not, consumer still has to catch up and thus will see new data anyways without needing an extra poll notification. Benchmarks (see tools/testing/selftests/bpf/benchs/bench_ringbuf.c) show that this allows to achieve a very high throughput without having to resort to tricks like "notify only every Nth sample", which are necessary with perf buffer. For extreme cases, when BPF program wants more manual control of notifications, commit/discard/output helpers accept BPF_RB_NO_WAKEUP and BPF_RB_FORCE_WAKEUP flags, which give full control over notifications of data availability, but require extra caution and diligence in using this API. Comparison to alternatives -------------------------- Before considering implementing BPF ring buffer from scratch existing alternatives in kernel were evaluated, but didn't seem to meet the needs. They largely fell into few categores: - per-CPU buffers (perf, ftrace, etc), which don't satisfy two motivations outlined above (ordering and memory consumption); - linked list-based implementations; while some were multi-producer designs, consuming these from user-space would be very complicated and most probably not performant; memory-mapping contiguous piece of memory is simpler and more performant for user-space consumers; - io_uring is SPSC, but also requires fixed-sized elements. Naively turning SPSC queue into MPSC w/ lock would have subpar performance compared to locked reserve + lockless commit, as with BPF ring buffer. Fixed sized elements would be too limiting for BPF programs, given existing BPF programs heavily rely on variable-sized perf buffer already; - specialized implementations (like a new printk ring buffer, [0]) with lots of printk-specific limitations and implications, that didn't seem to fit well for intended use with BPF programs. [0] https://lwn.net/Articles/779550/ Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200529075424.3139988-2-andriin@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-06-01 22:22:32 -07:00
Eelco Chaudron	ff2322b879	libbpf: Fix perf_buffer__free() API for sparse allocs In case the cpu_bufs are sparsely allocated they are not all free'ed. These changes will fix this. Fixes: fb84b8224655 ("libbpf: add perf buffer API") Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/159056888305.330763.9684536967379110349.stgit@ebuild Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-06-01 22:22:32 -07:00
John Fastabend	ab1b4f3844	bpf, sk_msg: Add get socket storage helpers Add helpers to use local socket storage. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/159033907577.12355.14740125020572756560.stgit@john-Precision-5820-Tower Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-06-01 22:22:32 -07:00
Eelco Chaudron	df9a526f99	libbpf: Add API to consume the perf ring buffer content This new API, perf_buffer__consume, can be used as follows: - When you have a perf ring where wakeup_events is higher than 1, and you have remaining data in the rings you would like to pull out on exit (or maybe based on a timeout). - For low latency cases where you burn a CPU that constantly polls the queues. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/159048487929.89441.7465713173442594608.stgit@ebuild Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-06-01 22:22:32 -07:00
Andrii Nakryiko	3b23942542	ci: blacklist bpf_iter tests Disable a bunch of new kernel selftests that can't succeed on 5.5 kernel. Flatten Travis tests into a single stage to parallelize and speed them up. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-05-20 01:00:06 -07:00
Andrii Nakryiko	90941cde5f	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: c321022244708aec4675de4f032ef1ba9ff0c640 Checkpoint bpf-next commit: dda18a5c0b75461d1ed228f80b59c67434b8d601 Baseline bpf commit: 7f645462ca01d01abb94d75e6768c8b3ed3a188b Checkpoint bpf commit: f85c1598ddfe83f61d0656bd1d2025fa3b148b99 Alexei Starovoitov (1): tools/bpf: sync bpf.h Andrey Ignatov (2): bpf: Support narrow loads from bpf_sock_addr.user_port bpf: Introduce bpf_sk_{, ancestor_}cgroup_id helpers Daniel Borkmann (2): bpf: Add get{peer, sock}name attach types for sock_addr bpf, libbpf: Enable get{peer, sock}name attach types Eelco Chaudron (1): libbpf: Fix probe code to return EPERM if encountered Gustavo A. R. Silva (1): bpf, libbpf: Replace zero-length array with flexible-array Horatiu Vultur (1): net: bridge: Add port attribute IFLA_BRPORT_MRP_RING_OPEN Ian Rogers (2): libbpf, hashmap: Remove unused #include libbpf, hashmap: Fix signedness warnings Quentin Monnet (1): tools, bpf: Synchronise BPF UAPI header with tools Song Liu (2): bpf: Sharing bpf runtime stats with BPF_ENABLE_STATS libbpf: Add support for command BPF_ENABLE_STATS Stanislav Fomichev (2): bpf: Bpf_{g,s}etsockopt for struct bpf_sock_addr bpf: Allow any port in bpf_bind helper Sumanth Korikkar (1): libbpf: Fix register naming in PT_REGS s390 macros Yonghong Song (7): bpf: Allow loading of a bpf_iter program bpf: Support bpf tracing/iter programs for BPF_LINK_CREATE bpf: Create anonymous bpf iterator bpf: Add bpf_seq_printf and bpf_seq_write helpers tools/libbpf: Add bpf_iter support tools/libpf: Add offsetof/container_of macro in bpf_helpers.h bpf: Change btf_iter func proto prefix to "bpf_iter_" include/uapi/linux/bpf.h \| 208 +++++++++++++++++++++++++++-------- include/uapi/linux/if_link.h \| 1 + src/bpf.c \| 20 ++++ src/bpf.h \| 3 + src/bpf_helpers.h \| 14 +++ src/bpf_tracing.h \| 20 +++- src/hashmap.c \| 5 +- src/hashmap.h \| 1 - src/libbpf.c \| 98 +++++++++++++++-- src/libbpf.h \| 9 ++ src/libbpf.map \| 3 + src/libbpf_internal.h \| 2 +- 12 files changed, 322 insertions(+), 62 deletions(-) -- 2.24.1	2020-05-20 01:00:06 -07:00
Andrii Nakryiko	97a0d1e7b5	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2020-05-20 01:00:06 -07:00
Alexei Starovoitov	d650751a9b	tools/bpf: sync bpf.h Sync tools/include/uapi/linux/bpf.h from include/uapi. Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-05-20 01:00:06 -07:00
Daniel Borkmann	dcb0c5ac44	bpf, libbpf: Enable get{peer, sock}name attach types Trivial patch to add the new get{peer,sock}name attach types to the section definitions in order to hook them up to sock_addr cgroup program type. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Andrey Ignatov <rdna@fb.com> Link: https://lore.kernel.org/bpf/7fcd4b1e41a8ebb364754a5975c75a7795051bd2.1589841594.git.daniel@iogearbox.net	2020-05-20 01:00:06 -07:00
Daniel Borkmann	2c892f1aa1	bpf: Add get{peer, sock}name attach types for sock_addr As stated in 983695fa6765 ("bpf: fix unconnected udp hooks"), the objective for the existing cgroup connect/sendmsg/recvmsg/bind BPF hooks is to be transparent to applications. In Cilium we make use of these hooks [0] in order to enable E-W load balancing for existing Kubernetes service types for all Cilium managed nodes in the cluster. Those backends can be local or remote. The main advantage of this approach is that it operates as close as possible to the socket, and therefore allows to avoid packet-based NAT given in connect/sendmsg/recvmsg hooks we only need to xlate sock addresses. This also allows to expose NodePort services on loopback addresses in the host namespace, for example. As another advantage, this also efficiently blocks bind requests for applications in the host namespace for exposed ports. However, one missing item is that we also need to perform reverse xlation for inet{,6}_getname() hooks such that we can return the service IP/port tuple back to the application instead of the remote peer address. The vast majority of applications does not bother about getpeername(), but in a few occasions we've seen breakage when validating the peer's address since it returns unexpectedly the backend tuple instead of the service one. Therefore, this trivial patch allows to customise and adds a getpeername() as well as getsockname() BPF cgroup hook for both IPv4 and IPv6 in order to address this situation. Simple example: # ./cilium/cilium service list ID Frontend Service Type Backend 1 1.2.3.4:80 ClusterIP 1 => 10.0.0.10:80 Before; curl's verbose output example, no getpeername() reverse xlation: # curl --verbose 1.2.3.4 * Rebuilt URL to: 1.2.3.4/ * Trying 1.2.3.4... * TCP_NODELAY set * Connected to 1.2.3.4 (10.0.0.10) port 80 (#0) > GET / HTTP/1.1 > Host: 1.2.3.4 > User-Agent: curl/7.58.0 > Accept: / [...] After; with getpeername() reverse xlation: # curl --verbose 1.2.3.4 * Rebuilt URL to: 1.2.3.4/ * Trying 1.2.3.4... * TCP_NODELAY set * Connected to 1.2.3.4 (1.2.3.4) port 80 (#0) > GET / HTTP/1.1 > Host: 1.2.3.4 > User-Agent: curl/7.58.0 > Accept: / [...] Originally, I had both under a BPF_CGROUP_INET{4,6}_GETNAME type and exposed peer to the context similar as in inet{,6}_getname() fashion, but API-wise this is suboptimal as it always enforces programs having to test for ctx->peer which can easily be missed, hence BPF_CGROUP_INET{4,6}_GET{PEER,SOCK}NAME split. Similarly, the checked return code is on tnum_range(1, 1), but if a use case comes up in future, it can easily be changed to return an error code instead. Helper and ctx member access is the same as with connect/sendmsg/etc hooks. [0] https://github.com/cilium/cilium/blob/master/bpf/bpf_sock.c Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Andrey Ignatov <rdna@fb.com> Link: https://lore.kernel.org/bpf/61a479d759b2482ae3efb45546490bacd796a220.1589841594.git.daniel@iogearbox.net	2020-05-20 01:00:06 -07:00
Ian Rogers	46407182c7	libbpf, hashmap: Fix signedness warnings Fixes the following warnings: hashmap.c: In function ‘hashmap__clear’: hashmap.h:150:20: error: comparison of integer expressions of different signedness: ‘int’ and ‘size_t’ {aka ‘long unsigned int’} [-Werror=sign-compare] 150 \| for (bkt = 0; bkt < map->cap; bkt++) \ hashmap.c: In function ‘hashmap_grow’: hashmap.h:150:20: error: comparison of integer expressions of different signedness: ‘int’ and ‘size_t’ {aka ‘long unsigned int’} [-Werror=sign-compare] 150 \| for (bkt = 0; bkt < map->cap; bkt++) \ Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200515165007.217120-4-irogers@google.com	2020-05-20 01:00:06 -07:00
Ian Rogers	a00d463bb9	libbpf, hashmap: Remove unused #include Remove #include of libbpf_internal.h that is unused. Discussed in this thread: https://lore.kernel.org/lkml/CAEf4BzZRmiEds_8R8g4vaAeWvJzPb4xYLnpF0X2VNY8oTzkphQ@mail.gmail.com/ Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200515165007.217120-3-irogers@google.com	2020-05-20 01:00:06 -07:00
Sumanth Korikkar	d8fdd1e848	libbpf: Fix register naming in PT_REGS s390 macros Fix register naming in PT_REGS s390 macros Fixes: b8ebce86ffe6 ("libbpf: Provide CO-RE variants of PT_REGS macros") Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200513154414.29972-1-sumanthk@linux.ibm.com	2020-05-20 01:00:06 -07:00
Andrey Ignatov	b8482d74a1	bpf: Introduce bpf_sk_{, ancestor_}cgroup_id helpers With having ability to lookup sockets in cgroup skb programs it becomes useful to access cgroup id of retrieved sockets so that policies can be implemented based on origin cgroup of such socket. For example, a container running in a cgroup can have cgroup skb ingress program that can lookup peer socket that is sending packets to a process inside the container and decide whether those packets should be allowed or denied based on cgroup id of the peer. More specifically such ingress program can implement intra-host policy "allow incoming packets only from this same container and not from any other container on same host" w/o relying on source IP addresses since quite often it can be the case that containers share same IP address on the host. Introduce two new helpers for this use-case: bpf_sk_cgroup_id() and bpf_sk_ancestor_cgroup_id(). These helpers are similar to existing bpf_skb_{,ancestor_}cgroup_id helpers with the only difference that sk is used to get cgroup id instead of skb, and share code with them. See documentation in UAPI for more details. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/f5884981249ce911f63e9b57ecd5d7d19154ff39.1589486450.git.rdna@fb.com	2020-05-20 01:00:06 -07:00
Andrey Ignatov	3cd9cac8fb	bpf: Support narrow loads from bpf_sock_addr.user_port bpf_sock_addr.user_port supports only 4-byte load and it leads to ugly code in BPF programs, like: volatile __u32 user_port = ctx->user_port; __u16 port = bpf_ntohs(user_port); Since otherwise clang may optimize the load to be 2-byte and it's rejected by verifier. Add support for 1- and 2-byte loads same way as it's supported for other fields in bpf_sock_addr like user_ip4, msg_src_ip4, etc. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/c1e983f4c17573032601d0b2b1f9d1274f24bc16.1589420814.git.rdna@fb.com	2020-05-20 01:00:06 -07:00
Yonghong Song	70e6075d1d	bpf: Change btf_iter func proto prefix to "bpf_iter_" This is to be consistent with tracing and lsm programs which have prefix "bpf_trace_" and "bpf_lsm_" respectively. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200513180216.2949387-1-yhs@fb.com	2020-05-20 01:00:06 -07:00
Eelco Chaudron	d71e9baa8b	libbpf: Fix probe code to return EPERM if encountered When the probe code was failing for any reason ENOTSUP was returned, even if this was due to not having enough lock space. This patch fixes this by returning EPERM to the user application, so it can respond and increase the RLIMIT_MEMLOCK size. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/158927424896.2342.10402475603585742943.stgit@ebuild	2020-05-20 01:00:06 -07:00
Quentin Monnet	b41c6d34a4	tools, bpf: Synchronise BPF UAPI header with tools Synchronise the bpf.h header under tools, to report the fixes recently brought to the documentation for the BPF helpers. Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200511161536.29853-5-quentin@isovalent.com	2020-05-20 01:00:06 -07:00
Gustavo A. R. Silva	9029d18d9b	bpf, libbpf: Replace zero-length array with flexible-array The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] sizeof(flexible-array-member) triggers a warning because flexible array members have incomplete type[1]. There are some instances of code in which the sizeof operator is being incorrectly/erroneously applied to zero-length arrays and the result is zero. Such instances may be hiding some bugs. So, this work (flexible-array member conversions) will also help to get completely rid of those sorts of issues. This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200507185057.GA13981@embeddedor	2020-05-20 01:00:06 -07:00
Yonghong Song	f81f504e12	tools/libpf: Add offsetof/container_of macro in bpf_helpers.h These two helpers will be used later in bpf_iter bpf program bpf_iter_netlink.c. Put them in bpf_helpers.h since they could be useful in other cases. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200509175919.2477104-1-yhs@fb.com	2020-05-20 01:00:06 -07:00
Yonghong Song	021e35fba2	tools/libbpf: Add bpf_iter support Two new libbpf APIs are added to support bpf_iter: - bpf_program__attach_iter Given a bpf program and additional parameters, which is none now, returns a bpf_link. - bpf_iter_create syscall level API to create a bpf iterator. The macro BPF_SEQ_PRINTF are also introduced. The format looks like: BPF_SEQ_PRINTF(seq, "task id %d\n", pid); This macro can help bpf program writers with nicer bpf_seq_printf syntax similar to the kernel one. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200509175917.2476936-1-yhs@fb.com	2020-05-20 01:00:06 -07:00
Yonghong Song	7112841ade	bpf: Add bpf_seq_printf and bpf_seq_write helpers Two helpers bpf_seq_printf and bpf_seq_write, are added for writing data to the seq_file buffer. bpf_seq_printf supports common format string flag/width/type fields so at least I can get identical results for netlink and ipv6_route targets. For bpf_seq_printf and bpf_seq_write, return value -EOVERFLOW specifically indicates a write failure due to overflow, which means the object will be repeated in the next bpf invocation if object collection stays the same. Note that if the object collection is changed, depending how collection traversal is done, even if the object still in the collection, it may not be visited. For bpf_seq_printf, format %s, %p{i,I}{4,6} needs to read kernel memory. Reading kernel memory may fail in the following two cases: - invalid kernel address, or - valid kernel address but requiring a major fault If reading kernel memory failed, the %s string will be an empty string and %p{i,I}{4,6} will be all 0. Not returning error to bpf program is consistent with what bpf_trace_printk() does for now. bpf_seq_printf may return -EBUSY meaning that internal percpu buffer for memory copy of strings or other pointees is not available. Bpf program can return 1 to indicate it wants the same object to be repeated. Right now, this should not happen on no-RT kernels since migrate_disable(), which guards bpf prog call, calls preempt_disable(). Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200509175914.2476661-1-yhs@fb.com	2020-05-20 01:00:06 -07:00
Yonghong Song	940f4df57b	bpf: Create anonymous bpf iterator A new bpf command BPF_ITER_CREATE is added. The anonymous bpf iterator is seq_file based. The seq_file private data are referenced by targets. The bpf_iter infrastructure allocated additional space at seq_file->private before the space used by targets to store some meta data, e.g., prog: prog to run session_id: an unique id for each opened seq_file seq_num: how many times bpf programs are queried in this session done_stop: an internal state to decide whether bpf program should be called in seq_ops->stop() or not The seq_num will start from 0 for valid objects. The bpf program may see the same seq_num more than once if - seq_file buffer overflow happens and the same object is retried by bpf_seq_read(), or - the bpf program explicitly requests a retry of the same object Since module is not supported for bpf_iter, all target registeration happens at __init time, so there is no need to change bpf_iter_unreg_target() as it is used mostly in error path of the init function at which time no bpf iterators have been created yet. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200509175905.2475770-1-yhs@fb.com	2020-05-20 01:00:06 -07:00
Yonghong Song	46c906b6d1	bpf: Support bpf tracing/iter programs for BPF_LINK_CREATE Given a bpf program, the step to create an anonymous bpf iterator is: - create a bpf_iter_link, which combines bpf program and the target. In the future, there could be more information recorded in the link. A link_fd will be returned to the user space. - create an anonymous bpf iterator with the given link_fd. The bpf_iter_link can be pinned to bpffs mount file system to create a file based bpf iterator as well. The benefit to use of bpf_iter_link: - using bpf link simplifies design and implementation as bpf link is used for other tracing bpf programs. - for file based bpf iterator, bpf_iter_link provides a standard way to replace underlying bpf programs. - for both anonymous and free based iterators, bpf link query capability can be leveraged. The patch added support of tracing/iter programs for BPF_LINK_CREATE. A new link type BPF_LINK_TYPE_ITER is added to facilitate link querying. Currently, only prog_id is needed, so there is no additional in-kernel show_fdinfo() and fill_link_info() hook is needed for BPF_LINK_TYPE_ITER link. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200509175901.2475084-1-yhs@fb.com	2020-05-20 01:00:06 -07:00
Yonghong Song	9dc3736a7f	bpf: Allow loading of a bpf_iter program A bpf_iter program is a tracing program with attach type BPF_TRACE_ITER. The load attribute attach_btf_id is used by the verifier against a particular kernel function, which represents a target, e.g., __bpf_iter__bpf_map for target bpf_map which is implemented later. The program return value must be 0 or 1 for now. 0 : successful, except potential seq_file buffer overflow which is handled by seq_file reader. 1 : request to restart the same object In the future, other return values may be used for filtering or teminating the iterator. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200509175900.2474947-1-yhs@fb.com	2020-05-20 01:00:06 -07:00
Stanislav Fomichev	8b3cbf12a2	bpf: Allow any port in bpf_bind helper We want to have a tighter control on what ports we bind to in the BPF_CGROUP_INET{4,6}_CONNECT hooks even if it means connect() becomes slightly more expensive. The expensive part comes from the fact that we now need to call inet_csk_get_port() that verifies that the port is not used and allocates an entry in the hash table for it. Since we can't rely on "snum \|\| !bind_address_no_port" to prevent us from calling POST_BIND hook anymore, let's add another bind flag to indicate that the call site is BPF program. v5: * fix wrong AF_INET (should be AF_INET6) in the bpf program for v6 v3: * More bpf_bind documentation refinements (Martin KaFai Lau) * Add UDP tests as well (Martin KaFai Lau) * Don't start the thread, just do socket+bind+listen (Martin KaFai Lau) v2: * Update documentation (Andrey Ignatov) * Pass BIND_FORCE_ADDRESS_NO_PORT conditionally (Andrey Ignatov) Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrey Ignatov <rdna@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200508174611.228805-5-sdf@google.com	2020-05-20 01:00:06 -07:00
Stanislav Fomichev	dfa07417ff	bpf: Bpf_{g,s}etsockopt for struct bpf_sock_addr Currently, bpf_getsockopt and bpf_setsockopt helpers operate on the 'struct bpf_sock_ops' context in BPF_PROG_TYPE_SOCK_OPS program. Let's generalize them and make them available for 'struct bpf_sock_addr'. That way, in the future, we can allow those helpers in more places. As an example, let's expose those 'struct bpf_sock_addr' based helpers to BPF_CGROUP_INET{4,6}_CONNECT hooks. That way we can override CC before the connection is made. v3: * Expose custom helpers for bpf_sock_addr context instead of doing generic bpf_sock argument (as suggested by Daniel). Even with try_socket_lock that doesn't sleep we have a problem where context sk is already locked and socket lock is non-nestable. v2: * s/BPF_PROG_TYPE_CGROUP_SOCKOPT/BPF_PROG_TYPE_SOCK_OPS/ Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200430233152.199403-1-sdf@google.com	2020-05-20 01:00:06 -07:00
Song Liu	5c1c96c579	libbpf: Add support for command BPF_ENABLE_STATS bpf_enable_stats() is added to enable given stats. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200430071506.1408910-3-songliubraving@fb.com	2020-05-20 01:00:06 -07:00
Song Liu	83f269b088	bpf: Sharing bpf runtime stats with BPF_ENABLE_STATS Currently, sysctl kernel.bpf_stats_enabled controls BPF runtime stats. Typical userspace tools use kernel.bpf_stats_enabled as follows: 1. Enable kernel.bpf_stats_enabled; 2. Check program run_time_ns; 3. Sleep for the monitoring period; 4. Check program run_time_ns again, calculate the difference; 5. Disable kernel.bpf_stats_enabled. The problem with this approach is that only one userspace tool can toggle this sysctl. If multiple tools toggle the sysctl at the same time, the measurement may be inaccurate. To fix this problem while keep backward compatibility, introduce a new bpf command BPF_ENABLE_STATS. On success, this command enables stats and returns a valid fd. BPF_ENABLE_STATS takes argument "type". Currently, only one type, BPF_STATS_RUN_TIME, is supported. We can extend the command to support other types of stats in the future. With BPF_ENABLE_STATS, user space tool would have the following flow: 1. Get a fd with BPF_ENABLE_STATS, and make sure it is valid; 2. Check program run_time_ns; 3. Sleep for the monitoring period; 4. Check program run_time_ns again, calculate the difference; 5. Close the fd. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200430071506.1408910-2-songliubraving@fb.com	2020-05-20 01:00:06 -07:00
Horatiu Vultur	597d350e4a	net: bridge: Add port attribute IFLA_BRPORT_MRP_RING_OPEN This patch adds a new port attribute, IFLA_BRPORT_MRP_RING_OPEN, which allows to notify the userspace when the port lost the continuite of MRP frames. This attribute is set by kernel whenever the SW or HW detects that the ring is being open or closed. Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-05-20 01:00:06 -07:00
Andrii Nakryiko	7fc4d5025b	vmtest: add bpf_obj_id to 5.5.0 blacklist bpf_obj_id selftest added testing of bpf_link related operations, which are not implemented in 5.5.0. Blacklist it. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-05-01 18:58:47 -07:00
Andrii Nakryiko	bd9e2feb2a	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 2fcd80144b93ff90836a44f2054b4d82133d3a85 Checkpoint bpf-next commit: c321022244708aec4675de4f032ef1ba9ff0c640 Baseline bpf commit: edadedf1c5b4e4404192a0a4c3c0c05e3b7672ab Checkpoint bpf commit: 7f645462ca01d01abb94d75e6768c8b3ed3a188b Andrii Nakryiko (8): bpf: Add support for BPF_OBJ_GET_INFO_BY_FD for bpf_link libbpf: Add low-level APIs for new bpf_link commands libbpf: Refactor BTF-defined map definition parsing logic libbpf: Refactor map creation logic and fix cleanup leak libbpf: Add BTF-defined map-in-map support libbpf: Fix memory leak and possible double-free in hashmap__clear libbpf: Fix huge memory leak in libbpf_find_vmlinux_btf_id() libbpf: Fix false uninitialized variable warning David Ahern (1): libbpf: Only check mode flags in get_xdp_id Jakub Wilk (1): bpf: Fix reStructuredText markup Maciej Żenczykowski (1): bpf: add bpf_ktime_get_boot_ns() Mao Wenan (1): libbpf: Return err if bpf_object__load failed Yoshiki Komachi (1): bpf_helpers.h: Add note for building with vmlinux.h or linux/types.h Zou Wei (1): libbpf: Remove unneeded semicolon in btf_dump_emit_type include/uapi/linux/bpf.h \| 46 ++- src/bpf.c \| 19 +- src/bpf.h \| 4 +- src/bpf_helpers.h \| 7 + src/btf_dump.c \| 2 +- src/hashmap.c \| 7 + src/libbpf.c \| 705 +++++++++++++++++++++++++++------------ src/libbpf.map \| 6 + src/netlink.c \| 2 + 9 files changed, 572 insertions(+), 226 deletions(-) -- 2.24.1	2020-05-01 18:58:47 -07:00
Andrii Nakryiko	814ed5011f	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2020-05-01 18:58:47 -07:00
Andrii Nakryiko	f8faf2b33d	libbpf: Fix false uninitialized variable warning Some versions of GCC falsely detect that vi might not be initialized. That's not true, but let's silence it with NULL initialization. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200430021436.1522502-1-andriin@fb.com	2020-05-01 18:58:47 -07:00
Andrii Nakryiko	3cb0b3fd52	libbpf: Fix huge memory leak in libbpf_find_vmlinux_btf_id() BTF object wasn't freed. Fixes: a6ed02cac690 ("libbpf: Load btf_vmlinux only once per object.") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Cc: KP Singh <kpsingh@google.com> Link: https://lore.kernel.org/bpf/20200429012111.277390-9-andriin@fb.com	2020-05-01 18:58:47 -07:00
Andrii Nakryiko	edb1aaa8dc	libbpf: Fix memory leak and possible double-free in hashmap__clear Fix memory leak in hashmap_clear() not freeing hashmap_entry structs for each of the remaining entries. Also NULL-out bucket list to prevent possible double-free between hashmap__clear() and hashmap__free(). Running test_progs-asan flavor clearly showed this problem. Reported-by: Alston Tang <alston64@fb.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200429012111.277390-5-andriin@fb.com	2020-05-01 18:58:47 -07:00
Andrii Nakryiko	f3271942dd	libbpf: Add BTF-defined map-in-map support As discussed at LPC 2019 ([0]), this patch brings (a quite belated) support for declarative BTF-defined map-in-map support in libbpf. It allows to define ARRAY_OF_MAPS and HASH_OF_MAPS BPF maps without any user-space initialization code involved. Additionally, it allows to initialize outer map's slots with references to respective inner maps at load time, also completely declaratively. Despite a weak type system of C, the way BTF-defined map-in-map definition works, it's actually quite hard to accidentally initialize outer map with incompatible inner maps. This being C, of course, it's still possible, but even that would be caught at load time and error returned with helpful debug log pointing exactly to the slot that failed to be initialized. As an example, here's a rather advanced HASH_OF_MAPS declaration and initialization example, filling slots #0 and #4 with two inner maps: #include <bpf/bpf_helpers.h> struct inner_map { __uint(type, BPF_MAP_TYPE_ARRAY); __uint(max_entries, 1); __type(key, int); __type(value, int); } inner_map1 SEC(".maps"), inner_map2 SEC(".maps"); struct outer_hash { __uint(type, BPF_MAP_TYPE_HASH_OF_MAPS); __uint(max_entries, 5); __uint(key_size, sizeof(int)); __array(values, struct inner_map); } outer_hash SEC(".maps") = { .values = { [0] = &inner_map2, [4] = &inner_map1, }, }; Here's the relevant part of libbpf debug log showing pretty clearly of what's going on with map-in-map initialization: libbpf: .maps relo #0: for 6 value 0 rel.r_offset 96 name 260 ('inner_map1') libbpf: .maps relo #0: map 'outer_arr' slot [0] points to map 'inner_map1' libbpf: .maps relo #1: for 7 value 32 rel.r_offset 112 name 249 ('inner_map2') libbpf: .maps relo #1: map 'outer_arr' slot [2] points to map 'inner_map2' libbpf: .maps relo #2: for 7 value 32 rel.r_offset 144 name 249 ('inner_map2') libbpf: .maps relo #2: map 'outer_hash' slot [0] points to map 'inner_map2' libbpf: .maps relo #3: for 6 value 0 rel.r_offset 176 name 260 ('inner_map1') libbpf: .maps relo #3: map 'outer_hash' slot [4] points to map 'inner_map1' libbpf: map 'inner_map1': created successfully, fd=4 libbpf: map 'inner_map2': created successfully, fd=5 libbpf: map 'outer_hash': created successfully, fd=7 libbpf: map 'outer_hash': slot [0] set to map 'inner_map2' fd=5 libbpf: map 'outer_hash': slot [4] set to map 'inner_map1' fd=4 Notice from the log above that fd=6 (not logged explicitly) is used for inner "prototype" map, necessary for creation of outer map. It is destroyed immediately after outer map is created. See also included selftest with some extra comments explaining extra details of usage. Additionally, similar initialization syntax and libbpf functionality can be used to do initialization of BPF_PROG_ARRAY with references to BPF sub-programs. This can be done in follow up patches, if there will be a demand for this. [0] https://linuxplumbersconf.org/event/4/contributions/448/ Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20200429002739.48006-4-andriin@fb.com	2020-05-01 18:58:47 -07:00
Andrii Nakryiko	040f73a7c7	libbpf: Refactor map creation logic and fix cleanup leak Factor out map creation and destruction logic to simplify code and especially error handling. Also fix map FD leak in case of partially successful map creation during bpf_object load operation. Fixes: 57a00f41644f ("libbpf: Add auto-pinning of maps when loading BPF objects") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20200429002739.48006-3-andriin@fb.com	2020-05-01 18:58:47 -07:00
Andrii Nakryiko	35283f89c6	libbpf: Refactor BTF-defined map definition parsing logic Factor out BTF map definition logic into stand-alone routine for easier reuse for map-in-map case. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200429002739.48006-2-andriin@fb.com	2020-05-01 18:58:47 -07:00
Andrii Nakryiko	1c4c845e79	libbpf: Add low-level APIs for new bpf_link commands Add low-level API calls for bpf_link_get_next_id() and bpf_link_get_fd_by_id(). Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200429001614.1544-6-andriin@fb.com	2020-05-01 18:58:47 -07:00
Andrii Nakryiko	2a374b5df0	bpf: Add support for BPF_OBJ_GET_INFO_BY_FD for bpf_link Add ability to fetch bpf_link details through BPF_OBJ_GET_INFO_BY_FD command. Also enhance show_fdinfo to potentially include bpf_link type-specific information (similarly to obj_info). Also introduce enum bpf_link_type stored in bpf_link itself and expose it in UAPI. bpf_link_tracing also now will store and return bpf_attach_type. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200429001614.1544-5-andriin@fb.com	2020-05-01 18:58:47 -07:00
Zou Wei	7878754030	libbpf: Remove unneeded semicolon in btf_dump_emit_type Fixes the following coccicheck warning: tools/lib/bpf/btf_dump.c:661:4-5: Unneeded semicolon Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zou Wei <zou_wei@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/1588064829-70613-1-git-send-email-zou_wei@huawei.com	2020-05-01 18:58:47 -07:00
Mao Wenan	da5aa114e2	libbpf: Return err if bpf_object__load failed bpf_object__load() has various return code, when it failed to load object, it must return err instead of -EINVAL. Signed-off-by: Mao Wenan <maowenan@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200426063635.130680-3-maowenan@huawei.com	2020-05-01 18:58:47 -07:00
Maciej Żenczykowski	625f64a126	bpf: add bpf_ktime_get_boot_ns() On a device like a cellphone which is constantly suspending and resuming CLOCK_MONOTONIC is not particularly useful for keeping track of or reacting to external network events. Instead you want to use CLOCK_BOOTTIME. Hence add bpf_ktime_get_boot_ns() as a mirror of bpf_ktime_get_ns() based around CLOCK_BOOTTIME instead of CLOCK_MONOTONIC. Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-05-01 18:58:47 -07:00
Yoshiki Komachi	ba344d9494	bpf_helpers.h: Add note for building with vmlinux.h or linux/types.h The following error was shown when a bpf program was compiled without vmlinux.h auto-generated from BTF: # clang -I./linux/tools/lib/ -I/lib/modules/$(uname -r)/build/include/ \ -O2 -Wall -target bpf -emit-llvm -c bpf_prog.c -o bpf_prog.bc ... In file included from linux/tools/lib/bpf/bpf_helpers.h:5: linux/tools/lib/bpf/bpf_helper_defs.h:56:82: error: unknown type name '__u64' ... It seems that bpf programs are intended for being built together with the vmlinux.h (which will have all the __u64 and other typedefs). But users may mistakenly think "include <linux/types.h>" is missing because the vmlinux.h is not common for non-bpf developers. IMO, an explicit comment therefore should be added to bpf_helpers.h as this patch shows. Signed-off-by: Yoshiki Komachi <komachi.yoshiki@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/1587427527-29399-1-git-send-email-komachi.yoshiki@gmail.com	2020-05-01 18:58:47 -07:00
Jakub Wilk	976e29343d	bpf: Fix reStructuredText markup The patch fixes: $ scripts/bpf_helpers_doc.py > bpf-helpers.rst $ rst2man bpf-helpers.rst > bpf-helpers.7 bpf-helpers.rst:1105: (WARNING/2) Inline strong start-string without end-string. Signed-off-by: Jakub Wilk <jwilk@jwilk.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20200422082324.2030-1-jwilk@jwilk.net	2020-05-01 18:58:47 -07:00
David Ahern	b3da63d59d	libbpf: Only check mode flags in get_xdp_id The commit in the Fixes tag changed get_xdp_id to only return prog_id if flags is 0, but there are other XDP flags than the modes - e.g., XDP_FLAGS_UPDATE_IF_NOEXIST. Since the intention was only to look at MODE flags, clear other ones before checking if flags is 0. Fixes: f07cbad29741 ("libbpf: Fix bpf_get_link_xdp_id flags handling") Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrey Ignatov <rdna@fb.com>	2020-05-01 18:58:47 -07:00
Andrii Nakryiko	902ba3fd33	README: add Debian libbpf package link Debian is now packaging libbpf from this repo. Add link to the package to README.	2020-05-01 18:20:43 -07:00
Andrii Nakryiko	cf3fc46ea8	sync: squelch annoying warning from filter-branch git command Newer git started emitting warning about dangerousness of filter-branch. Squelch it with FILTER_BRANCH_SQUELCH_WARNING=1 envvar. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-04-29 23:01:56 -07:00
Andrii Nakryiko	6a1615c263	vmtests: blacklist mmap test on 5.5 5.5 kernel has a bug in kernel allowing to violate read-only access to mmap()-ed map. Disable selftest that now is failing. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-04-17 15:31:03 -07:00
Andrii Nakryiko	e66d297441	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 1a323ea5356edbb3073dc59d51b9e6b86908857d Checkpoint bpf-next commit: 2fcd80144b93ff90836a44f2054b4d82133d3a85 Baseline bpf commit: 94b18a87efdd1626a1e6aef87271af4a7c616d36 Checkpoint bpf commit: edadedf1c5b4e4404192a0a4c3c0c05e3b7672ab Andrey Ignatov (1): libbpf: Fix bpf_get_link_xdp_id flags handling Andrii Nakryiko (1): libbpf: Always specify expected_attach_type on program load if supported Jeremy Cline (1): libbpf: Initialize *nl_pid so gcc 10 is happy Toke Høiland-Jørgensen (1): libbpf: Fix type of old_fd in bpf_xdp_set_link_opts src/libbpf.c \| 126 ++++++++++++++++++++++++++++++++------------------ src/libbpf.h \| 2 +- src/netlink.c \| 6 +-- 3 files changed, 86 insertions(+), 48 deletions(-) -- 2.24.1	2020-04-17 15:31:03 -07:00
Toke Høiland-Jørgensen	632afdff45	libbpf: Fix type of old_fd in bpf_xdp_set_link_opts The 'old_fd' parameter used for atomic replacement of XDP programs is supposed to be an FD, but was left as a u32 from an earlier iteration of the patch that added it. It was converted to an int when read, so things worked correctly even with negative values, but better change the definition to correctly reflect the intention. Fixes: bd5ca3ef93cd ("libbpf: Add function to set link XDP fd while specifying old program") Reported-by: David Ahern <dsahern@gmail.com> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200414145025.182163-1-toke@redhat.com	2020-04-17 15:31:03 -07:00
Andrii Nakryiko	6e706b38bd	libbpf: Always specify expected_attach_type on program load if supported For some types of BPF programs that utilize expected_attach_type, libbpf won't set load_attr.expected_attach_type, even if expected_attach_type is known from section definition. This was done to preserve backwards compatibility with old kernels that didn't recognize expected_attach_type attribute yet (which was added in 5e43f899b03a ("bpf: Check attach type at prog load time"). But this is problematic for some BPF programs that utilize newer features that require kernel to know specific expected_attach_type (e.g., extended set of return codes for cgroup_skb/egress programs). This patch makes libbpf specify expected_attach_type by default, but also detect support for this field in kernel and not set it during program load. This allows to have a good metadata for bpf_program (e.g., bpf_program__get_extected_attach_type()), but still work with old kernels (for cases where it can work at all). Additionally, due to expected_attach_type being always set for recognized program types, bpf_program__attach_cgroup doesn't have to do extra checks to determine correct attach type, so remove that additional logic. Also adjust section_names selftest to account for this change. More detailed discussion can be found in [0]. [0] https://lore.kernel.org/bpf/20200412003604.GA15986@rdna-mbp.dhcp.thefacebook.com/ Fixes: 5cf1e9145630 ("bpf: cgroup inet skb programs can return 0 to 3") Fixes: 5e43f899b03a ("bpf: Check attach type at prog load time") Reported-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Andrey Ignatov <rdna@fb.com> Link: https://lore.kernel.org/bpf/20200414182645.1368174-1-andriin@fb.com	2020-04-17 15:31:03 -07:00
Andrey Ignatov	850293ba1c	libbpf: Fix bpf_get_link_xdp_id flags handling Currently if one of XDP_FLAGS_{DRV,HW,SKB}_MODE flags is passed to bpf_get_link_xdp_id() and there is a single XDP program attached to ifindex, that program's id will be returned by bpf_get_link_xdp_id() in prog_id argument no matter what mode the program is attached in, i.e. flags argument is not taken into account. For example, if there is a single program attached with XDP_FLAGS_SKB_MODE but user calls bpf_get_link_xdp_id() with flags = XDP_FLAGS_DRV_MODE, that skb program will be returned. Fix it by returning info->prog_id only if user didn't specify flags. If flags is specified then return corresponding mode-specific-field from struct xdp_link_info. The initial error was introduced in commit 50db9f073188 ("libbpf: Add a support for getting xdp prog id on ifindex") and then refactored in 473f4e133a12 so 473f4e133a12 is used in the Fixes tag. Fixes: 473f4e133a12 ("libbpf: Add bpf_get_link_xdp_info() function to get more XDP information") Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/0e9e30490b44b447bb2bebc69c7135e7fe7e4e40.1586236080.git.rdna@fb.com	2020-04-17 15:31:03 -07:00
Jeremy Cline	fb528063b2	libbpf: Initialize nl_pid so gcc 10 is happy Builds of Fedora's kernel-tools package started to fail with "may be used uninitialized" warnings for nl_pid in bpf_set_link_xdp_fd() and bpf_get_link_xdp_info() on the s390 architecture. Although libbpf_netlink_open() always returns a negative number when it does not set nl_pid, the compiler does not determine this and thus believes the variable might be used uninitialized. Assuage gcc's fears by explicitly initializing nl_pid. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1807781 Signed-off-by: Jeremy Cline <jcline@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200404051430.698058-1-jcline@redhat.com	2020-04-17 15:31:03 -07:00
Andrii Nakryiko	97ada10bd8	ci: update blacklists and Kconfig Disable some of newest selftests on 5.5.0, turn on BPF_LSM on latest. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-04-02 00:02:25 -07:00
Andrii Nakryiko	9a35753b42	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 483d7a30f538e2f8addd32aa9a3d2e94ae55fa65 Checkpoint bpf-next commit: 1a323ea5356edbb3073dc59d51b9e6b86908857d Baseline bpf commit: 94b18a87efdd1626a1e6aef87271af4a7c616d36 Checkpoint bpf commit: 94b18a87efdd1626a1e6aef87271af4a7c616d36 Andrii Nakryiko (2): bpf: Implement bpf_link-based cgroup BPF program attachment libbpf: Add support for bpf_link-based cgroup attachment Antoine Tenart (1): net: macsec: add support for offloading to the MAC Daniel Borkmann (2): bpf: Add netns cookie and enable it for bpf cgroup hooks bpf: Enable bpf cgroup hooks to retrieve cgroup v2 and ancestor id Fletcher Dunn (1): libbpf, xsk: Init all ring members in xsk_umem__create and xsk_socket__create Joe Stringer (1): bpf: Add socket assign support KP Singh (2): bpf: Introduce BPF_PROG_TYPE_LSM tools/libbpf: Add support for BPF_PROG_TYPE_LSM Mark Starovoytov (1): net: macsec: add support for specifying offload upon link creation Stanislav Fomichev (1): libbpf: Don't allocate 16M for log buffer by default Tobias Klauser (1): libbpf: Remove unused parameter `def` to get_map_field_int Toke Høiland-Jørgensen (3): tools: Add EXPECTED_FD-related definitions in if_link.h libbpf: Add function to set link XDP fd while specifying old program libbpf: Add setter for initial value for internal maps include/uapi/linux/bpf.h \| 82 ++++++++++++++++++++- include/uapi/linux/if_link.h \| 6 +- src/bpf.c \| 37 +++++++++- src/bpf.h \| 19 +++++ src/btf.c \| 20 ++++-- src/libbpf.c \| 134 +++++++++++++++++++++++++++++------ src/libbpf.h \| 22 +++++- src/libbpf.map \| 9 +++ src/libbpf_probes.c \| 1 + src/netlink.c \| 34 ++++++++- src/xsk.c \| 16 ++++- 11 files changed, 345 insertions(+), 35 deletions(-) -- 2.24.1	2020-04-02 00:02:25 -07:00
Andrii Nakryiko	c4af2093cc	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2020-04-02 00:02:25 -07:00
Andrii Nakryiko	1543a19f36	libbpf: Add support for bpf_link-based cgroup attachment Add bpf_program__attach_cgroup(), which uses BPF_LINK_CREATE subcommand to create an FD-based kernel bpf_link. Also add low-level bpf_link_create() API. If expected_attach_type is not specified explicitly with bpf_program__set_expected_attach_type(), libbpf will try to determine proper attach type from BPF program's section definition. Also add support for bpf_link's underlying BPF program replacement: - unconditional through high-level bpf_link__update_program() API; - cmpxchg-like with specifying expected current BPF program through low-level bpf_link_update() API. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200330030001.2312810-4-andriin@fb.com	2020-04-02 00:02:25 -07:00
Andrii Nakryiko	8b41602694	bpf: Implement bpf_link-based cgroup BPF program attachment Implement new sub-command to attach cgroup BPF programs and return FD-based bpf_link back on success. bpf_link, once attached to cgroup, cannot be replaced, except by owner having its FD. Cgroup bpf_link supports only BPF_F_ALLOW_MULTI semantics. Both link-based and prog-based BPF_F_ALLOW_MULTI attachments can be freely intermixed. To prevent bpf_cgroup_link from keeping cgroup alive past the point when no BPF program can be executed, implement auto-detachment of link. When cgroup_bpf_release() is called, all attached bpf_links are forced to release cgroup refcounts, but they leave bpf_link otherwise active and allocated, as well as still owning underlying bpf_prog. This is because user-space might still have FDs open and active, so bpf_link as a user-referenced object can't be freed yet. Once last active FD is closed, bpf_link will be freed and underlying bpf_prog refcount will be dropped. But cgroup refcount won't be touched, because cgroup is released already. The inherent race between bpf_cgroup_link release (from closing last FD) and cgroup_bpf_release() is resolved by both operations taking cgroup_mutex. So the only additional check required is when bpf_cgroup_link attempts to detach itself from cgroup. At that time we need to check whether there is still cgroup associated with that link. And if not, exit with success, because bpf_cgroup_link was already successfully detached. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Roman Gushchin <guro@fb.com> Link: https://lore.kernel.org/bpf/20200330030001.2312810-2-andriin@fb.com	2020-04-02 00:02:25 -07:00
Joe Stringer	cecb299ac4	bpf: Add socket assign support Add support for TPROXY via a new bpf helper, bpf_sk_assign(). This helper requires the BPF program to discover the socket via a call to bpf_sk_lookup_(), then pass this socket to the new helper. The helper takes its own reference to the socket in addition to any existing reference that may or may not currently be obtained for the duration of BPF processing. For the destination socket to receive the traffic, the traffic must be routed towards that socket via local route. The simplest example route is below, but in practice you may want to route traffic more narrowly (eg by CIDR): $ ip route add local default dev lo This patch avoids trying to introduce an extra bit into the skb->sk, as that would require more invasive changes to all code interacting with the socket to ensure that the bit is handled correctly, such as all error-handling cases along the path from the helper in BPF through to the orphan path in the input. Instead, we opt to use the destructor variable to switch on the prefetch of the socket. Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200329225342.16317-2-joe@wand.net.nz	2020-04-02 00:02:25 -07:00
KP Singh	90e89264b9	tools/libbpf: Add support for BPF_PROG_TYPE_LSM Since BPF_PROG_TYPE_LSM uses the same attaching mechanism as BPF_PROG_TYPE_TRACING, the common logic is refactored into a static function bpf_program__attach_btf_id. A new API call bpf_program__attach_lsm is still added to avoid userspace conflicts if this ever changes in the future. Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Brendan Jackman <jackmanb@google.com> Reviewed-by: Florent Revest <revest@google.com> Reviewed-by: James Morris <jamorris@linux.microsoft.com> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200329004356.27286-7-kpsingh@chromium.org	2020-04-02 00:02:25 -07:00
KP Singh	f69cc97272	bpf: Introduce BPF_PROG_TYPE_LSM Introduce types and configs for bpf programs that can be attached to LSM hooks. The programs can be enabled by the config option CONFIG_BPF_LSM. Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Brendan Jackman <jackmanb@google.com> Reviewed-by: Florent Revest <revest@google.com> Reviewed-by: Thomas Garnier <thgarnie@google.com> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: James Morris <jamorris@linux.microsoft.com> Link: https://lore.kernel.org/bpf/20200329004356.27286-2-kpsingh@chromium.org	2020-04-02 00:02:25 -07:00
Toke Høiland-Jørgensen	a6e9750c8a	libbpf: Add setter for initial value for internal maps For internal maps (most notably the maps backing global variables), libbpf uses an internal mmaped area to store the data after opening the object. This data is subsequently copied into the kernel map when the object is loaded. This adds a function to set a new value for that data, which can be used to before it is loaded into the kernel. This is especially relevant for RODATA maps, since those are frozen on load. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200329132253.232541-1-toke@redhat.com	2020-04-02 00:02:25 -07:00
Toke Høiland-Jørgensen	60bade6674	libbpf: Add function to set link XDP fd while specifying old program This adds a new function to set the XDP fd while specifying the FD of the program to replace, using the newly added IFLA_XDP_EXPECTED_FD netlink parameter. The new function uses the opts struct mechanism to be extendable in the future. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/158515700857.92963.7052131201257841700.stgit@toke.dk	2020-04-02 00:02:25 -07:00
Toke Høiland-Jørgensen	e13c1b7b85	tools: Add EXPECTED_FD-related definitions in if_link.h This adds the IFLA_XDP_EXPECTED_FD netlink attribute definition and the XDP_FLAGS_REPLACE flag to if_link.h in tools/include. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/158515700747.92963.8615391897417388586.stgit@toke.dk	2020-04-02 00:02:25 -07:00
Fletcher Dunn	1d8451ccaf	libbpf, xsk: Init all ring members in xsk_umem__create and xsk_socket__create Fix a sharp edge in xsk_umem__create and xsk_socket__create. Almost all of the members of the ring buffer structs are initialized, but the "cached_xxx" variables are not all initialized. The caller is required to zero them. This is needlessly dangerous. The results if you don't do it can be very bad. For example, they can cause xsk_prod_nb_free and xsk_cons_nb_avail to return values greater than the size of the queue. xsk_ring_cons__peek can return an index that does not refer to an item that has been queued. I have confirmed that without this change, my program misbehaves unless I memset the ring buffers to zero before calling the function. Afterwards, my program works without (or with) the memset. Signed-off-by: Fletcher Dunn <fletcherd@valvesoftware.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/85f12913cde94b19bfcb598344701c38@valvesoftware.com	2020-04-02 00:02:25 -07:00
Daniel Borkmann	fad6e249ea	bpf: Enable bpf cgroup hooks to retrieve cgroup v2 and ancestor id Enable the bpf_get_current_cgroup_id() helper for connect(), sendmsg(), recvmsg() and bind-related hooks in order to retrieve the cgroup v2 context which can then be used as part of the key for BPF map lookups, for example. Given these hooks operate in process context 'current' is always valid and pointing to the app that is performing mentioned syscalls if it's subject to a v2 cgroup. Also with same motivation of commit 7723628101aa ("bpf: Introduce bpf_skb_ancestor_cgroup_id helper") enable retrieval of ancestor from current so the cgroup id can be used for policy lookups which can then forbid connect() / bind(), for example. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/d2a7ef42530ad299e3cbb245e6c12374b72145ef.1585323121.git.daniel@iogearbox.net	2020-04-02 00:02:25 -07:00
Daniel Borkmann	64f7fa917c	bpf: Add netns cookie and enable it for bpf cgroup hooks In Cilium we're mainly using BPF cgroup hooks today in order to implement kube-proxy free Kubernetes service translation for ClusterIP, NodePort (), ExternalIP, and LoadBalancer as well as HostPort mapping [0] for all traffic between Cilium managed nodes. While this works in its current shape and avoids packet-level NAT for inter Cilium managed node traffic, there is one major limitation we're facing today, that is, lack of netns awareness. In Kubernetes, the concept of Pods (which hold one or multiple containers) has been built around network namespaces, so while we can use the global scope of attaching to root BPF cgroup hooks also to our advantage (e.g. for exposing NodePort ports on loopback addresses), we also have the need to differentiate between initial network namespaces and non-initial one. For example, ExternalIP services mandate that non-local service IPs are not to be translated from the host (initial) network namespace as one example. Right now, we have an ugly work-around in place where non-local service IPs for ExternalIP services are not xlated from connect() and friends BPF hooks but instead via less efficient packet-level NAT on the veth tc ingress hook for Pod traffic. On top of determining whether we're in initial or non-initial network namespace we also have a need for a socket-cookie like mechanism for network namespaces scope. Socket cookies have the nice property that they can be combined as part of the key structure e.g. for BPF LRU maps without having to worry that the cookie could be recycled. We are planning to use this for our sessionAffinity implementation for services. Therefore, add a new bpf_get_netns_cookie() helper which would resolve both use cases at once: bpf_get_netns_cookie(NULL) would provide the cookie for the initial network namespace while passing the context instead of NULL would provide the cookie from the application's network namespace. We're using a hole, so no size increase; the assignment happens only once. Therefore this allows for a comparison on initial namespace as well as regular cookie usage as we have today with socket cookies. We could later on enable this helper for other program types as well as we would see need. () Both externalTrafficPolicy={Local\|Cluster} types [0] https://github.com/cilium/cilium/blob/master/bpf/bpf_sock.c Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/c47d2346982693a9cf9da0e12690453aded4c788.1585323121.git.daniel@iogearbox.net	2020-04-02 00:02:25 -07:00
Stanislav Fomichev	240b8fa098	libbpf: Don't allocate 16M for log buffer by default For each prog/btf load we allocate and free 16 megs of verifier buffer. On production systems it doesn't really make sense because the programs/btf have gone through extensive testing and (mostly) guaranteed to successfully load. Let's assume successful case by default and skip buffer allocation on the first try. If there is an error, start with BPF_LOG_BUF_SIZE and double it on each ENOSPC iteration. v3: * Return -ENOMEM when can't allocate log buffer (Andrii Nakryiko) v2: * Don't allocate the buffer at all on the first try (Andrii Nakryiko) Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200325195521.112210-1-sdf@google.com	2020-04-02 00:02:25 -07:00
Tobias Klauser	3756d20499	libbpf: Remove unused parameter `def` to get_map_field_int Has been unused since commit ef99b02b23ef ("libbpf: capture value in BTF type info for BTF-defined map defs"). Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200325113655.19341-1-tklauser@distanz.ch	2020-04-02 00:02:25 -07:00
Mark Starovoytov	9e8b23289f	net: macsec: add support for specifying offload upon link creation This patch adds new netlink attribute to allow a user to (optionally) specify the desired offload mode immediately upon MACSec link creation. Separate iproute patch will be required to support this from user space. Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-04-02 00:02:25 -07:00
Antoine Tenart	902eca48e5	net: macsec: add support for offloading to the MAC This patch adds a new MACsec offloading option, MACSEC_OFFLOAD_MAC, allowing a user to select a MAC as a provider for MACsec offloading operations. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-04-02 00:02:25 -07:00
Andrii Nakryiko	9f0d55c24a	vmtests: organize blacklists, enable sockmap_listen tests Enable now-fixed sockmap_listen tests. Disabled vmlinux test on 5.5, on which hrtimer_nanosleep() signature is incompatible. Filled out remaining permanently disabled tests resons. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-03-17 14:56:36 -07:00
Andrii Nakryiko	e53dd1c436	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 9b79c0be350d3825ef26ed9eebac6ae50df506bc Checkpoint bpf-next commit: 483d7a30f538e2f8addd32aa9a3d2e94ae55fa65 Baseline bpf commit: 90db6d772f749e38171d04619a5e3cd8804a6d02 Checkpoint bpf commit: 94b18a87efdd1626a1e6aef87271af4a7c616d36 Andrii Nakryiko (2): libbpf: Ignore incompatible types with matching name during CO-RE relocation libbpf: Provide CO-RE variants of PT_REGS macros Wenbo Zhang (1): bpf, libbpf: Fix ___bpf_kretprobe_args1(x) macro definition src/bpf_tracing.h \| 105 +++++++++++++++++++++++++++++++++++++++++++++- src/libbpf.c \| 4 ++ 2 files changed, 108 insertions(+), 1 deletion(-) -- 2.17.1	2020-03-17 14:56:36 -07:00
Andrii Nakryiko	da790d6014	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2020-03-17 14:56:36 -07:00
Wenbo Zhang	3d81b13b36	bpf, libbpf: Fix ___bpf_kretprobe_args1(x) macro definition Use PT_REGS_RC instead of PT_REGS_RET to get ret correctly. Fixes: df8ff35311c8 ("libbpf: Merge selftests' bpf_trace_helpers.h into libbpf's bpf_tracing.h") Signed-off-by: Wenbo Zhang <ethercflow@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200315083252.22274-1-ethercflow@gmail.com	2020-03-17 14:56:36 -07:00
Andrii Nakryiko	64bd9e074b	libbpf: Provide CO-RE variants of PT_REGS macros Syscall raw tracepoints have struct pt_regs pointer as tracepoint's first argument. After that, reading any of pt_regs fields requires bpf_probe_read(), even for tp_btf programs. Due to that, PT_REGS_PARMx macros are not usable as is. This patch adds CO-RE variants of those macros that use BPF_CORE_READ() to read necessary fields. This provides relocatable architecture-agnostic pt_regs field accesses. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200313172336.1879637-4-andriin@fb.com	2020-03-17 14:56:36 -07:00
Andrii Nakryiko	53d473dd8e	libbpf: Ignore incompatible types with matching name during CO-RE relocation When finding target type candidates, ignore forward declarations, functions, and other named types of incompatible kind. Not doing this can cause false errors. See [0] for one such case (due to struct pt_regs forward declaration). [0] https://github.com/iovisor/bcc/pull/2806#issuecomment-598543645 Fixes: ddc7c3042614 ("libbpf: implement BPF CO-RE offset relocation algorithm") Reported-by: Wenbo Zhang <ethercflow@gmail.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200313172336.1879637-3-andriin@fb.com	2020-03-17 14:56:36 -07:00
Andrii Nakryiko	6d64d927a2	vmtests: enable previously failing kprobe selftests With fixes in selftests, these tests should now pass. Also add ability to add comments to blacklist/whitelist to explain why certain test is disabled. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-03-12 22:57:51 -07:00
Andrii Nakryiko	cd87f1568e	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: abbc61a5f26d52a5d3abbbe552b275360b2c6631 Checkpoint bpf-next commit: 9b79c0be350d3825ef26ed9eebac6ae50df506bc Baseline bpf commit: 542bf38f11d11bf98c69b2f83f3519ada8a76e95 Checkpoint bpf commit: 90db6d772f749e38171d04619a5e3cd8804a6d02 Andrii Nakryiko (4): libbpf: Fix handling of optional field_name in btf_dump__emit_type_decl bpf: Switch BPF UAPI #define constants used from BPF program side to enums libbpf: Assume unsigned values for BTF_KIND_ENUM libbpf: Split BTF presence checks into libbpf- and kernel-specific parts Carlos Neira (1): bpf: Added new helper bpf_get_ns_current_pid_tgid Eelco Chaudron (1): bpf: Add bpf_xdp_output() helper KP Singh (2): bpf: Introduce BPF_MODIFY_RETURN tools/libbpf: Add support for BPF_MODIFY_RETURN Willem de Bruijn (1): bpf: Sync uapi bpf.h to tools/ include/uapi/linux/bpf.h \| 223 +++++++++++++++++++++++++++------------ src/btf_dump.c \| 10 +- src/libbpf.c \| 21 +++- 3 files changed, 176 insertions(+), 78 deletions(-) -- 2.17.1	2020-03-12 22:57:51 -07:00
Andrii Nakryiko	c417a4cb6f	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2020-03-12 22:57:51 -07:00
Eelco Chaudron	fa21d33fff	bpf: Add bpf_xdp_output() helper Introduce new helper that reuses existing xdp perf_event output implementation, but can be called from raw_tracepoint programs that receive 'struct xdp_buff *' as a tracepoint argument. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/158348514556.2239.11050972434793741444.stgit@xdp-tutorial	2020-03-12 22:57:51 -07:00
Carlos Neira	84cf76de9c	bpf: Added new helper bpf_get_ns_current_pid_tgid New bpf helper bpf_get_ns_current_pid_tgid, This helper will return pid and tgid from current task which namespace matches dev_t and inode number provided, this will allows us to instrument a process inside a container. Signed-off-by: Carlos Neira <cneirabustos@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200304204157.58695-3-cneirabustos@gmail.com	2020-03-12 22:57:51 -07:00
Andrii Nakryiko	2ef4fdac6c	libbpf: Split BTF presence checks into libbpf- and kernel-specific parts Needs for application BTF being present differs between user-space libbpf needs and kernel needs. Currently, BTF is mandatory only in kernel only when BPF application is using STRUCT_OPS. While libbpf itself relies more heavily on presense of BTF: - for BTF-defined maps; - for Kconfig externs; - for STRUCT_OPS as well. Thus, checks for presence and validness of bpf_object's BPF needs to be performed separately, which is patch does. Fixes: 5327644614a1 ("libbpf: Relax check whether BTF is mandatory") Reported-by: Michal Rostecki <mrostecki@opensuse.org> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Cc: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20200312185033.736911-1-andriin@fb.com	2020-03-12 22:57:51 -07:00
KP Singh	1d72c9c382	tools/libbpf: Add support for BPF_MODIFY_RETURN Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200304191853.1529-6-kpsingh@chromium.org	2020-03-12 22:57:51 -07:00
KP Singh	7930230b43	bpf: Introduce BPF_MODIFY_RETURN When multiple programs are attached, each program receives the return value from the previous program on the stack and the last program provides the return value to the attached function. The fmod_ret bpf programs are run after the fentry programs and before the fexit programs. The original function is only called if all the fmod_ret programs return 0 to avoid any unintended side-effects. The success value, i.e. 0 is not currently configurable but can be made so where user-space can specify it at load time. For example: int func_to_be_attached(int a, int b) { <--- do_fentry do_fmod_ret: <update ret by calling fmod_ret> if (ret != 0) goto do_fexit; original_function: <side_effects_happen_here> } <--- do_fexit The fmod_ret program attached to this function can be defined as: SEC("fmod_ret/func_to_be_attached") int BPF_PROG(func_name, int a, int b, int ret) { // This will skip the original function logic. return 1; } The first fmod_ret program is passed 0 in its return argument. Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200304191853.1529-4-kpsingh@chromium.org	2020-03-12 22:57:51 -07:00
Andrii Nakryiko	483a8c238f	libbpf: Assume unsigned values for BTF_KIND_ENUM Currently, BTF_KIND_ENUM type doesn't record whether enum values should be interpreted as signed or unsigned. In Linux, most enums are unsigned, though, so interpreting them as unsigned matches real world better. Change btf_dump test case to test maximum 32-bit value, instead of negative value. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200303003233.3496043-3-andriin@fb.com	2020-03-12 22:57:51 -07:00
Andrii Nakryiko	26cbe2384c	bpf: Switch BPF UAPI #define constants used from BPF program side to enums Switch BPF UAPI constants, previously defined as #define macro, to anonymous enum values. This preserves constants values and behavior in expressions, but has added advantaged of being captured as part of DWARF and, subsequently, BTF type info. Which, in turn, greatly improves usefulness of generated vmlinux.h for BPF applications, as it will not require BPF users to copy/paste various flags and constants, which are frequently used with BPF helpers. Only those constants that are used/useful from BPF program side are converted. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200303003233.3496043-2-andriin@fb.com	2020-03-12 22:57:51 -07:00
Andrii Nakryiko	cb4a430c8a	libbpf: Fix handling of optional field_name in btf_dump__emit_type_decl Internal functions, used by btf_dump__emit_type_decl(), assume field_name is never going to be NULL. Ensure it's always the case. Fixes: 9f81654eebe8 ("libbpf: Expose BTF-to-C type declaration emitting API") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200303180800.3303471-1-andriin@fb.com	2020-03-12 22:57:51 -07:00
Willem de Bruijn	f67d535cdb	bpf: Sync uapi bpf.h to tools/ sync tools/include/uapi/linux/bpf.h to match include/uapi/linux/bpf.h Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200303200503.226217-3-willemdebruijn.kernel@gmail.com	2020-03-12 22:57:51 -07:00
Julia Kartseva	ef4785f065	vmtest: libbpf#137 follow-ups - Run test_{maps\|verifier} only with the latest kernel - Mount run control script - Style Signed-off-by: Julia Kartseva (hex@fb.com)	2020-03-12 21:36:30 -07:00
Andrii Nakryiko	9a424bea42	vmtests: add few missing Kconfig settings Add few missing Kconfig settings that might be relied on in selftests. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-03-11 14:44:28 -07:00
Julia Kartseva	10e4311ad7	vmtest: add mkrootfs.sh to build Arch Linux disk image Generate a disk image for libbpf testing in compressed *.zst format The mkrootfs.sh has the following stages: - run pacstrap to install libbpf and selftests dependencies. - create /etc/fstab w/ bpffs and debugfs filesystems - create /etc/init.d/rcS to mount in bootime - create /etc/inittab to invoke /etc/init.d/rcS - compress an image In addition ./travis-ci/vmtest/run.sh set up ext4 fs and mounts it as a loop device: mkfs.ext4 -q "$tmp" mount -o loop "$tmp" "$mnt" Signed-off-by: Julia Kartseva (hex@fb.com)	2020-03-11 08:31:13 -07:00
Julia Kartseva	50febacba1	vmtest: disk image update; run test_{maps\|verifier}; blacklist update The disk image is updated to 2020-03-11. blacklist for LATEST kernel: attach_probe (needs root cause) perf_buffer (needs root cause) send_signal (flaky) sockmap_listen (flaky) Run test_maps and test_verifier. test_maps is not expected to pass for kernels other then LATEST. Signed-off-by: Julia Kartseva (hex@fb.com)	2020-03-11 08:31:13 -07:00
Andrii Nakryiko	ef7d57fcec	vmtest: blacklist link_pinning selftest on 5.5.0 Link pinning is not supported by 5.5.0 and older kernels. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-03-03 00:05:56 -08:00
Andrii Nakryiko	7e7a15321e	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 503d539a6e417b018616bf3060e0b5814fafce47 Checkpoint bpf-next commit: abbc61a5f26d52a5d3abbbe552b275360b2c6631 Baseline bpf commit: 41f57cfde186dba6e357f9db25eafbed017e4487 Checkpoint bpf commit: 542bf38f11d11bf98c69b2f83f3519ada8a76e95 Andrii Nakryiko (3): libbpf: Fix use of PT_REGS_PARM macros with vmlinux.h libbpf: Merge selftests' bpf_trace_helpers.h into libbpf's bpf_tracing.h libbpf: Add bpf_link pinning/unpinning src/bpf_tracing.h \| 120 +++++++++++++++++++++++++++++++++++++++++- src/libbpf.c \| 131 ++++++++++++++++++++++++++++++++++++---------- src/libbpf.h \| 5 ++ src/libbpf.map \| 5 ++ 4 files changed, 233 insertions(+), 28 deletions(-) -- 2.17.1	2020-03-03 00:05:56 -08:00
Andrii Nakryiko	77ac09c3eb	libbpf: Add bpf_link pinning/unpinning With bpf_link abstraction supported by kernel explicitly, add pinning/unpinning API for links. Also allow to create (open) bpf_link from BPF FS file. This API allows to have an "ephemeral" FD-based BPF links (like raw tracepoint or fexit/freplace attachments) surviving user process exit, by pinning them in a BPF FS, which is an important use case for long-running BPF programs. As part of this, expose underlying FD for bpf_link. While legacy bpf_link's might not have a FD associated with them (which will be expressed as a bpf_link with fd=-1), kernel's abstraction is based around FD-based usage, so match it closely. This, subsequently, allows to have a generic pinning/unpinning API for generalized bpf_link. For some types of bpf_links kernel might not support pinning, in which case bpf_link__pin() will return error. With FD being part of generic bpf_link, also get rid of bpf_link_fd in favor of using vanialla bpf_link. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200303043159.323675-3-andriin@fb.com	2020-03-03 00:05:56 -08:00
Andrii Nakryiko	40a08ef216	libbpf: Merge selftests' bpf_trace_helpers.h into libbpf's bpf_tracing.h Move BPF_PROG, BPF_KPROBE, and BPF_KRETPROBE macro into libbpf's bpf_tracing.h header to make it available for non-selftests users. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200229231112.1240137-5-andriin@fb.com	2020-03-03 00:05:56 -08:00
Andrii Nakryiko	b6683d1aeb	libbpf: Fix use of PT_REGS_PARM macros with vmlinux.h Add detection of vmlinux.h to bpf_tracing.h header for PT_REGS macro. Currently, BPF applications have to define __KERNEL__ symbol to use correct definition of struct pt_regs on x86 arch. This is due to different field names under internal kernel vs UAPI conditions. To make this more transparent for users, detect vmlinux.h by checking __VMLINUX_H__ symbol. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200229231112.1240137-3-andriin@fb.com	2020-03-03 00:05:56 -08:00
Julia Kartseva	5247b0b0dc	vmtest: enable more networking kernel selftests Set up loopback to enable more tests: - bpf_tcp_ca - cgroup_attach_autodetach - cgroup_attach_multi - cgroup_attach_override - select_reuseport - sockmap_ktls Signed-off-by: Julia Kartseva hex@fb.com	2020-02-26 14:02:34 -08:00
Andrii Nakryiko	c2b01ad4f3	vmtest: trim down kernel config to minimize build time Remove unnecesary drivers and features to speed up kernel compilation. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-02-26 12:08:31 -08:00
Andrii Nakryiko	c4468dec74	sync: bump kernel commit to latest to pull in latest selftests Manually bump sync commit from kernel repo. There are no libbpf changes, but we need latest selftest patches to try to debug more of crashing selftests. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-02-25 20:01:06 -08:00
Andrii Nakryiko	40229b3ffd	ci: enable more test_progs tests Trim tests blacklist. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-02-22 09:20:41 -08:00
Andrii Nakryiko	7f2d538c27	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 5327644614a18f5d0ff845844a4e9976210b3d8d Checkpoint bpf-next commit: 8eece07c011f88da0ccf4127fca9a4e4faaf58ae Baseline bpf commit: 41f57cfde186dba6e357f9db25eafbed017e4487 Checkpoint bpf commit: 41f57cfde186dba6e357f9db25eafbed017e4487 Eelco Chaudron (2): libbpf: Bump libpf current version to v0.0.8 libbpf: Add support for dynamic program attach target src/libbpf.c \| 34 ++++++++++++++++++++++++++++++---- src/libbpf.h \| 4 ++++ src/libbpf.map \| 5 +++++ 3 files changed, 39 insertions(+), 4 deletions(-) -- 2.17.1	2020-02-22 09:20:41 -08:00
Eelco Chaudron	b7c162a433	libbpf: Add support for dynamic program attach target Currently when you want to attach a trace program to a bpf program the section name needs to match the tracepoint/function semantics. However the addition of the bpf_program__set_attach_target() API allows you to specify the tracepoint/function dynamically. The call flow would look something like this: xdp_fd = bpf_prog_get_fd_by_id(id); trace_obj = bpf_object__open_file("func.o", NULL); prog = bpf_object__find_program_by_title(trace_obj, "fentry/myfunc"); bpf_program__set_expected_attach_type(prog, BPF_TRACE_FENTRY); bpf_program__set_attach_target(prog, xdp_fd, "xdpfilt_blk_all"); bpf_object__load(trace_obj) Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/158220519486.127661.7964708960649051384.stgit@xdp-tutorial	2020-02-22 09:20:41 -08:00
Eelco Chaudron	36c26f12f1	libbpf: Bump libpf current version to v0.0.8 New development cycles starts, bump to v0.0.8. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/158220518424.127661.8278643006567775528.stgit@xdp-tutorial	2020-02-22 09:20:41 -08:00
Andrii Nakryiko	22d5d40493	ci: fetch and build latest pahole Build latest pahole from sources and not rely on hacky Ubuntu repository approach. Also enable tests for latest kernel that rely on pahole 1.16. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-02-21 20:39:35 -08:00
Andrii Nakryiko	17c26b7da6	ci: clean up .travis.yaml Clean up Travis CI config, extract multi-step initializations into scripts. Also, move kernel-building tests to happen last to not block lightweight Debian and Ubuntu tests. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-02-21 20:39:35 -08:00
Andrii Nakryiko	e287979374	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 35b9211c0a2427e8f39e534f442f43804fc8d5ca Checkpoint bpf-next commit: 5327644614a18f5d0ff845844a4e9976210b3d8d Baseline bpf commit: 08dc225d8868d5094ada62f471ebdfcce9dbc298 Checkpoint bpf commit: 41f57cfde186dba6e357f9db25eafbed017e4487 Andrii Nakryiko (1): libbpf: Relax check whether BTF is mandatory Daniel Xu (1): selftests/bpf: Add bpf_read_branch_records() selftest Toke Høiland-Jørgensen (2): bpf, uapi: Remove text about bpf_redirect_map() giving higher performance libbpf: Sanitise internal map names so they are not rejected by the kernel include/uapi/linux/bpf.h \| 41 ++++++++++++++++++++++++++++++---------- src/libbpf.c \| 12 ++++++++---- 2 files changed, 39 insertions(+), 14 deletions(-) -- 2.17.1	2020-02-20 17:56:42 -08:00
Andrii Nakryiko	552af3d963	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2020-02-20 17:56:42 -08:00
Toke Høiland-Jørgensen	c772c9cbde	libbpf: Sanitise internal map names so they are not rejected by the kernel The kernel only accepts map names with alphanumeric characters, underscores and periods in their name. However, the auto-generated internal map names used by libbpf takes their prefix from the user-supplied BPF object name, which has no such restriction. This can lead to "Invalid argument" errors when trying to load a BPF program using global variables. Fix this by sanitising the map names, replacing any non-allowed characters with underscores. Fixes: d859900c4c56 ("bpf, libbpf: support global data/bss/rodata sections") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200217171701.215215-1-toke@redhat.com	2020-02-20 17:56:42 -08:00
Toke Høiland-Jørgensen	031a38cceb	bpf, uapi: Remove text about bpf_redirect_map() giving higher performance The performance of bpf_redirect() is now roughly the same as that of bpf_redirect_map(). However, David Ahern pointed out that the header file has not been updated to reflect this, and still says that a significant performance increase is possible when using bpf_redirect_map(). Remove this text from the bpf_redirect_map() description, and reword the description in bpf_redirect() slightly. Also fix the 'Return' section of the bpf_redirect_map() documentation. Fixes: 1d233886dd90 ("xdp: Use bulking for non-map XDP_REDIRECT and consolidate code paths") Reported-by: David Ahern <dsahern@gmail.com> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20200218130334.29889-1-toke@redhat.com	2020-02-20 17:56:42 -08:00
Andrii Nakryiko	6ff5062480	libbpf: Relax check whether BTF is mandatory If BPF program is using BTF-defined maps, BTF is required only for libbpf itself to process map definitions. If after that BTF fails to be loaded into kernel (e.g., if it doesn't support BTF at all), this shouldn't prevent valid BPF program from loading. Existing retry-without-BTF logic for creating maps will succeed to create such maps without any problems. So, presence of .maps section shouldn't make BTF required for kernel. Update the check accordingly. Validated by ensuring simple BPF program with BTF-defined maps is still loaded on old kernel without BTF support and map is correctly parsed and created. Fixes: abd29c931459 ("libbpf: allow specifying map definitions using BTF") Reported-by: Julia Kartseva <hex@fb.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200220062635.1497872-1-andriin@fb.com	2020-02-20 17:56:42 -08:00
Daniel Xu	fdff85e63e	selftests/bpf: Add bpf_read_branch_records() selftest Add a selftest to test: * default bpf_read_branch_records() behavior * BPF_F_GET_BRANCH_RECORDS_SIZE flag behavior * error path on non branch record perf events * using helper to write to stack * using helper to write to global On host with hardware counter support: # ./test_progs -t perf_branches #27/1 perf_branches_hw:OK #27/2 perf_branches_no_hw:OK #27 perf_branches:OK Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED On host without hardware counter support (VM): # ./test_progs -t perf_branches #27/1 perf_branches_hw:OK #27/2 perf_branches_no_hw:OK #27 perf_branches:OK Summary: 1/2 PASSED, 1 SKIPPED, 0 FAILED Also sync tools/include/uapi/linux/bpf.h. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200218030432.4600-3-dxu@dxuuu.xyz	2020-02-20 17:56:42 -08:00
Andrii Nakryiko	5c7661fd5e	vmtest: update and sort blacklists Update blacklists to omit some of the newest selftests. Also ensure that blacklist is sorted alphabetically. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-02-20 17:56:42 -08:00
Andrii Nakryiko	1feb21b081	vmtest: remove temporary runqslower fix It's now in bpf-next and this work around is not needed anymore. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-02-20 11:48:22 -08:00
Andrii Nakryiko	fa8cb316fb	sync: fix commit signature determination in sync script Commit signature, used to determine already synced commits, includes a short stats per each file relevant. Fix this script to include only files that are actually synced (i.e., exclude Makefile, Build file, etc). Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-02-20 11:12:59 -08:00
Julia Kartseva	f72fe00e70	vmtest: #121 follow-ups. Loop increase bpf-next git fetch depth - The previously introduced git fetch depth of bpf-next tree is not sufficient when bpf-next tree is far ahead from libbpf checkpoint commit, so increase the depth up to 128 max. Since 128 may be an overkill for a general case, increase exponentially in a loop until max is reached. - Do not fetch bpf-next twice - Remove setup_example.sh	2020-02-19 15:01:47 -08:00
Julia Kartseva	583bddce6b	vmtest: build and run bpf kernel selftests against various kernels Run kernel selftests in vmtest with the goal to test libbpf backward compatibility with older kernels. The list of kernels should be specified in .travis.yml config in `jobs` section, e.g. KERNEL=5.5.0. Enlisted kernel releases - 5.5.0 # built from main - 5.5.0-rc6 # built from bpf-next - LATEST The kernel specified as 'LATEST' in .travis.yml is built from bpf-next kernel tree, the rest of the kernels are downloaded from the specified in INDEX file. The kernel sources from bpf-next are manually patched with [1] from bpf tree to fix ranqslower build. This workaround should be removed after the patch is merged from bpf to bpf-next tree. Due to kernel sources being checked out the duration of the LATEST kernel test is ~30m. bpf selftests are built from tools/testing/selftests/bpf/ of bpf-next tree with HEAD revision set to CHECKPOINT-COMMIT specified in libbpf so selftests and libbpf are in sync. Currently only programs are tested with test_progs program, test_maps and test_verifier should follow. test_progs are run with blacklist required due to: - some features, e.g. fentry/fexit are not supported in older kernels - environment limitations, e.g an absence of the recent pahole in Debian - incomplete disk image The blacklist is passed to test_progs with -b option as specified in [2] patch set. Most of the preceeding tests are disabled due to incomplete disk image currenly lacking proper networking settings. For the LATEST kernel fome fentry/fexit tests are disabled due to pahole v1.16 is not abailible in Debian yet. Next steps are resolving issues with blacklisted tests, enabling maps and verifier testing, expanding the list of tested kernels. [1] https://lore.kernel.org/bpf/908498f794661c44dca54da9e09dc0c382df6fcb.1580425879.git.hex@fb.com/t.mbox.gz [2] https://www.spinics.net/lists/netdev/msg625192.html	2020-02-17 22:12:17 -08:00
Julia Kartseva	a52fb86a96	vmtest: add configs for bpf kernel selftests vmtest is run as a TravisCI job in order to test libbpf backward compatibility with the older kernels Add config files required to build and run bpf kernel selftests in vmtest: - latest.config: latest kernel config - INDEX: links to binaries (kernels, disk image) to download - blacklist/BLACKLIST-${kernel}: blacklisted bpf program tests for ${kernel}	2020-02-17 22:12:17 -08:00
Andrii Nakryiko	e5dbc1a96f	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: a6ed02cac690b635dbb938690e795812ce1e14ca Checkpoint bpf-next commit: 35b9211c0a2427e8f39e534f442f43804fc8d5ca Baseline bpf commit: 1712b2fff8c682d145c7889d2290696647d82dab Checkpoint bpf commit: 08dc225d8868d5094ada62f471ebdfcce9dbc298 Alexei Starovoitov (1): libbpf: Add support for program extensions Andrii Nakryiko (2): libbpf: Improve handling of failed CO-RE relocations libbpf: Fix realloc usage in bpf_core_find_cands Antoine Tenart (1): net: macsec: introduce the macsec_context structure Martin KaFai Lau (1): bpf: Sync uapi bpf.h to tools/ include/uapi/linux/bpf.h \| 10 +++- include/uapi/linux/if_link.h \| 7 +++ src/bpf.c \| 3 +- src/libbpf.c \| 112 +++++++++++++++++++++-------------- src/libbpf.h \| 8 ++- src/libbpf.map \| 2 + src/libbpf_probes.c \| 1 + 7 files changed, 97 insertions(+), 46 deletions(-) -- 2.17.1	2020-01-24 14:08:27 -08:00
Andrii Nakryiko	96333403ca	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2020-01-24 14:08:27 -08:00
Andrii Nakryiko	928f2fc146	libbpf: Fix realloc usage in bpf_core_find_cands Fix bug requesting invalid size of reallocated array when constructing CO-RE relocation candidate list. This can cause problems if there are many potential candidates and a very fine-grained memory allocator bucket sizes are used. Fixes: ddc7c3042614 ("libbpf: implement BPF CO-RE offset relocation algorithm") Reported-by: William Smith <williampsmith@fb.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200124201847.212528-1-andriin@fb.com	2020-01-24 14:08:27 -08:00
Andrii Nakryiko	8fd8b5bb46	libbpf: Improve handling of failed CO-RE relocations Previously, if libbpf failed to resolve CO-RE relocation for some instructions, it would either return error immediately, or, if .relaxed_core_relocs option was set, would replace relocatable offset/imm part of an instruction with a bogus value (-1). Neither approach is good, because there are many possible scenarios where relocation is expected to fail (e.g., when some field knowingly can be missing on specific kernel versions). On the other hand, replacing offset with invalid one can hide programmer errors, if this relocation failue wasn't anticipated. This patch deprecates .relaxed_core_relocs option and changes the approach to always replacing instruction, for which relocation failed, with invalid BPF helper call instruction. For cases where this is expected, BPF program should already ensure that that instruction is unreachable, in which case this invalid instruction is going to be silently ignored. But if instruction wasn't guarded, BPF program will be rejected at verification step with verifier log pointing precisely to the place in assembly where the problem is. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200124053837.2434679-1-andriin@fb.com	2020-01-24 14:08:27 -08:00
Martin KaFai Lau	b999e8f2c1	bpf: Sync uapi bpf.h to tools/ This patch sync uapi bpf.h to tools/. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200122233652.903348-1-kafai@fb.com	2020-01-24 14:08:27 -08:00
Alexei Starovoitov	c6c86a53f2	libbpf: Add support for program extensions Add minimal support for program extensions. bpf_object_open_opts() needs to be called with attach_prog_fd = target_prog_fd and BPF program extension needs to have in .c file section definition like SEC("freplace/func_to_be_replaced"). libbpf will search for "func_to_be_replaced" in the target_prog_fd's BTF and will pass it in attach_btf_id to the kernel. This approach works for tests, but more compex use case may need to request function name (and attach_btf_id that kernel sees) to be more dynamic. Such API will be added in future patches. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20200121005348.2769920-3-ast@kernel.org	2020-01-24 14:08:27 -08:00
Antoine Tenart	c69f0d12f3	net: macsec: introduce the macsec_context structure This patch introduces the macsec_context structure. It will be used in the kernel to exchange information between the common MACsec implementation (macsec.c) and the MACsec hardware offloading implementations. This structure contains pointers to MACsec specific structures which contain the actual MACsec configuration, and to the underlying device (phydev for now). Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-24 14:08:27 -08:00
Andrii Nakryiko	033ad7ee78	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 20f21d98cf12b8ecd69e8defc93fae9e3b353b13 Checkpoint bpf-next commit: a6ed02cac690b635dbb938690e795812ce1e14ca Baseline bpf commit: 1712b2fff8c682d145c7889d2290696647d82dab Checkpoint bpf commit: 1712b2fff8c682d145c7889d2290696647d82dab Andrii Nakryiko (3): libbpf: Fix error handling bug in btf_dump__new libbpf: Simplify BTF initialization logic libbpf: Fix potential multiplication overflow in mmap() size calculation KP Singh (1): libbpf: Load btf_vmlinux only once per object. src/btf_dump.c \| 1 + src/libbpf.c \| 174 ++++++++++++++++++++++++++++++------------------- 2 files changed, 109 insertions(+), 66 deletions(-) -- 2.17.1	2020-01-17 20:33:22 -08:00
KP Singh	397db2175d	libbpf: Load btf_vmlinux only once per object. As more programs (TRACING, STRUCT_OPS, and upcoming LSM) use vmlinux BTF information, loading the BTF vmlinux information for every program in an object is sub-optimal. The fix was originally proposed in: https://lore.kernel.org/bpf/CAEf4BzZodr3LKJuM7QwD38BiEH02Cc1UbtnGpVkCJ00Mf+V_Qg@mail.gmail.com/ The btf_vmlinux is populated in the object if any of the programs in the object requires it just before the programs are loaded and freed after the programs finish loading. Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Brendan Jackman <jackmanb@chromium.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200117212825.11755-1-kpsingh@chromium.org	2020-01-17 20:33:22 -08:00
Andrii Nakryiko	6756bdc96e	libbpf: Fix potential multiplication overflow in mmap() size calculation Prevent potential overflow performed in 32-bit integers, before assigning result to size_t. Reported by LGTM static analysis. Fixes: eba9c5f498a1 ("libbpf: Refactor global data map initialization") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200117060801.1311525-4-andriin@fb.com	2020-01-17 20:33:22 -08:00
Andrii Nakryiko	9c1ae55dbd	libbpf: Simplify BTF initialization logic Current implementation of bpf_object's BTF initialization is very convoluted and thus prone to errors. It doesn't have to be like that. This patch simplifies it significantly. This code also triggered static analysis issues over logically dead code due to redundant error checks. This simplification should fix that as well. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200117060801.1311525-3-andriin@fb.com	2020-01-17 20:33:22 -08:00
Andrii Nakryiko	091f073ff0	libbpf: Fix error handling bug in btf_dump__new Fix missing jump to error handling in btf_dump__new, found by Coverity static code analysis. Fixes: 9f81654eebe8 ("libbpf: Expose BTF-to-C type declaration emitting API") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200117060801.1311525-2-andriin@fb.com	2020-01-17 20:33:22 -08:00
Andrii Nakryiko	ad51a528dc	travis-ci: make sure before_script override is non-empty Travis CI seems to be ignoring empty before_script override. Let's make sure it's a non-empty no-op. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2020-01-17 20:00:47 -08:00
Andrii Nakryiko	fa29cc01ff	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 858e284f0ec18bff2620d9a6afe764dc683f8ba1 Checkpoint bpf-next commit: 20f21d98cf12b8ecd69e8defc93fae9e3b353b13 Baseline bpf commit: 1712b2fff8c682d145c7889d2290696647d82dab Checkpoint bpf commit: 1712b2fff8c682d145c7889d2290696647d82dab Andrii Nakryiko (1): libbpf: Revert bpf_helper_defs.h inclusion regression src/bpf_helpers.h \| 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- 2.17.1	2020-01-16 20:06:41 -08:00
Andrii Nakryiko	a2bec08412	libbpf: Revert bpf_helper_defs.h inclusion regression Revert bpf_helpers.h's change to include auto-generated bpf_helper_defs.h through <> instead of "", which causes it to be searched in include path. This can break existing applications that don't have their include path pointing directly to where libbpf installs its headers. There is ongoing work to make all (not just bpf_helper_defs.h) includes more consistent across libbpf and its consumers, but this unbreaks user code as is right now without any regressions. Selftests still behave sub-optimally (taking bpf_helper_defs.h from libbpf's source directory, if it's present there), which will be fixed in subsequent patches. Fixes: 6910d7d3867a ("selftests/bpf: Ensure bpf_helper_defs.h are taken from selftests dir") Reported-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200117004103.148068-1-andriin@fb.com	2020-01-16 20:06:41 -08:00
Andrii Nakryiko	080fd68e9c	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 1d1a3bcffe360a56fd8cc287ed74d4c3066daf42 Checkpoint bpf-next commit: 858e284f0ec18bff2620d9a6afe764dc683f8ba1 Baseline bpf commit: e7a5f1f1cd0008e5ad379270a8657e121eedb669 Checkpoint bpf commit: 1712b2fff8c682d145c7889d2290696647d82dab Andrii Nakryiko (2): tools: Sync uapi/linux/if_link.h libbpf: Support .text sub-calls relocations Brian Vazquez (1): libbpf: Fix unneeded extra initialization in bpf_map_batch_common Martin KaFai Lau (1): libbpf: Expose bpf_find_kernel_btf as a LIBBPF_API Yonghong Song (3): bpf: Add bpf_send_signal_thread() helper tools/bpf: Sync uapi header bpf.h libbpf: Add libbpf support to batch ops include/uapi/linux/bpf.h \| 40 +++++++++++- include/uapi/linux/if_link.h \| 1 + src/bpf.c \| 58 +++++++++++++++++ src/bpf.h \| 22 +++++++ src/btf.c \| 102 +++++++++++++++++++++++++++-- src/btf.h \| 2 + src/libbpf.c \| 122 +++++++---------------------------- src/libbpf.map \| 5 ++ 8 files changed, 247 insertions(+), 105 deletions(-) -- 2.17.1	2020-01-16 10:38:48 -08:00
Andrii Nakryiko	437f57042c	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2020-01-16 10:38:48 -08:00
Brian Vazquez	f4f271b068	libbpf: Fix unneeded extra initialization in bpf_map_batch_common bpf_attr doesn't required to be declared with '= {}' as memset is used in the code. Fixes: 2ab3d86ea1859 ("libbpf: Add libbpf support to batch ops") Reported-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Brian Vazquez <brianvv@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200116045918.75597-1-brianvv@google.com	2020-01-16 10:38:48 -08:00
Martin KaFai Lau	bd35a43bb3	libbpf: Expose bpf_find_kernel_btf as a LIBBPF_API This patch exposes bpf_find_kernel_btf() as a LIBBPF_API. It will be used in 'bpftool map dump' in a following patch to dump a map with btf_vmlinux_value_type_id set. bpf_find_kernel_btf() is renamed to libbpf_find_kernel_btf() and moved to btf.c. As <linux/kernel.h> is included, some of the max/min type casting needs to be fixed. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200115230031.1102305-1-kafai@fb.com	2020-01-16 10:38:48 -08:00
Yonghong Song	d91f681d3b	libbpf: Add libbpf support to batch ops Added four libbpf API functions to support map batch operations: . int bpf_map_delete_batch( ... ) . int bpf_map_lookup_batch( ... ) . int bpf_map_lookup_and_delete_batch( ... ) . int bpf_map_update_batch( ... ) Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200115184308.162644-8-brianvv@google.com	2020-01-16 10:38:48 -08:00
Yonghong Song	1e51491d05	tools/bpf: Sync uapi header bpf.h sync uapi header include/uapi/linux/bpf.h to tools/include/uapi/linux/bpf.h Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200115184308.162644-7-brianvv@google.com	2020-01-16 10:38:48 -08:00
Andrii Nakryiko	37440e95d1	libbpf: Support .text sub-calls relocations The LLVM patch https://reviews.llvm.org/D72197 makes LLVM emit function call relocations within the same section. This includes a default .text section, which contains any BPF sub-programs. This wasn't the case before and so libbpf was able to get a way with slightly simpler handling of subprogram call relocations. This patch adds support for .text section relocations. It needs to ensure correct order of relocations, so does two passes: - first, relocate .text instructions, if there are any relocations in it; - then process all the other programs and copy over patched .text instructions for all sub-program calls. v1->v2: - break early once .text program is processed. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200115190856.2391325-1-andriin@fb.com	2020-01-16 10:38:48 -08:00
Yonghong Song	681f2f9291	bpf: Add bpf_send_signal_thread() helper Commit 8b401f9ed244 ("bpf: implement bpf_send_signal() helper") added helper bpf_send_signal() which permits bpf program to send a signal to the current process. The signal may be delivered to any threads in the process. We found a use case where sending the signal to the current thread is more preferable. - A bpf program will collect the stack trace and then send signal to the user application. - The user application will add some thread specific information to the just collected stack trace for later analysis. If bpf_send_signal() is used, user application will need to check whether the thread receiving the signal matches the thread collecting the stack by checking thread id. If not, it will need to send signal to another thread through pthread_kill(). This patch proposed a new helper bpf_send_signal_thread(), which sends the signal to the thread corresponding to the current kernel task. This way, user space is guaranteed that bpf_program execution context and user space signal handling context are the same thread. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200115035002.602336-1-yhs@fb.com	2020-01-16 10:38:48 -08:00
Andrii Nakryiko	0e4638ec14	tools: Sync uapi/linux/if_link.h Sync uapi/linux/if_link.h into tools to avoid out of sync warnings during libbpf build. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200113073143.1779940-2-andriin@fb.com	2020-01-16 10:38:48 -08:00
hex	234a45a128	Update README with Distribution section - List of current distros having libbpf packaged from GH - Rationale of having libbpf packaged from GH - List of package dependencies	2020-01-14 16:12:42 -08:00
Andrii Nakryiko	8687395198	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 2bbc078f812d45b8decb55935dab21199bd21489 Checkpoint bpf-next commit: 1d1a3bcffe360a56fd8cc287ed74d4c3066daf42 Baseline bpf commit: 3c2f450e553ce47fcb0d6141807a8858e3213c9c Checkpoint bpf commit: e7a5f1f1cd0008e5ad379270a8657e121eedb669 Alexei Starovoitov (1): libbpf: Sanitize global functions Andrey Ignatov (1): bpf: Document BPF_F_QUERY_EFFECTIVE flag Andrii Nakryiko (3): libbpf: Make bpf_map order and indices stable selftests/bpf: Ensure bpf_helper_defs.h are taken from selftests dir libbpf: Poison kernel-only integer types Martin KaFai Lau (2): bpf: Synch uapi bpf.h to tools/ bpf: libbpf: Add STRUCT_OPS support Michal Rostecki (1): libbpf: Add probe for large INSN limit include/uapi/linux/bpf.h \| 26 +- include/uapi/linux/btf.h \| 6 + src/bpf.c \| 13 +- src/bpf.h \| 5 +- src/bpf_helpers.h \| 2 +- src/bpf_prog_linfo.c \| 3 + src/btf.c \| 3 + src/btf_dump.c \| 3 + src/hashmap.c \| 3 + src/libbpf.c \| 701 +++++++++++++++++++++++++++++++++++++-- src/libbpf.h \| 6 +- src/libbpf.map \| 4 + src/libbpf_errno.c \| 3 + src/libbpf_probes.c \| 26 ++ src/netlink.c \| 3 + src/nlattr.c \| 3 + src/str_error.c \| 3 + src/xsk.c \| 3 + 18 files changed, 784 insertions(+), 32 deletions(-) -- 2.17.1	2020-01-10 11:15:12 -08:00
Andrii Nakryiko	0cccc9ff28	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2020-01-10 11:15:12 -08:00
Andrii Nakryiko	b50eb28758	libbpf: Poison kernel-only integer types It's been a recurring issue with types like u32 slipping into libbpf source code accidentally. This is not detected during builds inside kernel source tree, but becomes a compilation error in libbpf's Github repo. Libbpf is supposed to use only __{s,u}{8,16,32,64} typedefs, so poison {s,u}{8,16,32,64} explicitly in every .c file. Doing that in a bit more centralized way, e.g., inside libbpf_internal.h breaks selftests, which are both using kernel u32 and libbpf_internal.h. This patch also fixes a new u32 occurence in libbpf.c, added recently. Fixes: 590a00888250 ("bpf: libbpf: Add STRUCT_OPS support") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200110181916.271446-1-andriin@fb.com	2020-01-10 11:15:12 -08:00
Alexei Starovoitov	8d936a1570	libbpf: Sanitize global functions In case the kernel doesn't support BTF_FUNC_GLOBAL sanitize BTF produced by the compiler for global functions. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200110064124.1760511-2-ast@kernel.org	2020-01-10 11:15:12 -08:00
Andrii Nakryiko	2c8602eb54	selftests/bpf: Ensure bpf_helper_defs.h are taken from selftests dir Reorder includes search path to ensure $(OUTPUT) and $(CURDIR) go before libbpf's directory. Also fix bpf_helpers.h to include bpf_helper_defs.h in such a way as to leverage includes search path. This allows selftests to not use libbpf's local and potentially stale bpf_helper_defs.h. It's important because selftests/bpf's Makefile only re-generates bpf_helper_defs.h in seltests' output directory, not the one in libbpf's directory. Also force regeneration of bpf_helper_defs.h when libbpf.a is updated to reduce staleness. Fixes: fa633a0f8919 ("libbpf: Fix build on read-only filesystems") Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200110051716.1591485-3-andriin@fb.com	2020-01-10 11:15:12 -08:00
Andrii Nakryiko	1ef23426e7	libbpf: Make bpf_map order and indices stable Currently, libbpf re-sorts bpf_map structs after all the maps are added and initialized, which might change their relative order and invalidate any bpf_map pointer or index taken before that. This is inconvenient and error-prone. For instance, it can cause .kconfig map index to point to a wrong map. Furthermore, libbpf itself doesn't rely on any specific ordering of bpf_maps, so it's just an unnecessary complication right now. This patch drops sorting of maps and makes their relative positions fixed. If efficient index is ever needed, it's better to have a separate array of pointers as a search index, instead of reordering bpf_map struct in-place. This will be less error-prone and will allow multiple independent orderings, if necessary (e.g., either by section index or by name). Fixes: 166750bc1dd2 ("libbpf: Support libbpf-provided extern variables") Reported-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200110034247.1220142-1-andriin@fb.com	2020-01-10 11:15:12 -08:00
Andrey Ignatov	d2100072b9	bpf: Document BPF_F_QUERY_EFFECTIVE flag Document BPF_F_QUERY_EFFECTIVE flag, mostly to clarify how it affects attach_flags what may not be obvious and what may lead to confision. Specifically attach_flags is returned only for target_fd but if programs are inherited from an ancestor cgroup then returned attach_flags for current cgroup may be confusing. For example, two effective programs of same attach_type can be returned but w/o BPF_F_ALLOW_MULTI in attach_flags. Simple repro: # bpftool c s /sys/fs/cgroup/path/to/task ID AttachType AttachFlags Name # bpftool c s /sys/fs/cgroup/path/to/task effective ID AttachType AttachFlags Name 95043 ingress tw_ipt_ingress 95048 ingress tw_ingress Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200108014006.938363-1-rdna@fb.com	2020-01-10 11:15:12 -08:00
Martin KaFai Lau	f3edca46e5	bpf: libbpf: Add STRUCT_OPS support This patch adds BPF STRUCT_OPS support to libbpf. The only sec_name convention is SEC(".struct_ops") to identify the struct_ops implemented in BPF, e.g. To implement a tcp_congestion_ops: SEC(".struct_ops") struct tcp_congestion_ops dctcp = { .init = (void )dctcp_init, / <-- a bpf_prog / / ... some more func prts ... */ .name = "bpf_dctcp", }; Each struct_ops is defined as a global variable under SEC(".struct_ops") as above. libbpf creates a map for each variable and the variable name is the map's name. Multiple struct_ops is supported under SEC(".struct_ops"). In the bpf_object__open phase, libbpf will look for the SEC(".struct_ops") section and find out what is the btf-type the struct_ops is implementing. Note that the btf-type here is referring to a type in the bpf_prog.o's btf. A "struct bpf_map" is added by bpf_object__add_map() as other maps do. It will then collect (through SHT_REL) where are the bpf progs that the func ptrs are referring to. No btf_vmlinux is needed in the open phase. In the bpf_object__load phase, the map-fields, which depend on the btf_vmlinux, are initialized (in bpf_map__init_kern_struct_ops()). It will also set the prog->type, prog->attach_btf_id, and prog->expected_attach_type. Thus, the prog's properties do not rely on its section name. [ Currently, the bpf_prog's btf-type ==> btf_vmlinux's btf-type matching process is as simple as: member-name match + btf-kind match + size match. If these matching conditions fail, libbpf will reject. The current targeting support is "struct tcp_congestion_ops" which most of its members are function pointers. The member ordering of the bpf_prog's btf-type can be different from the btf_vmlinux's btf-type. ] Then, all obj->maps are created as usual (in bpf_object__create_maps()). Once the maps are created and prog's properties are all set, the libbpf will proceed to load all the progs. bpf_map__attach_struct_ops() is added to register a struct_ops map to a kernel subsystem. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200109003514.3856730-1-kafai@fb.com	2020-01-10 11:15:12 -08:00
Martin KaFai Lau	cabb077325	bpf: Synch uapi bpf.h to tools/ This patch sync uapi bpf.h to tools/ Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200109003512.3856559-1-kafai@fb.com	2020-01-10 11:15:12 -08:00
Michal Rostecki	b95b281039	libbpf: Add probe for large INSN limit Introduce a new probe which checks whether kernel has large maximum program size which was increased in the following commit: c04c0d2b968a ("bpf: increase complexity limit and maximum program size") Based on the similar check in Cilium[0], authored by Daniel Borkmann. [0] `657d0f585a` Co-authored-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Michal Rostecki <mrostecki@opensuse.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Link: https://lore.kernel.org/bpf/20200108162428.25014-2-mrostecki@opensuse.org	2020-01-10 11:15:12 -08:00
Andrii Nakryiko	5033d7177e	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 7745ff9842617323adbe24e71c495d5ebd9aa352 Checkpoint bpf-next commit: 2bbc078f812d45b8decb55935dab21199bd21489 Baseline bpf commit: 1148f9adbe71415836a18a36c1b4ece999ab0973 Checkpoint bpf commit: 3c2f450e553ce47fcb0d6141807a8858e3213c9c Andrey Ignatov (2): bpf: Support replacing cgroup-bpf program in MULTI mode libbpf: Introduce bpf_prog_attach_xattr Andrii Nakryiko (1): libbpf: Support CO-RE relocations for LDX/ST/STX instructions include/uapi/linux/bpf.h \| 10 ++++++++++ src/bpf.c \| 17 ++++++++++++++++- src/bpf.h \| 11 +++++++++++ src/libbpf.c \| 31 ++++++++++++++++++++++++++++--- src/libbpf.map \| 1 + 5 files changed, 66 insertions(+), 4 deletions(-) -- 2.17.1	2019-12-30 12:05:19 -08:00
Andrii Nakryiko	49058f8c6f	libbpf: Support CO-RE relocations for LDX/ST/STX instructions Clang patch [0] enables emitting relocatable generic ALU/ALU64 instructions (i.e, shifts and arithmetic operations), as well as generic load/store instructions. The former ones are already supported by libbpf as is. This patch adds further support for load/store instructions. Relocatable field offset is encoded in BPF instruction's 16-bit offset section and are adjusted by libbpf based on target kernel BTF. These Clang changes and corresponding libbpf changes allow for more succinct generated BPF code by encoding relocatable field reads as a single ST/LDX/STX instruction. It also enables relocatable access to BPF context. Previously, if context struct (e.g., __sk_buff) was accessed with CO-RE relocations (e.g., due to preserve_access_index attribute), it would be rejected by BPF verifier due to modified context pointer dereference. With Clang patch, such context accesses are both relocatable and have a fixed offset from the point of view of BPF verifier. [0] https://reviews.llvm.org/D71790 Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20191223180305.86417-1-andriin@fb.com	2019-12-30 12:05:19 -08:00
Andrey Ignatov	8b20ffa4b9	libbpf: Introduce bpf_prog_attach_xattr Introduce a new bpf_prog_attach_xattr function that, in addition to program fd, target fd and attach type, accepts an extendable struct bpf_prog_attach_opts. bpf_prog_attach_opts relies on DECLARE_LIBBPF_OPTS macro to maintain backward and forward compatibility and has the following "optional" attach attributes: * existing attach_flags, since it's not required when attaching in NONE mode. Even though it's quite often used in MULTI and OVERRIDE mode it seems to be a good idea to reduce number of arguments to bpf_prog_attach_xattr; * newly introduced attribute of BPF_PROG_ATTACH command: replace_prog_fd that is fd of previously attached cgroup-bpf program to replace if BPF_F_REPLACE flag is used. The new function is named to be consistent with other xattr-functions (bpf_prog_test_run_xattr, bpf_create_map_xattr, bpf_load_program_xattr). The struct bpf_prog_attach_opts is supposed to be used with DECLARE_LIBBPF_OPTS macro. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/bd6e0732303eb14e4b79cb128268d9e9ad6db208.1576741281.git.rdna@fb.com	2019-12-30 12:05:19 -08:00
Andrey Ignatov	1b1e30679f	bpf: Support replacing cgroup-bpf program in MULTI mode The common use-case in production is to have multiple cgroup-bpf programs per attach type that cover multiple use-cases. Such programs are attached with BPF_F_ALLOW_MULTI and can be maintained by different people. Order of programs usually matters, for example imagine two egress programs: the first one drops packets and the second one counts packets. If they're swapped the result of counting program will be different. It brings operational challenges with updating cgroup-bpf program(s) attached with BPF_F_ALLOW_MULTI since there is no way to replace a program: * One way to update is to detach all programs first and then attach the new version(s) again in the right order. This introduces an interruption in the work a program is doing and may not be acceptable (e.g. if it's egress firewall); * Another way is attach the new version of a program first and only then detach the old version. This introduces the time interval when two versions of same program are working, what may not be acceptable if a program is not idempotent. It also imposes additional burden on program developers to make sure that two versions of their program can co-exist. Solve the problem by introducing a "replace" mode in BPF_PROG_ATTACH command for cgroup-bpf programs being attached with BPF_F_ALLOW_MULTI flag. This mode is enabled by newly introduced BPF_F_REPLACE attach flag and bpf_attr.replace_bpf_fd attribute to pass fd of the old program to replace That way user can replace any program among those attached with BPF_F_ALLOW_MULTI flag without the problems described above. Details of the new API: * If BPF_F_REPLACE is set but replace_bpf_fd doesn't have valid descriptor of BPF program, BPF_PROG_ATTACH will return corresponding error (EINVAL or EBADF). * If replace_bpf_fd has valid descriptor of BPF program but such a program is not attached to specified cgroup, BPF_PROG_ATTACH will return ENOENT. BPF_F_REPLACE is introduced to make the user intent clear, since replace_bpf_fd alone can't be used for this (its default value, 0, is a valid fd). BPF_F_REPLACE also makes it possible to extend the API in the future (e.g. add BPF_F_BEFORE and BPF_F_AFTER if needed). Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Andrii Narkyiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/30cd850044a0057bdfcaaf154b7d2f39850ba813.1576741281.git.rdna@fb.com	2019-12-30 12:05:19 -08:00
Andrii Nakryiko	e7a82fc033	sync: add zlib dependency and libbpf_common.h to list of installed headers zlib is now a direct dependency of libbpf (previously zlib was only dependency of libelf, on which libbpf depends as well). For non-pkg-config case, specify `-lz` compiler flag explicitly. Recent sync also added another public header to libbpf. Include it in a list of headers that are installed on target system. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2019-12-19 15:34:27 -08:00
Andrii Nakryiko	7c5583ab2d	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 679152d3a32e305c213f83160c328c37566ae8bc Checkpoint bpf-next commit: 7745ff9842617323adbe24e71c495d5ebd9aa352 Baseline bpf commit: fe3300897cbfd76c6cb825776e5ac0ca50a91ca4 Checkpoint bpf commit: 1148f9adbe71415836a18a36c1b4ece999ab0973 Andrii Nakryiko (26): libbpf: Extract and generalize CPU mask parsing logic libbpf: Don't attach perf_buffer to offline/missing CPUs libbpf: Don't require root for bpf_object__open() libbpf: Add generic bpf_program__attach() libbpf: Move non-public APIs from libbpf.h to libbpf_internal.h libbpf: Add BPF_EMBED_OBJ macro for embedding BPF .o files libbpf: Extract common user-facing helpers libbpf: Expose btf__align_of() API libbpf: Expose BTF-to-C type declaration emitting API libbpf: Expose BPF program's function name libbpf: Refactor global data map initialization libbpf: Postpone BTF ID finding for TRACING programs to load phase libbpf: Reduce log level of supported section names dump libbpf: Add BPF object skeleton support libbpf: Extract internal map names into constants libbpf: Support libbpf-provided extern variables bpftool: Generate externs datasec in BPF skeleton libbpf: Support flexible arrays in CO-RE libbpf: Add zlib as a dependency in pkg-config template libbpf: Reduce log level for custom section names libbpf: Remove BPF_EMBED_OBJ macro from libbpf.h libbpf: Add bpf_link__disconnect() API to preserve underlying BPF resource libbpf: Put Kconfig externs into .kconfig section libbpf: Allow to augment system Kconfig through extra optional config libbpf: BTF is required when externs are present libbpf: Fix another __u64 printf warning Jakub Sitnicki (1): libbpf: Recognize SK_REUSEPORT programs from section name Prashant Bhole (1): libbpf: Fix build by renaming variables Toke Høiland-Jørgensen (4): libbpf: Print hint about ulimit when getting permission denied error libbpf: Fix libbpf_common.h when installing libbpf through 'make install' libbpf: Add missing newline in opts validation macro libbpf: Fix printing of ulimit value include/uapi/linux/btf.h \| 7 +- src/bpf.h \| 6 +- src/bpf_helpers.h \| 11 + src/btf.c \| 48 +- src/btf.h \| 29 +- src/btf_dump.c \| 115 ++- src/libbpf.c \| 1673 ++++++++++++++++++++++++++++++++------ src/libbpf.h \| 107 +-- src/libbpf.map \| 12 + src/libbpf.pc.template \| 2 +- src/libbpf_common.h \| 40 + src/libbpf_internal.h \| 21 +- 12 files changed, 1678 insertions(+), 393 deletions(-) create mode 100644 src/libbpf_common.h -- 2.17.1	2019-12-19 15:34:27 -08:00
Andrii Nakryiko	0d9d85e345	libbpf: Fix another __u64 printf warning Fix yet another printf warning for %llu specifier on ppc64le. This time size_t casting won't work, so cast to verbose `unsigned long long`. Fixes: 166750bc1dd2 ("libbpf: Support libbpf-provided extern variables") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191219052103.3515-1-andriin@fb.com	2019-12-19 15:34:27 -08:00
Toke Høiland-Jørgensen	c8c4edf4c9	libbpf: Fix printing of ulimit value Naresh pointed out that libbpf builds fail on 32-bit architectures because rlimit.rlim_cur is defined as 'unsigned long long' on those architectures. Fix this by using %zu in printf and casting to size_t. Fixes: dc3a2d254782 ("libbpf: Print hint about ulimit when getting permission denied error") Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191219090236.905059-1-toke@redhat.com	2019-12-19 15:34:27 -08:00
Toke Høiland-Jørgensen	23983fd75b	libbpf: Add missing newline in opts validation macro The error log output in the opts validation macro was missing a newline. Fixes: 2ce8450ef5a3 ("libbpf: add bpf_object__open_{file, mem} w/ extensible opts") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191219120714.928380-1-toke@redhat.com	2019-12-19 15:34:27 -08:00
Andrii Nakryiko	4bbdefdce1	libbpf: BTF is required when externs are present BTF is required to get type information about extern variables. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191219002837.3074619-4-andriin@fb.com	2019-12-19 15:34:27 -08:00
Andrii Nakryiko	5bc09e54fa	libbpf: Allow to augment system Kconfig through extra optional config Instead of all or nothing approach of overriding Kconfig file location, allow to extend it with extra values and override chosen subset of values though optional user-provided extra config, passed as a string through open options' .kconfig option. If same config key is present in both user-supplied config and Kconfig, user-supplied one wins. This allows applications to more easily test various conditions despite host kernel's real configuration. If all of BPF object's __kconfig externs are satisfied from user-supplied config, system Kconfig won't be read at all. Simplify selftests by not needing to create temporary Kconfig files. Suggested-by: Alexei Starovoitov <ast@fb.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191219002837.3074619-3-andriin@fb.com	2019-12-19 15:34:27 -08:00
Andrii Nakryiko	9f61b5b95c	libbpf: Put Kconfig externs into .kconfig section Move Kconfig-provided externs into custom .kconfig section. Add __kconfig into bpf_helpers.h for user convenience. Update selftests accordingly. Suggested-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191219002837.3074619-2-andriin@fb.com	2019-12-19 15:34:27 -08:00
Andrii Nakryiko	baa3268b13	libbpf: Add bpf_link__disconnect() API to preserve underlying BPF resource There are cases in which BPF resource (program, map, etc) has to outlive userspace program that "installed" it in the system in the first place. When BPF program is attached, libbpf returns bpf_link object, which is supposed to be destroyed after no longer necessary through bpf_link__destroy() API. Currently, bpf_link destruction causes both automatic detachment and frees up any resources allocated to for bpf_link in-memory representation. This is inconvenient for the case described above because of coupling of detachment and resource freeing. This patch introduces bpf_link__disconnect() API call, which marks bpf_link as disconnected from its underlying BPF resouces. This means that when bpf_link is destroyed later, all its memory resources will be freed, but BPF resource itself won't be detached. This design allows to follow strict and resource-leak-free design by default, while giving easy and straightforward way for user code to opt for keeping BPF resource attached beyond lifetime of a bpf_link. For some BPF programs (i.e., FS-based tracepoints, kprobes, raw tracepoint, etc), user has to make sure to pin BPF program to prevent kernel to automatically detach it on process exit. This should typically be achived by pinning BPF program (or map in some cases) in BPF FS. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20191218225039.2668205-1-andriin@fb.com	2019-12-19 15:34:27 -08:00
Andrii Nakryiko	f5599ef856	libbpf: Remove BPF_EMBED_OBJ macro from libbpf.h Drop BPF_EMBED_OBJ and struct bpf_embed_data now that skeleton automatically embeds contents of its source object file. While BPF_EMBED_OBJ is useful independently of skeleton, we are currently don't have any use cases utilizing it, so let's remove them until/if we need it. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20191218052552.2915188-3-andriin@fb.com	2019-12-19 15:34:27 -08:00
Andrii Nakryiko	99c65fed78	libbpf: Reduce log level for custom section names Libbpf is trying to recognize BPF program type based on its section name during bpf_object__open() phase. This is not strictly enforced and user code has ability to specify/override correct BPF program type after open. But if BPF program is using custom section name, libbpf will still emit warnings, which can be quite annoying to users. This patch reduces log level of information messages emitted by libbpf if section name is not canonical. User can still get a list of all supported section names as debug-level message. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191217234228.1739308-1-andriin@fb.com	2019-12-19 15:34:27 -08:00
Toke Høiland-Jørgensen	e9adfa851f	libbpf: Fix libbpf_common.h when installing libbpf through 'make install' This fixes two issues with the newly introduced libbpf_common.h file: - The header failed to include <string.h> for the definition of memset() - The new file was not included in the install_headers rule in the Makefile Both of these issues cause breakage when installing libbpf with 'make install' and trying to use it in applications. Fixes: 544402d4b493 ("libbpf: Extract common user-facing helpers") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20191217112810.768078-1-toke@redhat.com	2019-12-19 15:34:27 -08:00
Andrii Nakryiko	8363b8d4e6	libbpf: Add zlib as a dependency in pkg-config template List zlib as another dependency of libbpf in pkg-config template. Verified it is correctly resolved to proper -lz flag: $ make DESTDIR=/tmp/libbpf-install install $ pkg-config --libs /tmp/libbpf-install/usr/local/lib64/pkgconfig/libbpf.pc -L/usr/local/lib64 -lbpf $ pkg-config --libs --static /tmp/libbpf-install/usr/local/lib64/pkgconfig/libbpf.pc -L/usr/local/lib64 -lbpf -lelf -lz Fixes: 166750bc1dd2 ("libbpf: Support libbpf-provided extern variables") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Cc: Luca Boccassi <bluca@debian.org> Link: https://lore.kernel.org/bpf/20191216183830.3972964-1-andriin@fb.com	2019-12-19 15:34:27 -08:00
Toke Høiland-Jørgensen	f892b464d0	libbpf: Print hint about ulimit when getting permission denied error Probably the single most common error newcomers to XDP are stumped by is the 'permission denied' error they get when trying to load their program and 'ulimit -l' is set too low. For examples, see [0], [1]. Since the error code is UAPI, we can't change that. Instead, this patch adds a few heuristics in libbpf and outputs an additional hint if they are met: If an EPERM is returned on map create or program load, and geteuid() shows we are root, and the current RLIMIT_MEMLOCK is not infinity, we output a hint about raising 'ulimit -l' as an additional log line. [0] https://marc.info/?l=xdp-newbies&m=157043612505624&w=2 [1] https://github.com/xdp-project/xdp-tutorial/issues/86 Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20191216181204.724953-1-toke@redhat.com	2019-12-19 15:34:27 -08:00
Prashant Bhole	a4132d1590	libbpf: Fix build by renaming variables In btf__align_of() variable name 't' is shadowed by inner block declaration of another variable with same name. Patch renames variables in order to fix it. CC sharedobjs/btf.o btf.c: In function ‘btf__align_of’: btf.c:303:21: error: declaration of ‘t’ shadows a previous local [-Werror=shadow] 303 \| int i, align = 1, t; \| ^ btf.c:283:25: note: shadowed declaration is here 283 \| const struct btf_type *t = btf__type_by_id(btf, id); \| Fixes: 3d208f4ca111 ("libbpf: Expose btf__align_of() API") Signed-off-by: Prashant Bhole <prashantbhole.linux@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20191216082738.28421-1-prashantbhole.linux@gmail.com	2019-12-19 15:34:27 -08:00
Andrii Nakryiko	303916a126	libbpf: Support flexible arrays in CO-RE Some data stuctures in kernel are defined with either zero-sized array or flexible (dimensionless) array at the end of a struct. Actual data of such array follows in memory immediately after the end of that struct, forming its variable-sized "body" of elements. Support such access pattern in CO-RE relocation handling. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191215070844.1014385-2-andriin@fb.com	2019-12-19 15:34:27 -08:00
Andrii Nakryiko	8eea7ed8e8	bpftool: Generate externs datasec in BPF skeleton Add support for generation of mmap()-ed read-only view of libbpf-provided extern variables. As externs are not supposed to be provided by user code (that's what .data, .bss, and .rodata is for), don't mmap() it initially. Only after skeleton load is performed, map .extern contents as read-only memory. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191214014710.3449601-4-andriin@fb.com	2019-12-19 15:34:27 -08:00
Andrii Nakryiko	fa030ffd20	libbpf: Support libbpf-provided extern variables Add support for extern variables, provided to BPF program by libbpf. Currently the following extern variables are supported: - LINUX_KERNEL_VERSION; version of a kernel in which BPF program is executing, follows KERNEL_VERSION() macro convention, can be 4- and 8-byte long; - CONFIG_xxx values; a set of values of actual kernel config. Tristate, boolean, strings, and integer values are supported. Set of possible values is determined by declared type of extern variable. Supported types of variables are: - Tristate values. Are represented as `enum libbpf_tristate`. Accepted values are strictly 'y', 'n', or 'm', which are represented as TRI_YES, TRI_NO, or TRI_MODULE, respectively. - Boolean values. Are represented as bool (_Bool) types. Accepted values are 'y' and 'n' only, turning into true/false values, respectively. - Single-character values. Can be used both as a substritute for bool/tristate, or as a small-range integer: - 'y'/'n'/'m' are represented as is, as characters 'y', 'n', or 'm'; - integers in a range [-128, 127] or [0, 255] (depending on signedness of char in target architecture) are recognized and represented with respective values of char type. - Strings. String values are declared as fixed-length char arrays. String of up to that length will be accepted and put in first N bytes of char array, with the rest of bytes zeroed out. If config string value is longer than space alloted, it will be truncated and warning message emitted. Char array is always zero terminated. String literals in config have to be enclosed in double quotes, just like C-style string literals. - Integers. 8-, 16-, 32-, and 64-bit integers are supported, both signed and unsigned variants. Libbpf enforces parsed config value to be in the supported range of corresponding integer type. Integers values in config can be: - decimal integers, with optional + and - signs; - hexadecimal integers, prefixed with 0x or 0X; - octal integers, starting with 0. Config file itself is searched in /boot/config-$(uname -r) location with fallback to /proc/config.gz, unless config path is specified explicitly through bpf_object_open_opts' kernel_config_path option. Both gzipped and plain text formats are supported. Libbpf adds explicit dependency on zlib because of this, but this shouldn't be a problem, given libelf already depends on zlib. All detected extern variables, are put into a separate .extern internal map. It, similarly to .rodata map, is marked as read-only from BPF program side, as well as is frozen on load. This allows BPF verifier to track extern values as constants and perform enhanced branch prediction and dead code elimination. This can be relied upon for doing kernel version/feature detection and using potentially unsupported field relocations or BPF helpers in a CO-RE-based BPF program, while still having a single version of BPF program running on old and new kernels. Selftests are validating this explicitly for unexisting BPF helper. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191214014710.3449601-3-andriin@fb.com	2019-12-19 15:34:27 -08:00
Andrii Nakryiko	ea06bc30fa	libbpf: Extract internal map names into constants Instead of duplicating string literals, keep them in one place and consistent. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191214014710.3449601-2-andriin@fb.com	2019-12-19 15:34:27 -08:00
Andrii Nakryiko	531ac0e65f	libbpf: Add BPF object skeleton support Add new set of APIs, allowing to open/load/attach BPF object through BPF object skeleton, generated by bpftool for a specific BPF object file. All the xxx_skeleton() APIs wrap up corresponding bpf_object_xxx() APIs, but additionally also automate map/program lookups by name, global data initialization and mmap()-ing, etc. All this greatly improves and simplifies userspace usability of working with BPF programs. See follow up patches for examples. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20191214014341.3442258-13-andriin@fb.com	2019-12-19 15:34:27 -08:00
Andrii Nakryiko	e35cb347ce	libbpf: Reduce log level of supported section names dump It's quite spammy. And now that bpf_object__open() is trying to determine program type from its section name, we are getting these verbose messages all the time. Reduce their log level to DEBUG. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20191214014341.3442258-12-andriin@fb.com	2019-12-19 15:34:27 -08:00
Andrii Nakryiko	68fa3f0b57	libbpf: Postpone BTF ID finding for TRACING programs to load phase Move BTF ID determination for BPF_PROG_TYPE_TRACING programs to a load phase. Performing it at open step is inconvenient, because it prevents BPF skeleton generation on older host kernel, which doesn't contain BTF_KIND_FUNCs information in vmlinux BTF. This is a common set up, though, when, e.g., selftests are compiled on older host kernel, but the test program itself is executed in qemu VM with bleeding edge kernel. Having this BTF searching performed at load time allows to successfully use bpf_object__open() for codegen and inspection of BPF object file. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20191214014341.3442258-11-andriin@fb.com	2019-12-19 15:34:27 -08:00
Andrii Nakryiko	1c145f0fda	libbpf: Refactor global data map initialization Refactor global data map initialization to use anonymous mmap()-ed memory instead of malloc()-ed one. This allows to do a transparent re-mmap()-ing of already existing memory address to point to BPF map's memory after bpf_object__load() step (done in follow up patch). This choreographed setup allows to have a nice and unsurprising way to pre-initialize read-only (and r/w as well) maps by user and after BPF map creation keep working with mmap()-ed contents of this map. All in a way that doesn't require user code to update any pointers: the illusion of working with memory contents is preserved before and after actual BPF map instantiation. Selftests and runqslower example demonstrate this feature in follow up patches. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20191214014341.3442258-10-andriin@fb.com	2019-12-19 15:34:27 -08:00
Andrii Nakryiko	418c07226a	libbpf: Expose BPF program's function name Add APIs to get BPF program function name, as opposed to bpf_program__title(), which returns BPF program function's section name. Function name has a benefit of being a valid C identifier and uniquely identifies a specific BPF program, while section name can be duplicated across multiple independent BPF programs. Add also bpf_object__find_program_by_name(), similar to bpf_object__find_program_by_title(), to facilitate looking up BPF programs by their C function names. Convert one of selftests to new API for look up. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20191214014341.3442258-9-andriin@fb.com	2019-12-19 15:34:27 -08:00
Andrii Nakryiko	5ec0ba6530	libbpf: Expose BTF-to-C type declaration emitting API Expose API that allows to emit type declaration and field/variable definition (if optional field name is specified) in valid C syntax for any provided BTF type. This is going to be used by bpftool when emitting data section layout as a struct. As part of making this API useful in a stand-alone fashion, move initialization of some of the internal btf_dump state to earlier phase. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20191214014341.3442258-8-andriin@fb.com	2019-12-19 15:34:27 -08:00
Andrii Nakryiko	600ba1c5e1	libbpf: Expose btf__align_of() API Expose BTF API that calculates type alignment requirements. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191214014341.3442258-7-andriin@fb.com	2019-12-19 15:34:27 -08:00
Andrii Nakryiko	aa73e35dc3	libbpf: Extract common user-facing helpers LIBBPF_API and DECLARE_LIBBPF_OPTS are needed in many public libbpf API headers. Extract them into libbpf_common.h to avoid unnecessary interdependency between btf.h, libbpf.h, and bpf.h or code duplication. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191214014341.3442258-6-andriin@fb.com	2019-12-19 15:34:27 -08:00
Andrii Nakryiko	dca6176410	libbpf: Add BPF_EMBED_OBJ macro for embedding BPF .o files Add a convenience macro BPF_EMBED_OBJ, which allows to embed other files (typically used to embed BPF .o files) into a hosting userspace programs. To C program it is exposed as struct bpf_embed_data, containing a pointer to raw data and its size in bytes. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20191214014341.3442258-5-andriin@fb.com	2019-12-19 15:34:27 -08:00
Andrii Nakryiko	6f88f26945	libbpf: Move non-public APIs from libbpf.h to libbpf_internal.h Few libbpf APIs are not public but currently exposed through libbpf.h to be used by bpftool. Move them to libbpf_internal.h, where intent of being non-stable and non-public is much more obvious. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20191214014341.3442258-4-andriin@fb.com	2019-12-19 15:34:27 -08:00
Andrii Nakryiko	f7af143516	libbpf: Add generic bpf_program__attach() Generalize BPF program attaching and allow libbpf to auto-detect type (and extra parameters, where applicable) and attach supported BPF program types based on program sections. Currently this is supported for: - kprobe/kretprobe; - tracepoint; - raw tracepoint; - tracing programs (typed raw TP/fentry/fexit). More types support can be trivially added within this framework. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20191214014341.3442258-3-andriin@fb.com	2019-12-19 15:34:27 -08:00
Andrii Nakryiko	4f3c7b3e13	libbpf: Don't require root for bpf_object__open() Reorganize bpf_object__open and bpf_object__load steps such that bpf_object__open doesn't need root access. This was previously done for feature probing and BTF sanitization. This doesn't have to happen on open, though, so move all those steps into the load phase. This is important, because it makes it possible for tools like bpftool, to just open BPF object file and inspect their contents: programs, maps, BTF, etc. For such operations it is prohibitive to require root access. On the other hand, there is a lot of custom libbpf logic in those steps, so its best avoided for tools to reimplement all that on their own. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20191214014341.3442258-2-andriin@fb.com	2019-12-19 15:34:27 -08:00
Andrii Nakryiko	b85e83f6cb	libbpf: Don't attach perf_buffer to offline/missing CPUs It's quite common on some systems to have more CPUs enlisted as "possible", than there are (and could ever be) present/online CPUs. In such cases, perf_buffer creationg will fail due to inability to create perf event on missing CPU with error like this: libbpf: failed to open perf buffer event on cpu #16: No such device This patch fixes the logic of perf_buffer__new() to ignore CPUs that are missing or currently offline. In rare cases where user explicitly listed specific CPUs to connect to, behavior is unchanged: libbpf will try to open perf event buffer on specified CPU(s) anyways. Fixes: fb84b8224655 ("libbpf: add perf buffer API") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191212013609.1691168-1-andriin@fb.com	2019-12-19 15:34:27 -08:00
Andrii Nakryiko	33d1fbea57	libbpf: Extract and generalize CPU mask parsing logic This logic is re-used for parsing a set of online CPUs. Having it as an isolated piece of code working with input string makes it conveninent to test this logic as well. While refactoring, also improve the robustness of original implementation. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191212013548.1690564-1-andriin@fb.com	2019-12-19 15:34:27 -08:00
Jakub Sitnicki	b234d12c97	libbpf: Recognize SK_REUSEPORT programs from section name Allow loading BPF object files that contain SK_REUSEPORT programs without having to manually set the program type before load if the the section name is set to "sk_reuseport". Makes user-space code needed to load SK_REUSEPORT BPF program more concise. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191212102259.418536-2-jakub@cloudflare.com	2019-12-19 15:34:27 -08:00
hex	7a1d185108	libbpf: fix Coverity scan CI A follow up of [1] Travis CI stages use default phases when no override provided. This leads to Coverity scan stage fail due to execuing the default before_script: phase of VMTEST. Fix this with an explicit override with empty value. [1] https://github.com/libbpf/libbpf/pull/108	2019-12-17 16:46:57 -08:00
hex	76d5bb6a13	libbpf: Add VMTEST to CI Extend continuous integration tests by adding testing against various kernel versions. The code is based on vmtest CI scripts implemented by osandov@ for drgn [1] with the following modifications: - The downloadables are stored in Amazon S3 cloud indexed in [2] - `--setup-cmd` command line option is added to vmtest/run.sh so setup commands run on VM boot can be set in e.g. `.travis.yml` - Travis build matrix [2] is introduced for VM tests so VM tests are followed by the existing CI tests. The matrix has `KERNEL` and `VMTEST_SETUPCMD` dimensions. - Minor style fixes. The vmtest extention code is located in travis-ci/vmtest and contains `run.sh` and `setup_example.sh` - `run.sh` is responsible for the vmtest workflow: downloading vmlinux and rootfs image from the cloud, fs mounting, syncing libbpf sources to the image, setting up scripts run on VM boot, starting VM using QEMU. `run.sh` covers more use cases than a script for a job run in TravisCI, e.g. int can build a kernel w/ `--build` option. - `setup_example.sh` is an example of a script run in VM which can be modified to e.g. run actual libbpf tests. A setup script should have executable permission. To set up a new kernel version for a test: 1) upload vmlinuz.* and vmlinux.*\.zst to Amazon S3 store located at [4]; 2) modify INDEX [2] file. [1] https://github.com/osandov/drgn [2] https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/INDEX [3] https://docs.travis-ci.com/user/build-matrix [4] https://libbpf-vmtest.s3-us-west-1.amazonaws.com/	2019-12-16 21:04:03 -08:00
Frantisek Sumsal	c42bfcbf0e	travis: build on ppc64le as well	2019-12-13 01:04:46 -08:00
Andrii Nakryiko	c2fc7c15a3	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: e7096c131e5161fa3b8e52a650d7719d2857adfd Checkpoint bpf-next commit: 679152d3a32e305c213f83160c328c37566ae8bc Baseline bpf commit: e42617b825f8073569da76dc4510bfa019b1c35a Checkpoint bpf commit: fe3300897cbfd76c6cb825776e5ac0ca50a91ca4 Andrii Nakryiko (2): libbpf: Bump libpf current version to v0.0.7 libbpf: Fix printf compilation warnings on ppc64le arch src/libbpf.c \| 37 +++++++++++++++++++------------------ src/libbpf.map \| 3 +++ 2 files changed, 22 insertions(+), 18 deletions(-) -- 2.17.1	2019-12-12 14:40:26 -08:00
Andrii Nakryiko	4060a65222	libbpf: Fix printf compilation warnings on ppc64le arch On ppc64le __u64 and __s64 are defined as long int and unsigned long int, respectively. This causes compiler to emit warning when %lld/%llu are used to printf 64-bit numbers. Fix this by casting to size_t/ssize_t with %zu and %zd format specifiers, respectively. v1->v2: - use size_t/ssize_t instead of custom typedefs (Martin). Fixes: 1f8e2bcb2cd5 ("libbpf: Refactor relocation handling") Fixes: abd29c931459 ("libbpf: allow specifying map definitions using BTF") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20191212171918.638010-1-andriin@fb.com	2019-12-12 14:40:26 -08:00
Andrii Nakryiko	a26f6b1375	libbpf: Bump libpf current version to v0.0.7 New development cycles starts, bump to v0.0.7 proactively. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20191209224022.3544519-1-andriin@fb.com	2019-12-12 14:40:26 -08:00
Toke Høiland-Jørgensen	6e686c26fa	Makefile: Add cscope and tags rules These were added to the kernel repo, but not in Github. However, they are useful for browsing the source in Github while prototyping new features and compiling them into userspace utilities. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>	2019-12-11 10:38:48 -08:00
Andrii Nakryiko	ab067ed371	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: b615e5a1e067dcb327482d1af7463268b89b1629 Checkpoint bpf-next commit: e7096c131e5161fa3b8e52a650d7719d2857adfd Baseline bpf commit: 34e59836565e36fade1464e054a3551c1a0364be Checkpoint bpf commit: e42617b825f8073569da76dc4510bfa019b1c35a Alexei Starovoitov (2): libbpf: Fix sym->st_value print on 32-bit arches selftests/bpf: Add test for BPF trampoline Andrii Nakryiko (1): libbpf: Fix global variable relocation Martin KaFai Lau (1): bpf: Introduce BPF_TRACE_x helper for the tracing tests src/libbpf.c \| 45 ++++++++++++++++++++------------------------- 1 file changed, 20 insertions(+), 25 deletions(-) -- 2.17.1	2019-12-09 09:44:20 -08:00
Martin KaFai Lau	9b69fbe4d1	bpf: Introduce BPF_TRACE_x helper for the tracing tests For BPF_PROG_TYPE_TRACING, the bpf_prog's ctx is an array of u64. This patch borrows the idea from BPF_CALL_x in filter.h to convert a u64 to the arg type of the traced function. The new BPF_TRACE_x has an arg to specify the return type of a bpf_prog. It will be used in the future TCP-ops bpf_prog that may return "void". The new macros are defined in the new header file "bpf_trace_helpers.h". It is under selftests/bpf/ for now. It could be moved to libbpf later after seeing more upcoming non-tracing use cases. The tests are changed to use these new macros also. Hence, the k[s]u8/16/32/64 are no longer needed and they are removed from the bpf_helpers.h. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191123202504.1502696-1-kafai@fb.com	2019-12-09 09:44:20 -08:00
Alexei Starovoitov	04d8fc50ab	selftests/bpf: Add test for BPF trampoline Add sanity test for BPF trampoline that checks kernel functions with up to 6 arguments of different sizes. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20191114185720.1641606-10-ast@kernel.org	2019-12-09 09:44:20 -08:00
Alexei Starovoitov	ceff1e0363	libbpf: Fix sym->st_value print on 32-bit arches The st_value field is a 64-bit value and causing this error on 32-bit arches: In file included from libbpf.c:52: libbpf.c: In function 'bpf_program__record_reloc': libbpf_internal.h:59:22: error: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'Elf64_Addr' {aka 'const long long unsigned int'} [-Werror=format=] Fix it with (__u64) cast. Fixes: 1f8e2bcb2cd5 ("libbpf: Refactor relocation handling") Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-12-09 09:44:20 -08:00
Andrii Nakryiko	d28acc595f	libbpf: Fix global variable relocation Similarly to a0d7da26ce86 ("libbpf: Fix call relocation offset calculation bug"), relocations against global variables need to take into account referenced symbol's st_value, which holds offset into a corresponding data section (and, subsequently, offset into internal backing map). For static variables this offset is always zero and data offset is completely described by respective instruction's imm field. Convert a bunch of selftests to global variables. Previously they were relying on `static volatile` trick to ensure Clang doesn't inline static variables, which with global variables is not necessary anymore. Fixes: 393cdfbee809 ("libbpf: Support initialized global variables") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20191127200651.1381348-1-andriin@fb.com	2019-12-09 09:44:20 -08:00
Andrii Nakryiko	9ef191ea7d	license: add LICENSE with dual-license SPDX expression Add LICENSE specifying dual-license expression. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2019-11-26 11:06:43 -08:00
Andrii Nakryiko	1add860402	license: add license note to README Add mention of dual-licensing to README Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2019-11-26 11:02:19 -08:00
Andrii Nakryiko	c658f21738	libbpf: add BSD-2-Clause and LGPL-2.1 licenses Libbpf is dual-licensed under BSD-2-Clause and LGPL-2.1 licenses. Include their texts in the root of the repo. Suggestes-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2019-11-26 09:54:43 -08:00
Andrii Nakryiko	9f519af7f4	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: e47a179997ceee6864fbae620eee09ea9c345a4d Checkpoint bpf-next commit: b615e5a1e067dcb327482d1af7463268b89b1629 Baseline bpf commit: d0fbb51dfaa612f960519b798387be436e8f83c5 Checkpoint bpf commit: 34e59836565e36fade1464e054a3551c1a0364be Alexei Starovoitov (4): libbpf: Introduce btf__find_by_name_kind() libbpf: Add support to attach to fentry/fexit tracing progs selftests/bpf: Add test for BPF trampoline libbpf: Add support for attaching BPF programs to other BPF programs Andrii Nakryiko (8): bpf: Add mmap() support for BPF_MAP_TYPE_ARRAY libbpf: Make global data internal arrays mmap()-able, if possible libbpf: Fix call relocation offset calculation bug libbpf: Refactor relocation handling libbpf: Fix various errors and warning reported by checkpatch.pl libbpf: Support initialized global variables libbpf: Fix bpf_object name determination for bpf_object__open_file() libbpf: Fix usage of u32 in userspace code Luigi Rizzo (1): net-af_xdp: Use correct number of channels from ethtool Martin KaFai Lau (1): bpf: Introduce BPF_TRACE_x helper for the tracing tests include/uapi/linux/bpf.h \| 6 + src/bpf.c \| 8 +- src/bpf.h \| 5 +- src/btf.c \| 22 ++ src/btf.h \| 2 + src/libbpf.c \| 478 ++++++++++++++++++++++++++------------- src/libbpf.h \| 7 +- src/libbpf.map \| 3 + src/xsk.c \| 11 +- 9 files changed, 371 insertions(+), 171 deletions(-) -- 2.17.1	2019-11-25 16:55:44 -08:00
Andrii Nakryiko	b7bdc604ef	libbpf: Fix usage of u32 in userspace code u32 is not defined for libbpf when compiled outside of kernel sources (e.g., in Github projection). Use __u32 instead. Fixes: b8c54ea455dc ("libbpf: Add support to attach to fentry/fexit tracing progs") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191125212948.1163343-1-andriin@fb.com	2019-11-25 16:55:44 -08:00
Martin KaFai Lau	354dd9844e	bpf: Introduce BPF_TRACE_x helper for the tracing tests For BPF_PROG_TYPE_TRACING, the bpf_prog's ctx is an array of u64. This patch borrows the idea from BPF_CALL_x in filter.h to convert a u64 to the arg type of the traced function. The new BPF_TRACE_x has an arg to specify the return type of a bpf_prog. It will be used in the future TCP-ops bpf_prog that may return "void". The new macros are defined in the new header file "bpf_trace_helpers.h". It is under selftests/bpf/ for now. It could be moved to libbpf later after seeing more upcoming non-tracing use cases. The tests are changed to use these new macros also. Hence, the k[s]u8/16/32/64 are no longer needed and they are removed from the bpf_helpers.h. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191123202504.1502696-1-kafai@fb.com	2019-11-25 16:55:44 -08:00
Andrii Nakryiko	9b91dce691	libbpf: Fix bpf_object name determination for bpf_object__open_file() If bpf_object__open_file() gets path like "some/dir/obj.o", it should derive BPF object's name as "obj" (unless overriden through opts->object_name). Instead, due to using `path` as a fallback value for opts->obj_name, path is used as is for object name, so for above example BPF object's name will be verbatim "some/dir/obj", which leads to all sorts of troubles, especially when internal maps are concern (they are using up to 8 characters of object name). Fix that by ensuring object_name stays NULL, unless overriden. Fixes: 291ee02b5e40 ("libbpf: Refactor bpf_object__open APIs to use common opts") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191122003527.551556-1-andriin@fb.com	2019-11-25 16:55:44 -08:00
Andrii Nakryiko	83535cb2bf	libbpf: Support initialized global variables Initialized global variables are no different in ELF from static variables, and don't require any extra support from libbpf. But they are matching semantics of global data (backed by BPF maps) more closely, preventing LLVM/Clang from aggressively inlining constant values and not requiring volatile incantations to prevent those. This patch enables global variables. It still disables uninitialized variables, which will be put into special COM (common) ELF section, because BPF doesn't allow uninitialized data to be accessed. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191121070743.1309473-5-andriin@fb.com	2019-11-25 16:55:44 -08:00
Andrii Nakryiko	3f05b513d4	libbpf: Fix various errors and warning reported by checkpatch.pl Fix a bunch of warnings and errors reported by checkpatch.pl, to make it easier to spot new problems. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191121070743.1309473-4-andriin@fb.com	2019-11-25 16:55:44 -08:00
Andrii Nakryiko	0d0d05de08	libbpf: Refactor relocation handling Relocation handling code is convoluted and unnecessarily deeply nested. Split out per-relocation logic into separate function. Also refactor the logic to be more a sequence of per-relocation type checks and processing steps, making it simpler to follow control flow. This makes it easier to further extends it to new kinds of relocations (e.g., support for extern variables). This patch also makes relocation's section verification more robust. Previously relocations against not yet supported externs were silently ignored because of obj->efile.text_shndx was zero, when all BPF programs had custom section names and there was no .text section. Also, invalid LDIMM64 relocations against non-map sections were passed through, if they were pointing to a .text section (or 0, which is invalid section). All these bugs are fixed within this refactoring and checks are made more appropriate for each type of relocation. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191121070743.1309473-3-andriin@fb.com	2019-11-25 16:55:44 -08:00
Andrii Nakryiko	44409068f7	libbpf: Fix call relocation offset calculation bug When relocating subprogram call, libbpf doesn't take into account relo->text_off, which comes from symbol's value. This generally works fine for subprograms implemented as static functions, but breaks for global functions. Taking a simplified test_pkt_access.c as an example: __attribute__ ((noinline)) static int test_pkt_access_subprog1(volatile struct __sk_buff skb) { return skb->len 2; } __attribute__ ((noinline)) static int test_pkt_access_subprog2(int val, volatile struct __sk_buff skb) { return skb->len + val; } SEC("classifier/test_pkt_access") int test_pkt_access(struct __sk_buff skb) { if (test_pkt_access_subprog1(skb) != skb->len * 2) return TC_ACT_SHOT; if (test_pkt_access_subprog2(2, skb) != skb->len + 2) return TC_ACT_SHOT; return TC_ACT_UNSPEC; } When compiled, we get two relocations, pointing to '.text' symbol. .text has st_value set to 0 (it points to the beginning of .text section): 0000000000000008 000000050000000a R_BPF_64_32 0000000000000000 .text 0000000000000040 000000050000000a R_BPF_64_32 0000000000000000 .text test_pkt_access_subprog1 and test_pkt_access_subprog2 offsets (targets of two calls) are encoded within call instruction's imm32 part as -1 and 2, respectively: 0000000000000000 test_pkt_access_subprog1: 0: 61 10 00 00 00 00 00 00 r0 = (u32 )(r1 + 0) 1: 64 00 00 00 01 00 00 00 w0 <<= 1 2: 95 00 00 00 00 00 00 00 exit 0000000000000018 test_pkt_access_subprog2: 3: 61 10 00 00 00 00 00 00 r0 = (u32 )(r1 + 0) 4: 04 00 00 00 02 00 00 00 w0 += 2 5: 95 00 00 00 00 00 00 00 exit 0000000000000000 test_pkt_access: 0: bf 16 00 00 00 00 00 00 r6 = r1 ===> 1: 85 10 00 00 ff ff ff ff call -1 2: bc 01 00 00 00 00 00 00 w1 = w0 3: b4 00 00 00 02 00 00 00 w0 = 2 4: 61 62 00 00 00 00 00 00 r2 = (u32 )(r6 + 0) 5: 64 02 00 00 01 00 00 00 w2 <<= 1 6: 5e 21 08 00 00 00 00 00 if w1 != w2 goto +8 <LBB0_3> 7: bf 61 00 00 00 00 00 00 r1 = r6 ===> 8: 85 10 00 00 02 00 00 00 call 2 9: bc 01 00 00 00 00 00 00 w1 = w0 10: 61 62 00 00 00 00 00 00 r2 = (u32 )(r6 + 0) 11: 04 02 00 00 02 00 00 00 w2 += 2 12: b4 00 00 00 ff ff ff ff w0 = -1 13: 1e 21 01 00 00 00 00 00 if w1 == w2 goto +1 <LBB0_3> 14: b4 00 00 00 02 00 00 00 w0 = 2 0000000000000078 LBB0_3: 15: 95 00 00 00 00 00 00 00 exit Now, if we compile example with global functions, the setup changes. Relocations are now against specifically test_pkt_access_subprog1 and test_pkt_access_subprog2 symbols, with test_pkt_access_subprog2 pointing 24 bytes into its respective section (.text), i.e., 3 instructions in: 0000000000000008 000000070000000a R_BPF_64_32 0000000000000000 test_pkt_access_subprog1 0000000000000048 000000080000000a R_BPF_64_32 0000000000000018 test_pkt_access_subprog2 Calls instructions now encode offsets relative to function symbols and are both set ot -1: 0000000000000000 test_pkt_access_subprog1: 0: 61 10 00 00 00 00 00 00 r0 = (u32 )(r1 + 0) 1: 64 00 00 00 01 00 00 00 w0 <<= 1 2: 95 00 00 00 00 00 00 00 exit 0000000000000018 test_pkt_access_subprog2: 3: 61 20 00 00 00 00 00 00 r0 = (u32 )(r2 + 0) 4: 0c 10 00 00 00 00 00 00 w0 += w1 5: 95 00 00 00 00 00 00 00 exit 0000000000000000 test_pkt_access: 0: bf 16 00 00 00 00 00 00 r6 = r1 ===> 1: 85 10 00 00 ff ff ff ff call -1 2: bc 01 00 00 00 00 00 00 w1 = w0 3: b4 00 00 00 02 00 00 00 w0 = 2 4: 61 62 00 00 00 00 00 00 r2 = (u32 )(r6 + 0) 5: 64 02 00 00 01 00 00 00 w2 <<= 1 6: 5e 21 09 00 00 00 00 00 if w1 != w2 goto +9 <LBB2_3> 7: b4 01 00 00 02 00 00 00 w1 = 2 8: bf 62 00 00 00 00 00 00 r2 = r6 ===> 9: 85 10 00 00 ff ff ff ff call -1 10: bc 01 00 00 00 00 00 00 w1 = w0 11: 61 62 00 00 00 00 00 00 r2 = (u32 )(r6 + 0) 12: 04 02 00 00 02 00 00 00 w2 += 2 13: b4 00 00 00 ff ff ff ff w0 = -1 14: 1e 21 01 00 00 00 00 00 if w1 == w2 goto +1 <LBB2_3> 15: b4 00 00 00 02 00 00 00 w0 = 2 0000000000000080 LBB2_3: 16: 95 00 00 00 00 00 00 00 exit Thus the right formula to calculate target call offset after relocation should take into account relocation's target symbol value (offset within section), call instruction's imm32 offset, and (subtracting, to get relative instruction offset) instruction index of call instruction itself. All that is shifted by number of instructions in main program, given all sub-programs are copied over after main program. Convert few selftests relying on bpf-to-bpf calls to use global functions instead of static ones. Fixes: 48cca7e44f9f ("libbpf: add support for bpf_call") Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191119224447.3781271-1-andriin@fb.com	2019-11-25 16:55:44 -08:00
Luigi Rizzo	16ecc53e73	net-af_xdp: Use correct number of channels from ethtool Drivers use different fields to report the number of channels, so take the maximum of all data channels (rx, tx, combined) when determining the size of the xsk map. The current code used only 'combined' which was set to 0 in some drivers e.g. mlx4. Tested: compiled and run xdpsock -q 3 -r -S on mlx4 Signed-off-by: Luigi Rizzo <lrizzo@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20191119001951.92930-1-lrizzo@google.com	2019-11-25 16:55:44 -08:00
Andrii Nakryiko	38f66776db	libbpf: Make global data internal arrays mmap()-able, if possible Add detection of BPF_F_MMAPABLE flag support for arrays and add it as an extra flag to internal global data maps, if supported by kernel. This allows users to memory-map global data and use it without BPF map operations, greatly simplifying user experience. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20191117172806.2195367-5-andriin@fb.com	2019-11-25 16:55:44 -08:00
Andrii Nakryiko	e9d33df74d	bpf: Add mmap() support for BPF_MAP_TYPE_ARRAY Add ability to memory-map contents of BPF array map. This is extremely useful for working with BPF global data from userspace programs. It allows to avoid typical bpf_map_{lookup,update}_elem operations, improving both performance and usability. There had to be special considerations for map freezing, to avoid having writable memory view into a frozen map. To solve this issue, map freezing and mmap-ing is happening under mutex now: - if map is already frozen, no writable mapping is allowed; - if map has writable memory mappings active (accounted in map->writecnt), map freezing will keep failing with -EBUSY; - once number of writable memory mappings drops to zero, map freezing can be performed again. Only non-per-CPU plain arrays are supported right now. Maps with spinlocks can't be memory mapped either. For BPF_F_MMAPABLE array, memory allocation has to be done through vmalloc() to be mmap()'able. We also need to make sure that array data memory is page-sized and page-aligned, so we over-allocate memory in such a way that struct bpf_array is at the end of a single page of memory with array->value being aligned with the start of the second page. On deallocation we need to accomodate this memory arrangement to free vmalloc()'ed memory correctly. One important consideration regarding how memory-mapping subsystem functions. Memory-mapping subsystem provides few optional callbacks, among them open() and close(). close() is called for each memory region that is unmapped, so that users can decrease their reference counters and free up resources, if necessary. open() is almost symmetrical: it's called for each memory region that is being mapped, except the very first one. So bpf_map_mmap does initial refcnt bump, while open() will do any extra ones after that. Thus number of close() calls is equal to number of open() calls plus one more. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Link: https://lore.kernel.org/bpf/20191117172806.2195367-4-andriin@fb.com	2019-11-25 16:55:44 -08:00
Alexei Starovoitov	05b515de7d	libbpf: Add support for attaching BPF programs to other BPF programs Extend libbpf api to pass attach_prog_fd into bpf_object__open. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20191114185720.1641606-19-ast@kernel.org	2019-11-25 16:55:44 -08:00
Alexei Starovoitov	c2bbeaa900	selftests/bpf: Add test for BPF trampoline Add sanity test for BPF trampoline that checks kernel functions with up to 6 arguments of different sizes. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20191114185720.1641606-10-ast@kernel.org	2019-11-25 16:55:44 -08:00
Alexei Starovoitov	799d153f41	libbpf: Add support to attach to fentry/fexit tracing progs Teach libbpf to recognize tracing programs types and attach them to fentry/fexit. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20191114185720.1641606-7-ast@kernel.org	2019-11-25 16:55:44 -08:00
Alexei Starovoitov	69ff3960eb	libbpf: Introduce btf__find_by_name_kind() Introduce btf__find_by_name_kind() helper to search BTF by name and kind, since name alone can be ambiguous. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20191114185720.1641606-6-ast@kernel.org	2019-11-25 16:55:44 -08:00
Frantisek Sumsal	b91f53ec5f	travis: use travis_terminate instead of set {+,-}e combo Apart from that it looks a bit nicer, it also acts as a workaround for https://travis-ci.community/t/exit-0-cannot-exit-successfully-on-arm/5731/4	2019-11-14 13:49:21 -08:00
Frantisek Sumsal	dd8f1bdd45	travis: bump the Ubuntu release to Bionic The main reason why this is necessary is that gcc 5.x on Xenial doesn't support ASan on s390x. Bumping the release to Bionic with gcc 7.x allows us to build libbpf on s390x with ASan without issues.	2019-11-14 13:49:21 -08:00
Frantisek Sumsal	3720f31852	travis: add an s390x job Travis now supports IBM Z and IBM Power architectures, so let's enable them in our CI as well. As libbpf won't compile on ppc64le right now (with current CFLAGS), let skip it until the issue is resolved, see discussion in https://github.com/libbpf/libbpf/pull/98#issuecomment-553873098 See: https://blog.travis-ci.com/2019-11-12-multi-cpu-architecture-ibm-power-ibm-z	2019-11-14 13:49:21 -08:00
Andrii Nakryiko	c51c492a65	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: ed578021210e14f15a654c825fba6a700c9a39a7 Checkpoint bpf-next commit: e47a179997ceee6864fbae620eee09ea9c345a4d Baseline bpf commit: 7de086909365cd60a5619a45af3f4152516fd75c Checkpoint bpf commit: d0fbb51dfaa612f960519b798387be436e8f83c5 Andrii Nakryiko (6): libbpf: Fix negative FD close() in xsk_setup_xdp_prog() libbpf: Fix memory leak/double free issue libbpf: Fix potential overflow issue libbpf: Fix another potential overflow issue in bpf_prog_linfo libbpf: Make btf__resolve_size logic always check size error condition libbpf: Improve handling of corrupted ELF during map initialization Magnus Karlsson (2): libbpf: Support XDP_SHARED_UMEM with external XDP program libbpf: Allow for creating Rx or Tx only AF_XDP sockets Toke Høiland-Jørgensen (5): libbpf: Unpin auto-pinned maps if loading fails libbpf: Propagate EPERM to caller on program load libbpf: Use pr_warn() when printing netlink errors libbpf: Add bpf_get_link_xdp_info() function to get more XDP information libbpf: Add getter for program size src/bpf.c \| 2 +- src/bpf_prog_linfo.c \| 14 +++---- src/btf.c \| 3 +- src/libbpf.c \| 47 ++++++++++++++---------- src/libbpf.h \| 13 +++++++ src/libbpf.map \| 2 + src/netlink.c \| 87 +++++++++++++++++++++++++++++--------------- src/nlattr.c \| 10 ++--- src/xsk.c \| 34 +++++++++++------ 9 files changed, 136 insertions(+), 76 deletions(-) -- 2.17.1	2019-11-13 16:39:58 -08:00
Magnus Karlsson	d3e68e036e	libbpf: Allow for creating Rx or Tx only AF_XDP sockets The libbpf AF_XDP code is extended to allow for the creation of Rx only or Tx only sockets. Previously it returned an error if the socket was not initialized for both Rx and Tx. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Tested-by: William Tu <u9012063@gmail.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Link: https://lore.kernel.org/bpf/1573148860-30254-4-git-send-email-magnus.karlsson@intel.com	2019-11-13 16:39:58 -08:00
Magnus Karlsson	6ce8910d4d	libbpf: Support XDP_SHARED_UMEM with external XDP program Add support in libbpf to create multiple sockets that share a single umem. Note that an external XDP program need to be supplied that routes the incoming traffic to the desired sockets. So you need to supply the libbpf_flag XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD and load your own XDP program. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Tested-by: William Tu <u9012063@gmail.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Link: https://lore.kernel.org/bpf/1573148860-30254-2-git-send-email-magnus.karlsson@intel.com	2019-11-13 16:39:58 -08:00
Toke Høiland-Jørgensen	79b1d813f9	libbpf: Add getter for program size This adds a new getter for the BPF program size (in bytes). This is useful for a caller that is trying to predict how much memory will be locked by loading a BPF object into the kernel. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/157333185272.88376.10996937115395724683.stgit@toke.dk	2019-11-13 16:39:58 -08:00
Toke Høiland-Jørgensen	26954e103d	libbpf: Add bpf_get_link_xdp_info() function to get more XDP information Currently, libbpf only provides a function to get a single ID for the XDP program attached to the interface. However, it can be useful to get the full set of program IDs attached, along with the attachment mode, in one go. Add a new getter function to support this, using an extendible structure to carry the information. Express the old bpf_get_link_id() function in terms of the new function. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/157333185164.88376.7520653040667637246.stgit@toke.dk	2019-11-13 16:39:58 -08:00
Toke Høiland-Jørgensen	c8c02fca3a	libbpf: Use pr_warn() when printing netlink errors The netlink functions were using fprintf(stderr, ) directly to print out error messages, instead of going through the usual logging macros. This makes it impossible for the calling application to silence or redirect those error messages. Fix this by switching to pr_warn() in nlattr.c and netlink.c. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/157333185055.88376.15999360127117901443.stgit@toke.dk	2019-11-13 16:39:58 -08:00
Toke Høiland-Jørgensen	0e2f5f9615	libbpf: Propagate EPERM to caller on program load When loading an eBPF program, libbpf overrides the return code for EPERM errors instead of returning it to the caller. This makes it hard to figure out what went wrong on load. In particular, EPERM is returned when the system rlimit is too low to lock the memory required for the BPF program. Previously, this was somewhat obscured because the rlimit error would be hit on map creation (which does return it correctly). However, since maps can now be reused, object load can proceed all the way to loading programs without hitting the error; propagating it even in this case makes it possible for the caller to react appropriately (and, e.g., attempt to raise the rlimit before retrying). Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/157333184946.88376.11768171652794234561.stgit@toke.dk	2019-11-13 16:39:58 -08:00
Toke Høiland-Jørgensen	b539321838	libbpf: Unpin auto-pinned maps if loading fails Since the automatic map-pinning happens during load, it will leave pinned maps around if the load fails at a later stage. Fix this by unpinning any pinned maps on cleanup. To avoid unpinning pinned maps that were reused rather than newly pinned, add a new boolean property on struct bpf_map to keep track of whether that map was reused or not; and only unpin those maps that were not reused. Fixes: 57a00f41644f ("libbpf: Add auto-pinning of maps when loading BPF objects") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/157333184731.88376.9992935027056165873.stgit@toke.dk	2019-11-13 16:39:58 -08:00
Andrii Nakryiko	0f15f88443	libbpf: Improve handling of corrupted ELF during map initialization If we get ELF file with "maps" section, but no symbols pointing to it, we'll end up with division by zero. Add check against this situation and exit early with error. Found by Coverity scan against Github libbpf sources. Fixes: bf82927125dd ("libbpf: refactor map initialization") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191107020855.3834758-6-andriin@fb.com	2019-11-13 16:39:58 -08:00
Andrii Nakryiko	bada95a5f3	libbpf: Make btf__resolve_size logic always check size error condition Perform size check always in btf__resolve_size. Makes the logic a bit more robust against corrupted BTF and silences LGTM/Coverity complaining about always true (size < 0) check. Fixes: 69eaab04c675 ("btf: extract BTF type size calculation") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191107020855.3834758-5-andriin@fb.com	2019-11-13 16:39:58 -08:00
Andrii Nakryiko	fb929625dc	libbpf: Fix another potential overflow issue in bpf_prog_linfo Fix few issues found by Coverity and LGTM. Fixes: b053b439b72a ("bpf: libbpf: bpftool: Print bpf_line_info during prog dump") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191107020855.3834758-4-andriin@fb.com	2019-11-13 16:39:58 -08:00
Andrii Nakryiko	1a828b3d58	libbpf: Fix potential overflow issue Fix a potential overflow issue found by LGTM analysis, based on Github libbpf source code. Fixes: 3d65014146c6 ("bpf: libbpf: Add btf_line_info support to libbpf") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191107020855.3834758-3-andriin@fb.com	2019-11-13 16:39:58 -08:00
Andrii Nakryiko	330f4683e2	libbpf: Fix memory leak/double free issue Coverity scan against Github libbpf code found the issue of not freeing memory and leaving already freed memory still referenced from bpf_program. Fix it by re-assigning successfully reallocated memory sooner. Fixes: 2993e0515bb4 ("tools/bpf: add support to read .BTF.ext sections") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191107020855.3834758-2-andriin@fb.com	2019-11-13 16:39:58 -08:00
Andrii Nakryiko	2ef7f5607c	libbpf: Fix negative FD close() in xsk_setup_xdp_prog() Fix issue reported by static analysis (Coverity). If bpf_prog_get_fd_by_id() fails, xsk_lookup_bpf_maps() will fail as well and clean-up code will attempt close() with fd=-1. Fix by checking bpf_prog_get_fd_by_id() return result and exiting early. Fixes: 10a13bb40e54 ("libbpf: remove qidconf and better support external bpf programs.") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191107054059.313884-1-andriin@fb.com	2019-11-13 16:39:58 -08:00
Andrii Nakryiko	4da243c179	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: f23c7ce341c2dfd187d4e3712ba6c110969463a0 Checkpoint bpf-next commit: ed578021210e14f15a654c825fba6a700c9a39a7 Baseline bpf commit: 7de086909365cd60a5619a45af3f4152516fd75c Checkpoint bpf commit: 7de086909365cd60a5619a45af3f4152516fd75c Andrii Nakryiko (1): libbpf: Simplify BPF_CORE_READ_BITFIELD_PROBED usage src/bpf_core_read.h \| 27 +++++++++++---------------- 1 file changed, 11 insertions(+), 16 deletions(-) -- 2.17.1	2019-11-06 14:11:45 -08:00
Andrii Nakryiko	4d8fc6d438	libbpf: Simplify BPF_CORE_READ_BITFIELD_PROBED usage Streamline BPF_CORE_READ_BITFIELD_PROBED interface to follow BPF_CORE_READ_BITFIELD (direct) and BPF_CORE_READ, in general, i.e., just return read result or 0, if underlying bpf_probe_read() failed. In practice, real applications rarely check bpf_probe_read() result, because it has to always work or otherwise it's a bug. So propagating internal bpf_probe_read() error from this macro hurts usability without providing real benefits in practice. This patch fixes the issue and simplifies usage, noticeable even in selftest itself. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20191106201500.2582438-1-andriin@fb.com	2019-11-06 14:11:45 -08:00
Andrii Nakryiko	6d4abdda08	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: a566e35f1e8b4b3be1e96a804d1cca38b578167c Checkpoint bpf-next commit: f23c7ce341c2dfd187d4e3712ba6c110969463a0 Baseline bpf commit: fc11078dd3514c65eabce166b8431a56d8a667cb Checkpoint bpf commit: 7de086909365cd60a5619a45af3f4152516fd75c Alexei Starovoitov (1): libbpf: Add support for prog_tracing Andrii Nakryiko (2): libbpf: Add support for relocatable bitfields libbpf: Add support for field size relocations Daniel Borkmann (1): bpf: Add probe_read_{user, kernel} and probe_read_{user, kernel}_str helpers Toke Høiland-Jørgensen (4): libbpf: Fix error handling in bpf_map__reuse_fd() libbpf: Store map pin path and status in struct bpf_map libbpf: Move directory creation into _pin() functions libbpf: Add auto-pinning of maps when loading BPF objects include/uapi/linux/bpf.h \| 124 ++++--- src/bpf.c \| 8 +- src/bpf.h \| 5 +- src/bpf_core_read.h \| 79 +++++ src/bpf_helpers.h \| 6 + src/libbpf.c \| 707 ++++++++++++++++++++++++++++++--------- src/libbpf.h \| 23 +- src/libbpf.map \| 5 + src/libbpf_internal.h \| 4 + src/libbpf_probes.c \| 1 + 10 files changed, 749 insertions(+), 213 deletions(-) -- 2.17.1	2019-11-05 16:00:11 -08:00
Andrii Nakryiko	67ab4c0f82	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2019-11-05 16:00:11 -08:00
Andrii Nakryiko	df45cf7a3e	libbpf: Add support for field size relocations Add bpf_core_field_size() macro, capturing a relocation against field size. Adjust bits of internal libbpf relocation logic to allow capturing size relocations of various field types: arrays, structs/unions, enums, etc. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191101222810.1246166-4-andriin@fb.com	2019-11-05 16:00:11 -08:00
Andrii Nakryiko	4438972ccc	libbpf: Add support for relocatable bitfields Add support for the new field relocation kinds, necessary to support relocatable bitfield reads. Provide macro for abstracting necessary code doing full relocatable bitfield extraction into u64 value. Two separate macros are provided: - BPF_CORE_READ_BITFIELD macro for direct memory read-enabled BPF programs (e.g., typed raw tracepoints). It uses direct memory dereference to extract bitfield backing integer value. - BPF_CORE_READ_BITFIELD_PROBED macro for cases where bpf_probe_read() needs to be used to extract same backing integer value. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191101222810.1246166-3-andriin@fb.com	2019-11-05 16:00:11 -08:00
Daniel Borkmann	09cd9ff2db	bpf: Add probe_read_{user, kernel} and probe_read_{user, kernel}_str helpers The current bpf_probe_read() and bpf_probe_read_str() helpers are broken in that they assume they can be used for probing memory access for kernel space addresses /as well as/ user space addresses. However, plain use of probe_kernel_read() for both cases will attempt to always access kernel space address space given access is performed under KERNEL_DS and some archs in-fact have overlapping address spaces where a kernel pointer and user pointer would have the /same/ address value and therefore accessing application memory via bpf_probe_read{,_str}() would read garbage values. Lets fix BPF side by making use of recently added 3d7081822f7f ("uaccess: Add non-pagefault user-space read functions"). Unfortunately, the only way to fix this status quo is to add dedicated bpf_probe_read_{user,kernel}() and bpf_probe_read_{user,kernel}_str() helpers. The bpf_probe_read{,_str}() helpers are kept as-is to retain their current behavior. The two _user() variants attempt the access always under USER_DS set, the two _kernel() variants will -EFAULT when accessing user memory if the underlying architecture has non-overlapping address ranges, also avoiding throwing the kernel warning via 00c42373d397 ("x86-64: add warning for non-canonical user access address dereferences"). Fixes: a5e8c07059d0 ("bpf: add bpf_probe_read_str helper") Fixes: 2541517c32be ("tracing, perf: Implement BPF programs attached to kprobes") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/796ee46e948bc808d54891a1108435f8652c6ca4.1572649915.git.daniel@iogearbox.net	2019-11-05 16:00:11 -08:00
Toke Høiland-Jørgensen	e7d860d2fc	libbpf: Add auto-pinning of maps when loading BPF objects This adds support to libbpf for setting map pinning information as part of the BTF map declaration, to get automatic map pinning (and reuse) on load. The pinning type currently only supports a single PIN_BY_NAME mode, where each map will be pinned by its name in a path that can be overridden, but defaults to /sys/fs/bpf. Since auto-pinning only does something if any maps actually have a 'pinning' BTF attribute set, we default the new option to enabled, on the assumption that seamless pinning is what most callers want. When a map has a pin_path set at load time, libbpf will compare the map pinned at that location (if any), and if the attributes match, will re-use that map instead of creating a new one. If no existing map is found, the newly created map will instead be pinned at the location. Programs wanting to customise the pinning can override the pinning paths using bpf_map__set_pin_path() before calling bpf_object__load() (including setting it to NULL to disable pinning of a particular map). Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/157269298092.394725.3966306029218559681.stgit@toke.dk	2019-11-05 16:00:11 -08:00
Toke Høiland-Jørgensen	ff3d2702d8	libbpf: Move directory creation into _pin() functions The existing pin_*() functions all try to create the parent directory before pinning. Move this check into the per-object _pin() functions instead. This ensures consistent behaviour when auto-pinning is added (which doesn't go through the top-level pin_maps() function), at the cost of a few more calls to mkdir(). Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/157269297985.394725.5882630952992598610.stgit@toke.dk	2019-11-05 16:00:11 -08:00
Toke Høiland-Jørgensen	44f9712f79	libbpf: Store map pin path and status in struct bpf_map Support storing and setting a pin path in struct bpf_map, which can be used for automatic pinning. Also store the pin status so we can avoid attempts to re-pin a map that has already been pinned (or reused from a previous pinning). The behaviour of bpf_object__{un,}pin_maps() is changed so that if it is called with a NULL path argument (which was previously illegal), it will (un)pin only those maps that have a pin_path set. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/157269297876.394725.14782206533681896279.stgit@toke.dk	2019-11-05 16:00:11 -08:00
Toke Høiland-Jørgensen	fe4cb796df	libbpf: Fix error handling in bpf_map__reuse_fd() bpf_map__reuse_fd() was calling close() in the error path before returning an error value based on errno. However, close can change errno, so that can lead to potentially misleading error messages. Instead, explicitly store errno in the err variable before each goto. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/157269297769.394725.12634985106772698611.stgit@toke.dk	2019-11-05 16:00:11 -08:00
Alexei Starovoitov	15de8ad80d	libbpf: Add support for prog_tracing Cleanup libbpf from expected_attach_type == attach_btf_id hack and introduce BPF_PROG_TYPE_TRACING. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20191030223212.953010-3-ast@kernel.org	2019-11-05 16:00:11 -08:00
Frantisek Sumsal	d7a137510a	coverity: explicitly use bash instead of sh On Ubuntu `/bin/sh` is a symlink to `/bin/dash`, which doesn't support certain builtins used by the Coverity script (namely pushd/popd)	2019-11-05 13:28:13 -08:00
Frantisek Sumsal	91e4f27dd7	travis: use sudo during the 'install' phase	2019-11-04 15:08:38 -08:00
Frantisek Sumsal	1339ef70a3	README: add Coverity badge	2019-11-01 23:22:57 -07:00
Frantisek Sumsal	c204e3d610	travis: automate Coverity builds	2019-11-01 23:22:57 -07:00
Frantisek Sumsal	32d0a03332	README: add a LGTM badge	2019-10-29 15:45:36 -07:00
Andrii Nakryiko	05346cfd90	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 3820729160440158a014add69cc0d371061a96b2 Checkpoint bpf-next commit: a566e35f1e8b4b3be1e96a804d1cca38b578167c Baseline bpf commit: 2afd23f78f39da84937006ecd24aa664a4ab052b Checkpoint bpf commit: fc11078dd3514c65eabce166b8431a56d8a667cb Andrii Nakryiko (2): libbpf: Fix off-by-one error in ELF sanity check libbpf: Don't use kernel-side u32 type in xsk.c Magnus Karlsson (1): libbpf: Fix compatibility for kernels without need_wakeup src/libbpf.c \| 2 +- src/xsk.c \| 83 ++++++++++++++++++++++++++++++++++++++++++++-------- 2 files changed, 72 insertions(+), 13 deletions(-) -- 2.17.1	2019-10-29 09:25:36 -07:00
Andrii Nakryiko	a7a32b899c	libbpf: Don't use kernel-side u32 type in xsk.c u32 is a kernel-side typedef. User-space library is supposed to use __u32. This breaks Github's projection of libbpf. Do u32 -> __u32 fix. Fixes: 94ff9ebb49a5 ("libbpf: Fix compatibility for kernels without need_wakeup") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Björn Töpel <bjorn.topel@intel.com> Cc: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20191029055953.2461336-1-andriin@fb.com	2019-10-29 09:25:36 -07:00
Andrii Nakryiko	68a051f2d2	libbpf: Fix off-by-one error in ELF sanity check libbpf's bpf_object__elf_collect() does simple sanity check after iterating over all ELF sections, if checks that .strtab index is correct. Unfortunately, due to section indices being 1-based, the check breaks for cases when .strtab ends up being the very last section in ELF. Fixes: 77ba9a5b48a7 ("tools lib bpf: Fetch map names from correct strtab") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191028233727.1286699-1-andriin@fb.com	2019-10-29 09:25:36 -07:00
Magnus Karlsson	8e80367637	libbpf: Fix compatibility for kernels without need_wakeup When the need_wakeup flag was added to AF_XDP, the format of the XDP_MMAP_OFFSETS getsockopt was extended. Code was added to the kernel to take care of compatibility issues arrising from running applications using any of the two formats. However, libbpf was not extended to take care of the case when the application/libbpf uses the new format but the kernel only supports the old format. This patch adds support in libbpf for parsing the old format, before the need_wakeup flag was added, and emulating a set of static need_wakeup flags that will always work for the application. v2 -> v3: * Incorporated code improvements suggested by Jonathan Lemon v1 -> v2: * Rebased to bpf-next * Rewrote the code as the previous version made you blind Fixes: a4500432c2587cb2a ("libbpf: add support for need_wakeup flag in AF_XDP part") Reported-by: Eloy Degen <degeneloy@gmail.com> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Link: https://lore.kernel.org/bpf/1571995035-21889-1-git-send-email-magnus.karlsson@intel.com	2019-10-29 09:25:36 -07:00
Andrii Nakryiko	9a5adecc62	sync: ignore test_libbpf.c Adjust sync script to ignore test_libbpf.c, not test_libbpf.cpp. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2019-10-29 09:25:36 -07:00
Frantisek Sumsal	b923d0e3c6	lgtm: fix the extraction process As this project uses only Makefile, without any configuration step, and due to a "non-standard" location of the source files, LGTM kept failing to find the respective Makefile and build the sources. By tricking LGTM's build system auto detection, that we use automake/configure, it correctly sets the source dir, thus the compilation, extraction & analysis steps now work in the src/ subdirectory, as expected.	2019-10-28 15:15:47 -07:00
Andrii Nakryiko	f02e248ae1	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 5e5b03d163e15a40b0fa57c70b4e8edd549b0b98 Checkpoint bpf-next commit: 3820729160440158a014add69cc0d371061a96b2 Baseline bpf commit: cd7455f1013ef96d5cbf5c05d2b7c06f273810a6 Checkpoint bpf commit: 2afd23f78f39da84937006ecd24aa664a4ab052b Björn Töpel (1): libbpf: Use implicit XSKMAP lookup from AF_XDP XDP program KP Singh (1): libbpf: Fix strncat bounds error in libbpf_prog_type_by_name src/libbpf.c \| 2 +- src/xsk.c \| 42 ++++++++++++++++++++++++++++++++---------- 2 files changed, 33 insertions(+), 11 deletions(-) -- 2.17.1	2019-10-24 22:59:06 -07:00
KP Singh	e152510d72	libbpf: Fix strncat bounds error in libbpf_prog_type_by_name On compiling samples with this change, one gets an error: error: ‘strncat’ specified bound 118 equals destination size [-Werror=stringop-truncation] strncat(dst, name + section_names[i].len, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ sizeof(raw_tp_btf_name) - (dst - raw_tp_btf_name)); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ strncat requires the destination to have enough space for the terminating null byte. Fixes: f75a697e09137 ("libbpf: Auto-detect btf_id of BTF-based raw_tracepoint") Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191023154038.24075-1-kpsingh@chromium.org	2019-10-24 22:59:06 -07:00
Björn Töpel	59ac1946b0	libbpf: Use implicit XSKMAP lookup from AF_XDP XDP program In commit 43e74c0267a3 ("bpf_xdp_redirect_map: Perform map lookup in eBPF helper") the bpf_redirect_map() helper learned to do map lookup, which means that the explicit lookup in the XDP program for AF_XDP is not needed for post-5.3 kernels. This commit adds the implicit map lookup with default action, which improves the performance for the "rx_drop" [1] scenario with ~4%. For pre-5.3 kernels, the bpf_redirect_map() returns XDP_ABORTED, and a fallback path for backward compatibility is entered, where explicit lookup is still performed. This means a slight regression for older kernels (an additional bpf_redirect_map() call), but I consider that a fair punishment for users not upgrading their kernels. ;-) v1->v2: Backward compatibility (Toke) [2] v2->v3: Avoid masking/zero-extension by using JMP32 [3] [1] # xdpsock -i eth0 -z -r [2] https://lore.kernel.org/bpf/87pnirb3dc.fsf@toke.dk/ [3] https://lore.kernel.org/bpf/87v9sip0i8.fsf@toke.dk/ Suggested-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20191022072206.6318-1-bjorn.topel@gmail.com	2019-10-24 22:59:06 -07:00
Andrii Nakryiko	5150a4a0fb	includes: add BPF_JMP32_IMM macro to fix build Recent xsk change started using new BPF_JMP32_IMM macro. Add it to our local copy of include/linux/filter.h to fix the build. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2019-10-24 22:59:06 -07:00
Frantisek Sumsal	2a25957df6	travis: add an aarch64 Xenial job	2019-10-23 10:13:54 -07:00
Andrii Nakryiko	e441f55089	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: da927466a152a9497c05926a95c6aebba6d3ad5b Checkpoint bpf-next commit: 5e5b03d163e15a40b0fa57c70b4e8edd549b0b98 Baseline bpf commit: 9e8acd9c44a0dd52b2922eeb82398c04e356c058 Checkpoint bpf commit: cd7455f1013ef96d5cbf5c05d2b7c06f273810a6 Alexei Starovoitov (3): bpf: Add attach_btf_id attribute to program load libbpf: Auto-detect btf_id of BTF-based raw_tracepoints bpf: Check types of arguments passed into helpers Andrii Nakryiko (5): tools: Sync if_link.h libbpf: Add bpf_program__get_{type, expected_attach_type) APIs libbpf: Add uprobe/uretprobe and tp/raw_tp section suffixes libbpf: Teach bpf_object__open to guess program types libbpf: Make DECLARE_LIBBPF_OPTS macro strictly a variable declaration John Fastabend (1): bpf, libbpf: Add kernel version section parsing back Kefeng Wang (1): tools, bpf: Rename pr_warning to pr_warn to align with kernel logging include/uapi/linux/bpf.h \| 28 +- include/uapi/linux/if_link.h \| 2 + src/bpf.c \| 3 + src/btf.c \| 56 +-- src/btf_dump.c \| 18 +- src/libbpf.c \| 830 +++++++++++++++++++---------------- src/libbpf.h \| 24 +- src/libbpf.map \| 2 + src/libbpf_internal.h \| 8 +- src/xsk.c \| 4 +- 10 files changed, 539 insertions(+), 436 deletions(-) -- 2.17.1	2019-10-22 16:15:55 -07:00
Andrii Nakryiko	beb9f88080	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2019-10-22 16:15:55 -07:00
Andrii Nakryiko	c7b5116f71	libbpf: Make DECLARE_LIBBPF_OPTS macro strictly a variable declaration LIBBPF_OPTS is implemented as a mix of field declaration and memset + assignment. This makes it neither variable declaration nor purely statements, which is a problem, because you can't mix it with either other variable declarations nor other function statements, because C90 compiler mode emits warning on mixing all that together. This patch changes LIBBPF_OPTS into a strictly declaration of variable and solves this problem, as can be seen in case of bpftool, which previously would emit compiler warning, if done this way (LIBBPF_OPTS as part of function variables declaration block). This patch also renames LIBBPF_OPTS into DECLARE_LIBBPF_OPTS to follow kernel convention for similar macros more closely. v1->v2: - rename LIBBPF_OPTS into DECLARE_LIBBPF_OPTS (Jakub Sitnicki). Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20191022172100.3281465-1-andriin@fb.com	2019-10-22 16:15:55 -07:00
Andrii Nakryiko	2b0cd55bf5	libbpf: Teach bpf_object__open to guess program types Teach bpf_object__open how to guess program type and expected attach type from section names, similar to what bpf_prog_load() does. This seems like a really useful features and an oversight to not have this done during bpf_object_open(). To preserver backwards compatible behavior of bpf_prog_load(), its attr->prog_type is treated as an override of bpf_object__open() decisions, if attr->prog_type is not UNSPECIFIED. There is a slight difference in behavior for bpf_prog_load(). Previously, if bpf_prog_load() was loading BPF object with more than one program, first program's guessed program type and expected attach type would determine corresponding attributes of all the subsequent program types, even if their sections names suggest otherwise. That seems like a rather dubious behavior and with this change it will behave more sanely: each program's type is determined individually, unless they are forced to uniformity through attr->prog_type. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191021033902.3856966-5-andriin@fb.com	2019-10-22 16:15:55 -07:00
Andrii Nakryiko	188276ca5f	libbpf: Add uprobe/uretprobe and tp/raw_tp section suffixes Map uprobe/uretprobe into KPROBE program type. tp/raw_tp are just an alias for more verbose tracepoint/raw_tracepoint, respectively. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191021033902.3856966-4-andriin@fb.com	2019-10-22 16:15:55 -07:00
Andrii Nakryiko	87c4984da8	libbpf: Add bpf_program__get_{type, expected_attach_type) APIs There are bpf_program__set_type() and bpf_program__set_expected_attach_type(), but no corresponding getters, which seems rather incomplete. Fix this. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191021033902.3856966-3-andriin@fb.com	2019-10-22 16:15:55 -07:00
Andrii Nakryiko	a5611ba6e8	tools: Sync if_link.h Sync if_link.h into tools/ and get rid of annoying libbpf Makefile warning. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191021033902.3856966-2-andriin@fb.com	2019-10-22 16:15:55 -07:00
Kefeng Wang	c6e01425b6	tools, bpf: Rename pr_warning to pr_warn to align with kernel logging For kernel logging macros, pr_warning() is completely removed and replaced by pr_warn(). By using pr_warn() in tools/lib/bpf/ for symmetry to kernel logging macros, we could eventually drop the use of pr_warning() in the whole kernel tree. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20191021055532.185245-1-wangkefeng.wang@huawei.com	2019-10-22 16:15:55 -07:00
John Fastabend	58e3a8fac1	bpf, libbpf: Add kernel version section parsing back With commit "libbpf: stop enforcing kern_version,..." we removed the kernel version section parsing in favor of querying for the kernel using uname() and populating the version using the result of the query. After this any version sections were simply ignored. Unfortunately, the world of kernels is not so friendly. I've found some customized kernels where uname() does not match the in kernel version. To fix this so programs can load in this environment this patch adds back parsing the section and if it exists uses the user specified kernel version to override the uname() result. However, keep most the kernel uname() discovery bits so users are not required to insert the version except in these odd cases. Fixes: 5e61f27070292 ("libbpf: stop enforcing kern_version, populate it for users") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/157140968634.9073.6407090804163937103.stgit@john-XPS-13-9370	2019-10-22 16:15:55 -07:00
Alexei Starovoitov	1b27702c14	bpf: Check types of arguments passed into helpers Introduce new helper that reuses existing skb perf_event output implementation, but can be called from raw_tracepoint programs that receive 'struct sk_buff ' as tracepoint argument or can walk other kernel data structures to skb pointer. In order to do that teach verifier to resolve true C types of bpf helpers into in-kernel BTF ids. The type of kernel pointer passed by raw tracepoint into bpf program will be tracked by the verifier all the way until it's passed into helper function. For example: kfree_skb() kernel function calls trace_kfree_skb(skb, loc); bpf programs receives that skb pointer and may eventually pass it into bpf_skb_output() bpf helper which in-kernel is implemented via bpf_skb_event_output() kernel function. Its first argument in the kernel is 'struct sk_buff '. The verifier makes sure that types match all the way. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20191016032505.2089704-11-ast@kernel.org	2019-10-22 16:15:55 -07:00
Alexei Starovoitov	39cf9fc90f	libbpf: Auto-detect btf_id of BTF-based raw_tracepoints It's a responsiblity of bpf program author to annotate the program with SEC("tp_btf/name") where "name" is a valid raw tracepoint. The libbpf will try to find "name" in vmlinux BTF and error out in case vmlinux BTF is not available or "name" is not found. If "name" is indeed a valid raw tracepoint then in-kernel BTF will have "btf_trace_##name" typedef that points to function prototype of that raw tracepoint. BTF description captures exact argument the kernel C code is passing into raw tracepoint. The kernel verifier will check the types while loading bpf program. libbpf keeps BTF type id in expected_attach_type, but since kernel ignores this attribute for tracing programs copy it into attach_btf_id attribute before loading. Later the kernel will use prog->attach_btf_id to select raw tracepoint during bpf_raw_tracepoint_open syscall command. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20191016032505.2089704-6-ast@kernel.org	2019-10-22 16:15:55 -07:00
Alexei Starovoitov	bc4a6e9709	bpf: Add attach_btf_id attribute to program load Add attach_btf_id attribute to prog_load command. It's similar to existing expected_attach_type attribute which is used in several cgroup based program types. Unfortunately expected_attach_type is ignored for tracing programs and cannot be reused for new purpose. Hence introduce attach_btf_id to verify bpf programs against given in-kernel BTF type id at load time. It is strictly checked to be valid for raw_tp programs only. In a later patches it will become: btf_id == 0 semantics of existing raw_tp progs. btd_id > 0 raw_tp with BTF and additional type safety. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20191016032505.2089704-5-ast@kernel.org	2019-10-22 16:15:55 -07:00
Andrii Nakryiko	4a50ceb043	Makefile: back-port _FILE_OFFSET_BITS=64 and _LARGEFILE64_SOURCE to Makefile Upstream commit 71dd77fd4bf7 ("libbpf: use LFS (_FILE_OFFSET_BITS) instead of direct mmap2 syscall") added _FILE_OFFSET_BITS=64 and _LARGEFILE64_SOURCE CFLAGS. Back-port them to Github's mirror to avoid compilation problems on ARM. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2019-10-22 14:50:23 -07:00
Andrii Nakryiko	4d86cae4f0	ci: disable GCC's -Wstringop-truncation noisy error This error is usually a false positive for us. Disable it. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2019-10-15 19:43:48 -07:00
Andrii Nakryiko	33b374395f	sync: adjust sync script for test_libbpf.c rename and bpf_helper_defs.h Accomodate changes: - test_libbpf.cpp was renamed to test_libbpf.c; - bpf_helper_defs.h should be ignored for consistency check at the end, as it's not checked in on linux side; Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2019-10-15 19:43:48 -07:00
Andrii Nakryiko	ade4409352	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: f05c2001ecc98629cecd47728e4db11e5a17e58d Checkpoint bpf-next commit: da927466a152a9497c05926a95c6aebba6d3ad5b Baseline bpf commit: 106c35dda32f8b63f88cad7433f1b8bb0056958a Checkpoint bpf commit: 9e8acd9c44a0dd52b2922eeb82398c04e356c058 Andrii Nakryiko (7): libbpf: Fix struct end padding in btf_dump libbpf: Generate more efficient BPF_CORE_READ code libbpf: Handle invalid typedef emitted by old GCC libbpf: Update BTF reloc support to latest Clang format libbpf: Refactor bpf_object__open APIs to use common opts libbpf: Add support for field existance CO-RE relocation libbpf: Add BPF-side definitions of supported field relocation kinds Ilya Maximets (1): libbpf: Fix passing uninitialized bytes to setsockopt src/bpf_core_read.h \| 28 ++++++- src/btf.c \| 16 ++-- src/btf.h \| 4 +- src/btf_dump.c \| 19 ++++- src/libbpf.c \| 169 ++++++++++++++++++++++++++---------------- src/libbpf.h \| 4 +- src/libbpf_internal.h \| 25 +++++-- src/xsk.c \| 1 + 8 files changed, 180 insertions(+), 86 deletions(-) -- 2.17.1	2019-10-15 19:43:48 -07:00
Andrii Nakryiko	2f9abb2a26	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2019-10-15 19:43:48 -07:00
Andrii Nakryiko	fca60960ea	libbpf: Add BPF-side definitions of supported field relocation kinds Add enum definition for Clang's __builtin_preserve_field_info() second argument (info_kind). Currently only byte offset and existence are supported. Corresponding Clang changes introducing this built-in can be found at [0] [0] https://reviews.llvm.org/D67980 Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191015182849.3922287-5-andriin@fb.com	2019-10-15 19:43:48 -07:00
Andrii Nakryiko	0db22b01a1	libbpf: Add support for field existance CO-RE relocation Add support for BPF_FRK_EXISTS relocation kind to detect existence of captured field in a destination BTF, allowing conditional logic to handle incompatible differences between kernels. Also introduce opt-in relaxed CO-RE relocation handling option, which makes libbpf emit warning for failed relocations, but proceed with other relocations. Instruction, for which relocation failed, is patched with (u32)-1 value. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191015182849.3922287-4-andriin@fb.com	2019-10-15 19:43:48 -07:00
Andrii Nakryiko	807b9d7be1	libbpf: Refactor bpf_object__open APIs to use common opts Refactor all the various bpf_object__open variations to ultimately specify common bpf_object_open_opts struct. This makes it easy to keep extending this common struct w/ extra parameters without having to update all the legacy APIs. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191015182849.3922287-3-andriin@fb.com	2019-10-15 19:43:48 -07:00
Andrii Nakryiko	a3d02f9ab4	libbpf: Update BTF reloc support to latest Clang format BTF offset reloc was generalized in recent Clang into field relocation, capturing extra u32 field, specifying what aspect of captured field needs to be relocated. This changes .BTF.ext's record size for this relocation from 12 bytes to 16 bytes. Given these format changes happened in Clang before official released version, it's ok to not support outdated 12-byte record size w/o breaking ABI. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191015182849.3922287-2-andriin@fb.com	2019-10-15 19:43:48 -07:00
Andrii Nakryiko	54aac21f7e	libbpf: Handle invalid typedef emitted by old GCC Old GCC versions are producing invalid typedef for __gnuc_va_list pointing to void. Special-case this and emit valid: typedef __builtin_va_list __gnuc_va_list; Reported-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20191011032901.452042-1-andriin@fb.com	2019-10-15 19:43:48 -07:00
Andrii Nakryiko	d8dd0beb98	libbpf: Generate more efficient BPF_CORE_READ code Existing BPF_CORE_READ() macro generates slightly suboptimal code. If there are intermediate pointers to be read, initial source pointer is going to be assigned into a temporary variable and then temporary variable is going to be uniformly used as a "source" pointer for all intermediate pointer reads. Schematically (ignoring all the type casts), BPF_CORE_READ(s, a, b, c) is expanded into: ({ const void __t = src; bpf_probe_read(&__t, sizeof(__t), &__t->a); bpf_probe_read(&__t, sizeof(__t), &__t->b); typeof(s->a->b->c) __r; bpf_probe_read(&__r, sizeof(__r), &__t->c); }) This initial `__t = src` makes calls more uniform, but causes slightly less optimal register usage sometimes when compiled with Clang. This can cascase into, e.g., more register spills. This patch fixes this issue by generating more optimal sequence: ({ const void __t; bpf_probe_read(&__t, sizeof(__t), &src->a); /* <-- src here / bpf_probe_read(&__t, sizeof(__t), &__t->b); typeof(s->a->b->c) __r; bpf_probe_read(&__r, sizeof(*__r), &__t->c); }) Fixes: 7db3822ab991 ("libbpf: Add BPF_CORE_READ/BPF_CORE_READ_INTO helpers") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191011023847.275936-1-andriin@fb.com	2019-10-15 19:43:48 -07:00
Ilya Maximets	e94f57a9ab	libbpf: Fix passing uninitialized bytes to setsockopt 'struct xdp_umem_reg' has 4 bytes of padding at the end that makes valgrind complain about passing uninitialized stack memory to the syscall: Syscall param socketcall.setsockopt() points to uninitialised byte(s) at 0x4E7AB7E: setsockopt (in /usr/lib64/libc-2.29.so) by 0x4BDE035: xsk_umem__create@@LIBBPF_0.0.4 (xsk.c:172) Uninitialised value was created by a stack allocation at 0x4BDDEBA: xsk_umem__create@@LIBBPF_0.0.4 (xsk.c:140) Padding bytes appeared after introducing of a new 'flags' field. memset() is required to clear them. Fixes: 10d30e301732 ("libbpf: add flags to umem config") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20191009164929.17242-1-i.maximets@ovn.org	2019-10-15 19:43:48 -07:00
Andrii Nakryiko	bda436be4a	libbpf: Fix struct end padding in btf_dump Fix a case where explicit padding at the end of a struct is necessary due to non-standart alignment requirements of fields (which BTF doesn't capture explicitly). Fixes: 351131b51c7a ("libbpf: add btf_dump API for BTF-to-C conversion") Reported-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Tested-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20191008231009.2991130-2-andriin@fb.com	2019-10-15 19:43:48 -07:00
Andrii Nakryiko	a30df5c09f	makefile: install new BPF-side headers along libbpf user-land ones Install BPF-side helper headers: - bpf_helpers.h - bpf_helper_defs.h - bpf_tracing.h - bpf_endian.h - bpf_core_read.h Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2019-10-09 14:42:45 -07:00
Andrii Nakryiko	e776bf7ec7	sync: teach sync script to generate bpf_helper_defs.h Linux repo doesn't commit bpf_helper_defs.h, as it's re-generated on build every time. For Github projection, though, it's much nicer to have this header be pre-generated during sync and commited. This makes integration story easier for all the users that use libbpf as a submodule. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2019-10-09 14:42:45 -07:00
Andrii Nakryiko	46688687d5	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 02dc96ef6c25f990452c114c59d75c368a1f4c8f Checkpoint bpf-next commit: f05c2001ecc98629cecd47728e4db11e5a17e58d Baseline bpf commit: 1bd63524593b964934a33afd442df16b8f90e2b5 Checkpoint bpf commit: 106c35dda32f8b63f88cad7433f1b8bb0056958a Andrii Nakryiko (7): libbpf: Bump current version to v0.0.6 libbpf: stop enforcing kern_version, populate it for users libbpf: add bpf_object__open_{file, mem} w/ extensible opts libbpf: fix bpf_object__name() to actually return object name uapi/bpf: fix helper docs libbpf: Move bpf_{helpers, helper_defs, endian, tracing}.h into libbpf libbpf: Add BPF_CORE_READ/BPF_CORE_READ_INTO helpers include/uapi/linux/bpf.h \| 32 +++---- src/bpf_core_read.h \| 167 +++++++++++++++++++++++++++++++++ src/bpf_endian.h \| 72 +++++++++++++++ src/bpf_helpers.h \| 41 ++++++++ src/bpf_tracing.h \| 195 +++++++++++++++++++++++++++++++++++++++ src/libbpf.c \| 183 ++++++++++++++++++------------------ src/libbpf.h \| 48 +++++++++- src/libbpf.map \| 6 ++ src/libbpf_internal.h \| 32 +++++++ 9 files changed, 661 insertions(+), 115 deletions(-) create mode 100644 src/bpf_core_read.h create mode 100644 src/bpf_endian.h create mode 100644 src/bpf_helpers.h create mode 100644 src/bpf_tracing.h -- 2.17.1	2019-10-09 14:42:45 -07:00
Andrii Nakryiko	19cbbd8f52	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2019-10-09 14:42:45 -07:00
Andrii Nakryiko	c87b3a6065	libbpf: Add BPF_CORE_READ/BPF_CORE_READ_INTO helpers Add few macros simplifying BCC-like multi-level probe reads, while also emitting CO-RE relocations for each read. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20191008175942.1769476-7-andriin@fb.com	2019-10-09 14:42:45 -07:00
Andrii Nakryiko	4c55ba2b19	libbpf: Move bpf_{helpers, helper_defs, endian, tracing}.h into libbpf Move bpf_helpers.h, bpf_tracing.h, and bpf_endian.h into libbpf. Move bpf_helper_defs.h generation into libbpf's Makefile. Ensure all those headers are installed along the other libbpf headers. Also, adjust selftests and samples include path to include libbpf now. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20191008175942.1769476-6-andriin@fb.com	2019-10-09 14:42:45 -07:00
Andrii Nakryiko	104006a054	uapi/bpf: fix helper docs Various small fixes to BPF helper documentation comments, enabling automatic header generation with a list of BPF helpers. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-10-09 14:42:45 -07:00
Andrii Nakryiko	bf83a95dee	libbpf: fix bpf_object__name() to actually return object name bpf_object__name() was returning file path, not name. Fix this. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-10-09 14:42:45 -07:00
Andrii Nakryiko	732f598282	libbpf: add bpf_object__open_{file, mem} w/ extensible opts Add new set of bpf_object__open APIs using new approach to optional parameters extensibility allowing simpler ABI compatibility approach. This patch demonstrates an approach to implementing libbpf APIs that makes it easy to extend existing APIs with extra optional parameters in such a way, that ABI compatibility is preserved without having to do symbol versioning and generating lots of boilerplate code to handle it. To facilitate succinct code for working with options, add OPTS_VALID, OPTS_HAS, and OPTS_GET macros that hide all the NULL, size, and zero checks. Additionally, newly added libbpf APIs are encouraged to follow similar pattern of having all mandatory parameters as formal function parameters and always have optional (NULL-able) xxx_opts struct, which should always have real struct size as a first field and the rest would be optional parameters added over time, which tune the behavior of existing API, if specified by user. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-10-09 14:42:45 -07:00
Andrii Nakryiko	de3c5a17cb	libbpf: stop enforcing kern_version, populate it for users Kernel version enforcement for kprobes/kretprobes was removed from 5.0 kernel in 6c4fc209fcf9 ("bpf: remove useless version check for prog load"). Since then, BPF programs were specifying SEC("version") just to please libbpf. We should stop enforcing this in libbpf, if even kernel doesn't care. Furthermore, libbpf now will pre-populate current kernel version of the host system, in case we are still running on old kernel. This patch also removes __bpf_object__open_xattr from libbpf.h, as nothing in libbpf is relying on having it in that header. That function was never exported as LIBBPF_API and even name suggests its internal version. So this should be safe to remove, as it doesn't break ABI. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-10-09 14:42:45 -07:00
Andrii Nakryiko	1a8a75037b	libbpf: Bump current version to v0.0.6 New release cycle started, let's bump to v0.0.6 proactively. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20190930222503.519782-1-andriin@fb.com	2019-10-09 14:42:45 -07:00
Andrii Nakryiko	1a26b51b1c	meson: kill meson.build as it's not used anymore Meson.build was added to facilitate systemd integration, but systemd integration went different direction and we don't need meson.build anymore. So remove it and not maintain it anymore. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2019-10-01 16:51:32 -07:00
Andrii Nakryiko	2cc0829775	ci: execute install step in CI Add simple execution of `make install` in Debian and Xenial build to catch most obvious breakages. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2019-10-01 12:56:22 -07:00
Andrii Nakryiko	92cb475558	makefile: fix install target After latest shared vs static libraries fixes, `make install` target broke as it relied on now removed $(LIBS) variable. This patch fixes issue by listing $(SHARED_LIBS) and $(STATIC_LIBS) explicitly. Tested with and without BUILD_STATIC_ONLY. Fixes: `8b2782a1f2` ("makefile: support libbpf symbol versioning in shared library mode") Reported-by: Michal Rostecki <mrostecki@opensuse.org> Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2019-10-01 12:56:22 -07:00
Andrii Nakryiko	8b2782a1f2	makefile: support libbpf symbol versioning in shared library mode Similarly to Linux's 1bd63524593b ("libbpf: handle symbol versioning properly for libbpf.a"), add necessary changes to build static and shared object files separately with extra shared library flags. This allows to properly handle symbol versioning in shared library mode, while still having statically linkable library. Cc: Yonghong Song <yhs@fb.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2019-09-30 21:42:42 -07:00
Andrii Nakryiko	886e8149a0	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: b41dae061bbd722b9d7fa828f35d22035b218e18 Checkpoint bpf-next commit: 02dc96ef6c25f990452c114c59d75c368a1f4c8f Baseline bpf commit: e3439af4a339acd7fddbd6d59b8ecefaac07a611 Checkpoint bpf commit: 1bd63524593b964934a33afd442df16b8f90e2b5 Yonghong Song (1): libbpf: handle symbol versioning properly for libbpf.a src/libbpf_internal.h \| 16 ++++++++++++++++ src/xsk.c \| 4 ++-- 2 files changed, 18 insertions(+), 2 deletions(-) -- 2.17.1	2019-09-30 16:10:45 -07:00
Yonghong Song	d275397111	libbpf: handle symbol versioning properly for libbpf.a bcc uses libbpf repo as a submodule. It brings in libbpf source code and builds everything together to produce shared libraries. With latest libbpf, I got the following errors: /bin/ld: libbcc_bpf.so.0.10.0: version node not found for symbol xsk_umem__create@LIBBPF_0.0.2 /bin/ld: failed to set dynamic section sizes: Bad value collect2: error: ld returned 1 exit status make[2]: *** [src/cc/libbcc_bpf.so.0.10.0] Error 1 In xsk.c, we have asm(".symver xsk_umem__create_v0_0_2, xsk_umem__create@LIBBPF_0.0.2"); asm(".symver xsk_umem__create_v0_0_4, xsk_umem__create@@LIBBPF_0.0.4"); The linker thinks the built is for LIBBPF but cannot find proper version LIBBPF_0.0.2/4, so emit errors. I also confirmed that using libbpf.a to produce a shared library also has issues: -bash-4.4$ cat t.c extern void xsk_umem__create; void test() { return xsk_umem__create; } -bash-4.4$ gcc -c -fPIC t.c -bash-4.4$ gcc -shared t.o libbpf.a -o t.so /bin/ld: t.so: version node not found for symbol xsk_umem__create@LIBBPF_0.0.2 /bin/ld: failed to set dynamic section sizes: Bad value collect2: error: ld returned 1 exit status -bash-4.4$ Symbol versioning does happens in commonly used libraries, e.g., elfutils and glibc. For static libraries, for a versioned symbol, the old definitions will be ignored, and the symbol will be an alias to the latest definition. For example, glibc sched_setaffinity is versioned. -bash-4.4$ readelf -s /usr/lib64/libc.so.6 \| grep sched_setaffinity 756: 000000000013d3d0 13 FUNC GLOBAL DEFAULT 13 sched_setaffinity@GLIBC_2.3.3 757: 00000000000e2e70 455 FUNC GLOBAL DEFAULT 13 sched_setaffinity@@GLIBC_2.3.4 1800: 0000000000000000 0 FILE LOCAL DEFAULT ABS sched_setaffinity.c 4228: 00000000000e2e70 455 FUNC LOCAL DEFAULT 13 __sched_setaffinity_new 4648: 000000000013d3d0 13 FUNC LOCAL DEFAULT 13 __sched_setaffinity_old 7338: 000000000013d3d0 13 FUNC GLOBAL DEFAULT 13 sched_setaffinity@GLIBC_2 7380: 00000000000e2e70 455 FUNC GLOBAL DEFAULT 13 sched_setaffinity@@GLIBC_ -bash-4.4$ For static library, the definition of sched_setaffinity aliases to the new definition. -bash-4.4$ readelf -s /usr/lib64/libc.a \| grep sched_setaffinity File: /usr/lib64/libc.a(sched_setaffinity.o) 8: 0000000000000000 455 FUNC GLOBAL DEFAULT 1 __sched_setaffinity_new 12: 0000000000000000 455 FUNC WEAK DEFAULT 1 sched_setaffinity For both elfutils and glibc, additional macros are used to control different handling of symbol versioning w.r.t static and shared libraries. For elfutils, the macro is SYMBOL_VERSIONING (https://sourceware.org/git/?p=elfutils.git;a=blob;f=lib/eu-config.h). For glibc, the macro is SHARED (https://sourceware.org/git/?p=glibc.git;a=blob;f=include/shlib-compat.h;hb=refs/heads/master) This patch used SHARED as the macro name. After this patch, the libbpf.a has -bash-4.4$ readelf -s libbpf.a \| grep xsk_umem__create 372: 0000000000017145 1190 FUNC GLOBAL DEFAULT 1 xsk_umem__create_v0_0_4 405: 0000000000017145 1190 FUNC GLOBAL DEFAULT 1 xsk_umem__create 499: 00000000000175eb 103 FUNC GLOBAL DEFAULT 1 xsk_umem__create_v0_0_2 -bash-4.4$ No versioned symbols for xsk_umem__create. The libbpf.a can be used to build a shared library succesfully. -bash-4.4$ cat t.c extern void xsk_umem__create; void test() { return xsk_umem__create; } -bash-4.4$ gcc -c -fPIC t.c -bash-4.4$ gcc -shared t.o libbpf.a -o t.so -bash-4.4$ Fixes: 10d30e301732 ("libbpf: add flags to umem config") Cc: Kevin Laatz <kevin.laatz@intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andrii Nakryiko <andriin@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-09-30 16:10:45 -07:00
Andrii Nakryiko	ede18f80d8	scripts: fix empty cherry-pick handling, fix IGNORE_CONSISTENCY check Fix two issues I've encountered during latest bpf/bpf-next sync. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2019-09-30 15:02:52 -07:00
Andrii Nakryiko	07cd489681	libbpf: fix Github-only indentation issue When appying urgent fix to Github mirror, before it was pushed to linux repo, there were some indentation issues, which eventually got fixed upstream, but are still in Github mirror. Fix it to prevent future merge conflicts. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2019-09-30 15:02:43 -07:00
Andrii Nakryiko	d2f307c7f6	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 0bb52b0dfc88a155688f492aba8e686147600278 Checkpoint bpf-next commit: b41dae061bbd722b9d7fa828f35d22035b218e18 Baseline bpf commit: 2c238177bd7f4b14bdf7447cc1cd9bb791f147e6 Checkpoint bpf commit: e3439af4a339acd7fddbd6d59b8ecefaac07a611 Alexei Starovoitov (1): tools/bpf: sync bpf.h Andrii Nakryiko (2): libbpf: fix false uninitialized variable warning libbpf: Teach btf_dumper to emit stand-alone anonymous enum definitions Daniel Borkmann (1): bpf: sync bpf.h to tools infrastructure Kevin Laatz (1): libbpf: add flags to umem config Toke Høiland-Jørgensen (1): libbpf: Remove getsockopt() check for XDP_OPTIONS include/uapi/linux/bpf.h \| 7 ++- include/uapi/linux/if_xdp.h \| 9 ++++ src/btf_dump.c \| 94 ++++++++++++++++++++++++++++++++++--- src/libbpf.map \| 1 + src/xsk.c \| 44 +++++++++++------ src/xsk.h \| 27 +++++++++++ 6 files changed, 160 insertions(+), 22 deletions(-) -- 2.17.1	2019-09-26 13:29:16 -07:00
Andrii Nakryiko	990cef2a0c	libbpf: Teach btf_dumper to emit stand-alone anonymous enum definitions BTF-to-C converter previously skipped anonymous enums in an assumption that those are embedded in struct's field definitions. This is not always the case and a lot of kernel constants are defined as part of anonymous enums. This change fixes the logic by eagerly marking all types as either referenced by any other type or not. This is enough to distinguish two classes of anonymous enums and emit previously omitted enum definitions. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20190925203745.3173184-1-andriin@fb.com	2019-09-26 13:29:16 -07:00
Andrii Nakryiko	4c2c521513	libbpf: fix false uninitialized variable warning Some compilers emit warning for potential uninitialized next_id usage. The code is correct, but control flow is too complicated for some compilers to figure this out. Re-initialize next_id to satisfy compiler. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-09-26 13:29:16 -07:00
Toke Høiland-Jørgensen	b1e911e9ba	libbpf: Remove getsockopt() check for XDP_OPTIONS The xsk_socket__create() function fails and returns an error if it cannot get the XDP_OPTIONS through getsockopt(). However, support for XDP_OPTIONS was not added until kernel 5.3, so this means that creating XSK sockets always fails on older kernels. Since the option is just used to set the zero-copy flag in the xsk struct, and that flag is not really used for anything yet, just remove the getsockopt() call until a proper use for it is introduced. Suggested-by: Yonghong Song <yhs@fb.com> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-09-26 13:29:16 -07:00
Kevin Laatz	ae673dc91f	libbpf: add flags to umem config This patch adds a 'flags' field to the umem_config and umem_reg structs. This will allow for more options to be added for configuring umems. The first use for the flags field is to add a flag for unaligned chunks mode. These flags can either be user-provided or filled with a default. Since we change the size of the xsk_umem_config struct, we need to version the ABI. This patch includes the ABI versioning for xsk_umem__create. The Makefile was also updated to handle multiple function versions in check-abi. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-09-26 13:29:16 -07:00
Alexei Starovoitov	5a256d12bf	tools/bpf: sync bpf.h sync bpf.h from kernel/ to tools/ Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-09-26 13:29:16 -07:00
Ondrej Mosnacek	ae8edc7624	libbpf: fix linker flags for shared library The -lelf flag needs to be specified after the object files, otherwise the output library produced by some compilers doesn't contain a link to libelf.so: (Example from Debian testing run on Travis.) $ ldd libelf.so linux-vdso.so.1 (0x00007ffcbfda9000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f75f8d24000) /lib64/ld-linux-x86-64.so.2 (0x00007f75f8f0f000) Linking against such library then produces 'undefined reference to ...' errors unless the target links against libelf as well. After this commit the built library references the libelf library correctly: $ ldd libbpf.so linux-vdso.so.1 (0x00007ffc821f1000) libelf.so.1 => /usr/lib/x86_64-linux-gnu/libelf.so.1 (0x00007f70ea3ec000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f70ea22c000) libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f70ea20f000) /lib64/ld-linux-x86-64.so.2 (0x00007f70ea433000) Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>	2019-09-26 00:41:27 -07:00
Ondrej Mosnacek	8f8b4a14fa	Travis CI: add sanity check for libelf dependency Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>	2019-09-26 00:41:27 -07:00
Yonghong Song	476e158b07	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: b753c5a7f99f390fc100de18647ce0dcacdceafc Checkpoint bpf-next commit: 0bb52b0dfc88a155688f492aba8e686147600278 Baseline bpf commit: 91b4db5313a2c793aabc2143efb8ed0cf0fdd097 Checkpoint bpf commit: 2c238177bd7f4b14bdf7447cc1cd9bb791f147e6 Andrii Nakryiko (1): libbpf: make libbpf.map source of truth for libbpf version Ivan Khoronzhuk (1): libbpf: use LFS (_FILE_OFFSET_BITS) instead of direct mmap2 syscall Magnus Karlsson (1): libbpf: add support for need_wakeup flag in AF_XDP part Peter Wu (1): bpf: sync bpf.h to tools/ Quentin Monnet (3): tools: bpf: synchronise BPF UAPI header with tools libbpf: refactor bpf_*_get_next_id() functions libbpf: add bpf_btf_get_next_id() to cycle through BTF objects Stanislav Fomichev (1): bpf: sync bpf.h to tools/ include/uapi/linux/bpf.h \| 12 ++++++--- include/uapi/linux/if_xdp.h \| 13 +++++++++ src/bpf.c \| 24 ++++++++--------- src/bpf.h \| 1 + src/libbpf.map \| 5 ++++ src/xsk.c \| 53 +++++++++++++------------------------ src/xsk.h \| 6 +++++ 7 files changed, 64 insertions(+), 50 deletions(-) -- 2.17.1	2019-08-27 13:58:15 -07:00
Peter Wu	13e1ee420e	bpf: sync bpf.h to tools/ Fix a 'struct pt_reg' typo and clarify when bpf_trace_printk discards lines. Affects documentation only. Signed-off-by: Peter Wu <peter@lekensteyn.nl> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-27 13:58:15 -07:00
Ivan Khoronzhuk	3e2bab6d2c	libbpf: use LFS (_FILE_OFFSET_BITS) instead of direct mmap2 syscall Drop __NR_mmap2 fork in flavor of LFS, that is _FILE_OFFSET_BITS=64 (glibc & bionic) / LARGEFILE64_SOURCE (for musl) decision. It allows mmap() to use 64bit offset that is passed to mmap2 syscall. As result pgoff is not truncated and no need to use direct access to mmap2 for 32 bits systems. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-08-27 13:58:15 -07:00
Quentin Monnet	9084f4cd4d	libbpf: add bpf_btf_get_next_id() to cycle through BTF objects Add an API function taking a BTF object id and providing the id of the next BTF object in the kernel. This can be used to list all BTF objects loaded on the system. v2: - Rebase on top of Andrii's changes regarding libbpf versioning. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-27 13:58:15 -07:00
Quentin Monnet	66d20edaf0	libbpf: refactor bpf_*_get_next_id() functions In preparation for the introduction of a similar function for retrieving the id of the next BTF object, consolidate the code from bpf_prog_get_next_id() and bpf_map_get_next_id() in libbpf. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-27 13:58:15 -07:00
Quentin Monnet	d8d6772ab8	tools: bpf: synchronise BPF UAPI header with tools Synchronise the bpf.h header under tools, to report the addition of the new BPF_BTF_GET_NEXT_ID syscall command for bpf(). Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-27 13:58:15 -07:00
Stanislav Fomichev	4397d09cd8	bpf: sync bpf.h to tools/ Sync new sk storage clone flag. Cc: Martin KaFai Lau <kafai@fb.com> Cc: Yonghong Song <yhs@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-08-27 13:58:15 -07:00
Magnus Karlsson	5771dacd3d	libbpf: add support for need_wakeup flag in AF_XDP part This commit adds support for the new need_wakeup flag in AF_XDP. The xsk_socket__create function is updated to handle this and a new function is introduced called xsk_ring_prod__needs_wakeup(). This function can be used by the application to check if Rx and/or Tx processing needs to be explicitly woken up. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-08-27 13:58:15 -07:00
Andrii Nakryiko	d34efeeef1	libbpf: make libbpf.map source of truth for libbpf version Currently libbpf version is specified in 2 places: libbpf.map and Makefile. They easily get out of sync and it's very easy to update one, but forget to update another one. In addition, Github projection of libbpf has to maintain its own version which has to be remembered to be kept in sync manually, which is very error-prone approach. This patch makes libbpf.map a source of truth for libbpf version and uses shell invocation to parse out correct full and major libbpf version to use during build. Now we need to make sure that once new release cycle starts, we need to add (initially) empty section to libbpf.map with correct latest version. This also will make it possible to keep Github projection consistent with kernel sources version of libbpf by adopting similar parsing of version from libbpf.map. v2->v3: - grep -o + sort -rV (Andrey); v1->v2: - eager version vars evaluation (Jakub); - simplified version regex (Andrey); Cc: Andrey Ignatov <rdna@fb.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-27 13:58:15 -07:00
Petar Penkov	db63a5aa5d	filter.h: fix BPF_LD_MAP_VALUE definition The current definition calls BPF_LD_IMM64_RAW_FULL with BPF_PSEUDO_MAP_FD but the original patch[0] invokes it with BPF_PSEUDO_MAP_VALUE. [0] https://patchwork.ozlabs.org/patch/1082785/	2019-08-22 16:33:59 -07:00
Andrii Nakryiko	d60f568961	Makefile: get libbpf version from libbpf.map Similarly to kernel-side Makefile ([0]), get libbpf version by looking at latest version in libbpf.map. [0] https://patchwork.ozlabs.org/patch/1147232/ Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2019-08-14 22:41:40 -07:00
Andrii Nakryiko	e78a36f4b0	sync: fix non-empty merge detection/handling Fix how non-empty merge detection is done. Allow to proceed despite non-empty merges (they will typically will cause conflicts during applying patches, but if conflicts were handled already, it should be ok to ignore this problem). Also ensure that diff's output is in unified diff format. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2019-08-14 09:53:04 -07:00
Andrii Nakryiko	c8a7eb06bd	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: b707659213d3c70f2c704ec950df6263b4bffe84 Checkpoint bpf-next commit: b753c5a7f99f390fc100de18647ce0dcacdceafc Baseline bpf commit: f1fc7249dddc0e52d9e805e2e661caa118649509 Checkpoint bpf commit: 91b4db5313a2c793aabc2143efb8ed0cf0fdd097 Andrii Nakryiko (2): libbpf: fix missing __WORDSIZE definition libbpf: attempt to load kernel BTF from sysfs first Arnaldo Carvalho de Melo (1): tools headers UAPI: Sync if_link.h with the kernel Daniel Borkmann (1): bpf: sync bpf.h to tools infrastructure include/uapi/linux/bpf.h \| 4 +-- include/uapi/linux/if_link.h \| 5 +++ src/hashmap.h \| 5 +++ src/libbpf.c \| 64 ++++++++++++++++++++++++++++++++---- 4 files changed, 69 insertions(+), 9 deletions(-) -- 2.17.1	2019-08-14 09:52:47 -07:00
Daniel Borkmann	b48c14807b	bpf: sync bpf.h to tools infrastructure Pull in updates in BPF helper function description. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-14 09:52:47 -07:00
Andrii Nakryiko	a3b4055ec7	libbpf: attempt to load kernel BTF from sysfs first Add support for loading kernel BTF from sysfs (/sys/kernel/btf/vmlinux) as a target BTF. Also extend the list of on disk search paths for vmlinux ELF image with entries that perf is searching for. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-08-14 09:52:47 -07:00
Andrii Nakryiko	30603852f4	libbpf: fix missing __WORDSIZE definition hashmap.h depends on __WORDSIZE being defined. It is defined by glibc/musl in different headers. It's an explicit goal for musl to be "non-detectable" at compilation time, so instead include glibc header if glibc is explicitly detected and fall back to musl header otherwise. Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Fixes: e3b924224028 ("libbpf: add resizable non-thread safe internal hashmap") Link: https://lkml.kernel.org/r/20190718173021.2418606-1-andriin@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-08-14 09:52:47 -07:00
Arnaldo Carvalho de Melo	1a28fa5dac	tools headers UAPI: Sync if_link.h with the kernel To pick the changes in: 07a4ddec3ce9 ("bonding: add an option to specify a delay between peer notifications") And silence this build warning: Kernel ABI header at 'tools/include/uapi/linux/if_link.h' differs from latest version at 'include/uapi/linux/if_link.h' Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Vincent Bernat <vincent@bernat.ch> Link: https://lkml.kernel.org/n/tip-3liw4exxh8goc0rq9xryl2kv@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2019-08-14 09:52:47 -07:00
Andrii Nakryiko	def5576b37	sync: pull patches from bpf tree as well Add patches sync from both bpf and bpf-next trees at the same time. Baseline checkpoint commits are tracked independently for both. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2019-08-09 08:41:26 -07:00
Andrii Nakryiko	3e45a16621	sync: make patch applying interactive, allow to ignore consistency Give more control over patching process to allow manual intervention and fix up, after which process will continue. Also allow an option to ignore consistency check results (when having both bpf and bpf-next changes). Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2019-08-09 08:41:26 -07:00
Andrii Nakryiko	2c0e53cb08	sync: attempt to auto-resolve non-libbpf conflicts If cherry-picked commit contains non-libbpf files, chances are high that this will result in conflict, because we are generally skipping commits that didn't touch libbpf files, which means that our working copy will not be up-to-date for non-libbpf files. This change checks if conflicts are only in non-libbpf files and marks them as resolved. This will work fine as long as we don't cherry-pick some more non-libbpf changes to same set of files that happen to conflict with not-so-resolved version of non-libbpf files. But anyways, this should help in a lot of cases. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2019-08-09 08:41:26 -07:00
Andrii Nakryiko	6227c6f8dd	sync: add manual cherry-picking mode This is sometimes necessary when we did ad-hoc urgent bug fixes, which are not identical to the ones in kernel. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2019-08-09 08:41:26 -07:00
Andrii Nakryiko	00ad180d07	sync: extract cherry-picking logic for reuse Extract non-empty merge validation and cherry-picking logic so that it can be re-used for bpf and bpf-next commits. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2019-08-09 08:41:26 -07:00
Andrii Nakryiko	715a58d593	sync: improve and automate already synced patches detection Make detection more precise and automate skip/sync decision, if everything looks sane. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2019-08-09 08:41:26 -07:00
Andrii Nakryiko	11052fc1be	sync: add commit_desc() function and move things around a bit Just cleaning up a bunch of stuff before the next refactoring. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2019-08-09 08:41:26 -07:00
Andrii Nakryiko	97ecda3b25	sync: extract directory changing function Extract the logic of handling relative paths within the script into go_to() function. Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2019-08-09 08:41:26 -07:00
Andrii Nakryiko	342bcfa319	sync: centralize kernel-to-github paths mapping Use associative array (requires at least bash 4) to centralize mapping of paths between kernel's libbpf layout and the one on Github. This minimizes the chance of all those mappings getting out of sync (which happened twice before). Signed-off-by: Andrii Nakryiko <andriin@fb.com>	2019-08-09 08:41:26 -07:00
Andrii Nakryiko	c020432531	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 66b5f1c439843bcbab01cc7f3854ae2742f3d1e3 Checkpoint bpf-next commit: b707659213d3c70f2c704ec950df6263b4bffe84 Baseline bpf commit: 53db1cced401e4c65d49edf198e00daa9fc837e6 Checkpoint bpf commit: f1fc7249dddc0e52d9e805e2e661caa118649509 Andrii Nakryiko (8): libbpf: provide more helpful message on uninitialized global var libbpf: return previous print callback from libbpf_set_print libbpf: add helpers for working with BTF types libbpf: convert libbpf code to use new btf helpers libbpf: add .BTF.ext offset relocation section loading libbpf: implement BPF CO-RE offset relocation algorithm libbpf: fix SIGSEGV when BTF loading fails, but .BTF.ext exists libbpf: sanitize VAR to conservative 1-byte INT Petar Penkov (1): bpf: sync bpf.h to tools/ Stanislav Fomichev (2): tools/bpf: sync bpf_flow_keys flags bpf/flow_dissector: support ipv6 flow_label and BPF_FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL Toke Høiland-Jørgensen (2): tools/include/uapi: Add devmap_hash BPF map type tools/libbpf_probes: Add new devmap_hash type include/uapi/linux/bpf.h \| 44 +- src/btf.c \| 250 +++++----- src/btf.h \| 182 ++++++++ src/btf_dump.c \| 138 ++---- src/libbpf.c \| 972 ++++++++++++++++++++++++++++++++++++--- src/libbpf.h \| 3 +- src/libbpf_internal.h \| 105 +++++ src/libbpf_probes.c \| 1 + 8 files changed, 1405 insertions(+), 290 deletions(-) -- 2.17.1	2019-08-09 08:40:44 -07:00
Andrii Nakryiko	99ce275b52	libbpf: implement BPF CO-RE offset relocation algorithm This patch implements the core logic for BPF CO-RE offsets relocations. Every instruction that needs to be relocated has corresponding bpf_offset_reloc as part of BTF.ext. Relocations are performed by trying to match recorded "local" relocation spec against potentially many compatible "target" types, creating corresponding spec. Details of the algorithm are noted in corresponding comments in the code. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-09 08:40:44 -07:00
Andrii Nakryiko	c0a5f7ee11	libbpf: add .BTF.ext offset relocation section loading Add support for BPF CO-RE offset relocations. Add section/record iteration macros for .BTF.ext. These macro are useful for iterating over each .BTF.ext record, either for dumping out contents or later for BPF CO-RE relocation handling. To enable other parts of libbpf to work with .BTF.ext contents, moved a bunch of type definitions into libbpf_internal.h. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-09 08:40:44 -07:00
Andrii Nakryiko	0da9ba439f	libbpf: convert libbpf code to use new btf helpers Simplify code by relying on newly added BTF helper functions. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-09 08:40:44 -07:00
Andrii Nakryiko	c4735d9e05	libbpf: add helpers for working with BTF types Add lots of frequently used helpers that simplify working with BTF types. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-09 08:40:44 -07:00
Petar Penkov	563f1d3fff	bpf: sync bpf.h to tools/ Sync updated documentation for bpf_redirect_map. Sync the bpf_tcp_gen_syncookie helper function definition with the one in tools/uapi. Signed-off-by: Petar Penkov <ppenkov@google.com> Reviewed-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-09 08:40:44 -07:00
Toke Høiland-Jørgensen	c5d4295fc5	tools/libbpf_probes: Add new devmap_hash type This adds the definition for BPF_MAP_TYPE_DEVMAP_HASH to libbpf_probes.c in tools/lib/bpf. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-09 08:40:44 -07:00
Toke Høiland-Jørgensen	b606dc725e	tools/include/uapi: Add devmap_hash BPF map type This adds the devmap_hash BPF map type to the uapi headers in tools/. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-09 08:40:44 -07:00
Andrii Nakryiko	f615047aa0	libbpf: return previous print callback from libbpf_set_print By returning previously set print callback from libbpf_set_print, it's possible to restore it, eventually. This is useful when running many independent test with one default print function, but overriding log verbosity for particular subset of tests. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-09 08:40:44 -07:00
Stanislav Fomichev	84a508a51f	bpf/flow_dissector: support ipv6 flow_label and BPF_FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL Add support for exporting ipv6 flow label via bpf_flow_keys. Export flow label from bpf_flow.c and also return early when BPF_FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL is passed. Acked-by: Petar Penkov <ppenkov@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: Song Liu <songliubraving@fb.com> Cc: Song Liu <songliubraving@fb.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Petar Penkov <ppenkov@google.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-09 08:40:44 -07:00
Stanislav Fomichev	c59016e100	tools/bpf: sync bpf_flow_keys flags Export bpf_flow_keys flags to tools/libbpf/selftests. Acked-by: Petar Penkov <ppenkov@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: Song Liu <songliubraving@fb.com> Cc: Song Liu <songliubraving@fb.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Petar Penkov <ppenkov@google.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-09 08:40:44 -07:00
Andrii Nakryiko	509ef92905	libbpf: provide more helpful message on uninitialized global var When BPF program defines uninitialized global variable, it's put into a special COMMON section. Libbpf will reject such programs, but will provide very unhelpful message with garbage-looking section index. This patch detects special section cases and gives more explicit error message. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-09 08:40:44 -07:00
Andrii Nakryiko	2c9394f2a3	libbpf: set BTF FD for prog only when there is supported .BTF.ext data 5d01ab7bac46 ("libbpf: fix erroneous multi-closing of BTF FD") introduced backwards-compatibility issue, manifesting itself as -E2BIG error returned on program load due to unknown non-zero btf_fd attribute value for BPF_PROG_LOAD sys_bpf() sub-command. This patch fixes bug by ensuring that we only ever associate BTF FD with program if there is a BTF.ext data that was successfully loaded into kernel, which automatically means kernel supports func_info/line_info and associated BTF FD for progs (checked and ensured also by BTF sanitization code). Fixes: 5d01ab7bac46 ("libbpf: fix erroneous multi-closing of BTF FD") Reported-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-01 11:01:38 -07:00
Takshak Chahande	0f4d83f3ab	libbpf : make libbpf_num_possible_cpus function thread safe Having static variable `cpus` in libbpf_num_possible_cpus function without guarding it with mutex makes this function thread-unsafe. If multiple threads accessing this function, in the current form; it leads to incrementing the static variable value `cpus` in the multiple of total available CPUs. Used local stack variable to calculate the number of possible CPUs and then updated the static variable using WRITE_ONCE(). Changes since v1: * added stack variable to calculate cpus * serialized static variable update using WRITE_ONCE() * fixed Fixes tag Fixes: 6446b3155521 ("bpf: add a new API libbpf_num_possible_cpus()") Signed-off-by: Takshak Chahande <ctakshak@fb.com> Acked-by: Andrey Ignatov <rdna@fb.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-08-01 11:01:38 -07:00
hex	6a7b28b6a1	libbpf: fix extraversion in Makefile The current LIBBPF_VERSION (0.0.3) doesn't match with ABI version LIBBPF_0.0.4. Fix the EXTRAVERSION portion.	2019-07-31 21:56:11 -07:00
Andrii Nakryiko	d76d264ac0	libbpf: fix erroneous multi-closing of BTF FD Libbpf stores associated BTF FD per each instance of bpf_program. When program is unloaded, that FD is closed. This is wrong, because leads to a race and possibly closing of unrelated files, if application simultaneously opens new files while bpf_programs are unloaded. It's also unnecessary, because struct btf "owns" that FD, and btf__free(), called from bpf_object__close() will close it. Thus the fix is to never have per-program BTF FD and fetch it from obj->btf, when necessary. Fixes: 2993e0515bb4 ("tools/bpf: add support to read .BTF.ext sections") Reported-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-07-29 16:32:58 -07:00
Andrii Nakryiko	63a3bdf23a	libbpf: silence GCC8 warning about string truncation Despite a proper NULL-termination after strncpy(..., ..., IFNAMSIZ - 1), GCC8 still complains about expected string truncation: xsk.c:330:2: error: 'strncpy' output may be truncated copying 15 bytes from a string of length 15 [-Werror=stringop-truncation] strncpy(ifr.ifr_name, xsk->ifname, IFNAMSIZ - 1); This patch gets rid of the issue altogether by using memcpy instead. There is no performance regression, as strncpy will still copy and fill all of the bytes anyway. v1->v2: - rebase against bpf tree. Cc: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-07-29 16:32:58 -07:00
Ilya Maximets	12fa15e89a	libbpf: fix using uninitialized ioctl results 'channels.max_combined' initialized only on ioctl success and errno is only valid on ioctl failure. The code doesn't produce any runtime issues, but makes memory sanitizers angry: Conditional jump or move depends on uninitialised value(s) at 0x55C056F: xsk_get_max_queues (xsk.c:336) by 0x55C05B2: xsk_create_bpf_maps (xsk.c:354) by 0x55C089F: xsk_setup_xdp_prog (xsk.c:447) by 0x55C0E57: xsk_socket__create (xsk.c:601) Uninitialised value was created by a stack allocation at 0x55C04CD: xsk_get_max_queues (xsk.c:318) Additionally fixed warning on uninitialized bytes in ioctl arguments: Syscall param ioctl(SIOCETHTOOL) points to uninitialised byte(s) at 0x648D45B: ioctl (in /usr/lib64/libc-2.28.so) by 0x55C0546: xsk_get_max_queues (xsk.c:330) by 0x55C05B2: xsk_create_bpf_maps (xsk.c:354) by 0x55C089F: xsk_setup_xdp_prog (xsk.c:447) by 0x55C0E57: xsk_socket__create (xsk.c:601) Address 0x1ffefff378 is on thread 1's stack in frame #1, created by xsk_get_max_queues (xsk.c:318) Uninitialised value was created by a stack allocation at 0x55C04CD: xsk_get_max_queues (xsk.c:318) CC: Magnus Karlsson <magnus.karlsson@intel.com> Fixes: 1cad07884239 ("libbpf: add support for using AF_XDP sockets") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2019-07-29 16:32:58 -07:00
Arnaldo Carvalho de Melo	b987dcfecb	libbpf: Avoid designated initializers for unnamed union members As it fails to build in some systems with: libbpf.c: In function 'perf_buffer__new': libbpf.c:4515: error: unknown field 'sample_period' specified in initializer libbpf.c:4516: error: unknown field 'wakeup_events' specified in initializer Doing as: attr.sample_period = 1; I.e. not as a designated initializer makes it build everywhere. Cc: Andrii Nakryiko <andriin@fb.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Fixes: fb84b8224655 ("libbpf: add perf buffer API") Link: https://lkml.kernel.org/n/tip-hnlmch8qit1ieksfppmr32si@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-07-29 16:32:58 -07:00
Arnaldo Carvalho de Melo	9c1ab4d070	libbpf: Fix endianness macro usage for some compilers Using endian.h and its endianness macros makes this code build in a wider range of compilers, as some don't have those macros (__BYTE_ORDER__, __ORDER_LITTLE_ENDIAN__, __ORDER_BIG_ENDIAN__), so use instead endian.h's macros (__BYTE_ORDER, __LITTLE_ENDIAN, __BIG_ENDIAN) which makes this code even shorter :-) Acked-by: Andrii Nakryiko <andriin@fb.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Fixes: 12ef5634a855 ("libbpf: simplify endianness check") Fixes: e6c64855fd7a ("libbpf: add btf__parse_elf API to load .BTF and .BTF.ext") Link: https://lkml.kernel.org/n/tip-eep5n8vgwcdphw3uc058k03u@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-07-29 16:32:58 -07:00

129 changed files with 163652 additions and 7700 deletions

1

.gitattributes vendored Normal file

View File

				`@@ -0,0 +1 @@`
				`assets/** export-ignore`

108617

.github/actions/build-selftests/vmlinux.h vendored Normal file

View File

File diff suppressed because it is too large Load Diff

									
										16

.github/actions/debian/action.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,16 @@

				name: 'debian'

				description: 'Build'

				inputs:

				  target:

				    description: 'Run target'

				    required: true

				runs:

				  using: "composite"

				  steps:

				    - run: |

				        source /tmp/ci_setup

				        bash -x $CI_ROOT/managers/debian.sh SETUP

				        bash -x $CI_ROOT/managers/debian.sh ${{ inputs.target }}

				        bash -x $CI_ROOT/managers/debian.sh CLEANUP

				      shell: bash

									
										23

.github/actions/setup/action.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,23 @@

				name: 'setup'

				description: 'setup env, create /tmp/ci_setup'

				runs:

				  using: "composite"

				  steps:

				    - id: variables

				      run: |

				        export REPO_ROOT=$GITHUB_WORKSPACE

				        export CI_ROOT=$REPO_ROOT/ci

				        # this is somewhat ugly, but that is the easiest way to share this code with

				        # arch specific docker

				        echo 'echo ::group::Env setup' > /tmp/ci_setup

				        echo export DEBIAN_FRONTEND=noninteractive >> /tmp/ci_setup

				        echo sudo apt-get update >> /tmp/ci_setup

				        echo sudo apt-get install -y aptitude qemu-kvm zstd binutils-dev elfutils libcap-dev libelf-dev libdw-dev libguestfs-tools >> /tmp/ci_setup

				        echo export PROJECT_NAME='libbpf' >> /tmp/ci_setup

				        echo export AUTHOR_EMAIL="$(git log -1 --pretty=\"%aE\")" >> /tmp/ci_setup

				        echo export REPO_ROOT=$GITHUB_WORKSPACE >> /tmp/ci_setup

				        echo export CI_ROOT=$REPO_ROOT/ci >> /tmp/ci_setup

				        echo export VMTEST_ROOT=$CI_ROOT/vmtest >> /tmp/ci_setup

				        echo 'echo ::endgroup::' >> /tmp/ci_setup

				      shell: bash

									
										92

.github/workflows/build.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,92 @@

				name: libbpf-build

				on:

				  pull_request:

				  push:

				  schedule:

				    - cron:  '0 18 * * *'

				concurrency:

				  group: ci-build-${{ github.head_ref }}

				  cancel-in-progress: true

				jobs:

				  debian:

				    runs-on: ubuntu-latest

				    name: Debian Build (${{ matrix.name }})

				    strategy:

				      fail-fast: false

				      matrix:

				        include:

				          - name: default

				            target: RUN

				          - name: ASan+UBSan

				            target: RUN_ASAN

				          - name: clang ASan+UBSan

				            target: RUN_CLANG_ASAN

				          - name: gcc-10 ASan+UBSan

				            target: RUN_GCC10_ASAN

				          - name: clang

				            target: RUN_CLANG

				          - name: clang-14

				            target: RUN_CLANG14

				          - name: clang-15

				            target: RUN_CLANG15

				          - name: clang-16

				            target: RUN_CLANG16

				          - name: gcc-10

				            target: RUN_GCC10

				          - name: gcc-11

				            target: RUN_GCC11

				          - name: gcc-12

				            target: RUN_GCC12

				    steps:

				      - uses: actions/checkout@v4

				        name: Checkout

				      - uses: ./.github/actions/setup

				        name: Setup

				      - uses: ./.github/actions/debian

				        name: Build

				        with:

				          target: ${{ matrix.target }}

				  ubuntu:

				    runs-on: ubuntu-latest

				    name: Ubuntu Build (${{ matrix.arch }})

				    strategy:

				      fail-fast: false

				      matrix:

				        include:

				          - arch: aarch64

				          - arch: ppc64le

				          - arch: s390x

				          - arch: amd64

				    steps:

				      - uses: actions/checkout@v4

				        name: Checkout

				      - name: Setup QEMU

				        uses: docker/setup-qemu-action@v3

				        with:

				          image: tonistiigi/binfmt:qemu-v8.1.5

				      - uses: ./.github/actions/setup

				        name: Pre-Setup

				      - run: source /tmp/ci_setup && sudo -E $CI_ROOT/managers/ubuntu.sh

				        if: matrix.arch == 'amd64'

				        name: Setup

				      - name: Build in docker

				        if: matrix.arch != 'amd64'

				        run: |

				          cp /tmp/ci_setup ${GITHUB_WORKSPACE}

				          docker run --rm \

				                 --platform linux/${{ matrix.arch }} \

				                 -v ${GITHUB_WORKSPACE}:${GITHUB_WORKSPACE} \

				                 -e GITHUB_WORKSPACE=${GITHUB_WORKSPACE} \

				                 -w /ci/workspace \

				                 ubuntu:noble \

				                 ${GITHUB_WORKSPACE}/ci/build-in-docker.sh

									
										40

.github/workflows/cifuzz.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,40 @@

				---

				# https://google.github.io/oss-fuzz/getting-started/continuous-integration/

				name: CIFuzz

				on:

				  push:

				    branches:

				      - master

				  pull_request:

				    branches:

				      - master

				jobs:

				  Fuzzing:

				    runs-on: ubuntu-latest

				    if: github.repository == 'libbpf/libbpf'

				    strategy:

				      fail-fast: false

				      matrix:

				        sanitizer: [address, undefined, memory]

				    steps:

				      - name: Build Fuzzers (${{ matrix.sanitizer }})

				        id: build

				        uses: google/oss-fuzz/infra/cifuzz/actions/build_fuzzers@master

				        with:

				          oss-fuzz-project-name: 'libbpf'

				          dry-run: false

				          allowed-broken-targets-percentage: 0

				          sanitizer: ${{ matrix.sanitizer }}

				      - name: Run Fuzzers (${{ matrix.sanitizer }})

				        uses: google/oss-fuzz/infra/cifuzz/actions/run_fuzzers@master

				        with:

				          oss-fuzz-project-name: 'libbpf'

				          fuzz-seconds: 300

				          dry-run: false

				          sanitizer: ${{ matrix.sanitizer }}

				      - name: Upload Crash

				        uses: actions/upload-artifact@v4

				        if: failure() && steps.build.outcome == 'success'

				        with:

				          name: ${{ matrix.sanitizer }}-artifacts

				          path: ./out/artifacts

									
										52

.github/workflows/codeql.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,52 @@

				---

				# vi: ts=2 sw=2 et:

				name: "CodeQL"

				on:

				  push:

				    branches:

				      - master

				  pull_request:

				    branches:

				      - master

				permissions:

				  contents: read

				jobs:

				  analyze:

				    name: Analyze

				    runs-on: ubuntu-latest

				    concurrency:

				      group: ${{ github.workflow }}-${{ matrix.language }}-${{ github.ref }}

				      cancel-in-progress: true

				    permissions:

				      actions: read

				      security-events: write

				    strategy:

				      fail-fast: false

				      matrix:

				        language: ['cpp', 'python']

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@v4

				      - name: Initialize CodeQL

				        uses: github/codeql-action/init@v2

				        with:

				          languages: ${{ matrix.language }}

				          queries: +security-extended,security-and-quality

				      - name: Setup

				        uses: ./.github/actions/setup

				      - name: Build

				        run: |

				          source /tmp/ci_setup

				          make -C ./src

				      - name: Perform CodeQL Analysis

				        uses: github/codeql-action/analyze@v2

									
										32

.github/workflows/coverity.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,32 @@

				name: libbpf-ci-coverity

				on:

				  push:

				    branches:

				      - master

				  schedule:

				    - cron:  '0 18 * * *'

				jobs:

				  coverity:

				    runs-on: ubuntu-latest

				    name: Coverity

				    env:

				      COVERITY_SCAN_TOKEN: ${{ secrets.COVERITY_SCAN_TOKEN }}

				    steps:

				      - uses: actions/checkout@v4

				      - uses: ./.github/actions/setup

				      - name: Run coverity

				        if: ${{ env.COVERITY_SCAN_TOKEN }}

				        run: |

				          source /tmp/ci_setup

				          export COVERITY_SCAN_NOTIFICATION_EMAIL="${AUTHOR_EMAIL}"

				          export COVERITY_SCAN_BRANCH_PATTERN=${GITHUB_REF##refs/*/}

				          export TRAVIS_BRANCH=${COVERITY_SCAN_BRANCH_PATTERN}

				          scripts/coverity.sh

				        env:

				          COVERITY_SCAN_PROJECT_NAME: libbpf

				          COVERITY_SCAN_BUILD_COMMAND_PREPEND: 'cd src/'

				          COVERITY_SCAN_BUILD_COMMAND: 'make'

				      - name: SCM log

				        run: cat /home/runner/work/libbpf/libbpf/src/cov-int/scm_log.txt

									
										19

.github/workflows/lint.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,19 @@

				name: "lint"

				on:

				  pull_request:

				  push:

				    branches:

				      - master

				jobs:

				  shellcheck:

				    name: ShellCheck

				    runs-on: ubuntu-latest

				    steps:

				      - name: Checkout repository

				        uses: actions/checkout@v4

				      - name: Run ShellCheck

				        uses: ludeeus/action-shellcheck@master

				        env:

				          SHELLCHECK_OPTS: --severity=error

									
										31

.github/workflows/ondemand.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,31 @@

				name: ondemand

				on:

				  workflow_dispatch:

				    inputs:

				      arch:

				        default: 'x86_64'

				        required: true

				      llvm-version:

				        default: '18'

				        required: true

				      kernel:

				        default: 'LATEST'

				        required: true

				      pahole:

				        default: "master"

				        required: true

				      runs-on:

				        default: 'ubuntu-24.04'

				        required: true

				jobs:

				  vmtest:

				    name: ${{ inputs.kernel }} kernel llvm-${{ inputs.llvm-version }} pahole@${{ inputs.pahole }}

				    uses: ./.github/workflows/vmtest.yml

				    with:

				      runs_on: ${{ inputs.runs-on }}

				      kernel: ${{ inputs.kernel }}

				      arch: ${{ inputs.arch }}

				      llvm-version: ${{ inputs.llvm-version }}

				      pahole: ${{ inputs.pahole }}

									
										36

.github/workflows/test.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,36 @@

				name: libbpf-ci

				on:

				  pull_request:

				  push:

				  schedule:

				    - cron:  '0 18 * * *'

				concurrency:

				  group: ci-test-${{ github.head_ref }}

				  cancel-in-progress: true

				jobs:

				  vmtest:

				    strategy:

				      fail-fast: false

				      matrix:

				        include:

				          - kernel: 'LATEST'

				            runs_on: 'ubuntu-24.04'

				            arch: 'x86_64'

				            llvm-version: '18'

				            pahole: 'master'

				          - kernel: 'LATEST'

				            runs_on: 'ubuntu-24.04'

				            arch: 'x86_64'

				            llvm-version: '18'

				            pahole: 'tmp.master'

				    name: Linux ${{ matrix.kernel }} llvm-${{ matrix.llvm-version }}

				    uses: ./.github/workflows/vmtest.yml

				    with:

				      runs_on: ${{ matrix.runs_on }}

				      kernel: ${{ matrix.kernel }}

				      arch: ${{ matrix.arch }}

				      llvm-version: ${{ matrix.llvm-version }}

				      pahole: ${{ matrix.pahole }}

									
										117

.github/workflows/vmtest.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,117 @@

				name: 'Build kernel and selftests/bpf, run selftests via vmtest'

				on:

				  workflow_call:

				    inputs:

				      runs_on:

				        required: true

				        default: 'ubuntu-24.04'

				        type: string

				      arch:

				        description: 'what arch to test'

				        required: true

				        default: 'x86_64'

				        type: string

				      kernel:

				        description: 'kernel version or LATEST'

				        required: true

				        default: 'LATEST'

				        type: string

				      pahole:

				        description: 'pahole rev or branch'

				        required: false

				        default: 'master'

				        type: string

				      llvm-version:

				        description: 'llvm version'

				        required: false

				        default: '18'

				        type: string

				jobs:

				  vmtest:

				    name: pahole@${{ inputs.pahole }}

				    runs-on: ${{ inputs.runs_on }}

				    steps:

				      - uses: actions/checkout@v4

				      - name: Setup environment

				        uses: libbpf/ci/setup-build-env@v3

				        with:

				          pahole: ${{ inputs.pahole }}

				          arch: ${{ inputs.arch }}

				          llvm-version: ${{ inputs.llvm-version }}

				      - name: Get checkpoint commit

				        shell: bash

				        run: |

				          cat CHECKPOINT-COMMIT

				          echo "CHECKPOINT=$(cat CHECKPOINT-COMMIT)" >> $GITHUB_ENV

				      - name: Get kernel source at checkpoint

				        uses: libbpf/ci/get-linux-source@v3

				        with:

				          repo: 'https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git'

				          rev: ${{ env.CHECKPOINT }}

				          dest: '${{ github.workspace }}/.kernel'

				      - name: Patch kernel source

				        uses: libbpf/ci/patch-kernel@v3

				        with:

				          patches-root: '${{ github.workspace }}/ci/diffs'

				          repo-root: '.kernel'

				      - name: Configure kernel build

				        shell: bash

				        run: |

				          cd .kernel

				          cat tools/testing/selftests/bpf/config \

				              tools/testing/selftests/bpf/config.${{ inputs.arch }} > .config

				          # this file might or might not exist depending on kernel version

				          cat tools/testing/selftests/bpf/config.vm >> .config || :

				          make olddefconfig && make prepare

				          cd -

				      - name: Build kernel image

				        if: ${{ inputs.kernel == 'LATEST' }}

				        shell: bash

				        run: |

				          cd .kernel

				          make -j $((4*$(nproc))) all

				          cp vmlinux ${{ github.workspace }}

				          cd -

				      - name: Download prebuilt kernel

				        if: ${{ inputs.kernel != 'LATEST' }}

				        uses: libbpf/ci/download-vmlinux@v3

				        with:

				          kernel: ${{ inputs.kernel }}

				          arch: ${{ inputs.arch }}

				      - name: Build selftests/bpf

				        uses: libbpf/ci/build-selftests@v3

				        env:

				          MAX_MAKE_JOBS: 32

				          VMLINUX_BTF: ${{ github.workspace }}/vmlinux

				          VMLINUX_H: ${{ inputs.kernel != 'LATEST' && format('{0}/.github/actions/build-selftests/vmlinux.h', github.workspace) || '' }}

				        with:

				          arch: ${{ inputs.arch }}

				          kernel-root: ${{ github.workspace }}/.kernel

				          llvm-version: ${{ inputs.llvm-version }}

				      - name: Run selftests

				        env:

				          ALLOWLIST_FILE: /tmp/allowlist

				          DENYLIST_FILE: /tmp/denylist

				          KERNEL: ${{ inputs.kernel }}

				          VMLINUX: ${{ github.workspace }}/vmlinux

				          LLVM_VERSION: ${{ inputs.llvm-version }}

				          SELFTESTS_BPF: ${{ github.workspace }}/.kernel/tools/testing/selftests/bpf

				          VMTEST_CONFIGS: ${{ github.workspace }}/ci/vmtest/configs

				        uses: libbpf/ci/run-vmtest@v3

				        with:

				          arch: ${{ inputs.arch }}

				          kbuild-output: ${{ github.workspace }}/.kernel

				          kernel-root: ${{ github.workspace }}/.kernel

				          vmlinuz: ${{ inputs.arch }}/vmlinuz-${{ inputs.kernel }}

22

.mailmap Normal file

View File

@@ -0,0 +1,22 @@
 Alexei Starovoitov <ast@kernel.org> <alexei.starovoitov@gmail.com>
 Antoine Tenart <atenart@kernel.org> <antoine.tenart@bootlin.com>
 Benjamin Tissoires <bentiss@kernel.org> <benjamin.tissoires@redhat.com>
 Björn Töpel <bjorn@kernel.org> <bjorn.topel@intel.com>
 Changbin Du <changbin.du@intel.com> <changbin.du@gmail.com>
 Colin Ian King <colin.i.king@gmail.com> <colin.king@canonical.com>
 Dan Carpenter <error27@gmail.com> <dan.carpenter@oracle.com>
 Geliang Tang <geliang@kernel.org> <geliang.tang@suse.com>
 Herbert Xu <herbert@gondor.apana.org.au>
 Jakub Kicinski <kuba@kernel.org> <jakub.kicinski@netronome.com>
 Jesper Dangaard Brouer <hawk@kernel.org> <brouer@redhat.com>
 Kees Cook <kees@kernel.org> <keescook@chromium.org>
 Leo Yan <leo.yan@linux.dev> <leo.yan@linaro.org>
 Mark Starovoytov <mstarovo@pm.me> <mstarovoitov@marvell.com>
 Maxim Mikityanskiy <maxtram95@gmail.com> <maximmi@mellanox.com>
 Maxim Mikityanskiy <maxtram95@gmail.com> <maximmi@nvidia.com>
 Puranjay Mohan <puranjay@kernel.org> <puranjay12@gmail.com>
 Quentin Monnet <qmo@kernel.org> <quentin@isovalent.com>
 Quentin Monnet <qmo@kernel.org> <quentin.monnet@netronome.com>
 Stanislav Fomichev <sdf@fomichev.me> <sdf@google.com>
 Vadim Fedorenko <vadim.fedorenko@linux.dev> <vadfed@meta.com>
 Vadim Fedorenko <vadim.fedorenko@linux.dev> <vfedorenko@novek.ru>

									
										26

.readthedocs.yaml
									
										Normal file
									
												View File
												
				@@ -0,0 +1,26 @@

				# .readthedocs.yaml

				# Read the Docs configuration file

				# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details

				# Required

				version: 2

				build:

				  os: "ubuntu-22.04"

				  tools:

				    python: "3.11"

				# Build documentation in the docs/ directory with Sphinx

				sphinx:

				  builder: html

				  configuration: docs/conf.py

				formats:

				  - htmlzip

				  - pdf

				  - epub

				# Optionally set the version of Python and requirements required to build your docs

				python:

				   install:

				   - requirements: docs/sphinx/requirements.txt

									
										122

.travis.yml
									
												View File
											
				@@ -1,122 +0,0 @@

				sudo: required

				dist: xenial

				services:

				    - docker

				env:

				    global:

				        - AUTHOR_EMAIL="$(git log -1 $TRAVIS_COMMIT --pretty=\"%aE\")"

				        - CI_MANAGERS="$TRAVIS_BUILD_DIR/travis-ci/managers"

				        - REPO_ROOT="$TRAVIS_BUILD_DIR"

				jobs:

				    include:

				        - stage: Build & test

				          name: Debian Testing

				          language: bash

				          env:

				              - DEBIAN_RELEASE="testing"

				              - CONT_NAME="libbpf-debian-$DEBIAN_RELEASE"

				          before_install:

				              - sudo apt-get -y -o Dpkg::Options::="--force-confnew" install docker-ce

				              - docker --version

				          install:

				              - $CI_MANAGERS/debian.sh SETUP

				          script:

				              - set -e

				              - $CI_MANAGERS/debian.sh RUN

				              - set +e

				          after_script:

				              - $CI_MANAGERS/debian.sh CLEANUP

				        - name: Debian Testing (ASan+UBSan)

				          language: bash

				          env:

				              - DEBIAN_RELEASE="testing"

				              - CONT_NAME="libbpf-debian-$DEBIAN_RELEASE"

				          before_install:

				              - sudo apt-get -y -o Dpkg::Options::="--force-confnew" install docker-ce

				              - docker --version

				          install:

				              - $CI_MANAGERS/debian.sh SETUP

				          script:

				              - set -e

				              - $CI_MANAGERS/debian.sh RUN_ASAN

				              - set +e

				          after_script:

				              - $CI_MANAGERS/debian.sh CLEANUP

				        - name: Debian Testing (clang)

				          language: bash

				          env:

				              - DEBIAN_RELEASE="testing"

				              - CONT_NAME="libbpf-debian-$DEBIAN_RELEASE"

				          before_install:

				              - sudo apt-get -y -o Dpkg::Options::="--force-confnew" install docker-ce

				              - docker --version

				          install:

				              - $CI_MANAGERS/debian.sh SETUP

				          script:

				              - set -e

				              - $CI_MANAGERS/debian.sh RUN_CLANG

				              - set +e

				          after_script:

				              - $CI_MANAGERS/debian.sh CLEANUP

				        - name: Debian Testing (clang ASan+UBSan)

				          language: bash

				          env:

				              - DEBIAN_RELEASE="testing"

				              - CONT_NAME="libbpf-debian-$DEBIAN_RELEASE"

				          before_install:

				              - sudo apt-get -y -o Dpkg::Options::="--force-confnew" install docker-ce

				              - docker --version

				          install:

				              - $CI_MANAGERS/debian.sh SETUP

				          script:

				              - set -e

				              - $CI_MANAGERS/debian.sh RUN_CLANG_ASAN

				              - set +e

				          after_script:

				              - $CI_MANAGERS/debian.sh CLEANUP

				        - name: Debian Testing (gcc-8)

				          language: bash

				          env:

				              - DEBIAN_RELEASE="testing"

				              - CONT_NAME="libbpf-debian-$DEBIAN_RELEASE"

				          before_install:

				              - sudo apt-get -y -o Dpkg::Options::="--force-confnew" install docker-ce

				              - docker --version

				          install:

				              - $CI_MANAGERS/debian.sh SETUP

				          script:

				              - set -e

				              - $CI_MANAGERS/debian.sh RUN_GCC8

				              - set +e

				          after_script:

				              - $CI_MANAGERS/debian.sh CLEANUP

				        - name: Debian Testing (gcc-8 ASan+UBSan)

				          language: bash

				          env:

				              - DEBIAN_RELEASE="testing"

				              - CONT_NAME="libbpf-debian-$DEBIAN_RELEASE"

				          before_install:

				              - sudo apt-get -y -o Dpkg::Options::="--force-confnew" install docker-ce

				              - docker --version

				          install:

				              - $CI_MANAGERS/debian.sh SETUP

				          script:

				              - set -e

				              - $CI_MANAGERS/debian.sh RUN_GCC8_ASAN

				              - set +e

				          after_script:

				              - $CI_MANAGERS/debian.sh CLEANUP

				        - name: Ubuntu Xenial

				          language: bash

				          script:

				              - set -e

				              - sudo $CI_MANAGERS/xenial.sh

				              - set +e

1

BPF-CHECKPOINT-COMMIT Normal file

View File

				`@@ -0,0 +1 @@`
				`b4432656b36e5cc1d50a1f2dc15357543add530e`

2

CHECKPOINT-COMMIT

View File

@@ -1 +1 @@
 b5f1c439843bcbab01cc7f3854ae2742f3d1e3
 d53fe9adff354b6a93fda5f38c165947da0f

1

LICENSE Normal file

View File

				`@@ -0,0 +1 @@`
				`LGPL-2.1 OR BSD-2-Clause`

32

LICENSE.BSD-2-Clause Normal file

View File

@@ -0,0 +1,32 @@
 Valid-License-Identifier: BSD-2-Clause
 SPDX-URL: https://spdx.org/licenses/BSD-2-Clause.html
 Usage-Guide:
   To use the BSD 2-clause "Simplified" License put the following SPDX
   tag/value pair into a comment according to the placement guidelines in
   the licensing rules documentation:
     SPDX-License-Identifier: BSD-2-Clause
 License-Text:
 Copyright (c) 2015 The Libbpf Authors. All rights reserved.
 Redistribution and use in source and binary forms, with or without
 modification, are permitted provided that the following conditions are met:
 . Redistributions of source code must retain the above copyright notice,
    this list of conditions and the following disclaimer.
 . Redistributions in binary form must reproduce the above copyright
    notice, this list of conditions and the following disclaimer in the
    documentation and/or other materials provided with the distribution.
 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
 AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
 LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
 CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
 SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
 INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
 CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
 ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
 POSSIBILITY OF SUCH DAMAGE.

503

LICENSE.LGPL-2.1 Normal file

View File

@@ -0,0 +1,503 @@
 Valid-License-Identifier: LGPL-2.1
 Valid-License-Identifier: LGPL-2.1+
 SPDX-URL: https://spdx.org/licenses/LGPL-2.1.html
 Usage-Guide:
   To use this license in source code, put one of the following SPDX
   tag/value pairs into a comment according to the placement
   guidelines in the licensing rules documentation.
   For 'GNU Lesser General Public License (LGPL) version 2.1 only' use:
     SPDX-License-Identifier: LGPL-2.1
   For 'GNU Lesser General Public License (LGPL) version 2.1 or any later
   version' use:
     SPDX-License-Identifier: LGPL-2.1+
 License-Text:
 GNU LESSER GENERAL PUBLIC LICENSE
 Version 2.1, February 1999
 Copyright (C) 1991, 1999 Free Software Foundation, Inc.
 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
 Everyone is permitted to copy and distribute verbatim copies of this
 license document, but changing it is not allowed.
 [This is the first released version of the Lesser GPL. It also counts as
 the successor of the GNU Library Public License, version 2, hence the
 version number 2.1.]
 Preamble
 The licenses for most software are designed to take away your freedom to
 share and change it. By contrast, the GNU General Public Licenses are
 intended to guarantee your freedom to share and change free software--to
 make sure the software is free for all its users.
 This license, the Lesser General Public License, applies to some specially
 designated software packages--typically libraries--of the Free Software
 Foundation and other authors who decide to use it. You can use it too, but
 we suggest you first think carefully about whether this license or the
 ordinary General Public License is the better strategy to use in any
 particular case, based on the explanations below.
 When we speak of free software, we are referring to freedom of use, not
 price. Our General Public Licenses are designed to make sure that you have
 the freedom to distribute copies of free software (and charge for this
 service if you wish); that you receive source code or can get it if you
 want it; that you can change the software and use pieces of it in new free
 programs; and that you are informed that you can do these things.
 To protect your rights, we need to make restrictions that forbid
 distributors to deny you these rights or to ask you to surrender these
 rights. These restrictions translate to certain responsibilities for you if
 you distribute copies of the library or if you modify it.
 For example, if you distribute copies of the library, whether gratis or for
 a fee, you must give the recipients all the rights that we gave you. You
 must make sure that they, too, receive or can get the source code. If you
 link other code with the library, you must provide complete object files to
 the recipients, so that they can relink them with the library after making
 changes to the library and recompiling it. And you must show them these
 terms so they know their rights.
 We protect your rights with a two-step method: (1) we copyright the
 library, and (2) we offer you this license, which gives you legal
 permission to copy, distribute and/or modify the library.
 To protect each distributor, we want to make it very clear that there is no
 warranty for the free library. Also, if the library is modified by someone
 else and passed on, the recipients should know that what they have is not
 the original version, so that the original author's reputation will not be
 affected by problems that might be introduced by others.
 Finally, software patents pose a constant threat to the existence of any
 free program. We wish to make sure that a company cannot effectively
 restrict the users of a free program by obtaining a restrictive license
 from a patent holder. Therefore, we insist that any patent license obtained
 for a version of the library must be consistent with the full freedom of
 use specified in this license.
 Most GNU software, including some libraries, is covered by the ordinary GNU
 General Public License. This license, the GNU Lesser General Public
 License, applies to certain designated libraries, and is quite different
 from the ordinary General Public License. We use this license for certain
 libraries in order to permit linking those libraries into non-free
 programs.
 When a program is linked with a library, whether statically or using a
 shared library, the combination of the two is legally speaking a combined
 work, a derivative of the original library. The ordinary General Public
 License therefore permits such linking only if the entire combination fits
 its criteria of freedom. The Lesser General Public License permits more lax
 criteria for linking other code with the library.
 We call this license the "Lesser" General Public License because it does
 Less to protect the user's freedom than the ordinary General Public
 License. It also provides other free software developers Less of an
 advantage over competing non-free programs. These disadvantages are the
 reason we use the ordinary General Public License for many
 libraries. However, the Lesser license provides advantages in certain
 special circumstances.
 For example, on rare occasions, there may be a special need to encourage
 the widest possible use of a certain library, so that it becomes a de-facto
 standard. To achieve this, non-free programs must be allowed to use the
 library. A more frequent case is that a free library does the same job as
 widely used non-free libraries. In this case, there is little to gain by
 limiting the free library to free software only, so we use the Lesser
 General Public License.
 In other cases, permission to use a particular library in non-free programs
 enables a greater number of people to use a large body of free
 software. For example, permission to use the GNU C Library in non-free
 programs enables many more people to use the whole GNU operating system, as
 well as its variant, the GNU/Linux operating system.
 Although the Lesser General Public License is Less protective of the users'
 freedom, it does ensure that the user of a program that is linked with the
 Library has the freedom and the wherewithal to run that program using a
 modified version of the Library.
 The precise terms and conditions for copying, distribution and modification
 follow. Pay close attention to the difference between a "work based on the
 library" and a "work that uses the library". The former contains code
 derived from the library, whereas the latter must be combined with the
 library in order to run.
 TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
 . This License Agreement applies to any software library or other program
    which contains a notice placed by the copyright holder or other
    authorized party saying it may be distributed under the terms of this
    Lesser General Public License (also called "this License"). Each
    licensee is addressed as "you".
    A "library" means a collection of software functions and/or data
    prepared so as to be conveniently linked with application programs
    (which use some of those functions and data) to form executables.
    The "Library", below, refers to any such software library or work which
    has been distributed under these terms. A "work based on the Library"
    means either the Library or any derivative work under copyright law:
    that is to say, a work containing the Library or a portion of it, either
    verbatim or with modifications and/or translated straightforwardly into
    another language. (Hereinafter, translation is included without
    limitation in the term "modification".)
    "Source code" for a work means the preferred form of the work for making
    modifications to it. For a library, complete source code means all the
    source code for all modules it contains, plus any associated interface
    definition files, plus the scripts used to control compilation and
    installation of the library.
     Activities other than copying, distribution and modification are not
     covered by this License; they are outside its scope. The act of running
     a program using the Library is not restricted, and output from such a
     program is covered only if its contents constitute a work based on the
     Library (independent of the use of the Library in a tool for writing
     it). Whether that is true depends on what the Library does and what the
     program that uses the Library does.
 . You may copy and distribute verbatim copies of the Library's complete
    source code as you receive it, in any medium, provided that you
    conspicuously and appropriately publish on each copy an appropriate
    copyright notice and disclaimer of warranty; keep intact all the notices
    that refer to this License and to the absence of any warranty; and
    distribute a copy of this License along with the Library.
    You may charge a fee for the physical act of transferring a copy, and
    you may at your option offer warranty protection in exchange for a fee.
 . You may modify your copy or copies of the Library or any portion of it,
    thus forming a work based on the Library, and copy and distribute such
    modifications or work under the terms of Section 1 above, provided that
    you also meet all of these conditions:
    a) The modified work must itself be a software library.
    b) You must cause the files modified to carry prominent notices stating
       that you changed the files and the date of any change.
    c) You must cause the whole of the work to be licensed at no charge to
       all third parties under the terms of this License.
    d) If a facility in the modified Library refers to a function or a table
       of data to be supplied by an application program that uses the
       facility, other than as an argument passed when the facility is
       invoked, then you must make a good faith effort to ensure that, in
       the event an application does not supply such function or table, the
       facility still operates, and performs whatever part of its purpose
       remains meaningful.
    (For example, a function in a library to compute square roots has a
     purpose that is entirely well-defined independent of the
     application. Therefore, Subsection 2d requires that any
     application-supplied function or table used by this function must be
     optional: if the application does not supply it, the square root
     function must still compute square roots.)
    These requirements apply to the modified work as a whole. If
    identifiable sections of that work are not derived from the Library, and
    can be reasonably considered independent and separate works in
    themselves, then this License, and its terms, do not apply to those
    sections when you distribute them as separate works. But when you
    distribute the same sections as part of a whole which is a work based on
    the Library, the distribution of the whole must be on the terms of this
    License, whose permissions for other licensees extend to the entire
    whole, and thus to each and every part regardless of who wrote it.
    Thus, it is not the intent of this section to claim rights or contest
    your rights to work written entirely by you; rather, the intent is to
    exercise the right to control the distribution of derivative or
    collective works based on the Library.
    In addition, mere aggregation of another work not based on the Library
    with the Library (or with a work based on the Library) on a volume of a
    storage or distribution medium does not bring the other work under the
    scope of this License.
 . You may opt to apply the terms of the ordinary GNU General Public
    License instead of this License to a given copy of the Library. To do
    this, you must alter all the notices that refer to this License, so that
    they refer to the ordinary GNU General Public License, version 2,
    instead of to this License. (If a newer version than version 2 of the
    ordinary GNU General Public License has appeared, then you can specify
    that version instead if you wish.) Do not make any other change in these
    notices.
    Once this change is made in a given copy, it is irreversible for that
    copy, so the ordinary GNU General Public License applies to all
    subsequent copies and derivative works made from that copy.
    This option is useful when you wish to copy part of the code of the
    Library into a program that is not a library.
 . You may copy and distribute the Library (or a portion or derivative of
    it, under Section 2) in object code or executable form under the terms
    of Sections 1 and 2 above provided that you accompany it with the
    complete corresponding machine-readable source code, which must be
    distributed under the terms of Sections 1 and 2 above on a medium
    customarily used for software interchange.
    If distribution of object code is made by offering access to copy from a
    designated place, then offering equivalent access to copy the source
    code from the same place satisfies the requirement to distribute the
    source code, even though third parties are not compelled to copy the
    source along with the object code.
 . A program that contains no derivative of any portion of the Library, but
    is designed to work with the Library by being compiled or linked with
    it, is called a "work that uses the Library". Such a work, in isolation,
    is not a derivative work of the Library, and therefore falls outside the
    scope of this License.
    However, linking a "work that uses the Library" with the Library creates
    an executable that is a derivative of the Library (because it contains
    portions of the Library), rather than a "work that uses the
    library". The executable is therefore covered by this License. Section 6
    states terms for distribution of such executables.
    When a "work that uses the Library" uses material from a header file
    that is part of the Library, the object code for the work may be a
    derivative work of the Library even though the source code is
    not. Whether this is true is especially significant if the work can be
    linked without the Library, or if the work is itself a library. The
    threshold for this to be true is not precisely defined by law.
    If such an object file uses only numerical parameters, data structure
    layouts and accessors, and small macros and small inline functions (ten
    lines or less in length), then the use of the object file is
    unrestricted, regardless of whether it is legally a derivative
    work. (Executables containing this object code plus portions of the
    Library will still fall under Section 6.)
    Otherwise, if the work is a derivative of the Library, you may
    distribute the object code for the work under the terms of Section
 . Any executables containing that work also fall under Section 6,
    whether or not they are linked directly with the Library itself.
 . As an exception to the Sections above, you may also combine or link a
    "work that uses the Library" with the Library to produce a work
    containing portions of the Library, and distribute that work under terms
    of your choice, provided that the terms permit modification of the work
    for the customer's own use and reverse engineering for debugging such
    modifications.
    You must give prominent notice with each copy of the work that the
    Library is used in it and that the Library and its use are covered by
    this License. You must supply a copy of this License. If the work during
    execution displays copyright notices, you must include the copyright
    notice for the Library among them, as well as a reference directing the
    user to the copy of this License. Also, you must do one of these things:
    a) Accompany the work with the complete corresponding machine-readable
       source code for the Library including whatever changes were used in
       the work (which must be distributed under Sections 1 and 2 above);
       and, if the work is an executable linked with the Library, with the
       complete machine-readable "work that uses the Library", as object
       code and/or source code, so that the user can modify the Library and
       then relink to produce a modified executable containing the modified
       Library. (It is understood that the user who changes the contents of
       definitions files in the Library will not necessarily be able to
       recompile the application to use the modified definitions.)
    b) Use a suitable shared library mechanism for linking with the
       Library. A suitable mechanism is one that (1) uses at run time a copy
       of the library already present on the user's computer system, rather
       than copying library functions into the executable, and (2) will
       operate properly with a modified version of the library, if the user
       installs one, as long as the modified version is interface-compatible
       with the version that the work was made with.
    c) Accompany the work with a written offer, valid for at least three
       years, to give the same user the materials specified in Subsection
 a, above, for a charge no more than the cost of performing this
       distribution.
    d) If distribution of the work is made by offering access to copy from a
       designated place, offer equivalent access to copy the above specified
       materials from the same place.
    e) Verify that the user has already received a copy of these materials
       or that you have already sent this user a copy.
    For an executable, the required form of the "work that uses the Library"
    must include any data and utility programs needed for reproducing the
    executable from it. However, as a special exception, the materials to be
    distributed need not include anything that is normally distributed (in
    either source or binary form) with the major components (compiler,
    kernel, and so on) of the operating system on which the executable runs,
    unless that component itself accompanies the executable.
    It may happen that this requirement contradicts the license restrictions
    of other proprietary libraries that do not normally accompany the
    operating system. Such a contradiction means you cannot use both them
    and the Library together in an executable that you distribute.
 . You may place library facilities that are a work based on the Library
    side-by-side in a single library together with other library facilities
    not covered by this License, and distribute such a combined library,
    provided that the separate distribution of the work based on the Library
    and of the other library facilities is otherwise permitted, and provided
    that you do these two things:
    a) Accompany the combined library with a copy of the same work based on
       the Library, uncombined with any other library facilities. This must
       be distributed under the terms of the Sections above.
    b) Give prominent notice with the combined library of the fact that part
       of it is a work based on the Library, and explaining where to find
       the accompanying uncombined form of the same work.
 . You may not copy, modify, sublicense, link with, or distribute the
    Library except as expressly provided under this License. Any attempt
    otherwise to copy, modify, sublicense, link with, or distribute the
    Library is void, and will automatically terminate your rights under this
    License. However, parties who have received copies, or rights, from you
    under this License will not have their licenses terminated so long as
    such parties remain in full compliance.
 . You are not required to accept this License, since you have not signed
    it. However, nothing else grants you permission to modify or distribute
    the Library or its derivative works. These actions are prohibited by law
    if you do not accept this License. Therefore, by modifying or
    distributing the Library (or any work based on the Library), you
    indicate your acceptance of this License to do so, and all its terms and
    conditions for copying, distributing or modifying the Library or works
    based on it.
 . Each time you redistribute the Library (or any work based on the
     Library), the recipient automatically receives a license from the
     original licensor to copy, distribute, link with or modify the Library
     subject to these terms and conditions. You may not impose any further
     restrictions on the recipients' exercise of the rights granted
     herein. You are not responsible for enforcing compliance by third
     parties with this License.
 . If, as a consequence of a court judgment or allegation of patent
     infringement or for any other reason (not limited to patent issues),
     conditions are imposed on you (whether by court order, agreement or
     otherwise) that contradict the conditions of this License, they do not
     excuse you from the conditions of this License. If you cannot
     distribute so as to satisfy simultaneously your obligations under this
     License and any other pertinent obligations, then as a consequence you
     may not distribute the Library at all. For example, if a patent license
     would not permit royalty-free redistribution of the Library by all
     those who receive copies directly or indirectly through you, then the
     only way you could satisfy both it and this License would be to refrain
     entirely from distribution of the Library.
     If any portion of this section is held invalid or unenforceable under
     any particular circumstance, the balance of the section is intended to
     apply, and the section as a whole is intended to apply in other
     circumstances.
     It is not the purpose of this section to induce you to infringe any
     patents or other property right claims or to contest validity of any
     such claims; this section has the sole purpose of protecting the
     integrity of the free software distribution system which is implemented
     by public license practices. Many people have made generous
     contributions to the wide range of software distributed through that
     system in reliance on consistent application of that system; it is up
     to the author/donor to decide if he or she is willing to distribute
     software through any other system and a licensee cannot impose that
     choice.
     This section is intended to make thoroughly clear what is believed to
     be a consequence of the rest of this License.
 . If the distribution and/or use of the Library is restricted in certain
     countries either by patents or by copyrighted interfaces, the original
     copyright holder who places the Library under this License may add an
     explicit geographical distribution limitation excluding those
     countries, so that distribution is permitted only in or among countries
     not thus excluded. In such case, this License incorporates the
     limitation as if written in the body of this License.
 . The Free Software Foundation may publish revised and/or new versions of
     the Lesser General Public License from time to time. Such new versions
     will be similar in spirit to the present version, but may differ in
     detail to address new problems or concerns.
     Each version is given a distinguishing version number. If the Library
     specifies a version number of this License which applies to it and "any
     later version", you have the option of following the terms and
     conditions either of that version or of any later version published by
     the Free Software Foundation. If the Library does not specify a license
     version number, you may choose any version ever published by the Free
     Software Foundation.
 . If you wish to incorporate parts of the Library into other free
     programs whose distribution conditions are incompatible with these,
     write to the author to ask for permission. For software which is
     copyrighted by the Free Software Foundation, write to the Free Software
     Foundation; we sometimes make exceptions for this. Our decision will be
     guided by the two goals of preserving the free status of all
     derivatives of our free software and of promoting the sharing and reuse
     of software generally.
 NO WARRANTY
 . BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
     FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
     OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
     PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER
     EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
     WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE
     ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE LIBRARY IS WITH
     YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL
     NECESSARY SERVICING, REPAIR OR CORRECTION.
 . IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
     WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
     REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU FOR
     DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL
     DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE LIBRARY
     (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED
     INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF
     THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF SUCH HOLDER OR
     OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
 END OF TERMS AND CONDITIONS
 How to Apply These Terms to Your New Libraries
 If you develop a new library, and you want it to be of the greatest
 possible use to the public, we recommend making it free software that
 everyone can redistribute and change. You can do so by permitting
 redistribution under these terms (or, alternatively, under the terms of the
 ordinary General Public License).
 To apply these terms, attach the following notices to the library. It is
 safest to attach them to the start of each source file to most effectively
 convey the exclusion of warranty; and each file should have at least the
 "copyright" line and a pointer to where the full notice is found.
 one line to give the library's name and an idea of what it does.
 Copyright (C) year name of author
 This library is free software; you can redistribute it and/or modify it
 under the terms of the GNU Lesser General Public License as published by
 the Free Software Foundation; either version 2.1 of the License, or (at
 your option) any later version.
 This library is distributed in the hope that it will be useful, but WITHOUT
 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
 FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License
 for more details.
 You should have received a copy of the GNU Lesser General Public License
 along with this library; if not, write to the Free Software Foundation,
 Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA Also add
 information on how to contact you by electronic and paper mail.
 You should also get your employer (if you work as a programmer) or your
 school, if any, to sign a "copyright disclaimer" for the library, if
 necessary. Here is a sample; alter the names:
 Yoyodyne, Inc., hereby disclaims all copyright interest in
 the library `Frob' (a library for tweaking knobs) written
 by James Random Hacker.
 signature of Ty Coon, 1 April 1990
 Ty Coon, President of Vice
 That's all there is to it!

									
										189

README.md
									
												View File
												
				@@ -1,29 +1,66 @@

				<picture>

				  <source media="(prefers-color-scheme: dark)" srcset="assets/libbpf-logo-sideways-darkbg.png" width="40%">

				  <img src="assets/libbpf-logo-sideways.png" width="40%">

				</picture>

				This is a mirror of [bpf-next linux tree](https://kernel.googlesource.com/pub/scm/linux/kernel/git/bpf/bpf-next)'s

				`tools/lib/bpf` directory plus its supporting header files.

				libbpf

				[![Github Actions Builds & Tests](https://github.com/libbpf/libbpf/actions/workflows/test.yml/badge.svg)](https://github.com/libbpf/libbpf/actions/workflows/test.yml)

				[![Coverity](https://img.shields.io/coverity/scan/18195.svg)](https://scan.coverity.com/projects/libbpf)

				[![CodeQL](https://github.com/libbpf/libbpf/workflows/CodeQL/badge.svg?branch=master)](https://github.com/libbpf/libbpf/actions?query=workflow%3ACodeQL+branch%3Amaster)

				[![OSS-Fuzz Status](https://oss-fuzz-build-logs.storage.googleapis.com/badges/libbpf.svg)](https://oss-fuzz-build-logs.storage.googleapis.com/index.html#libbpf)

				[![Read the Docs](https://readthedocs.org/projects/libbpf/badge/?version=latest)](https://libbpf.readthedocs.io/en/latest/)

				======

				The following files will by sync'ed with bpf-next repo:

				  - `src/` <-> `bpf-next/tools/lib/bpf/`

				  - `include/uapi/linux/bpf_common.h` <-> `bpf-next/tools/include/uapi/linux/bpf_common.h`

				  - `include/uapi/linux/bpf.h` <-> `bpf-next/tools/include/uapi/linux/bpf.h`

				  - `include/uapi/linux/btf.h` <-> `bpf-next/tools/include/uapi/linux/btf.h`

				  - `include/uapi/linux/if_link.h` <-> `bpf-next/tools/include/uapi/linux/if_link.h`

				  - `include/uapi/linux/if_xdp.h` <-> `bpf-next/tools/include/uapi/linux/if_xdp.h`

				  - `include/uapi/linux/netlink.h` <-> `bpf-next/tools/include/uapi/linux/netlink.h`

				  - `include/tools/libc_compat.h` <-> `bpf-next/tools/include/tools/libc_compat.h`

				**This is the official home of the libbpf library.**

				Other header files at this repo (`include/linux/*.h`) are reduced versions of

				their counterpart files at bpf-next's `tools/include/linux/*.h` to make compilation

				successful.

				*Please use this Github repository for building and packaging libbpf

				and when using it in your projects through Git submodule.*

				Build [![Build Status](https://travis-ci.org/libbpf/libbpf.svg?branch=master)](https://travis-ci.org/libbpf/libbpf)

				=====

				Libbpf *authoritative source code* is developed as part of [bpf-next Linux source

				tree](https://kernel.googlesource.com/pub/scm/linux/kernel/git/bpf/bpf-next) under

				`tools/lib/bpf` subdirectory and is periodically synced to Github. As such, all the

				libbpf changes should be sent to [BPF mailing list](http://vger.kernel.org/vger-lists.html#bpf),

				please don't open PRs here unless you are changing Github-specific parts of libbpf

				(e.g., Github-specific Makefile).

				Libbpf and general BPF usage questions

				======================================

				Libbpf documentation can be found [here](https://libbpf.readthedocs.io/en/latest/api.html).

				It's an ongoing effort and has ways to go, but please take a look and consider contributing as well.

				Please check out [libbpf-bootstrap](https://github.com/libbpf/libbpf-bootstrap)

				and [the companion blog post](https://nakryiko.com/posts/libbpf-bootstrap/) for

				the examples of building BPF applications with libbpf.

				[libbpf-tools](https://github.com/iovisor/bcc/tree/master/libbpf-tools) are also

				a good source of the real-world libbpf-based tracing tools.

				See also ["BPF CO-RE reference guide"](https://nakryiko.com/posts/bpf-core-reference-guide/)

				for the coverage of practical aspects of building BPF CO-RE applications and

				["BPF CO-RE"](https://nakryiko.com/posts/bpf-portability-and-co-re/) for

				general introduction into BPF portability issues and BPF CO-RE origins.

				All general BPF questions, including kernel functionality, libbpf APIs and

				their application, should be sent to bpf@vger.kernel.org mailing list. You can

				subscribe to it [here](http://vger.kernel.org/vger-lists.html#bpf) and search

				its archive [here](https://lore.kernel.org/bpf/). Please search the archive

				before asking new questions. It very well might be that this was already

				addressed or answered before.

				bpf@vger.kernel.org is monitored by many more people and they will happily try

				to help you with whatever issue you have. This repository's PRs and issues

				should be opened only for dealing with issues pertaining to specific way this

				libbpf mirror repo is set up and organized.

				Building libbpf

				===============

				libelf is an internal dependency of libbpf and thus it is required to link

				against and must be installed on the system for applications to work.

				pkg-config is used by default to find libelf, and the program called can be

				overridden with `PKG_CONFIG`.

				If using `pkg-config` at build time is not desired, it can be disabled by setting

				`NO_PKG_CONFIG=1` when calling make.

				If using `pkg-config` at build time is not desired, it can be disabled by

				setting `NO_PKG_CONFIG=1` when calling make.

				To build both static libbpf.a and shared libbpf.so:

				```bash

				@@ -48,23 +85,105 @@ $ cd src

				$ PKG_CONFIG_PATH=/build/root/lib64/pkgconfig DESTDIR=/build/root make install

				```

				To integrate libbpf into a project which uses Meson building system define

				`[wrap-git]` file in `subprojects` folder.

				To add libbpf dependency to the parent parent project, e.g. for

				libbpf_static_dep:

				```

				libbpf_obj = subproject('libbpf', required : true)

				libbpf_static_dep = libbpf_proj.get_variable('libbpf_static_dep')

				```

				BPF CO-RE (Compile Once – Run Everywhere)

				=========================================

				To validate changes to meson.build

				```bash

				$ python3 meson.py build

				$ ninja -C build/

				```

				Libbpf supports building BPF CO-RE-enabled applications, which, in contrast to

				[BCC](https://github.com/iovisor/bcc/), do not require Clang/LLVM runtime

				being deployed to target servers and doesn't rely on kernel-devel headers

				being available.

				To install headers, libs and pkgconfig

				```bash

				$ cd build

				$ ninja install

				It does rely on kernel to be built with [BTF type

				information](https://www.kernel.org/doc/html/latest/bpf/btf.html), though.

				Some major Linux distributions come with kernel BTF already built in:

				  - Fedora 31+

				  - RHEL 8.2+

				  - OpenSUSE Tumbleweed (in the next release, as of 2020-06-04)

				  - Arch Linux (from kernel 5.7.1.arch1-1)

				  - Manjaro (from kernel 5.4 if compiled after 2021-06-18)

				  - Ubuntu 20.10

				  - Debian 11 (amd64/arm64)

				If your kernel doesn't come with BTF built-in, you'll need to build custom

				kernel. You'll need:

				  - `pahole` 1.16+ tool (part of `dwarves` package), which performs DWARF to

				    BTF conversion;

				  - kernel built with `CONFIG_DEBUG_INFO_BTF=y` option;

				  - you can check if your kernel has BTF built-in by looking for

				    `/sys/kernel/btf/vmlinux` file:

				```shell

				$ ls -la /sys/kernel/btf/vmlinux

				-r--r--r--. 1 root root 3541561 Jun  2 18:16 /sys/kernel/btf/vmlinux

				```

				To develop and build BPF programs, you'll need Clang/LLVM 10+. The following

				distributions have Clang/LLVM 10+ packaged by default:

				  - Fedora 32+

				  - Ubuntu 20.04+

				  - Arch Linux

				  - Ubuntu 20.10 (LLVM 11)

				  - Debian 11 (LLVM 11)

				  - Alpine 3.13+

				Otherwise, please make sure to update it on your system.

				The following resources are useful to understand what BPF CO-RE is and how to

				use it:

				- [BPF CO-RE reference guide](https://nakryiko.com/posts/bpf-core-reference-guide/)

				- [BPF Portability and CO-RE](https://nakryiko.com/posts/bpf-portability-and-co-re/)

				- [HOWTO: BCC to libbpf conversion](https://nakryiko.com/posts/bcc-to-libbpf-howto-guide/)

				- [libbpf-tools in BCC repo](https://github.com/iovisor/bcc/tree/master/libbpf-tools)

				  contain lots of real-world tools converted from BCC to BPF CO-RE. Consider

				  converting some more to both contribute to the BPF community and gain some

				  more experience with it.

				Distributions

				=============

				Distributions packaging libbpf from this mirror:

				  - [Fedora](https://src.fedoraproject.org/rpms/libbpf)

				  - [Gentoo](https://packages.gentoo.org/packages/dev-libs/libbpf)

				  - [Debian](https://packages.debian.org/source/sid/libbpf)

				  - [Arch](https://archlinux.org/packages/core/x86_64/libbpf/)

				  - [Ubuntu](https://packages.ubuntu.com/source/jammy/libbpf)

				  - [Alpine](https://pkgs.alpinelinux.org/packages?name=libbpf)

				Benefits of packaging from the mirror over packaging from kernel sources:

				  - Consistent versioning across distributions.

				  - No ties to any specific kernel, transparent handling of older kernels.

				    Libbpf is designed to be kernel-agnostic and work across multitude of

				    kernel versions. It has built-in mechanisms to gracefully handle older

				    kernels, that are missing some of the features, by working around or

				    gracefully degrading functionality. Thus libbpf is not tied to a specific

				    kernel version and can/should be packaged and versioned independently.

				  - Continuous integration testing via

				    [GitHub Actions](https://github.com/libbpf/libbpf/actions).

				  - Static code analysis via [LGTM](https://lgtm.com/projects/g/libbpf/libbpf)

				    and [Coverity](https://scan.coverity.com/projects/libbpf).

				Package dependencies of libbpf, package names may vary across distros:

				  - zlib

				  - libelf

				[![libbpf distro packaging status](https://repology.org/badge/vertical-allrepos/libbpf.svg)](https://repology.org/project/libbpf/versions)

				bpf-next to Github sync

				=======================

				All the gory details of syncing can be found in `scripts/sync-kernel.sh`

				script. See [SYNC.md](SYNC.md) for instruction.

				Some header files in this repo (`include/linux/*.h`) are reduced versions of

				their counterpart files at

				[bpf-next](https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/)'s

				`tools/include/linux/*.h` to make compilation successful.

				License

				=======

				This work is dual-licensed under BSD 2-clause license and GNU LGPL v2.1 license.

				You can choose between one of them if you use this work.

				`SPDX-License-Identifier: BSD-2-Clause OR LGPL-2.1`

									
										281

SYNC.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,281 @@

				<picture>

				  <source media="(prefers-color-scheme: dark)" srcset="assets/libbpf-logo-sideways-darkbg.png" width="40%">

				  <img src="assets/libbpf-logo-sideways.png" width="40%">

				</picture>

				Libbpf sync

				===========

				Libbpf *authoritative source code* is developed as part of [bpf-next Linux source

				tree](https://kernel.googlesource.com/pub/scm/linux/kernel/git/bpf/bpf-next) under

				`tools/lib/bpf` subdirectory and is periodically synced to Github.

				Most of the mundane mechanical things like bpf and bpf-next tree merge, Git

				history transformation, cherry-picking relevant commits, re-generating

				auto-generated headers, etc. are taken care by

				[sync-kernel.sh script](https://github.com/libbpf/libbpf/blob/master/scripts/sync-kernel.sh).

				But occasionally human needs to do few extra things to make everything work

				nicely.

				This document goes over the process of syncing libbpf sources from Linux repo

				to this Github repository. Feel free to contribute fixes and additions if you

				run into new problems not outlined here.

				Setup expectations

				------------------

				Sync script has particular expectation of upstream Linux repo setup. It

				expects that current HEAD of that repo points to bpf-next's master branch and

				that there is a separate local branch pointing to bpf tree's master branch.

				This is important, as the script will automatically merge their histories for

				the purpose of libbpf sync.

				Below, we assume that Linux repo is located at `~/linux`, it's current head is

				at latest `bpf-next/master`, and libbpf's Github repo is located at

				`~/libbpf`, checked out to latest commit on `master` branch. It doesn't matter

				from where to run `sync-kernel.sh` script, but we'll be running it from inside

				`~/libbpf`.

				```

				$ cd ~/linux && git remote -v | grep -E '^(bpf|bpf-next)'

				bpf     https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git (fetch)

				bpf     ssh://git@gitolite.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git

				(push)

				bpf-next

				https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git (fetch)

				bpf-next

				ssh://git@gitolite.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git (push)

				$ git branch -vv | grep -E '^? (master|bpf-master)'

				* bpf-master                               2d311f480b52 [bpf/master] riscv, bpf: Fix patch_text implicit declaration

				  master                                   c8ee37bde402 [bpf-next/master] libbpf: Fix bpf_xdp_query() in old kernels

				$ git checkout bpf-master && git pull && git checkout master && git pull

				...

				$ git log --oneline -n1

				c8ee37bde402 (HEAD -> master, bpf-next/master) libbpf: Fix bpf_xdp_query() in old kernels

				$ cd ~/libbpf && git checkout master && git pull

				Your branch is up to date with 'libbpf/master'.

				Already up to date.

				```

				Running setup script

				--------------------

				First step is to always run `sync-kernel.sh` script. It expects three arguments:

				```

				$ scripts/sync-kernel.sh <libbpf-repo> <kernel-repo> <bpf-branch>

				```

				Note, that we'll store script's entire output in `/tmp/libbpf-sync.txt` and

				put it into PR summary later on. **Please store scripts output and include it

				in PR summary for others to check for anything unexpected and suspicious.**

				```

				$ scripts/sync-kernel.sh ~/libbpf ~/linux bpf-master | tee /tmp/libbpf-sync.txt

				Dumping existing libbpf commit signatures...

				WORKDIR:          /home/andriin/libbpf

				LINUX REPO:       /home/andriin/linux

				LIBBPF REPO:      /home/andriin/libbpf

				...

				```

				Most of the time this will go very uneventful. One expected case when sync

				script might require user intervention is if `bpf` tree has some libbpf fixes,

				which is nowadays not a very frequent occurence. But if that happens, script

				will show you a diff between expected state as of latest bpf-next and synced

				Github repo state. And will ask if these changes look good. Please use your

				best judgement to verify that differences are indeed from expected `bpf` tree

				fixes. E.g., it might look like below:

				```

				Comparing list of files...

				Comparing file contents...

				--- /home/andriin/linux/include/uapi/linux/netdev.h     2023-02-27 16:54:42.270583372 -0800

				+++ /home/andriin/libbpf/include/uapi/linux/netdev.h    2023-02-27 16:54:34.615530796 -0800

				@@ -19,7 +19,7 @@

				  * @NETDEV_XDP_ACT_XSK_ZEROCOPY: This feature informs if netdev supports AF_XDP

				  *   in zero copy mode.

				  * @NETDEV_XDP_ACT_HW_OFFLOAD: This feature informs if netdev supports XDP hw

				- *   oflloading.

				+ *   offloading.

				  * @NETDEV_XDP_ACT_RX_SG: This feature informs if netdev implements non-linear

				  *   XDP buffer support in the driver napi callback.

				  * @NETDEV_XDP_ACT_NDO_XMIT_SG: This feature informs if netdev implements

				/home/andriin/linux/include/uapi/linux/netdev.h and /home/andriin/libbpf/include/uapi/linux/netdev.h are different!

				Unfortunately, there are some inconsistencies, please double check.

				Does everything look good? [y/N]:

				```

				If it looks sensible and expected, type `y` and script will proceed.

				If sync is successful, your `~/linux` repo will be left in original state on

				the original HEAD commit. `~/libbpf` repo will now be on a new branch, named

				`libbpf-sync-<timestamp>` (e.g., `libbpf-sync-2023-02-28T00-53-40.072Z`).

				Push this branch into your fork of `libbpf/libbpf` Github repo and create a PR:

				```

				$ git push --set-upstream origin libbpf-sync-2023-02-28T00-53-40.072Z

				Enumerating objects: 130, done.

				Counting objects: 100% (115/115), done.

				Delta compression using up to 80 threads

				Compressing objects: 100% (28/28), done.

				Writing objects: 100% (32/32), 5.57 KiB | 1.86 MiB/s, done.

				Total 32 (delta 21), reused 0 (delta 0), pack-reused 0

				remote: Resolving deltas: 100% (21/21), completed with 9 local objects.

				remote:

				remote: Create a pull request for 'libbpf-sync-2023-02-28T00-53-40.072Z' on GitHub by visiting:

				remote:      https://github.com/anakryiko/libbpf/pull/new/libbpf-sync-2023-02-28T00-53-40.072Z

				remote:

				To github.com:anakryiko/libbpf.git

				 * [new branch]                libbpf-sync-2023-02-28T00-53-40.072Z -> libbpf-sync-2023-02-28T00-53-40.072Z

				Branch 'libbpf-sync-2023-02-28T00-53-40.072Z' set up to track remote branch 'libbpf-sync-2023-02-28T00-53-40.072Z' from 'origin'.

				```

				**Please, adjust PR name to have a properly looking timestamp. Libbpf

				maintainers will be very thankful for that!**

				By default Github will turn above branch name into PR with subject "Libbpf sync

				2023 02 28 t00 53 40.072 z". Please fix this into a proper timestamp, e.g.:

				"Libbpf sync 2023-02-28T00:53:40.072Z". Thank you!

				**Please don't forget to paste contents of /tmp/libbpf-sync.txt into PR

				summary!**

				Once PR is created, libbpf CI will run a bunch of tests to check that

				everything is good. In simple cases that would be all you'd need to do. In more

				complicated cases some extra adjustments might be necessary.

				**Please, keep naming and style consistent.** Prefix CI-related fixes with `ci: `

				prefix. If you had to modify sync script, prefix it with `sync: `. Also make

				sure that each such commit has `Signed-off-by: Your Full Name <your@email.com>`,

				just like you'd do that for Linux upstream patch. Libbpf closely follows kernel

				conventions and styling, so please help maintaining that.

				Including new sources

				---------------------

				If entirely new source files (typically `*.c`) were added to the library in the

				kernel repository, it may be necessary to add these to the build system

				manually (you may notice linker errors otherwise), because the script cannot

				handle such changes automatically. To that end, edit `src/Makefile` as

				necessary. Commit

				[c2495832ced4](https://github.com/libbpf/libbpf/commit/c2495832ced4239bcd376b9954db38a6addd89ca)

				is an example of how to go about doing that.

				Similarly, if new public API header files were added, the `Makefile` will need

				to be adjusted as well.

				Updating allow/deny lists

				-------------------------

				Libbpf CI intentionally runs a subset of latest BPF selftests on old kernel

				(4.9 and 5.5, currently). It happens from time to time that some tests that

				previously were successfully running on old kernels now don't, typically due to

				reliance on some freshly added kernel feature. It might look something like this in [CI logs](https://github.com/libbpf/libbpf/actions/runs/4206303272/jobs/7299609578#step:4:2733):

				```

				  All error logs:

				  serial_test_xdp_info:FAIL:get_xdp_none errno=2

				  #283     xdp_info:FAIL

				  Summary: 49/166 PASSED, 5 SKIPPED, 1 FAILED

				```

				In such case we can either work with upstream to fix test to be compatible with

				old kernels, or we'll have to add a test into a denylist (or remove it from

				allowlist, like was [done](https://github.com/libbpf/libbpf/commit/ea284299025bf85b85b4923191de6463cd43ccd6)

				for the case above).

				```

				$ find . -name '*LIST*'

				./ci/vmtest/configs/ALLOWLIST-4.9.0

				./ci/vmtest/configs/DENYLIST-5.5.0

				./ci/vmtest/configs/DENYLIST-latest.s390x

				./ci/vmtest/configs/DENYLIST-latest

				./ci/vmtest/configs/ALLOWLIST-5.5.0

				```

				Please determine which tests need to be added/removed from which list. And then

				add that as a separate commit. **Please keep using the same branch name, so

				that the same PR can be updated.** There is no need to open new PRs for each

				such fix.

				Regenerating vmlinux.h header

				-----------------------------

				To compile latest BPF selftests against old kernels, we check in pre-generated

				[vmlinux.h](https://github.com/libbpf/libbpf/blob/master/.github/actions/build-selftests/vmlinux.h)

				header file, located at `.github/actions/build-selftests/vmlinux.h`, which

				contains type definitions from latest upstream kernel. When after libbpf sync

				upstream BPF selftests require new kernel types, we'd need to regenerate

				`vmlinux.h` and check it in as well.

				This will looks something like this in [CI logs](https://github.com/libbpf/libbpf/actions/runs/4198939244/jobs/7283214243#step:4:1903):

				```

				  In file included from progs/test_spin_lock_fail.c:5:

				  /home/runner/work/libbpf/libbpf/.kernel/tools/testing/selftests/bpf/bpf_experimental.h:73:53: error: declaration of 'struct bpf_rb_root' will not be visible outside of this function [-Werror,-Wvisibility]

				  extern struct bpf_rb_node *bpf_rbtree_remove(struct bpf_rb_root *root,

				                                                      ^

				  /home/runner/work/libbpf/libbpf/.kernel/tools/testing/selftests/bpf/bpf_experimental.h:81:35: error: declaration of 'struct bpf_rb_root' will not be visible outside of this function [-Werror,-Wvisibility]

				  extern void bpf_rbtree_add(struct bpf_rb_root *root, struct bpf_rb_node *node,

				                                    ^

				  /home/runner/work/libbpf/libbpf/.kernel/tools/testing/selftests/bpf/bpf_experimental.h:90:52: error: declaration of 'struct bpf_rb_root' will not be visible outside of this function [-Werror,-Wvisibility]

				  extern struct bpf_rb_node *bpf_rbtree_first(struct bpf_rb_root *root) __ksym;

				                                                     ^

				  3 errors generated.

				  make: *** [Makefile:572: /home/runner/work/libbpf/libbpf/.kernel/tools/testing/selftests/bpf/test_spin_lock_fail.bpf.o] Error 1

				  make: *** Waiting for unfinished jobs....

				  Error: Process completed with exit code 2.

				```

				You'll need to build latest upstream kernel from `bpf-next` tree, using BPF

				selftest configs. Concat arch-agnostic and arch-specific configs, build kernel,

				then use bpftool to dump `vmlinux.h`:

				```

				$ cd ~/linux

				$ cat tools/testing/selftests/bpf/config \

				   tools/testing/selftests/bpf/config.x86_64 > .config

				$ make -j$(nproc) olddefconfig all

				...

				$ bpftool btf dump file ~/linux/vmlinux format c > ~/libbpf/.github/actions/build-selftests/vmlinux.h

				$ cd ~/libbpf && git add . && git commit -s

				```

				Check in generated `vmlinux.h`, don't forget to use `ci: ` commit prefix, add

				it on top of sync commits. Push to Github and let libbpf CI do the checking for

				you. See [this commit](https://github.com/libbpf/libbpf/commit/34212c94a64df8eeb1dd5d064630a65e1dfd4c20)

				for reference.

				Troubleshooting

				---------------

				If something goes wrong and sync script exits early or is terminated early by

				user, you might end up with `~/linux` repo on temporary sync-related branch.

				Don't worry, though, sync script never destroys repo state, it follows

				"copy-on-write" philosophy and creates new branches where necessary. So it's

				very easy to restore previous state. So if anything goes wrong, it's easy to

				start fresh:

				```

				$ git branch | grep -E 'libbpf-.*Z'

				  libbpf-baseline-2023-02-28T00-43-35.146Z

				  libbpf-bpf-baseline-2023-02-28T00-43-35.146Z

				  libbpf-bpf-tip-2023-02-28T00-43-35.146Z

				  libbpf-squash-base-2023-02-28T00-43-35.146Z

				* libbpf-squash-tip-2023-02-28T00-43-35.146Z

				$ git cherry-pick --abort

				$ git checkout master && git branch | grep -E 'libbpf-.*Z' | xargs git br -D

				Switched to branch 'master'

				Your branch is up to date with 'bpf-next/master'.

				Deleted branch libbpf-baseline-2023-02-28T00-43-35.146Z (was 951bce29c898).

				Deleted branch libbpf-bpf-baseline-2023-02-28T00-43-35.146Z (was 3a70e0d4c9d7).

				Deleted branch libbpf-bpf-tip-2023-02-28T00-43-35.146Z (was 2d311f480b52).

				Deleted branch libbpf-squash-base-2023-02-28T00-43-35.146Z (was 957f109ef883).

				Deleted branch libbpf-squash-tip-2023-02-28T00-43-35.146Z (was be66130d2339).

				Deleted branch libbpf-tip-2023-02-28T00-43-35.146Z (was 2d311f480b52).

				```

				You might need to do the same for your `~/libbpf` repo sometimes, depending at

				which stage sync script was terminated.

BIN
assets/libbpf-logo-compact-darkbg.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 262 KiB

BIN
assets/libbpf-logo-compact-mono.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 128 KiB

BIN
assets/libbpf-logo-compact.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 116 KiB

BIN
assets/libbpf-logo-sideways-darkbg.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 284 KiB

BIN
assets/libbpf-logo-sideways-mono.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 142 KiB

BIN
assets/libbpf-logo-sideways.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 140 KiB

BIN
assets/libbpf-logo-sparse-darkbg.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 352 KiB

BIN
assets/libbpf-logo-sparse-mono.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 206 KiB

BIN
assets/libbpf-logo-sparse.png Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 236 KiB

									
										14

ci/build-in-docker.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,14 @@

				#!/bin/bash

				set -euo pipefail

				export DEBIAN_FRONTEND=noninteractive

				export TZ="America/Los_Angeles"

				apt-get update -y

				apt-get install -y tzdata build-essential sudo

				source ${GITHUB_WORKSPACE}/ci_setup

				$CI_ROOT/managers/ubuntu.sh

				exit 0

0

ci/diffs/.keep Normal file

View File

									
										85

ci/diffs/0001-selftests-bpf-set-test-path-for-token-obj_priv_impli.patch
									
										Normal file
									
												View File
												
				@@ -0,0 +1,85 @@

				From e3a4f5092e847ec00e2b66c060f2cef52b8d0177 Mon Sep 17 00:00:00 2001

				From: Ihor Solodrai <ihor.solodrai@pm.me>

				Date: Thu, 14 Nov 2024 12:49:34 -0800

				Subject: [PATCH bpf-next] selftests/bpf: set test path for

				 token/obj_priv_implicit_token_envvar

				token/obj_priv_implicit_token_envvar test may fail in an environment

				where the process executing tests can not write to the root path.

				Example:

				https://github.com/libbpf/libbpf/actions/runs/11844507007/job/33007897936

				Change default path used by the test to /tmp/bpf-token-fs, and make it

				runtime configurable via an environment variable.

				Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>

				---

				 tools/testing/selftests/bpf/prog_tests/token.c | 18 +++++++++++-------

				 1 file changed, 11 insertions(+), 7 deletions(-)

				diff --git a/tools/testing/selftests/bpf/prog_tests/token.c b/tools/testing/selftests/bpf/prog_tests/token.c

				index fe86e4fdb89c..39f5414b674b 100644

				--- a/tools/testing/selftests/bpf/prog_tests/token.c

				+++ b/tools/testing/selftests/bpf/prog_tests/token.c

				@@ -828,8 +828,11 @@ static int userns_obj_priv_btf_success(int mnt_fd, struct token_lsm *lsm_skel)

				 	return validate_struct_ops_load(mnt_fd, true /* should succeed */);

				 }

				+static const char* token_bpffs_custom_dir() {

				+	return getenv("BPF_SELFTESTS_BPF_TOKEN_DIR") ? : "/tmp/bpf-token-fs";

				+}

				+

				 #define TOKEN_ENVVAR "LIBBPF_BPF_TOKEN_PATH"

				-#define TOKEN_BPFFS_CUSTOM "/bpf-token-fs"

				 static int userns_obj_priv_implicit_token(int mnt_fd, struct token_lsm *lsm_skel)

				 {

				@@ -892,6 +895,7 @@ static int userns_obj_priv_implicit_token(int mnt_fd, struct token_lsm *lsm_skel

				 static int userns_obj_priv_implicit_token_envvar(int mnt_fd, struct token_lsm *lsm_skel)

				 {

				+	const char *custom_dir = token_bpffs_custom_dir();

				 	LIBBPF_OPTS(bpf_object_open_opts, opts);

				 	struct dummy_st_ops_success *skel;

				 	int err;

				@@ -909,10 +913,10 @@ static int userns_obj_priv_implicit_token_envvar(int mnt_fd, struct token_lsm *l

				 	 * BPF token implicitly, unless pointed to it through

				 	 * LIBBPF_BPF_TOKEN_PATH envvar

				 	 */

				-	rmdir(TOKEN_BPFFS_CUSTOM);

				-	if (!ASSERT_OK(mkdir(TOKEN_BPFFS_CUSTOM, 0777), "mkdir_bpffs_custom"))

				+	rmdir(custom_dir);

				+	if (!ASSERT_OK(mkdir(custom_dir, 0777), "mkdir_bpffs_custom"))

				 		goto err_out;

				-	err = sys_move_mount(mnt_fd, "", AT_FDCWD, TOKEN_BPFFS_CUSTOM, MOVE_MOUNT_F_EMPTY_PATH);

				+	err = sys_move_mount(mnt_fd, "", AT_FDCWD, custom_dir, MOVE_MOUNT_F_EMPTY_PATH);

				 	if (!ASSERT_OK(err, "move_mount_bpffs"))

				 		goto err_out;

				@@ -925,7 +929,7 @@ static int userns_obj_priv_implicit_token_envvar(int mnt_fd, struct token_lsm *l

				 		goto err_out;

				 	}

				-	err = setenv(TOKEN_ENVVAR, TOKEN_BPFFS_CUSTOM, 1 /*overwrite*/);

				+	err = setenv(TOKEN_ENVVAR, custom_dir, 1 /*overwrite*/);

				 	if (!ASSERT_OK(err, "setenv_token_path"))

				 		goto err_out;

				@@ -951,11 +955,11 @@ static int userns_obj_priv_implicit_token_envvar(int mnt_fd, struct token_lsm *l

				 	if (!ASSERT_ERR(err, "obj_empty_token_path_load"))

				 		goto err_out;

				-	rmdir(TOKEN_BPFFS_CUSTOM);

				+	rmdir(custom_dir);

				 	unsetenv(TOKEN_ENVVAR);

				 	return 0;

				 err_out:

				-	rmdir(TOKEN_BPFFS_CUSTOM);

				+	rmdir(custom_dir);

				 	unsetenv(TOKEN_ENVVAR);

				 	return -EINVAL;

				 }

				-- 

				2.47.0

									
										69

ci/diffs/4000-selftests-bpf-Fix-tests-after-fields-reorder-in-stru.patch
									
										Normal file
									
												View File
												
				@@ -0,0 +1,69 @@

				From bd06a13f44e15e2e83561ea165061c445a15bd9e Mon Sep 17 00:00:00 2001

				From: Song Liu <song@kernel.org>

				Date: Thu, 27 Mar 2025 11:55:28 -0700

				Subject: [PATCH 4000/4002] selftests/bpf: Fix tests after fields reorder in

				 struct file

				The change in struct file [1] moved f_ref to the 3rd cache line.

				It made *(u64 *)file dereference invalid from the verifier point of view,

				because btf_struct_walk() walks into f_lock field, which is 4-byte long.

				Fix the selftests to deference the file pointer as a 4-byte access.

				[1] commit e249056c91a2 ("fs: place f_ref to 3rd cache line in struct file to resolve false sharing")

				Reported-by: Jakub Kicinski <kuba@kernel.org>

				Signed-off-by: Song Liu <song@kernel.org>

				Link: https://lore.kernel.org/r/20250327185528.1740787-1-song@kernel.org

				Signed-off-by: Alexei Starovoitov <ast@kernel.org>

				---

				 tools/testing/selftests/bpf/progs/test_module_attach.c    | 2 +-

				 tools/testing/selftests/bpf/progs/test_subprogs_extable.c | 6 +++---

				 2 files changed, 4 insertions(+), 4 deletions(-)

				diff --git a/tools/testing/selftests/bpf/progs/test_module_attach.c b/tools/testing/selftests/bpf/progs/test_module_attach.c

				index fb07f5773888..7f3c233943b3 100644

				--- a/tools/testing/selftests/bpf/progs/test_module_attach.c

				+++ b/tools/testing/selftests/bpf/progs/test_module_attach.c

				@@ -117,7 +117,7 @@ int BPF_PROG(handle_fexit_ret, int arg, struct file *ret)

				 	bpf_probe_read_kernel(&buf, 8, ret);

				 	bpf_probe_read_kernel(&buf, 8, (char *)ret + 256);

				-	*(volatile long long *)ret;

				+	*(volatile int *)ret;

				 	*(volatile int *)&ret->f_mode;

				 	return 0;

				 }

				diff --git a/tools/testing/selftests/bpf/progs/test_subprogs_extable.c b/tools/testing/selftests/bpf/progs/test_subprogs_extable.c

				index e2a21fbd4e44..dcac69f5928a 100644

				--- a/tools/testing/selftests/bpf/progs/test_subprogs_extable.c

				+++ b/tools/testing/selftests/bpf/progs/test_subprogs_extable.c

				@@ -21,7 +21,7 @@ static __u64 test_cb(struct bpf_map *map, __u32 *key, __u64 *val, void *data)

				 SEC("fexit/bpf_testmod_return_ptr")

				 int BPF_PROG(handle_fexit_ret_subprogs, int arg, struct file *ret)

				 {

				-	*(volatile long *)ret;

				+	*(volatile int *)ret;

				 	*(volatile int *)&ret->f_mode;

				 	bpf_for_each_map_elem(&test_array, test_cb, NULL, 0);

				 	triggered++;

				@@ -31,7 +31,7 @@ int BPF_PROG(handle_fexit_ret_subprogs, int arg, struct file *ret)

				 SEC("fexit/bpf_testmod_return_ptr")

				 int BPF_PROG(handle_fexit_ret_subprogs2, int arg, struct file *ret)

				 {

				-	*(volatile long *)ret;

				+	*(volatile int *)ret;

				 	*(volatile int *)&ret->f_mode;

				 	bpf_for_each_map_elem(&test_array, test_cb, NULL, 0);

				 	triggered++;

				@@ -41,7 +41,7 @@ int BPF_PROG(handle_fexit_ret_subprogs2, int arg, struct file *ret)

				 SEC("fexit/bpf_testmod_return_ptr")

				 int BPF_PROG(handle_fexit_ret_subprogs3, int arg, struct file *ret)

				 {

				-	*(volatile long *)ret;

				+	*(volatile int *)ret;

				 	*(volatile int *)&ret->f_mode;

				 	bpf_for_each_map_elem(&test_array, test_cb, NULL, 0);

				 	triggered++;

				-- 

				2.49.0

									
										71

ci/diffs/4001-selftests-bpf-Fix-verifier_bpf_fastcall-test.patch
									
										Normal file
									
												View File
												
				@@ -0,0 +1,71 @@

				From 8be3a12f9f266aaf3f06f0cfe0e90cfe4d956f3d Mon Sep 17 00:00:00 2001

				From: Song Liu <song@kernel.org>

				Date: Fri, 28 Mar 2025 12:31:24 -0700

				Subject: [PATCH 4001/4002] selftests/bpf: Fix verifier_bpf_fastcall test

				Commit [1] moves percpu data on x86 from address 0x000... to address

				0xfff...

				Before [1]:

				159020: 0000000000030700     0 OBJECT  GLOBAL DEFAULT   23 pcpu_hot

				After [1]:

				152602: ffffffff83a3e034     4 OBJECT  GLOBAL DEFAULT   35 pcpu_hot

				As a result, verifier_bpf_fastcall tests should now expect a negative

				value for pcpu_hot, IOW, the disassemble should show "r=" instead of

				"w=".

				Fix this in the test.

				Note that, a later change created a new variable "cpu_number" for

				bpf_get_smp_processor_id() [2]. The inlining logic is updated properly

				as part of this change, so there is no need to fix anything on the

				kernel side.

				[1] commit 9d7de2aa8b41 ("x86/percpu/64: Use relative percpu offsets")

				[2] commit 01c7bc5198e9 ("x86/smp: Move cpu number to percpu hot section")

				Reported-by: Jakub Kicinski <kuba@kernel.org>

				Signed-off-by: Song Liu <song@kernel.org>

				Link: https://lore.kernel.org/r/20250328193124.808784-1-song@kernel.org

				Signed-off-by: Alexei Starovoitov <ast@kernel.org>

				---

				 tools/testing/selftests/bpf/progs/verifier_bpf_fastcall.c | 6 +++---

				 1 file changed, 3 insertions(+), 3 deletions(-)

				diff --git a/tools/testing/selftests/bpf/progs/verifier_bpf_fastcall.c b/tools/testing/selftests/bpf/progs/verifier_bpf_fastcall.c

				index a9be6ae49454..c258b0722e04 100644

				--- a/tools/testing/selftests/bpf/progs/verifier_bpf_fastcall.c

				+++ b/tools/testing/selftests/bpf/progs/verifier_bpf_fastcall.c

				@@ -12,7 +12,7 @@ SEC("raw_tp")

				 __arch_x86_64

				 __log_level(4) __msg("stack depth 8")

				 __xlated("4: r5 = 5")

				-__xlated("5: w0 = ")

				+__xlated("5: r0 = ")

				 __xlated("6: r0 = &(void __percpu *)(r0)")

				 __xlated("7: r0 = *(u32 *)(r0 +0)")

				 __xlated("8: exit")

				@@ -704,7 +704,7 @@ SEC("raw_tp")

				 __arch_x86_64

				 __log_level(4) __msg("stack depth 32+0")

				 __xlated("2: r1 = 1")

				-__xlated("3: w0 =")

				+__xlated("3: r0 =")

				 __xlated("4: r0 = &(void __percpu *)(r0)")

				 __xlated("5: r0 = *(u32 *)(r0 +0)")

				 /* bpf_loop params setup */

				@@ -753,7 +753,7 @@ __arch_x86_64

				 __log_level(4) __msg("stack depth 40+0")

				 /* call bpf_get_smp_processor_id */

				 __xlated("2: r1 = 42")

				-__xlated("3: w0 =")

				+__xlated("3: r0 =")

				 __xlated("4: r0 = &(void __percpu *)(r0)")

				 __xlated("5: r0 = *(u32 *)(r0 +0)")

				 /* call bpf_get_prandom_u32 */

				-- 

				2.49.0

									
										71

ci/diffs/4002-selftests-bpf-Fix-verifier_private_stack-test-failur.patch
									
										Normal file
									
												View File
												
				@@ -0,0 +1,71 @@

				From 07be1f644ff9eeb842fd0490ddd824df0828cb0e Mon Sep 17 00:00:00 2001

				From: Yonghong Song <yonghong.song@linux.dev>

				Date: Sun, 30 Mar 2025 20:38:28 -0700

				Subject: [PATCH 4002/4002] selftests/bpf: Fix verifier_private_stack test

				 failure

				Several verifier_private_stack tests failed with latest bpf-next.

				For example, for 'Private stack, single prog' subtest, the

				jitted code:

				  func #0:

				  0:      f3 0f 1e fa                             endbr64

				  4:      0f 1f 44 00 00                          nopl    (%rax,%rax)

				  9:      0f 1f 00                                nopl    (%rax)

				  c:      55                                      pushq   %rbp

				  d:      48 89 e5                                movq    %rsp, %rbp

				  10:     f3 0f 1e fa                             endbr64

				  14:     49 b9 58 74 8a 8f 7d 60 00 00           movabsq $0x607d8f8a7458, %r9

				  1e:     65 4c 03 0c 25 28 c0 48 87              addq    %gs:-0x78b73fd8, %r9

				  27:     bf 2a 00 00 00                          movl    $0x2a, %edi

				  2c:     49 89 b9 00 ff ff ff                    movq    %rdi, -0x100(%r9)

				  33:     31 c0                                   xorl    %eax, %eax

				  35:     c9                                      leave

				  36:     e9 20 5d 0f e1                          jmp     0xffffffffe10f5d5b

				The insn 'addq %gs:-0x78b73fd8, %r9' does not match the expected

				regex 'addq %gs:0x{{.*}}, %r9' and this caused test failure.

				Fix it by changing '%gs:0x{{.*}}' to '%gs:{{.*}}' to accommodate the

				possible negative offset. A few other subtests are fixed in a similar way.

				Signed-off-by: Yonghong Song <yonghong.song@linux.dev>

				Link: https://lore.kernel.org/r/20250331033828.365077-1-yonghong.song@linux.dev

				Signed-off-by: Alexei Starovoitov <ast@kernel.org>

				---

				 tools/testing/selftests/bpf/progs/verifier_private_stack.c | 6 +++---

				 1 file changed, 3 insertions(+), 3 deletions(-)

				diff --git a/tools/testing/selftests/bpf/progs/verifier_private_stack.c b/tools/testing/selftests/bpf/progs/verifier_private_stack.c

				index b1fbdf119553..fc91b414364e 100644

				--- a/tools/testing/selftests/bpf/progs/verifier_private_stack.c

				+++ b/tools/testing/selftests/bpf/progs/verifier_private_stack.c

				@@ -27,7 +27,7 @@ __description("Private stack, single prog")

				 __success

				 __arch_x86_64

				 __jited("	movabsq	$0x{{.*}}, %r9")

				-__jited("	addq	%gs:0x{{.*}}, %r9")

				+__jited("	addq	%gs:{{.*}}, %r9")

				 __jited("	movl	$0x2a, %edi")

				 __jited("	movq	%rdi, -0x100(%r9)")

				 __naked void private_stack_single_prog(void)

				@@ -74,7 +74,7 @@ __success

				 __arch_x86_64

				 /* private stack fp for the main prog */

				 __jited("	movabsq	$0x{{.*}}, %r9")

				-__jited("	addq	%gs:0x{{.*}}, %r9")

				+__jited("	addq	%gs:{{.*}}, %r9")

				 __jited("	movl	$0x2a, %edi")

				 __jited("	movq	%rdi, -0x200(%r9)")

				 __jited("	pushq	%r9")

				@@ -122,7 +122,7 @@ __jited("	pushq	%rbp")

				 __jited("	movq	%rsp, %rbp")

				 __jited("	endbr64")

				 __jited("	movabsq	$0x{{.*}}, %r9")

				-__jited("	addq	%gs:0x{{.*}}, %r9")

				+__jited("	addq	%gs:{{.*}}, %r9")

				 __jited("	pushq	%r9")

				 __jited("	callq")

				 __jited("	popq	%r9")

				-- 

				2.49.0

									
										95

ci/managers/debian.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,95 @@

				#!/bin/bash

				PHASES=(${@:-SETUP RUN RUN_ASAN CLEANUP})

				DEBIAN_RELEASE="${DEBIAN_RELEASE:-testing}"

				CONT_NAME="${CONT_NAME:-libbpf-debian-$DEBIAN_RELEASE}"

				ENV_VARS="${ENV_VARS:-}"

				DOCKER_RUN="${DOCKER_RUN:-docker run}"

				REPO_ROOT="${REPO_ROOT:-$PWD}"

				ADDITIONAL_DEPS=(pkgconf)

				EXTRA_CFLAGS=""

				EXTRA_LDFLAGS=""

				function info() {

				    echo -e "\033[33;1m$1\033[0m"

				}

				function error() {

				    echo -e "\033[31;1m$1\033[0m"

				}

				function docker_exec() {

				    docker exec $ENV_VARS $CONT_NAME "$@"

				}

				set -eu

				source "$(dirname $0)/travis_wait.bash"

				for phase in "${PHASES[@]}"; do

				    case $phase in

				        SETUP)

				            info "Setup phase"

				            info "Using Debian $DEBIAN_RELEASE"

				            docker --version

				            docker pull debian:$DEBIAN_RELEASE

				            info "Starting container $CONT_NAME"

				            $DOCKER_RUN -v $REPO_ROOT:/build:rw \

				                        -w /build --privileged=true --name $CONT_NAME \

				                        -dit --net=host debian:$DEBIAN_RELEASE /bin/bash

				            echo -e "::group::Build Env Setup"

				            docker_exec bash -c "echo deb-src http://deb.debian.org/debian $DEBIAN_RELEASE main >>/etc/apt/sources.list"

				            docker_exec apt-get -y update

				            docker_exec apt-get -y install aptitude

				            docker_exec aptitude -y install make libz-dev libelf-dev

				            docker_exec aptitude -y install "${ADDITIONAL_DEPS[@]}"

				            echo -e "::endgroup::"

				            ;;

				        RUN|RUN_CLANG|RUN_CLANG14|RUN_CLANG15|RUN_CLANG16|RUN_GCC10|RUN_GCC11|RUN_GCC12|RUN_ASAN|RUN_CLANG_ASAN|RUN_GCC10_ASAN)

				            CC="cc"

				            if [[ "$phase" =~ "RUN_CLANG(\d+)(_ASAN)?" ]]; then

				                ENV_VARS="-e CC=clang-${BASH_REMATCH[1]} -e CXX=clang++-${BASH_REMATCH[1]}"

				                CC="clang-${BASH_REMATCH[1]}"

				            elif [[ "$phase" = *"CLANG"* ]]; then

				                ENV_VARS="-e CC=clang -e CXX=clang++"

				                CC="clang"

				            elif [[ "$phase" =~ "RUN_GCC(\d+)(_ASAN)?" ]]; then

				                ENV_VARS="-e CC=gcc-${BASH_REMATCH[1]} -e CXX=g++-${BASH_REMATCH[1]}"

				                CC="gcc-${BASH_REMATCH[1]}"

				            fi

				            if [[ "$phase" = *"ASAN"* ]]; then

				                EXTRA_CFLAGS="${EXTRA_CFLAGS} -fsanitize=address,undefined"

				                EXTRA_LDFLAGS="${EXTRA_LDFLAGS} -fsanitize=address,undefined"

				            fi

				            if [[ "$CC" != "cc" ]]; then

				                docker_exec aptitude -y install "$CC"

				            else

				                docker_exec aptitude -y install gcc

				            fi

				            docker_exec mkdir build install

				            docker_exec ${CC} --version

				            info "build"

				            docker_exec make -j$((4*$(nproc))) EXTRA_CFLAGS="${EXTRA_CFLAGS}" EXTRA_LDFLAGS="${EXTRA_LDFLAGS}" -C ./src -B OBJDIR=../build

				            info "ldd build/libbpf.so:"

				            docker_exec ldd build/libbpf.so

				            if ! docker_exec ldd build/libbpf.so | grep -q libelf; then

				                error "No reference to libelf.so in libbpf.so!"

				                exit 1

				            fi

				            info "install"

				            docker_exec make -j$((4*$(nproc))) -C src OBJDIR=../build DESTDIR=../install install

				            info "link binary"

				            docker_exec bash -c "EXTRA_CFLAGS=\"${EXTRA_CFLAGS}\" EXTRA_LDFLAGS=\"${EXTRA_LDFLAGS}\" ./ci/managers/test_compile.sh"

				            ;;

				        CLEANUP)

				            info "Cleanup phase"

				            docker stop $CONT_NAME

				            docker rm -f $CONT_NAME

				            ;;

				        *)

				            echo >&2 "Unknown phase '$phase'"

				            exit 1

				    esac

				done

									
										15

ci/managers/test_compile.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,15 @@

				#!/bin/bash

				set -euox pipefail

				EXTRA_CFLAGS=${EXTRA_CFLAGS:-}

				EXTRA_LDFLAGS=${EXTRA_LDFLAGS:-}

				cat << EOF > main.c

				#include <bpf/libbpf.h>

				int main() {

				  return bpf_object__open(0) < 0;

				}

				EOF

				# static linking

				${CC:-cc} ${EXTRA_CFLAGS} ${EXTRA_LDFLAGS} -o main -I./include/uapi -I./install/usr/include main.c ./build/libbpf.a -lelf -lz

0

travis-ci/managers/travis_wait.bash → ci/managers/travis_wait.bash

View File

									
										24

ci/managers/ubuntu.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,24 @@

				#!/bin/bash

				set -eux

				RELEASE="focal"

				apt-get update

				apt-get install -y pkg-config

				source "$(dirname $0)/travis_wait.bash"

				cd $REPO_ROOT

				EXTRA_CFLAGS="-Werror -Wall -fsanitize=address,undefined"

				EXTRA_LDFLAGS="-Werror -Wall -fsanitize=address,undefined"

				mkdir build install

				cc --version

				make -j$((4*$(nproc))) EXTRA_CFLAGS="${EXTRA_CFLAGS}" EXTRA_LDFLAGS="${EXTRA_LDFLAGS}" -C ./src -B OBJDIR=../build

				ldd build/libbpf.so

				if ! ldd build/libbpf.so | grep -q libelf; then

				    echo "FAIL: No reference to libelf.so in libbpf.so!"

				    exit 1

				fi

				make -j$((4*$(nproc))) -C src OBJDIR=../build DESTDIR=../install install

				EXTRA_CFLAGS=${EXTRA_CFLAGS} EXTRA_LDFLAGS=${EXTRA_LDFLAGS} $(dirname $0)/test_compile.sh

15

ci/vmtest/configs/DENYLIST Normal file

View File

@@ -0,0 +1,15 @@
 # TEMPORARY
 btf_dump/btf_dump: syntax
 kprobe_multi_bench_attach
 core_reloc/enum64val
 core_reloc/size___diff_sz
 core_reloc/type_based___diff_sz
 test_ima	# All of CI is broken on it following 6.3-rc1 merge
 lwt_reroute      # crashes kernel after netnext merge from 2ab1efad60ad "net/sched: cls_api: complement tcf_tfilter_dump_policy"
 tc_links_ingress # started failing after net-next merge from 2ab1efad60ad "net/sched: cls_api: complement tcf_tfilter_dump_policy"
 xdp_bonding/xdp_bonding_features     # started failing after net merge from 359e54a93ab4 "l2tp: pass correct message length to ip6_append_data"
 tc_redirect/tc_redirect_dtime # uapi breakage after net-next commit 885c36e59f46 ("net: Re-use and set mono_delivery_time bit for userspace tstamp packets")
 migrate_reuseport/IPv4 TCP_NEW_SYN_RECV reqsk_timer_handler # flaky, under investigation
 migrate_reuseport/IPv6 TCP_NEW_SYN_RECV reqsk_timer_handler # flaky, under investigation
 verify_pkcs7_sig # keeps failing

13

ci/vmtest/configs/DENYLIST-latest Normal file

View File

@@ -0,0 +1,13 @@
 decap_sanity  # weird failure with decap_sanity_ns netns already existing, TBD
 empty_skb # waiting the fix in bpf tree to make it to bpf-next
 bpf_nf/tc-bpf-ct # test consistently failing on x86: https://github.com/libbpf/libbpf/pull/698#issuecomment-1590341200
 bpf_nf/xdp-ct   # test consistently failing on x86: https://github.com/libbpf/libbpf/pull/698#issuecomment-1590341200
 kprobe_multi_bench_attach # suspected to cause crashes in CI
 find_vma # test consistently fails on latest kernel, see https://github.com/libbpf/libbpf/issues/754 for details
 bpf_cookie/perf_event
 send_signal/send_signal_nmi
 send_signal/send_signal_nmi_thread
 lwt_reroute # crashes kernel, fix pending upstream
 tc_links_ingress # fails, same fix is pending upstream
 tc_redirect		  # enough is enough, banned for life for flakiness

17

ci/vmtest/configs/DENYLIST-latest.s390x Normal file

View File

@@ -0,0 +1,17 @@
 # TEMPORARY
 sockmap_listen/sockhash VSOCK test_vsock_redir
 usdt/basic                               # failing verifier due to bounds check after LLVM update
 usdt/multispec                           # same as above
 deny_namespace                           # not yet in bpf denylist
 tc_redirect/tc_redirect_dtime            # very flaky
 lru_bug                                  # not yet in bpf-next denylist
 # Disabled temporarily for a crash.
 # https://lore.kernel.org/bpf/c9923c1d-971d-4022-8dc8-1364e929d34c@gmail.com/
 dummy_st_ops/dummy_init_ptr_arg
 fexit_bpf2bpf
 tailcalls
 trace_ext
 xdp_bpf2bpf
 xdp_metadata

									
										37

ci/vmtest/configs/run-vmtest.env
									
										Normal file
									
												View File
												
				@@ -0,0 +1,37 @@

				#!/bin/bash

				# This file is sourced by libbpf/ci/run-vmtest Github Action scripts.

				# $SELFTESTS_BPF and $VMTEST_CONFIGS are set in the workflow, before

				# libbpf/ci/run-vmtest action is called

				# See .github/workflows/kernel-test.yml

				ALLOWLIST_FILES=(

				    "${SELFTESTS_BPF}/ALLOWLIST"

				    "${SELFTESTS_BPF}/ALLOWLIST.${ARCH}"

				    "${VMTEST_CONFIGS}/ALLOWLIST"

				    "${VMTEST_CONFIGS}/ALLOWLIST-${KERNEL}"

				    "${VMTEST_CONFIGS}/ALLOWLIST-${KERNEL}.${ARCH}"

				)

				DENYLIST_FILES=(

				    "${SELFTESTS_BPF}/DENYLIST"

				    "${SELFTESTS_BPF}/DENYLIST.${ARCH}"

				    "${VMTEST_CONFIGS}/DENYLIST"

				    "${VMTEST_CONFIGS}/DENYLIST-${KERNEL}"

				    "${VMTEST_CONFIGS}/DENYLIST-${KERNEL}.${ARCH}"

				)

				# Export pipe-separated strings, because bash doesn't support array export

				export SELFTESTS_BPF_ALLOWLIST_FILES=$(IFS="|"; echo "${ALLOWLIST_FILES[*]}")

				export SELFTESTS_BPF_DENYLIST_FILES=$(IFS="|"; echo "${DENYLIST_FILES[*]}")

				if [[ "${LLVM_VERSION}" -lt 18 ]]; then

				    echo "KERNEL_TEST=test_progs test_progs_no_alu32 test_maps test_verifier" >> $GITHUB_ENV

				else # all

				    echo "KERNEL_TEST=test_progs test_progs_cpuv4 test_progs_no_alu32 test_maps test_verifier" >> $GITHUB_ENV

				fi

				echo "cp -R ${SELFTESTS_BPF} ${GITHUB_WORKSPACE}/selftests"

				mkdir -p "${GITHUB_WORKSPACE}/selftests"

				cp -R "${SELFTESTS_BPF}" "${GITHUB_WORKSPACE}/selftests"

2

docs/.gitignore vendored Normal file

View File

@@ -0,0 +1,2 @@
 sphinx/build
 sphinx/doxygen/build

									
										93

docs/api.rst
									
										Normal file
									
												View File
												
				@@ -0,0 +1,93 @@

				.. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)

				.. _api:

				.. toctree:: Table of Contents

				LIBBPF API

				==========

				Error Handling

				--------------

				When libbpf is used in "libbpf 1.0 mode", API functions can return errors in one of two ways.

				You can set "libbpf 1.0" mode with the following line:

				.. code-block::

				    libbpf_set_strict_mode(LIBBPF_STRICT_DIRECT_ERRS | LIBBPF_STRICT_CLEAN_PTRS);

				If the function returns an error code directly, it uses 0 to indicate success

				and a negative error code to indicate what caused the error. In this case the

				error code should be checked directly from the return, you do not need to check

				errno.

				For example:

				.. code-block::

				    err = some_libbpf_api_with_error_return(...);

				    if (err < 0) {

				        /* Handle error accordingly */

				    }

				If the function returns a pointer, it will return NULL to indicate there was

				an error. In this case errno should be checked for the error code.

				For example:

				.. code-block::

				    ptr = some_libbpf_api_returning_ptr();

				    if (!ptr) {

				        /* note no minus sign for EINVAL and E2BIG below */

				        if (errno == EINVAL) {

				           /* handle EINVAL error */

				        } else if (errno == E2BIG) {

				           /* handle E2BIG error */

				        }

				    }

				libbpf.h

				--------

				.. doxygenfile:: libbpf.h

				   :project: libbpf

				   :sections: func define public-type enum

				bpf.h

				-----

				.. doxygenfile:: bpf.h

				   :project: libbpf

				   :sections: func define public-type enum

				btf.h

				-----

				.. doxygenfile:: btf.h

				   :project: libbpf

				   :sections: func define public-type enum

				xsk.h

				-----

				.. doxygenfile:: xsk.h

				   :project: libbpf

				   :sections: func define public-type enum

				bpf_tracing.h

				-------------

				.. doxygenfile:: bpf_tracing.h

				   :project: libbpf

				   :sections: func define public-type enum

				bpf_core_read.h

				---------------

				.. doxygenfile:: bpf_core_read.h

				   :project: libbpf

				   :sections: func define public-type enum

				bpf_endian.h

				------------

				.. doxygenfile:: bpf_endian.h

				   :project: libbpf

				   :sections: func define public-type enum

									
										41

docs/conf.py
									
										Normal file
									
												View File
												
				@@ -0,0 +1,41 @@

				#!/usr/bin/env python3

				# SPDX-License-Identifier: GPL-2.0

				# Configuration file for the Sphinx documentation builder.

				#

				# This file only contains a selection of the most common options. For a full

				# list see the documentation:

				# https://www.sphinx-doc.org/en/master/usage/configuration.html

				import os

				import subprocess

				project = "libbpf"

				extensions = [

				    'sphinx.ext.autodoc',

				    'sphinx.ext.doctest',

				    'sphinx.ext.mathjax',

				    'sphinx.ext.viewcode',

				    'sphinx.ext.imgmath',

				    'sphinx.ext.todo',

				    'sphinx_rtd_theme',

				    'breathe',

				]

				# List of patterns, relative to source directory, that match files and

				# directories to ignore when looking for source files.

				# This pattern also affects html_static_path and html_extra_path.

				exclude_patterns = []

				read_the_docs_build = os.environ.get('READTHEDOCS', None) == 'True'

				if read_the_docs_build:

				    subprocess.call('cd sphinx ; make clean', shell=True)

				    subprocess.call('cd sphinx/doxygen ; doxygen', shell=True)

				html_theme = 'sphinx_rtd_theme'

				breathe_projects = { "libbpf": "./sphinx/doxygen/build/xml/" }

				breathe_default_project = "libbpf"

				breathe_show_define_initializer = True

				breathe_show_enumvalue_initializer = True

									
										33

docs/index.rst
									
										Normal file
									
												View File
												
				@@ -0,0 +1,33 @@

				.. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)

				.. _libbpf:

				======

				libbpf

				======

				If you are looking to develop BPF applications using the libbpf library, this

				directory contains important documentation that you should read.

				To get started, it is recommended to begin with the :doc:`libbpf Overview

				<libbpf_overview>` document, which provides a high-level understanding of the

				libbpf APIs and their usage. This will give you a solid foundation to start

				exploring and utilizing the various features of libbpf to develop your BPF

				applications.

				.. toctree::

				   :maxdepth: 1

				   libbpf_overview

				   API Documentation <https://libbpf.readthedocs.io/en/latest/api.html>

				   program_types

				   libbpf_naming_convention

				   libbpf_build

				All general BPF questions, including kernel functionality, libbpf APIs and their

				application, should be sent to bpf@vger.kernel.org mailing list.  You can

				`subscribe <http://vger.kernel.org/vger-lists.html#bpf>`_ to the mailing list

				search its `archive <https://lore.kernel.org/bpf/>`_.  Please search the archive

				before asking new questions. It may be that this was already addressed or

				answered before.

									
										37

docs/libbpf_build.rst
									
										Normal file
									
												View File
												
				@@ -0,0 +1,37 @@

				.. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)

				Building libbpf

				===============

				libelf and zlib are internal dependencies of libbpf and thus are required to link

				against and must be installed on the system for applications to work.

				pkg-config is used by default to find libelf, and the program called

				can be overridden with PKG_CONFIG.

				If using pkg-config at build time is not desired, it can be disabled by

				setting NO_PKG_CONFIG=1 when calling make.

				To build both static libbpf.a and shared libbpf.so:

				.. code-block:: bash

				    $ cd src

				    $ make

				To build only static libbpf.a library in directory build/ and install them

				together with libbpf headers in a staging directory root/:

				.. code-block:: bash

				    $ cd src

				    $ mkdir build root

				    $ BUILD_STATIC_ONLY=y OBJDIR=build DESTDIR=root make install

				To build both static libbpf.a and shared libbpf.so against a custom libelf

				dependency installed in /build/root/ and install them together with libbpf

				headers in a build directory /build/root/:

				.. code-block:: bash

				    $ cd src

				    $ PKG_CONFIG_PATH=/build/root/lib64/pkgconfig DESTDIR=/build/root make

									
										87

src/README.rst → docs/libbpf_naming_convention.rst
									
												View File
												
				@@ -1,7 +1,7 @@

				.. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)

				libbpf API naming convention

				============================

				API naming convention

				=====================

				libbpf API provides access to a few logically separated groups of

				functions and types. Every group has its own naming convention

				@@ -9,15 +9,15 @@ described here. It's recommended to follow these conventions whenever a

				new function or type is added to keep libbpf API clean and consistent.

				All types and functions provided by libbpf API should have one of the

				following prefixes: ``bpf_``, ``btf_``, ``libbpf_``, ``xsk_``,

				``perf_buffer_``.

				following prefixes: ``bpf_``, ``btf_``, ``libbpf_``, ``btf_dump_``,

				``ring_buffer_``, ``perf_buffer_``.

				System call wrappers

				--------------------

				System call wrappers are simple wrappers for commands supported by

				sys_bpf system call. These wrappers should go to ``bpf.h`` header file

				and map one-on-one to corresponding commands.

				and map one to one to corresponding commands.

				For example ``bpf_map_lookup_elem`` wraps ``BPF_MAP_LOOKUP_ELEM``

				command of sys_bpf, ``bpf_prog_attach`` wraps ``BPF_PROG_ATTACH``, etc.

				@@ -49,10 +49,6 @@ object, ``bpf_object``, double underscore and ``open`` that defines the

				purpose of the function to open ELF file and create ``bpf_object`` from

				it.

				Another example: ``bpf_program__load`` is named for corresponding

				object, ``bpf_program``, that is separated from other part of the name

				by double underscore.

				All objects and corresponding functions other than BTF related should go

				to ``libbpf.h``. BTF types and functions should go to ``btf.h``.

				@@ -63,21 +59,8 @@ Auxiliary functions and types that don't fit well in any of categories

				described above should have ``libbpf_`` prefix, e.g.

				``libbpf_get_error`` or ``libbpf_prog_type_by_name``.

				AF_XDP functions

				-------------------

				AF_XDP functions should have an ``xsk_`` prefix, e.g.

				``xsk_umem__get_data`` or ``xsk_umem__create``. The interface consists

				of both low-level ring access functions and high-level configuration

				functions. These can be mixed and matched. Note that these functions

				are not reentrant for performance reasons.

				Please take a look at Documentation/networking/af_xdp.rst in the Linux

				kernel source tree on how to use XDP sockets and for some common

				mistakes in case you do not get any traffic up to user space.

				libbpf ABI

				==========

				ABI

				---

				libbpf can be both linked statically or used as DSO. To avoid possible

				conflicts with other libraries an application is linked with, all

				@@ -100,8 +83,8 @@ This prevents from accidentally exporting a symbol, that is not supposed

				to be a part of ABI what, in turn, improves both libbpf developer- and

				user-experiences.

				ABI versionning

				---------------

				ABI versioning

				--------------

				To make future ABI extensions possible libbpf ABI is versioned.

				Versioning is implemented by ``libbpf.map`` version script that is

				@@ -116,7 +99,8 @@ This bump in ABI version is at most once per kernel development cycle.

				For example, if current state of ``libbpf.map`` is:

				.. code-block::

				.. code-block:: none

				        LIBBPF_0.0.1 {

				        	global:

				                        bpf_func_a;

				@@ -128,7 +112,8 @@ For example, if current state of ``libbpf.map`` is:

				, and a new symbol ``bpf_func_c`` is being introduced, then

				``libbpf.map`` should be changed like this:

				.. code-block::

				.. code-block:: none

				        LIBBPF_0.0.1 {

				        	global:

				                        bpf_func_a;

				@@ -148,7 +133,7 @@ Format of version script and ways to handle ABI changes, including

				incompatible ones, described in details in [1].

				Stand-alone build

				=================

				-------------------

				Under https://github.com/libbpf/libbpf there is a (semi-)automated

				mirror of the mainline's version of libbpf for a stand-alone build.

				@@ -156,13 +141,53 @@ mirror of the mainline's version of libbpf for a stand-alone build.

				However, all changes to libbpf's code base must be upstreamed through

				the mainline kernel tree.

				API documentation convention

				============================

				The libbpf API is documented via comments above definitions in

				header files. These comments can be rendered by doxygen and sphinx

				for well organized html output. This section describes the

				convention in which these comments should be formatted.

				Here is an example from btf.h:

				.. code-block:: c

				        /**

				         * @brief **btf__new()** creates a new instance of a BTF object from the raw

				         * bytes of an ELF's BTF section

				         * @param data raw bytes

				         * @param size number of bytes passed in `data`

				         * @return new BTF object instance which has to be eventually freed with

				         * **btf__free()**

				         *

				         * On error, error-code-encoded-as-pointer is returned, not a NULL. To extract

				         * error code from such a pointer `libbpf_get_error()` should be used. If

				         * `libbpf_set_strict_mode(LIBBPF_STRICT_CLEAN_PTRS)` is enabled, NULL is

				         * returned on error instead. In both cases thread-local `errno` variable is

				         * always set to error code as well.

				         */

				The comment must start with a block comment of the form '/\*\*'.

				The documentation always starts with a @brief directive. This line is a short

				description about this API. It starts with the name of the API, denoted in bold

				like so: **api_name**. Please include an open and close parenthesis if this is a

				function. Follow with the short description of the API. A longer form description

				can be added below the last directive, at the bottom of the comment.

				Parameters are denoted with the @param directive, there should be one for each

				parameter. If this is a function with a non-void return, use the @return directive

				to document it.

				License

				=======

				-------------------

				libbpf is dual-licensed under LGPL 2.1 and BSD 2-Clause.

				Links

				=====

				-------------------

				[1] https://www.akkadia.org/drepper/dsohowto.pdf

				    (Chapter 3. Maintaining APIs and ABIs).

									
										236

docs/libbpf_overview.rst
									
										Normal file
									
												View File
												
				@@ -0,0 +1,236 @@

				.. SPDX-License-Identifier: GPL-2.0

				===============

				libbpf Overview

				===============

				libbpf is a C-based library containing a BPF loader that takes compiled BPF

				object files and prepares and loads them into the Linux kernel. libbpf takes the

				heavy lifting of loading, verifying, and attaching BPF programs to various

				kernel hooks, allowing BPF application developers to focus only on BPF program

				correctness and performance.

				The following are the high-level features supported by libbpf:

				* Provides high-level and low-level APIs for user space programs to interact

				  with BPF programs. The low-level APIs wrap all the bpf system call

				  functionality, which is useful when users need more fine-grained control

				  over the interactions between user space and BPF programs.

				* Provides overall support for the BPF object skeleton generated by bpftool.

				  The skeleton file simplifies the process for the user space programs to access

				  global variables and work with BPF programs.

				* Provides BPF-side APIS, including BPF helper definitions, BPF maps support,

				  and tracing helpers, allowing developers to simplify BPF code writing.

				* Supports BPF CO-RE mechanism, enabling BPF developers to write portable

				  BPF programs that can be compiled once and run across different kernel

				  versions.

				This document will delve into the above concepts in detail, providing a deeper

				understanding of the capabilities and advantages of libbpf and how it can help

				you develop BPF applications efficiently.

				BPF App Lifecycle and libbpf APIs

				==================================

				A BPF application consists of one or more BPF programs (either cooperating or

				completely independent), BPF maps, and global variables. The global

				variables are shared between all BPF programs, which allows them to cooperate on

				a common set of data. libbpf provides APIs that user space programs can use to

				manipulate the BPF programs by triggering different phases of a BPF application

				lifecycle.

				The following section provides a brief overview of each phase in the BPF life

				cycle:

				* **Open phase**: In this phase, libbpf parses the BPF

				  object file and discovers BPF maps, BPF programs, and global variables. After

				  a BPF app is opened, user space apps can make additional adjustments

				  (setting BPF program types, if necessary; pre-setting initial values for

				  global variables, etc.) before all the entities are created and loaded.

				* **Load phase**: In the load phase, libbpf creates BPF

				  maps, resolves various relocations, and verifies and loads BPF programs into

				  the kernel. At this point, libbpf validates all the parts of a BPF application

				  and loads the BPF program into the kernel, but no BPF program has yet been

				  executed. After the load phase, it’s possible to set up the initial BPF map

				  state without racing with the BPF program code execution.

				* **Attachment phase**: In this phase, libbpf

				  attaches BPF programs to various BPF hook points (e.g., tracepoints, kprobes,

				  cgroup hooks, network packet processing pipeline, etc.). During this

				  phase, BPF programs perform useful work such as processing

				  packets, or updating BPF maps and global variables that can be read from user

				  space.

				* **Tear down phase**: In the tear down phase,

				  libbpf detaches BPF programs and unloads them from the kernel. BPF maps are

				  destroyed, and all the resources used by the BPF app are freed.

				BPF Object Skeleton File

				========================

				BPF skeleton is an alternative interface to libbpf APIs for working with BPF

				objects. Skeleton code abstract away generic libbpf APIs to significantly

				simplify code for manipulating BPF programs from user space. Skeleton code

				includes a bytecode representation of the BPF object file, simplifying the

				process of distributing your BPF code. With BPF bytecode embedded, there are no

				extra files to deploy along with your application binary.

				You can generate the skeleton header file ``(.skel.h)`` for a specific object

				file by passing the BPF object to the bpftool. The generated BPF skeleton

				provides the following custom functions that correspond to the BPF lifecycle,

				each of them prefixed with the specific object name:

				* ``<name>__open()`` – creates and opens BPF application (``<name>`` stands for

				  the specific bpf object name)

				* ``<name>__load()`` – instantiates, loads,and verifies BPF application parts

				* ``<name>__attach()`` – attaches all auto-attachable BPF programs (it’s

				  optional, you can have more control by using libbpf APIs directly)

				* ``<name>__destroy()`` – detaches all BPF programs and

				  frees up all used resources

				Using the skeleton code is the recommended way to work with bpf programs. Keep

				in mind, BPF skeleton provides access to the underlying BPF object, so whatever

				was possible to do with generic libbpf APIs is still possible even when the BPF

				skeleton is used. It's an additive convenience feature, with no syscalls, and no

				cumbersome code.

				Other Advantages of Using Skeleton File

				---------------------------------------

				* BPF skeleton provides an interface for user space programs to work with BPF

				  global variables. The skeleton code memory maps global variables as a struct

				  into user space. The struct interface allows user space programs to initialize

				  BPF programs before the BPF load phase and fetch and update data from user

				  space afterward.

				* The ``skel.h`` file reflects the object file structure by listing out the

				  available maps, programs, etc. BPF skeleton provides direct access to all the

				  BPF maps and BPF programs as struct fields. This eliminates the need for

				  string-based lookups with ``bpf_object_find_map_by_name()`` and

				  ``bpf_object_find_program_by_name()`` APIs, reducing errors due to BPF source

				  code and user-space code getting out of sync.

				* The embedded bytecode representation of the object file ensures that the

				  skeleton and the BPF object file are always in sync.

				BPF Helpers

				===========

				libbpf provides BPF-side APIs that BPF programs can use to interact with the

				system. The BPF helpers definition allows developers to use them in BPF code as

				any other plain C function. For example, there are helper functions to print

				debugging messages, get the time since the system was booted, interact with BPF

				maps, manipulate network packets, etc.

				For a complete description of what the helpers do, the arguments they take, and

				the return value, see the `bpf-helpers

				<https://man7.org/linux/man-pages/man7/bpf-helpers.7.html>`_ man page.

				BPF CO-RE (Compile Once – Run Everywhere)

				=========================================

				BPF programs work in the kernel space and have access to kernel memory and data

				structures. One limitation that BPF applications come across is the lack of

				portability across different kernel versions and configurations. `BCC

				<https://github.com/iovisor/bcc/>`_ is one of the solutions for BPF

				portability. However, it comes with runtime overhead and a large binary size

				from embedding the compiler with the application.

				libbpf steps up the BPF program portability by supporting the BPF CO-RE concept.

				BPF CO-RE brings together BTF type information, libbpf, and the compiler to

				produce a single executable binary that you can run on multiple kernel versions

				and configurations.

				To make BPF programs portable libbpf relies on the BTF type information of the

				running kernel. Kernel also exposes this self-describing authoritative BTF

				information through ``sysfs`` at ``/sys/kernel/btf/vmlinux``.

				You can generate the BTF information for the running kernel with the following

				command:

				::

				  $ bpftool btf dump file /sys/kernel/btf/vmlinux format c > vmlinux.h

				The command generates a ``vmlinux.h`` header file with all kernel types

				(:doc:`BTF types <../btf>`) that the running kernel uses. Including

				``vmlinux.h`` in your BPF program eliminates dependency on system-wide kernel

				headers.

				libbpf enables portability of BPF programs by looking at the BPF program’s

				recorded BTF type and relocation information and matching them to BTF

				information (vmlinux) provided by the running kernel. libbpf then resolves and

				matches all the types and fields, and updates necessary offsets and other

				relocatable data to ensure that BPF program’s logic functions correctly for a

				specific kernel on the host. BPF CO-RE concept thus eliminates overhead

				associated with BPF development and allows developers to write portable BPF

				applications without modifications and runtime source code compilation on the

				target machine.

				The following code snippet shows how to read the parent field of a kernel

				``task_struct`` using BPF CO-RE and libbf. The basic helper to read a field in a

				CO-RE relocatable manner is ``bpf_core_read(dst, sz, src)``, which will read

				``sz`` bytes from the field referenced by ``src`` into the memory pointed to by

				``dst``.

				.. code-block:: C

				   :emphasize-lines: 6

				    //...

				    struct task_struct *task = (void *)bpf_get_current_task();

				    struct task_struct *parent_task;

				    int err;

				    err = bpf_core_read(&parent_task, sizeof(void *), &task->parent);

				    if (err) {

				      /* handle error */

				    }

				    /* parent_task contains the value of task->parent pointer */

				In the code snippet, we first get a pointer to the current ``task_struct`` using

				``bpf_get_current_task()``.  We then use ``bpf_core_read()`` to read the parent

				field of task struct into the ``parent_task`` variable. ``bpf_core_read()`` is

				just like ``bpf_probe_read_kernel()`` BPF helper, except it records information

				about the field that should be relocated on the target kernel. i.e, if the

				``parent`` field gets shifted to a different offset within

				``struct task_struct`` due to some new field added in front of it, libbpf will

				automatically adjust the actual offset to the proper value.

				Getting Started with libbpf

				===========================

				Check out the `libbpf-bootstrap <https://github.com/libbpf/libbpf-bootstrap>`_

				repository with simple examples of using libbpf to build various BPF

				applications.

				See also `libbpf API documentation

				<https://libbpf.readthedocs.io/en/latest/api.html>`_.

				libbpf and Rust

				===============

				If you are building BPF applications in Rust, it is recommended to use the

				`Libbpf-rs <https://github.com/libbpf/libbpf-rs>`_ library instead of bindgen

				bindings directly to libbpf. Libbpf-rs wraps libbpf functionality in

				Rust-idiomatic interfaces and provides libbpf-cargo plugin to handle BPF code

				compilation and skeleton generation. Using Libbpf-rs will make building user

				space part of the BPF application easier. Note that the BPF program themselves

				must still be written in plain C.

				libbpf logging

				==============

				By default, libbpf logs informational and warning messages to stderr. The

				verbosity of these messages can be controlled by setting the environment

				variable LIBBPF_LOG_LEVEL to either warn, info, or debug. A custom log

				callback can be set using ``libbpf_set_print()``.

				Additional Documentation

				========================

				* `Program types and ELF Sections <https://libbpf.readthedocs.io/en/latest/program_types.html>`_

				* `API naming convention <https://libbpf.readthedocs.io/en/latest/libbpf_naming_convention.html>`_

				* `Building libbpf <https://libbpf.readthedocs.io/en/latest/libbpf_build.html>`_

				* `API documentation Convention <https://libbpf.readthedocs.io/en/latest/libbpf_naming_convention.html#api-documentation-convention>`_

									
										235

docs/program_types.rst
									
										Normal file
									
												View File
												
				@@ -0,0 +1,235 @@

				.. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)

				.. _program_types_and_elf:

				Program Types and ELF Sections

				==============================

				The table below lists the program types, their attach types where relevant and the ELF section

				names supported by libbpf for them. The ELF section names follow these rules:

				- ``type`` is an exact match, e.g. ``SEC("socket")``

				- ``type+`` means it can be either exact ``SEC("type")`` or well-formed ``SEC("type/extras")``

				  with a '``/``' separator between ``type`` and ``extras``.

				When ``extras`` are specified, they provide details of how to auto-attach the BPF program.  The

				format of ``extras`` depends on the program type, e.g. ``SEC("tracepoint/<category>/<name>")``

				for tracepoints or ``SEC("usdt/<path>:<provider>:<name>")`` for USDT probes. The extras are

				described in more detail in the footnotes.

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| Program Type                              | Attach Type                            | ELF Section Name                 | Sleepable |

				+===========================================+========================================+==================================+===========+

				| ``BPF_PROG_TYPE_CGROUP_DEVICE``           | ``BPF_CGROUP_DEVICE``                  | ``cgroup/dev``                   |           |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_CGROUP_SKB``              |                                        | ``cgroup/skb``                   |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_CGROUP_INET_EGRESS``             | ``cgroup_skb/egress``            |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_CGROUP_INET_INGRESS``            | ``cgroup_skb/ingress``           |           |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_CGROUP_SOCKOPT``          | ``BPF_CGROUP_GETSOCKOPT``              | ``cgroup/getsockopt``            |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_CGROUP_SETSOCKOPT``              | ``cgroup/setsockopt``            |           |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_CGROUP_SOCK_ADDR``        | ``BPF_CGROUP_INET4_BIND``              | ``cgroup/bind4``                 |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_CGROUP_INET4_CONNECT``           | ``cgroup/connect4``              |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_CGROUP_INET4_GETPEERNAME``       | ``cgroup/getpeername4``          |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_CGROUP_INET4_GETSOCKNAME``       | ``cgroup/getsockname4``          |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_CGROUP_INET6_BIND``              | ``cgroup/bind6``                 |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_CGROUP_INET6_CONNECT``           | ``cgroup/connect6``              |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_CGROUP_INET6_GETPEERNAME``       | ``cgroup/getpeername6``          |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_CGROUP_INET6_GETSOCKNAME``       | ``cgroup/getsockname6``          |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_CGROUP_UDP4_RECVMSG``            | ``cgroup/recvmsg4``              |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_CGROUP_UDP4_SENDMSG``            | ``cgroup/sendmsg4``              |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_CGROUP_UDP6_RECVMSG``            | ``cgroup/recvmsg6``              |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_CGROUP_UDP6_SENDMSG``            | ``cgroup/sendmsg6``              |           |

				|                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_CGROUP_UNIX_CONNECT``            | ``cgroup/connect_unix``          |           |

				|                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_CGROUP_UNIX_SENDMSG``            | ``cgroup/sendmsg_unix``          |           |

				|                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_CGROUP_UNIX_RECVMSG``            | ``cgroup/recvmsg_unix``          |           |

				|                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_CGROUP_UNIX_GETPEERNAME``        | ``cgroup/getpeername_unix``      |           |

				|                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_CGROUP_UNIX_GETSOCKNAME``        | ``cgroup/getsockname_unix``      |           |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_CGROUP_SOCK``             | ``BPF_CGROUP_INET4_POST_BIND``         | ``cgroup/post_bind4``            |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_CGROUP_INET6_POST_BIND``         | ``cgroup/post_bind6``            |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_CGROUP_INET_SOCK_CREATE``        | ``cgroup/sock_create``           |           |

				+                                           +                                        +----------------------------------+-----------+

				|                                           |                                        | ``cgroup/sock``                  |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_CGROUP_INET_SOCK_RELEASE``       | ``cgroup/sock_release``          |           |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_CGROUP_SYSCTL``           | ``BPF_CGROUP_SYSCTL``                  | ``cgroup/sysctl``                |           |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_EXT``                     |                                        | ``freplace+`` [#fentry]_         |           |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_FLOW_DISSECTOR``          | ``BPF_FLOW_DISSECTOR``                 | ``flow_dissector``               |           |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_KPROBE``                  |                                        | ``kprobe+`` [#kprobe]_           |           |

				+                                           +                                        +----------------------------------+-----------+

				|                                           |                                        | ``kretprobe+`` [#kprobe]_        |           |

				+                                           +                                        +----------------------------------+-----------+

				|                                           |                                        | ``ksyscall+`` [#ksyscall]_       |           |

				+                                           +                                        +----------------------------------+-----------+

				|                                           |                                        |  ``kretsyscall+`` [#ksyscall]_   |           |

				+                                           +                                        +----------------------------------+-----------+

				|                                           |                                        | ``uprobe+`` [#uprobe]_           |           |

				+                                           +                                        +----------------------------------+-----------+

				|                                           |                                        | ``uprobe.s+`` [#uprobe]_         | Yes       |

				+                                           +                                        +----------------------------------+-----------+

				|                                           |                                        | ``uretprobe+`` [#uprobe]_        |           |

				+                                           +                                        +----------------------------------+-----------+

				|                                           |                                        | ``uretprobe.s+`` [#uprobe]_      | Yes       |

				+                                           +                                        +----------------------------------+-----------+

				|                                           |                                        | ``usdt+`` [#usdt]_               |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_TRACE_KPROBE_MULTI``             | ``kprobe.multi+`` [#kpmulti]_    |           |

				+                                           +                                        +----------------------------------+-----------+

				|                                           |                                        | ``kretprobe.multi+`` [#kpmulti]_ |           |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_LIRC_MODE2``              | ``BPF_LIRC_MODE2``                     | ``lirc_mode2``                   |           |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_LSM``                     | ``BPF_LSM_CGROUP``                     | ``lsm_cgroup+``                  |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_LSM_MAC``                        | ``lsm+`` [#lsm]_                 |           |

				+                                           +                                        +----------------------------------+-----------+

				|                                           |                                        | ``lsm.s+`` [#lsm]_               | Yes       |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_LWT_IN``                  |                                        | ``lwt_in``                       |           |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_LWT_OUT``                 |                                        | ``lwt_out``                      |           |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_LWT_SEG6LOCAL``           |                                        | ``lwt_seg6local``                |           |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_LWT_XMIT``                |                                        | ``lwt_xmit``                     |           |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_NETFILTER``               |                                        | ``netfilter``                    |           |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_PERF_EVENT``              |                                        | ``perf_event``                   |           |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE`` |                                        | ``raw_tp.w+`` [#rawtp]_          |           |

				+                                           +                                        +----------------------------------+-----------+

				|                                           |                                        | ``raw_tracepoint.w+``            |           |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_RAW_TRACEPOINT``          |                                        | ``raw_tp+`` [#rawtp]_            |           |

				+                                           +                                        +----------------------------------+-----------+

				|                                           |                                        | ``raw_tracepoint+``              |           |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_SCHED_ACT``               |                                        | ``action`` [#tc_legacy]_         |           |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_SCHED_CLS``               |                                        | ``classifier`` [#tc_legacy]_     |           |

				+                                           +                                        +----------------------------------+-----------+

				|                                           |                                        | ``tc`` [#tc_legacy]_             |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_NETKIT_PRIMARY``                 | ``netkit/primary``               |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_NETKIT_PEER``                    | ``netkit/peer``                  |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_TCX_INGRESS``                    | ``tc/ingress``                   |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_TCX_EGRESS``                     | ``tc/egress``                    |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_TCX_INGRESS``                    | ``tcx/ingress``                  |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_TCX_EGRESS``                     | ``tcx/egress``                   |           |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_SK_LOOKUP``               | ``BPF_SK_LOOKUP``                      | ``sk_lookup``                    |           |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_SK_MSG``                  | ``BPF_SK_MSG_VERDICT``                 | ``sk_msg``                       |           |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_SK_REUSEPORT``            | ``BPF_SK_REUSEPORT_SELECT_OR_MIGRATE`` | ``sk_reuseport/migrate``         |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_SK_REUSEPORT_SELECT``            | ``sk_reuseport``                 |           |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_SK_SKB``                  |                                        | ``sk_skb``                       |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_SK_SKB_STREAM_PARSER``           | ``sk_skb/stream_parser``         |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_SK_SKB_STREAM_VERDICT``          | ``sk_skb/stream_verdict``        |           |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_SOCKET_FILTER``           |                                        | ``socket``                       |           |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_SOCK_OPS``                | ``BPF_CGROUP_SOCK_OPS``                | ``sockops``                      |           |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_STRUCT_OPS``              |                                        | ``struct_ops+`` [#struct_ops]_   |           |

				+                                           +                                        +----------------------------------+-----------+

				|                                           |                                        | ``struct_ops.s+`` [#struct_ops]_ | Yes       |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_SYSCALL``                 |                                        | ``syscall``                      | Yes       |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_TRACEPOINT``              |                                        | ``tp+`` [#tp]_                   |           |

				+                                           +                                        +----------------------------------+-----------+

				|                                           |                                        | ``tracepoint+`` [#tp]_           |           |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_TRACING``                 | ``BPF_MODIFY_RETURN``                  | ``fmod_ret+`` [#fentry]_         |           |

				+                                           +                                        +----------------------------------+-----------+

				|                                           |                                        | ``fmod_ret.s+`` [#fentry]_       | Yes       |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_TRACE_FENTRY``                   | ``fentry+`` [#fentry]_           |           |

				+                                           +                                        +----------------------------------+-----------+

				|                                           |                                        | ``fentry.s+`` [#fentry]_         | Yes       |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_TRACE_FEXIT``                    | ``fexit+`` [#fentry]_            |           |

				+                                           +                                        +----------------------------------+-----------+

				|                                           |                                        | ``fexit.s+`` [#fentry]_          | Yes       |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_TRACE_ITER``                     | ``iter+`` [#iter]_               |           |

				+                                           +                                        +----------------------------------+-----------+

				|                                           |                                        | ``iter.s+`` [#iter]_             | Yes       |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_TRACE_RAW_TP``                   | ``tp_btf+`` [#fentry]_           |           |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				| ``BPF_PROG_TYPE_XDP``                     | ``BPF_XDP_CPUMAP``                     | ``xdp.frags/cpumap``             |           |

				+                                           +                                        +----------------------------------+-----------+

				|                                           |                                        | ``xdp/cpumap``                   |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_XDP_DEVMAP``                     | ``xdp.frags/devmap``             |           |

				+                                           +                                        +----------------------------------+-----------+

				|                                           |                                        | ``xdp/devmap``                   |           |

				+                                           +----------------------------------------+----------------------------------+-----------+

				|                                           | ``BPF_XDP``                            | ``xdp.frags``                    |           |

				+                                           +                                        +----------------------------------+-----------+

				|                                           |                                        | ``xdp``                          |           |

				+-------------------------------------------+----------------------------------------+----------------------------------+-----------+

				.. rubric:: Footnotes

				.. [#fentry] The ``fentry`` attach format is ``fentry[.s]/<function>``.

				.. [#kprobe] The ``kprobe`` attach format is ``kprobe/<function>[+<offset>]``. Valid

				             characters for ``function`` are ``a-zA-Z0-9_.`` and ``offset`` must be a valid

				             non-negative integer.

				.. [#ksyscall] The ``ksyscall`` attach format is ``ksyscall/<syscall>``.

				.. [#uprobe] The ``uprobe`` attach format is ``uprobe[.s]/<path>:<function>[+<offset>]``.

				.. [#usdt] The ``usdt`` attach format is ``usdt/<path>:<provider>:<name>``.

				.. [#kpmulti] The ``kprobe.multi`` attach format is ``kprobe.multi/<pattern>`` where ``pattern``

				              supports ``*`` and ``?`` wildcards. Valid characters for pattern are

				              ``a-zA-Z0-9_.*?``.

				.. [#lsm] The ``lsm`` attachment format is ``lsm[.s]/<hook>``.

				.. [#rawtp] The ``raw_tp`` attach format is ``raw_tracepoint[.w]/<tracepoint>``.

				.. [#tc_legacy] The ``tc``, ``classifier`` and ``action`` attach types are deprecated, use

				                ``tcx/*`` instead.

				.. [#struct_ops] The ``struct_ops`` attach format supports ``struct_ops[.s]/<name>`` convention,

				                 but ``name`` is ignored and it is recommended to just use plain

				                 ``SEC("struct_ops[.s]")``. The attachments are defined in a struct initializer

				                 that is tagged with ``SEC(".struct_ops[.link]")``.

				.. [#tp] The ``tracepoint`` attach format is ``tracepoint/<category>/<name>``.

				.. [#iter] The ``iter`` attach format is ``iter[.s]/<struct-name>``.

									
										9

docs/sphinx/Makefile
									
										Normal file
									
												View File
												
				@@ -0,0 +1,9 @@

				SPHINXBUILD   ?= sphinx-build

				SOURCEDIR     = ../src

				BUILDDIR      = build

				help:

					@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)"

				%:

					@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)"

277

docs/sphinx/doxygen/Doxyfile Normal file

View File

@@ -0,0 +1,277 @@
 DOXYFILE_ENCODING      = UTF-8
 PROJECT_NAME           = "libbpf"
 PROJECT_NUMBER         =
 PROJECT_BRIEF          =
 PROJECT_LOGO           =
 OUTPUT_DIRECTORY       = ./build
 CREATE_SUBDIRS         = NO
 ALLOW_UNICODE_NAMES    = NO
 OUTPUT_LANGUAGE        = English
 OUTPUT_TEXT_DIRECTION  = None
 BRIEF_MEMBER_DESC      = YES
 REPEAT_BRIEF           = YES
 ALWAYS_DETAILED_SEC    = NO
 INLINE_INHERITED_MEMB  = NO
 FULL_PATH_NAMES        = YES
 STRIP_FROM_PATH        =
 STRIP_FROM_INC_PATH    =
 SHORT_NAMES            = NO
 JAVADOC_AUTOBRIEF      = NO
 JAVADOC_BANNER         = NO
 QT_AUTOBRIEF           = NO
 MULTILINE_CPP_IS_BRIEF = NO
 PYTHON_DOCSTRING       = NO
 INHERIT_DOCS           = YES
 SEPARATE_MEMBER_PAGES  = NO
 TAB_SIZE               = 4
 ALIASES                =
 OPTIMIZE_OUTPUT_FOR_C  = YES
 OPTIMIZE_OUTPUT_JAVA   = NO
 OPTIMIZE_FOR_FORTRAN   = NO
 OPTIMIZE_OUTPUT_VHDL   = NO
 OPTIMIZE_OUTPUT_SLICE  = NO
 EXTENSION_MAPPING      =
 MARKDOWN_SUPPORT       = YES
 TOC_INCLUDE_HEADINGS   = 5
 AUTOLINK_SUPPORT       = YES
 BUILTIN_STL_SUPPORT    = NO
 CPP_CLI_SUPPORT        = NO
 SIP_SUPPORT            = NO
 IDL_PROPERTY_SUPPORT   = YES
 DISTRIBUTE_GROUP_DOC   = NO
 GROUP_NESTED_COMPOUNDS = NO
 SUBGROUPING            = YES
 INLINE_GROUPED_CLASSES = NO
 INLINE_SIMPLE_STRUCTS  = NO
 TYPEDEF_HIDES_STRUCT   = NO
 LOOKUP_CACHE_SIZE      = 0
 NUM_PROC_THREADS       = 1
 EXTRACT_ALL            = NO
 EXTRACT_PRIVATE        = NO
 EXTRACT_PRIV_VIRTUAL   = NO
 EXTRACT_PACKAGE        = NO
 EXTRACT_STATIC         = NO
 EXTRACT_LOCAL_CLASSES  = YES
 EXTRACT_LOCAL_METHODS  = NO
 EXTRACT_ANON_NSPACES   = NO
 RESOLVE_UNNAMED_PARAMS = YES
 HIDE_UNDOC_MEMBERS     = NO
 HIDE_UNDOC_CLASSES     = NO
 HIDE_FRIEND_COMPOUNDS  = NO
 HIDE_IN_BODY_DOCS      = NO
 INTERNAL_DOCS          = NO
 CASE_SENSE_NAMES       = YES
 HIDE_SCOPE_NAMES       = NO
 HIDE_COMPOUND_REFERENCE= NO
 SHOW_INCLUDE_FILES     = YES
 SHOW_GROUPED_MEMB_INC  = NO
 FORCE_LOCAL_INCLUDES   = NO
 INLINE_INFO            = YES
 SORT_MEMBER_DOCS       = YES
 SORT_BRIEF_DOCS        = NO
 SORT_MEMBERS_CTORS_1ST = NO
 SORT_GROUP_NAMES       = NO
 SORT_BY_SCOPE_NAME     = NO
 STRICT_PROTO_MATCHING  = NO
 GENERATE_TODOLIST      = YES
 GENERATE_TESTLIST      = YES
 GENERATE_BUGLIST       = YES
 GENERATE_DEPRECATEDLIST= YES
 ENABLED_SECTIONS       =
 MAX_INITIALIZER_LINES  = 30
 SHOW_USED_FILES        = YES
 SHOW_FILES             = YES
 SHOW_NAMESPACES        = YES
 FILE_VERSION_FILTER    =
 LAYOUT_FILE            =
 CITE_BIB_FILES         =
 QUIET                  = NO
 WARNINGS               = YES
 WARN_IF_UNDOCUMENTED   = YES
 WARN_IF_DOC_ERROR      = YES
 WARN_NO_PARAMDOC       = NO
 WARN_AS_ERROR          = NO
 WARN_FORMAT            = "$file:$line: $text"
 WARN_LOGFILE           =
 INPUT                  = ../../../src
 INPUT_ENCODING         = UTF-8
 FILE_PATTERNS          = *.c \
                          *.h
 RECURSIVE              = NO
 EXCLUDE                =
 EXCLUDE_SYMLINKS       = NO
 EXCLUDE_PATTERNS       =
 EXCLUDE_SYMBOLS        = ___*
 EXAMPLE_PATH           =
 EXAMPLE_PATTERNS       = *
 EXAMPLE_RECURSIVE      = NO
 IMAGE_PATH             =
 INPUT_FILTER           =
 FILTER_PATTERNS        =
 FILTER_SOURCE_FILES    = NO
 FILTER_SOURCE_PATTERNS =
 USE_MDFILE_AS_MAINPAGE = YES
 SOURCE_BROWSER         = NO
 INLINE_SOURCES         = NO
 STRIP_CODE_COMMENTS    = YES
 REFERENCED_BY_RELATION = NO
 REFERENCES_RELATION    = NO
 REFERENCES_LINK_SOURCE = YES
 SOURCE_TOOLTIPS        = YES
 USE_HTAGS              = NO
 VERBATIM_HEADERS       = YES
 ALPHABETICAL_INDEX     = YES
 IGNORE_PREFIX          =
 GENERATE_HTML          = NO
 HTML_OUTPUT            = html
 HTML_FILE_EXTENSION    = .html
 HTML_HEADER            =
 HTML_FOOTER            =
 HTML_STYLESHEET        =
 HTML_EXTRA_STYLESHEET  =
 HTML_EXTRA_FILES       =
 HTML_COLORSTYLE_HUE    = 220
 HTML_COLORSTYLE_SAT    = 100
 HTML_COLORSTYLE_GAMMA  = 80
 HTML_TIMESTAMP         = NO
 HTML_DYNAMIC_MENUS     = YES
 HTML_DYNAMIC_SECTIONS  = NO
 HTML_INDEX_NUM_ENTRIES = 100
 GENERATE_DOCSET        = NO
 DOCSET_FEEDNAME        = "Doxygen generated docs"
 DOCSET_BUNDLE_ID       = org.doxygen.Project
 DOCSET_PUBLISHER_ID    = org.doxygen.Publisher
 DOCSET_PUBLISHER_NAME  = Publisher
 GENERATE_HTMLHELP      = NO
 CHM_FILE               =
 HHC_LOCATION           =
 GENERATE_CHI           = NO
 CHM_INDEX_ENCODING     =
 BINARY_TOC             = NO
 TOC_EXPAND             = NO
 GENERATE_QHP           = NO
 QCH_FILE               =
 QHP_NAMESPACE          = org.doxygen.Project
 QHP_VIRTUAL_FOLDER     = doc
 QHP_CUST_FILTER_NAME   =
 QHP_CUST_FILTER_ATTRS  =
 QHP_SECT_FILTER_ATTRS  =
 QHG_LOCATION           =
 GENERATE_ECLIPSEHELP   = NO
 ECLIPSE_DOC_ID         = org.doxygen.Project
 DISABLE_INDEX          = NO
 GENERATE_TREEVIEW      = NO
 ENUM_VALUES_PER_LINE   = 4
 TREEVIEW_WIDTH         = 250
 EXT_LINKS_IN_WINDOW    = NO
 HTML_FORMULA_FORMAT    = png
 FORMULA_FONTSIZE       = 10
 FORMULA_TRANSPARENT    = YES
 FORMULA_MACROFILE      =
 USE_MATHJAX            = NO
 MATHJAX_FORMAT         = HTML-CSS
 MATHJAX_RELPATH        = https://cdn.jsdelivr.net/npm/mathjax@2
 MATHJAX_EXTENSIONS     =
 MATHJAX_CODEFILE       =
 SEARCHENGINE           = YES
 SERVER_BASED_SEARCH    = NO
 EXTERNAL_SEARCH        = NO
 SEARCHENGINE_URL       =
 SEARCHDATA_FILE        = searchdata.xml
 EXTERNAL_SEARCH_ID     =
 EXTRA_SEARCH_MAPPINGS  =
 GENERATE_LATEX         = NO
 LATEX_OUTPUT           = latex
 LATEX_CMD_NAME         =
 MAKEINDEX_CMD_NAME     = makeindex
 LATEX_MAKEINDEX_CMD    = makeindex
 COMPACT_LATEX          = NO
 PAPER_TYPE             = a4
 EXTRA_PACKAGES         =
 LATEX_HEADER           =
 LATEX_FOOTER           =
 LATEX_EXTRA_STYLESHEET =
 LATEX_EXTRA_FILES      =
 PDF_HYPERLINKS         = YES
 USE_PDFLATEX           = YES
 LATEX_BATCHMODE        = NO
 LATEX_HIDE_INDICES     = NO
 LATEX_SOURCE_CODE      = NO
 LATEX_BIB_STYLE        = plain
 LATEX_TIMESTAMP        = NO
 LATEX_EMOJI_DIRECTORY  =
 GENERATE_RTF           = NO
 RTF_OUTPUT             = rtf
 COMPACT_RTF            = NO
 RTF_HYPERLINKS         = NO
 RTF_STYLESHEET_FILE    =
 RTF_EXTENSIONS_FILE    =
 RTF_SOURCE_CODE        = NO
 GENERATE_MAN           = NO
 MAN_OUTPUT             = man
 MAN_EXTENSION          = .3
 MAN_SUBDIR             =
 MAN_LINKS              = NO
 GENERATE_XML           = YES
 XML_OUTPUT             = xml
 XML_PROGRAMLISTING     = YES
 XML_NS_MEMB_FILE_SCOPE = NO
 GENERATE_DOCBOOK       = NO
 DOCBOOK_OUTPUT         = docbook
 DOCBOOK_PROGRAMLISTING = NO
 GENERATE_AUTOGEN_DEF   = NO
 GENERATE_PERLMOD       = NO
 PERLMOD_LATEX          = NO
 PERLMOD_PRETTY         = YES
 PERLMOD_MAKEVAR_PREFIX =
 ENABLE_PREPROCESSING   = YES
 MACRO_EXPANSION        = NO
 EXPAND_ONLY_PREDEF     = YES
 SEARCH_INCLUDES        = YES
 INCLUDE_PATH           =
 INCLUDE_FILE_PATTERNS  =
 PREDEFINED             =
 EXPAND_AS_DEFINED      =
 SKIP_FUNCTION_MACROS   = NO
 TAGFILES               =
 GENERATE_TAGFILE       =
 ALLEXTERNALS           = NO
 EXTERNAL_GROUPS        = YES
 EXTERNAL_PAGES         = YES
 CLASS_DIAGRAMS         = YES
 DIA_PATH               =
 HIDE_UNDOC_RELATIONS   = YES
 HAVE_DOT               = NO
 DOT_NUM_THREADS        = 0
 DOT_FONTNAME           = Helvetica
 DOT_FONTSIZE           = 10
 DOT_FONTPATH           =
 CLASS_GRAPH            = YES
 COLLABORATION_GRAPH    = YES
 GROUP_GRAPHS           = YES
 UML_LOOK               = NO
 UML_LIMIT_NUM_FIELDS   = 10
 DOT_UML_DETAILS        = NO
 DOT_WRAP_THRESHOLD     = 17
 TEMPLATE_RELATIONS     = NO
 INCLUDE_GRAPH          = YES
 INCLUDED_BY_GRAPH      = YES
 CALL_GRAPH             = NO
 CALLER_GRAPH           = NO
 GRAPHICAL_HIERARCHY    = YES
 DIRECTORY_GRAPH        = YES
 DOT_IMAGE_FORMAT       = png
 INTERACTIVE_SVG        = NO
 DOT_PATH               =
 DOTFILE_DIRS           =
 MSCFILE_DIRS           =
 DIAFILE_DIRS           =
 PLANTUML_JAR_PATH      =
 PLANTUML_CFG_FILE      =
 PLANTUML_INCLUDE_PATH  =
 DOT_GRAPH_MAX_NODES    = 50
 MAX_DOT_GRAPH_DEPTH    = 0
 DOT_TRANSPARENT        = NO
 DOT_MULTI_TARGETS      = NO
 GENERATE_LEGEND        = YES
 DOT_CLEANUP            = YES

2

docs/sphinx/requirements.txt Normal file

View File

@@ -0,0 +1,2 @@
 breathe
 sphinx_rtd_theme

									
										23

fuzz/bpf-object-fuzzer.c
									
										Normal file
									
												View File
												
				@@ -0,0 +1,23 @@

				#include "libbpf.h"

				static int libbpf_print_fn(enum libbpf_print_level level, const char *format, va_list args)

				{

					return 0;

				}

				int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {

					struct bpf_object *obj = NULL;

					DECLARE_LIBBPF_OPTS(bpf_object_open_opts, opts);

					int err;

					libbpf_set_print(libbpf_print_fn);

					opts.object_name = "fuzz-object";

					obj = bpf_object__open_mem(data, size, &opts);

					err = libbpf_get_error(obj);

					if (err)

						return 0;

					bpf_object__close(obj);

					return 0;

				}

BIN
fuzz/bpf-object-fuzzer_seed_corpus.zip Normal file

View File

Binary file not shown.

									
										33

include/linux/filter.h
									
												View File
												
				@@ -5,6 +5,22 @@

				#include <linux/bpf.h>

				#define BPF_RAW_INSN(CODE, DST, SRC, OFF, IMM)			\

					((struct bpf_insn) {					\

						.code = CODE,					\

						.dst_reg = DST,					\

						.src_reg = SRC,					\

						.off = OFF,					\

						.imm = IMM })

				#define BPF_ALU32_IMM(OP, DST, IMM)				\

					((struct bpf_insn) {					\

						.code  = BPF_ALU | BPF_OP(OP) | BPF_K,		\

						.dst_reg = DST,					\

						.src_reg = 0,					\

						.off   = 0,					\

						.imm   = IMM })

				#define BPF_ALU64_IMM(OP, DST, IMM)				\

					((struct bpf_insn) {					\

						.code  = BPF_ALU64 | BPF_OP(OP) | BPF_K,	\

				@@ -21,6 +37,14 @@

						.off   = 0,					\

						.imm   = IMM })

				#define BPF_CALL_REL(DST)					\

					((struct bpf_insn) {					\

						.code = BPF_JMP | BPF_CALL,			\

						.dst_reg = 0,					\

						.src_reg = BPF_PSEUDO_CALL,			\

						.off = 0,					\

						.imm = DST })

				#define BPF_EXIT_INSN()						\

					((struct bpf_insn) {					\

						.code  = BPF_JMP | BPF_EXIT,			\

				@@ -96,7 +120,7 @@

							      MAP_FD, 0)

				#define BPF_LD_MAP_VALUE(DST, MAP_FD, VALUE_OFF)		\

					BPF_LD_IMM64_RAW_FULL(DST, BPF_PSEUDO_MAP_FD, 0, 0,	\

					BPF_LD_IMM64_RAW_FULL(DST, BPF_PSEUDO_MAP_VALUE, 0, 0,	\

							      MAP_FD, VALUE_OFF)

				#define BPF_JMP_IMM(OP, DST, IMM, OFF)				\

				@@ -107,5 +131,12 @@

						.off  = OFF,					\

						.imm  = IMM })

				#define BPF_JMP32_IMM(OP, DST, IMM, OFF)			\

					((struct bpf_insn) {					\

						.code = BPF_JMP32 | BPF_OP(OP) | BPF_K,		\

						.dst_reg = DST,					\

						.src_reg = 0,					\

						.off  = OFF,					\

						.imm  = IMM })

				#endif

									
										2

include/linux/kernel.h
									
												View File
												
				@@ -3,6 +3,8 @@

				#ifndef __LINUX_KERNEL_H

				#define __LINUX_KERNEL_H

				#include <linux/compiler.h>

				#ifndef offsetof

				#define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)

				#endif

									
										9

include/linux/list.h
									
												View File
												
				@@ -72,11 +72,20 @@ static inline void list_del(struct list_head *entry)

				        entry->prev = LIST_POISON2;

				}

				static inline int list_empty(const struct list_head *head)

				{

					return head->next == head;

				}

				#define list_entry(ptr, type, member) \

				        container_of(ptr, type, member)

				#define list_first_entry(ptr, type, member) \

				        list_entry((ptr)->next, type, member)

				#define list_next_entry(pos, member) \

				        list_entry((pos)->member.next, typeof(*(pos)), member)

				#define list_for_each_entry(pos, head, member) \

					for (pos = list_first_entry(head, typeof(*pos), member); \

					     &pos->member != (head); \

					     pos = list_next_entry(pos, member))

				#endif

									
										2

include/linux/types.h
									
												View File
												
				@@ -7,6 +7,8 @@

				#include <stddef.h>

				#include <stdint.h>

				#include <linux/stddef.h>

				#include <asm/types.h>

				#include <asm/posix_types.h>

									
										20

include/tools/libc_compat.h
									
												View File
											
				@@ -1,20 +0,0 @@

				// SPDX-License-Identifier: (LGPL-2.0+ OR BSD-2-Clause)

				/* Copyright (C) 2018 Netronome Systems, Inc. */

				#ifndef __TOOLS_LIBC_COMPAT_H

				#define __TOOLS_LIBC_COMPAT_H

				#include <stdlib.h>

				#include <linux/overflow.h>

				#ifdef COMPAT_NEED_REALLOCARRAY

				static inline void *reallocarray(void *ptr, size_t nmemb, size_t size)

				{

					size_t bytes;

					if (unlikely(check_mul_overflow(nmemb, size, &bytes)))

						return NULL;

					return realloc(ptr, bytes);

				}

				#endif

				#endif

4833

include/uapi/linux/bpf.h

View File

File diff suppressed because it is too large Load Diff

									
										90

include/uapi/linux/btf.h
									
												View File
												
				@@ -22,9 +22,9 @@ struct btf_header {

				};

				/* Max # of type identifier */

				#define BTF_MAX_TYPE	0x0000ffff

				#define BTF_MAX_TYPE	0x000fffff

				/* Max offset into the string section */

				#define BTF_MAX_NAME_OFFSET	0x0000ffff

				#define BTF_MAX_NAME_OFFSET	0x00ffffff

				/* Max # of struct/union/enum members or func args */

				#define BTF_MAX_VLEN	0xffff

				@@ -33,17 +33,18 @@ struct btf_type {

					/* "info" bits arrangement

					 * bits  0-15: vlen (e.g. # of struct's members)

					 * bits 16-23: unused

					 * bits 24-27: kind (e.g. int, ptr, array...etc)

					 * bits 28-30: unused

					 * bits 24-28: kind (e.g. int, ptr, array...etc)

					 * bits 29-30: unused

					 * bit     31: kind_flag, currently used by

					 *             struct, union and fwd

					 *             struct, union, enum, fwd, enum64,

					 *             decl_tag and type_tag

					 */

					__u32 info;

					/* "size" is used by INT, ENUM, STRUCT, UNION and DATASEC.

					/* "size" is used by INT, ENUM, STRUCT, UNION, DATASEC and ENUM64.

					 * "size" tells the size of the type it is describing.

					 *

					 * "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT,

					 * FUNC, FUNC_PROTO and VAR.

					 * FUNC, FUNC_PROTO, VAR, DECL_TAG and TYPE_TAG.

					 * "type" is a type_id referring to another type.

					 */

					union {

				@@ -52,28 +53,35 @@ struct btf_type {

					};

				};

				#define BTF_INFO_KIND(info)	(((info) >> 24) & 0x0f)

				#define BTF_INFO_KIND(info)	(((info) >> 24) & 0x1f)

				#define BTF_INFO_VLEN(info)	((info) & 0xffff)

				#define BTF_INFO_KFLAG(info)	((info) >> 31)

				#define BTF_KIND_UNKN		0	/* Unknown	*/

				#define BTF_KIND_INT		1	/* Integer	*/

				#define BTF_KIND_PTR		2	/* Pointer	*/

				#define BTF_KIND_ARRAY		3	/* Array	*/

				#define BTF_KIND_STRUCT		4	/* Struct	*/

				#define BTF_KIND_UNION		5	/* Union	*/

				#define BTF_KIND_ENUM		6	/* Enumeration	*/

				#define BTF_KIND_FWD		7	/* Forward	*/

				#define BTF_KIND_TYPEDEF	8	/* Typedef	*/

				#define BTF_KIND_VOLATILE	9	/* Volatile	*/

				#define BTF_KIND_CONST		10	/* Const	*/

				#define BTF_KIND_RESTRICT	11	/* Restrict	*/

				#define BTF_KIND_FUNC		12	/* Function	*/

				#define BTF_KIND_FUNC_PROTO	13	/* Function Proto	*/

				#define BTF_KIND_VAR		14	/* Variable	*/

				#define BTF_KIND_DATASEC	15	/* Section	*/

				#define BTF_KIND_MAX		BTF_KIND_DATASEC

				#define NR_BTF_KINDS		(BTF_KIND_MAX + 1)

				enum {

					BTF_KIND_UNKN		= 0,	/* Unknown	*/

					BTF_KIND_INT		= 1,	/* Integer	*/

					BTF_KIND_PTR		= 2,	/* Pointer	*/

					BTF_KIND_ARRAY		= 3,	/* Array	*/

					BTF_KIND_STRUCT		= 4,	/* Struct	*/

					BTF_KIND_UNION		= 5,	/* Union	*/

					BTF_KIND_ENUM		= 6,	/* Enumeration up to 32-bit values */

					BTF_KIND_FWD		= 7,	/* Forward	*/

					BTF_KIND_TYPEDEF	= 8,	/* Typedef	*/

					BTF_KIND_VOLATILE	= 9,	/* Volatile	*/

					BTF_KIND_CONST		= 10,	/* Const	*/

					BTF_KIND_RESTRICT	= 11,	/* Restrict	*/

					BTF_KIND_FUNC		= 12,	/* Function	*/

					BTF_KIND_FUNC_PROTO	= 13,	/* Function Proto	*/

					BTF_KIND_VAR		= 14,	/* Variable	*/

					BTF_KIND_DATASEC	= 15,	/* Section	*/

					BTF_KIND_FLOAT		= 16,	/* Floating point	*/

					BTF_KIND_DECL_TAG	= 17,	/* Decl Tag */

					BTF_KIND_TYPE_TAG	= 18,	/* Type Tag */

					BTF_KIND_ENUM64		= 19,	/* Enumeration up to 64-bit values */

					NR_BTF_KINDS,

					BTF_KIND_MAX		= NR_BTF_KINDS - 1,

				};

				/* For some specific BTF_KIND, "struct btf_type" is immediately

				 * followed by extra data.

				@@ -142,7 +150,14 @@ struct btf_param {

				enum {

					BTF_VAR_STATIC = 0,

					BTF_VAR_GLOBAL_ALLOCATED,

					BTF_VAR_GLOBAL_ALLOCATED = 1,

					BTF_VAR_GLOBAL_EXTERN = 2,

				};

				enum btf_func_linkage {

					BTF_FUNC_STATIC = 0,

					BTF_FUNC_GLOBAL = 1,

					BTF_FUNC_EXTERN = 2,

				};

				/* BTF_KIND_VAR is followed by a single "struct btf_var" to describe

				@@ -162,4 +177,25 @@ struct btf_var_secinfo {

					__u32	size;

				};

				/* BTF_KIND_DECL_TAG is followed by a single "struct btf_decl_tag" to describe

				 * additional information related to the tag applied location.

				 * If component_idx == -1, the tag is applied to a struct, union,

				 * variable or function. Otherwise, it is applied to a struct/union

				 * member or a func argument, and component_idx indicates which member

				 * or argument (0 ... vlen-1).

				 */

				struct btf_decl_tag {

				       __s32   component_idx;

				};

				/* BTF_KIND_ENUM64 is followed by multiple "struct btf_enum64".

				 * The exact number of btf_enum64 is stored in the vlen (of the

				 * info in "struct btf_type").

				 */

				struct btf_enum64 {

					__u32	name_off;

					__u32	val_lo32;

					__u32	val_hi32;

				};

				#endif /* _UAPI__LINUX_BTF_H__ */

1022

include/uapi/linux/if_link.h

View File

File diff suppressed because it is too large Load Diff

									
										99

include/uapi/linux/if_xdp.h
									
												View File
												
				@@ -16,6 +16,34 @@

				#define XDP_SHARED_UMEM	(1 << 0)

				#define XDP_COPY	(1 << 1) /* Force copy-mode */

				#define XDP_ZEROCOPY	(1 << 2) /* Force zero-copy mode */

				/* If this option is set, the driver might go sleep and in that case

				 * the XDP_RING_NEED_WAKEUP flag in the fill and/or Tx rings will be

				 * set. If it is set, the application need to explicitly wake up the

				 * driver with a poll() (Rx and Tx) or sendto() (Tx only). If you are

				 * running the driver and the application on the same core, you should

				 * use this option so that the kernel will yield to the user space

				 * application.

				 */

				#define XDP_USE_NEED_WAKEUP (1 << 3)

				/* By setting this option, userspace application indicates that it can

				 * handle multiple descriptors per packet thus enabling AF_XDP to split

				 * multi-buffer XDP frames into multiple Rx descriptors. Without this set

				 * such frames will be dropped.

				 */

				#define XDP_USE_SG	(1 << 4)

				/* Flags for xsk_umem_config flags */

				#define XDP_UMEM_UNALIGNED_CHUNK_FLAG	(1 << 0)

				/* Force checksum calculation in software. Can be used for testing or

				 * working around potential HW issues. This option causes performance

				 * degradation and only works in XDP_COPY mode.

				 */

				#define XDP_UMEM_TX_SW_CSUM		(1 << 1)

				/* Request to reserve tx_metadata_len bytes of per-chunk metadata.

				 */

				#define XDP_UMEM_TX_METADATA_LEN	(1 << 2)

				struct sockaddr_xdp {

					__u16 sxdp_family;

				@@ -25,10 +53,14 @@ struct sockaddr_xdp {

					__u32 sxdp_shared_umem_fd;

				};

				/* XDP_RING flags */

				#define XDP_RING_NEED_WAKEUP (1 << 0)

				struct xdp_ring_offset {

					__u64 producer;

					__u64 consumer;

					__u64 desc;

					__u64 flags;

				};

				struct xdp_mmap_offsets {

				@@ -53,12 +85,17 @@ struct xdp_umem_reg {

					__u64 len; /* Length of packet data area */

					__u32 chunk_size;

					__u32 headroom;

					__u32 flags;

					__u32 tx_metadata_len;

				};

				struct xdp_statistics {

					__u64 rx_dropped; /* Dropped for reasons other than invalid desc */

					__u64 rx_dropped; /* Dropped for other reasons */

					__u64 rx_invalid_descs; /* Dropped due to invalid descriptor */

					__u64 tx_invalid_descs; /* Dropped due to invalid descriptor */

					__u64 rx_ring_full; /* Dropped due to rx ring being full */

					__u64 rx_fill_ring_empty_descs; /* Failed to retrieve item from fill ring */

					__u64 tx_ring_empty_descs; /* Failed to retrieve item from tx ring */

				};

				struct xdp_options {

				@@ -74,6 +111,56 @@ struct xdp_options {

				#define XDP_UMEM_PGOFF_FILL_RING	0x100000000ULL

				#define XDP_UMEM_PGOFF_COMPLETION_RING	0x180000000ULL

				/* Masks for unaligned chunks mode */

				#define XSK_UNALIGNED_BUF_OFFSET_SHIFT 48

				#define XSK_UNALIGNED_BUF_ADDR_MASK \

					((1ULL << XSK_UNALIGNED_BUF_OFFSET_SHIFT) - 1)

				/* Request transmit timestamp. Upon completion, put it into tx_timestamp

				 * field of struct xsk_tx_metadata.

				 */

				#define XDP_TXMD_FLAGS_TIMESTAMP		(1 << 0)

				/* Request transmit checksum offload. Checksum start position and offset

				 * are communicated via csum_start and csum_offset fields of struct

				 * xsk_tx_metadata.

				 */

				#define XDP_TXMD_FLAGS_CHECKSUM			(1 << 1)

				/* Request launch time hardware offload. The device will schedule the packet for

				 * transmission at a pre-determined time called launch time. The value of

				 * launch time is communicated via launch_time field of struct xsk_tx_metadata.

				 */

				#define XDP_TXMD_FLAGS_LAUNCH_TIME		(1 << 2)

				/* AF_XDP offloads request. 'request' union member is consumed by the driver

				 * when the packet is being transmitted. 'completion' union member is

				 * filled by the driver when the transmit completion arrives.

				 */

				struct xsk_tx_metadata {

					__u64 flags;

					union {

						struct {

							/* XDP_TXMD_FLAGS_CHECKSUM */

							/* Offset from desc->addr where checksumming should start. */

							__u16 csum_start;

							/* Offset from csum_start where checksum should be stored. */

							__u16 csum_offset;

							/* XDP_TXMD_FLAGS_LAUNCH_TIME */

							/* Launch time in nanosecond against the PTP HW Clock */

							__u64 launch_time;

						} request;

						struct {

							/* XDP_TXMD_FLAGS_TIMESTAMP */

							__u64 tx_timestamp;

						} completion;

					};

				};

				/* Rx/Tx descriptor */

				struct xdp_desc {

					__u64 addr;

				@@ -83,4 +170,14 @@ struct xdp_desc {

				/* UMEM descriptor is __u64 */

				/* Flag indicating that the packet continues with the buffer pointed out by the

				 * next frame in the ring. The end of the packet is signalled by setting this

				 * bit to zero. For single buffer packets, every descriptor has 'options' set

				 * to 0 and this maintains backward compatibility.

				 */

				#define XDP_PKT_CONTD (1 << 0)

				/* TX packet carries valid metadata. */

				#define XDP_TX_METADATA (1 << 1)

				#endif /* _LINUX_IF_XDP_H */

									
										230

include/uapi/linux/netdev.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,230 @@

				/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) */

				/* Do not edit directly, auto-generated from: */

				/*	Documentation/netlink/specs/netdev.yaml */

				/* YNL-GEN uapi header */

				#ifndef _UAPI_LINUX_NETDEV_H

				#define _UAPI_LINUX_NETDEV_H

				#define NETDEV_FAMILY_NAME	"netdev"

				#define NETDEV_FAMILY_VERSION	1

				/**

				 * enum netdev_xdp_act

				 * @NETDEV_XDP_ACT_BASIC: XDP features set supported by all drivers

				 *   (XDP_ABORTED, XDP_DROP, XDP_PASS, XDP_TX)

				 * @NETDEV_XDP_ACT_REDIRECT: The netdev supports XDP_REDIRECT

				 * @NETDEV_XDP_ACT_NDO_XMIT: This feature informs if netdev implements

				 *   ndo_xdp_xmit callback.

				 * @NETDEV_XDP_ACT_XSK_ZEROCOPY: This feature informs if netdev supports AF_XDP

				 *   in zero copy mode.

				 * @NETDEV_XDP_ACT_HW_OFFLOAD: This feature informs if netdev supports XDP hw

				 *   offloading.

				 * @NETDEV_XDP_ACT_RX_SG: This feature informs if netdev implements non-linear

				 *   XDP buffer support in the driver napi callback.

				 * @NETDEV_XDP_ACT_NDO_XMIT_SG: This feature informs if netdev implements

				 *   non-linear XDP buffer support in ndo_xdp_xmit callback.

				 */

				enum netdev_xdp_act {

					NETDEV_XDP_ACT_BASIC = 1,

					NETDEV_XDP_ACT_REDIRECT = 2,

					NETDEV_XDP_ACT_NDO_XMIT = 4,

					NETDEV_XDP_ACT_XSK_ZEROCOPY = 8,

					NETDEV_XDP_ACT_HW_OFFLOAD = 16,

					NETDEV_XDP_ACT_RX_SG = 32,

					NETDEV_XDP_ACT_NDO_XMIT_SG = 64,

					/* private: */

					NETDEV_XDP_ACT_MASK = 127,

				};

				/**

				 * enum netdev_xdp_rx_metadata

				 * @NETDEV_XDP_RX_METADATA_TIMESTAMP: Device is capable of exposing receive HW

				 *   timestamp via bpf_xdp_metadata_rx_timestamp().

				 * @NETDEV_XDP_RX_METADATA_HASH: Device is capable of exposing receive packet

				 *   hash via bpf_xdp_metadata_rx_hash().

				 * @NETDEV_XDP_RX_METADATA_VLAN_TAG: Device is capable of exposing receive

				 *   packet VLAN tag via bpf_xdp_metadata_rx_vlan_tag().

				 */

				enum netdev_xdp_rx_metadata {

					NETDEV_XDP_RX_METADATA_TIMESTAMP = 1,

					NETDEV_XDP_RX_METADATA_HASH = 2,

					NETDEV_XDP_RX_METADATA_VLAN_TAG = 4,

				};

				/**

				 * enum netdev_xsk_flags

				 * @NETDEV_XSK_FLAGS_TX_TIMESTAMP: HW timestamping egress packets is supported

				 *   by the driver.

				 * @NETDEV_XSK_FLAGS_TX_CHECKSUM: L3 checksum HW offload is supported by the

				 *   driver.

				 * @NETDEV_XSK_FLAGS_TX_LAUNCH_TIME_FIFO: Launch time HW offload is supported

				 *   by the driver.

				 */

				enum netdev_xsk_flags {

					NETDEV_XSK_FLAGS_TX_TIMESTAMP = 1,

					NETDEV_XSK_FLAGS_TX_CHECKSUM = 2,

					NETDEV_XSK_FLAGS_TX_LAUNCH_TIME_FIFO = 4,

				};

				enum netdev_queue_type {

					NETDEV_QUEUE_TYPE_RX,

					NETDEV_QUEUE_TYPE_TX,

				};

				enum netdev_qstats_scope {

					NETDEV_QSTATS_SCOPE_QUEUE = 1,

				};

				enum {

					NETDEV_A_DEV_IFINDEX = 1,

					NETDEV_A_DEV_PAD,

					NETDEV_A_DEV_XDP_FEATURES,

					NETDEV_A_DEV_XDP_ZC_MAX_SEGS,

					NETDEV_A_DEV_XDP_RX_METADATA_FEATURES,

					NETDEV_A_DEV_XSK_FEATURES,

					__NETDEV_A_DEV_MAX,

					NETDEV_A_DEV_MAX = (__NETDEV_A_DEV_MAX - 1)

				};

				enum {

					__NETDEV_A_IO_URING_PROVIDER_INFO_MAX,

					NETDEV_A_IO_URING_PROVIDER_INFO_MAX = (__NETDEV_A_IO_URING_PROVIDER_INFO_MAX - 1)

				};

				enum {

					NETDEV_A_PAGE_POOL_ID = 1,

					NETDEV_A_PAGE_POOL_IFINDEX,

					NETDEV_A_PAGE_POOL_NAPI_ID,

					NETDEV_A_PAGE_POOL_INFLIGHT,

					NETDEV_A_PAGE_POOL_INFLIGHT_MEM,

					NETDEV_A_PAGE_POOL_DETACH_TIME,

					NETDEV_A_PAGE_POOL_DMABUF,

					NETDEV_A_PAGE_POOL_IO_URING,

					__NETDEV_A_PAGE_POOL_MAX,

					NETDEV_A_PAGE_POOL_MAX = (__NETDEV_A_PAGE_POOL_MAX - 1)

				};

				enum {

					NETDEV_A_PAGE_POOL_STATS_INFO = 1,

					NETDEV_A_PAGE_POOL_STATS_ALLOC_FAST = 8,

					NETDEV_A_PAGE_POOL_STATS_ALLOC_SLOW,

					NETDEV_A_PAGE_POOL_STATS_ALLOC_SLOW_HIGH_ORDER,

					NETDEV_A_PAGE_POOL_STATS_ALLOC_EMPTY,

					NETDEV_A_PAGE_POOL_STATS_ALLOC_REFILL,

					NETDEV_A_PAGE_POOL_STATS_ALLOC_WAIVE,

					NETDEV_A_PAGE_POOL_STATS_RECYCLE_CACHED,

					NETDEV_A_PAGE_POOL_STATS_RECYCLE_CACHE_FULL,

					NETDEV_A_PAGE_POOL_STATS_RECYCLE_RING,

					NETDEV_A_PAGE_POOL_STATS_RECYCLE_RING_FULL,

					NETDEV_A_PAGE_POOL_STATS_RECYCLE_RELEASED_REFCNT,

					__NETDEV_A_PAGE_POOL_STATS_MAX,

					NETDEV_A_PAGE_POOL_STATS_MAX = (__NETDEV_A_PAGE_POOL_STATS_MAX - 1)

				};

				enum {

					NETDEV_A_NAPI_IFINDEX = 1,

					NETDEV_A_NAPI_ID,

					NETDEV_A_NAPI_IRQ,

					NETDEV_A_NAPI_PID,

					NETDEV_A_NAPI_DEFER_HARD_IRQS,

					NETDEV_A_NAPI_GRO_FLUSH_TIMEOUT,

					NETDEV_A_NAPI_IRQ_SUSPEND_TIMEOUT,

					__NETDEV_A_NAPI_MAX,

					NETDEV_A_NAPI_MAX = (__NETDEV_A_NAPI_MAX - 1)

				};

				enum {

					__NETDEV_A_XSK_INFO_MAX,

					NETDEV_A_XSK_INFO_MAX = (__NETDEV_A_XSK_INFO_MAX - 1)

				};

				enum {

					NETDEV_A_QUEUE_ID = 1,

					NETDEV_A_QUEUE_IFINDEX,

					NETDEV_A_QUEUE_TYPE,

					NETDEV_A_QUEUE_NAPI_ID,

					NETDEV_A_QUEUE_DMABUF,

					NETDEV_A_QUEUE_IO_URING,

					NETDEV_A_QUEUE_XSK,

					__NETDEV_A_QUEUE_MAX,

					NETDEV_A_QUEUE_MAX = (__NETDEV_A_QUEUE_MAX - 1)

				};

				enum {

					NETDEV_A_QSTATS_IFINDEX = 1,

					NETDEV_A_QSTATS_QUEUE_TYPE,

					NETDEV_A_QSTATS_QUEUE_ID,

					NETDEV_A_QSTATS_SCOPE,

					NETDEV_A_QSTATS_RX_PACKETS = 8,

					NETDEV_A_QSTATS_RX_BYTES,

					NETDEV_A_QSTATS_TX_PACKETS,

					NETDEV_A_QSTATS_TX_BYTES,

					NETDEV_A_QSTATS_RX_ALLOC_FAIL,

					NETDEV_A_QSTATS_RX_HW_DROPS,

					NETDEV_A_QSTATS_RX_HW_DROP_OVERRUNS,

					NETDEV_A_QSTATS_RX_CSUM_COMPLETE,

					NETDEV_A_QSTATS_RX_CSUM_UNNECESSARY,

					NETDEV_A_QSTATS_RX_CSUM_NONE,

					NETDEV_A_QSTATS_RX_CSUM_BAD,

					NETDEV_A_QSTATS_RX_HW_GRO_PACKETS,

					NETDEV_A_QSTATS_RX_HW_GRO_BYTES,

					NETDEV_A_QSTATS_RX_HW_GRO_WIRE_PACKETS,

					NETDEV_A_QSTATS_RX_HW_GRO_WIRE_BYTES,

					NETDEV_A_QSTATS_RX_HW_DROP_RATELIMITS,

					NETDEV_A_QSTATS_TX_HW_DROPS,

					NETDEV_A_QSTATS_TX_HW_DROP_ERRORS,

					NETDEV_A_QSTATS_TX_CSUM_NONE,

					NETDEV_A_QSTATS_TX_NEEDS_CSUM,

					NETDEV_A_QSTATS_TX_HW_GSO_PACKETS,

					NETDEV_A_QSTATS_TX_HW_GSO_BYTES,

					NETDEV_A_QSTATS_TX_HW_GSO_WIRE_PACKETS,

					NETDEV_A_QSTATS_TX_HW_GSO_WIRE_BYTES,

					NETDEV_A_QSTATS_TX_HW_DROP_RATELIMITS,

					NETDEV_A_QSTATS_TX_STOP,

					NETDEV_A_QSTATS_TX_WAKE,

					__NETDEV_A_QSTATS_MAX,

					NETDEV_A_QSTATS_MAX = (__NETDEV_A_QSTATS_MAX - 1)

				};

				enum {

					NETDEV_A_DMABUF_IFINDEX = 1,

					NETDEV_A_DMABUF_QUEUES,

					NETDEV_A_DMABUF_FD,

					NETDEV_A_DMABUF_ID,

					__NETDEV_A_DMABUF_MAX,

					NETDEV_A_DMABUF_MAX = (__NETDEV_A_DMABUF_MAX - 1)

				};

				enum {

					NETDEV_CMD_DEV_GET = 1,

					NETDEV_CMD_DEV_ADD_NTF,

					NETDEV_CMD_DEV_DEL_NTF,

					NETDEV_CMD_DEV_CHANGE_NTF,

					NETDEV_CMD_PAGE_POOL_GET,

					NETDEV_CMD_PAGE_POOL_ADD_NTF,

					NETDEV_CMD_PAGE_POOL_DEL_NTF,

					NETDEV_CMD_PAGE_POOL_CHANGE_NTF,

					NETDEV_CMD_PAGE_POOL_STATS_GET,

					NETDEV_CMD_QUEUE_GET,

					NETDEV_CMD_NAPI_GET,

					NETDEV_CMD_QSTATS_GET,

					NETDEV_CMD_BIND_RX,

					NETDEV_CMD_NAPI_SET,

					__NETDEV_CMD_MAX,

					NETDEV_CMD_MAX = (__NETDEV_CMD_MAX - 1)

				};

				#define NETDEV_MCGRP_MGMT	"mgmt"

				#define NETDEV_MCGRP_PAGE_POOL	"page-pool"

				#endif /* _UAPI_LINUX_NETDEV_H */

1475

include/uapi/linux/perf_event.h Normal file

View File

File diff suppressed because it is too large Load Diff

									
										565

include/uapi/linux/pkt_cls.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,565 @@

				/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */

				#ifndef __LINUX_PKT_CLS_H

				#define __LINUX_PKT_CLS_H

				#include <linux/types.h>

				#include <linux/pkt_sched.h>

				#define TC_COOKIE_MAX_SIZE 16

				/* Action attributes */

				enum {

					TCA_ACT_UNSPEC,

					TCA_ACT_KIND,

					TCA_ACT_OPTIONS,

					TCA_ACT_INDEX,

					TCA_ACT_STATS,

					TCA_ACT_PAD,

					TCA_ACT_COOKIE,

					__TCA_ACT_MAX

				};

				#define TCA_ACT_MAX __TCA_ACT_MAX

				#define TCA_OLD_COMPAT (TCA_ACT_MAX+1)

				#define TCA_ACT_MAX_PRIO 32

				#define TCA_ACT_BIND	1

				#define TCA_ACT_NOBIND	0

				#define TCA_ACT_UNBIND	1

				#define TCA_ACT_NOUNBIND	0

				#define TCA_ACT_REPLACE		1

				#define TCA_ACT_NOREPLACE	0

				#define TC_ACT_UNSPEC	(-1)

				#define TC_ACT_OK		0

				#define TC_ACT_RECLASSIFY	1

				#define TC_ACT_SHOT		2

				#define TC_ACT_PIPE		3

				#define TC_ACT_STOLEN		4

				#define TC_ACT_QUEUED		5

				#define TC_ACT_REPEAT		6

				#define TC_ACT_REDIRECT		7

				#define TC_ACT_TRAP		8 /* For hw path, this means "trap to cpu"

								   * and don't further process the frame

								   * in hardware. For sw path, this is

								   * equivalent of TC_ACT_STOLEN - drop

								   * the skb and act like everything

								   * is alright.

								   */

				#define TC_ACT_VALUE_MAX	TC_ACT_TRAP

				/* There is a special kind of actions called "extended actions",

				 * which need a value parameter. These have a local opcode located in

				 * the highest nibble, starting from 1. The rest of the bits

				 * are used to carry the value. These two parts together make

				 * a combined opcode.

				 */

				#define __TC_ACT_EXT_SHIFT 28

				#define __TC_ACT_EXT(local) ((local) << __TC_ACT_EXT_SHIFT)

				#define TC_ACT_EXT_VAL_MASK ((1 << __TC_ACT_EXT_SHIFT) - 1)

				#define TC_ACT_EXT_OPCODE(combined) ((combined) & (~TC_ACT_EXT_VAL_MASK))

				#define TC_ACT_EXT_CMP(combined, opcode) (TC_ACT_EXT_OPCODE(combined) == opcode)

				#define TC_ACT_JUMP __TC_ACT_EXT(1)

				#define TC_ACT_GOTO_CHAIN __TC_ACT_EXT(2)

				#define TC_ACT_EXT_OPCODE_MAX	TC_ACT_GOTO_CHAIN

				/* Action type identifiers*/

				enum {

					TCA_ID_UNSPEC=0,

					TCA_ID_POLICE=1,

					/* other actions go here */

					__TCA_ID_MAX=255

				};

				#define TCA_ID_MAX __TCA_ID_MAX

				struct tc_police {

					__u32			index;

					int			action;

				#define TC_POLICE_UNSPEC	TC_ACT_UNSPEC

				#define TC_POLICE_OK		TC_ACT_OK

				#define TC_POLICE_RECLASSIFY	TC_ACT_RECLASSIFY

				#define TC_POLICE_SHOT		TC_ACT_SHOT

				#define TC_POLICE_PIPE		TC_ACT_PIPE

					__u32			limit;

					__u32			burst;

					__u32			mtu;

					struct tc_ratespec	rate;

					struct tc_ratespec	peakrate;

					int			refcnt;

					int			bindcnt;

					__u32			capab;

				};

				struct tcf_t {

					__u64   install;

					__u64   lastuse;

					__u64   expires;

					__u64   firstuse;

				};

				struct tc_cnt {

					int                   refcnt;

					int                   bindcnt;

				};

				#define tc_gen \

					__u32                 index; \

					__u32                 capab; \

					int                   action; \

					int                   refcnt; \

					int                   bindcnt

				enum {

					TCA_POLICE_UNSPEC,

					TCA_POLICE_TBF,

					TCA_POLICE_RATE,

					TCA_POLICE_PEAKRATE,

					TCA_POLICE_AVRATE,

					TCA_POLICE_RESULT,

					TCA_POLICE_TM,

					TCA_POLICE_PAD,

					__TCA_POLICE_MAX

				#define TCA_POLICE_RESULT TCA_POLICE_RESULT

				};

				#define TCA_POLICE_MAX (__TCA_POLICE_MAX - 1)

				/* tca flags definitions */

				#define TCA_CLS_FLAGS_SKIP_HW	(1 << 0) /* don't offload filter to HW */

				#define TCA_CLS_FLAGS_SKIP_SW	(1 << 1) /* don't use filter in SW */

				#define TCA_CLS_FLAGS_IN_HW	(1 << 2) /* filter is offloaded to HW */

				#define TCA_CLS_FLAGS_NOT_IN_HW (1 << 3) /* filter isn't offloaded to HW */

				#define TCA_CLS_FLAGS_VERBOSE	(1 << 4) /* verbose logging */

				/* U32 filters */

				#define TC_U32_HTID(h) ((h)&0xFFF00000)

				#define TC_U32_USERHTID(h) (TC_U32_HTID(h)>>20)

				#define TC_U32_HASH(h) (((h)>>12)&0xFF)

				#define TC_U32_NODE(h) ((h)&0xFFF)

				#define TC_U32_KEY(h) ((h)&0xFFFFF)

				#define TC_U32_UNSPEC	0

				#define TC_U32_ROOT	(0xFFF00000)

				enum {

					TCA_U32_UNSPEC,

					TCA_U32_CLASSID,

					TCA_U32_HASH,

					TCA_U32_LINK,

					TCA_U32_DIVISOR,

					TCA_U32_SEL,

					TCA_U32_POLICE,

					TCA_U32_ACT,

					TCA_U32_INDEV,

					TCA_U32_PCNT,

					TCA_U32_MARK,

					TCA_U32_FLAGS,

					TCA_U32_PAD,

					__TCA_U32_MAX

				};

				#define TCA_U32_MAX (__TCA_U32_MAX - 1)

				struct tc_u32_key {

					__be32		mask;

					__be32		val;

					int		off;

					int		offmask;

				};

				struct tc_u32_sel {

					unsigned char		flags;

					unsigned char		offshift;

					unsigned char		nkeys;

					__be16			offmask;

					__u16			off;

					short			offoff;

					short			hoff;

					__be32			hmask;

					struct tc_u32_key	keys[];

				};

				struct tc_u32_mark {

					__u32		val;

					__u32		mask;

					__u32		success;

				};

				struct tc_u32_pcnt {

					__u64 rcnt;

					__u64 rhit;

					__u64 kcnts[];

				};

				/* Flags */

				#define TC_U32_TERMINAL		1

				#define TC_U32_OFFSET		2

				#define TC_U32_VAROFFSET	4

				#define TC_U32_EAT		8

				#define TC_U32_MAXDEPTH 8

				/* ROUTE filter */

				enum {

					TCA_ROUTE4_UNSPEC,

					TCA_ROUTE4_CLASSID,

					TCA_ROUTE4_TO,

					TCA_ROUTE4_FROM,

					TCA_ROUTE4_IIF,

					TCA_ROUTE4_POLICE,

					TCA_ROUTE4_ACT,

					__TCA_ROUTE4_MAX

				};

				#define TCA_ROUTE4_MAX (__TCA_ROUTE4_MAX - 1)

				/* FW filter */

				enum {

					TCA_FW_UNSPEC,

					TCA_FW_CLASSID,

					TCA_FW_POLICE,

					TCA_FW_INDEV,

					TCA_FW_ACT, /* used by CONFIG_NET_CLS_ACT */

					TCA_FW_MASK,

					__TCA_FW_MAX

				};

				#define TCA_FW_MAX (__TCA_FW_MAX - 1)

				/* Flow filter */

				enum {

					FLOW_KEY_SRC,

					FLOW_KEY_DST,

					FLOW_KEY_PROTO,

					FLOW_KEY_PROTO_SRC,

					FLOW_KEY_PROTO_DST,

					FLOW_KEY_IIF,

					FLOW_KEY_PRIORITY,

					FLOW_KEY_MARK,

					FLOW_KEY_NFCT,

					FLOW_KEY_NFCT_SRC,

					FLOW_KEY_NFCT_DST,

					FLOW_KEY_NFCT_PROTO_SRC,

					FLOW_KEY_NFCT_PROTO_DST,

					FLOW_KEY_RTCLASSID,

					FLOW_KEY_SKUID,

					FLOW_KEY_SKGID,

					FLOW_KEY_VLAN_TAG,

					FLOW_KEY_RXHASH,

					__FLOW_KEY_MAX,

				};

				#define FLOW_KEY_MAX	(__FLOW_KEY_MAX - 1)

				enum {

					FLOW_MODE_MAP,

					FLOW_MODE_HASH,

				};

				enum {

					TCA_FLOW_UNSPEC,

					TCA_FLOW_KEYS,

					TCA_FLOW_MODE,

					TCA_FLOW_BASECLASS,

					TCA_FLOW_RSHIFT,

					TCA_FLOW_ADDEND,

					TCA_FLOW_MASK,

					TCA_FLOW_XOR,

					TCA_FLOW_DIVISOR,

					TCA_FLOW_ACT,

					TCA_FLOW_POLICE,

					TCA_FLOW_EMATCHES,

					TCA_FLOW_PERTURB,

					__TCA_FLOW_MAX

				};

				#define TCA_FLOW_MAX	(__TCA_FLOW_MAX - 1)

				/* Basic filter */

				enum {

					TCA_BASIC_UNSPEC,

					TCA_BASIC_CLASSID,

					TCA_BASIC_EMATCHES,

					TCA_BASIC_ACT,

					TCA_BASIC_POLICE,

					__TCA_BASIC_MAX

				};

				#define TCA_BASIC_MAX (__TCA_BASIC_MAX - 1)

				/* Cgroup classifier */

				enum {

					TCA_CGROUP_UNSPEC,

					TCA_CGROUP_ACT,

					TCA_CGROUP_POLICE,

					TCA_CGROUP_EMATCHES,

					__TCA_CGROUP_MAX,

				};

				#define TCA_CGROUP_MAX (__TCA_CGROUP_MAX - 1)

				/* BPF classifier */

				#define TCA_BPF_FLAG_ACT_DIRECT		(1 << 0)

				enum {

					TCA_BPF_UNSPEC,

					TCA_BPF_ACT,

					TCA_BPF_POLICE,

					TCA_BPF_CLASSID,

					TCA_BPF_OPS_LEN,

					TCA_BPF_OPS,

					TCA_BPF_FD,

					TCA_BPF_NAME,

					TCA_BPF_FLAGS,

					TCA_BPF_FLAGS_GEN,

					TCA_BPF_TAG,

					TCA_BPF_ID,

					__TCA_BPF_MAX,

				};

				#define TCA_BPF_MAX (__TCA_BPF_MAX - 1)

				/* Flower classifier */

				enum {

					TCA_FLOWER_UNSPEC,

					TCA_FLOWER_CLASSID,

					TCA_FLOWER_INDEV,

					TCA_FLOWER_ACT,

					TCA_FLOWER_KEY_ETH_DST,		/* ETH_ALEN */

					TCA_FLOWER_KEY_ETH_DST_MASK,	/* ETH_ALEN */

					TCA_FLOWER_KEY_ETH_SRC,		/* ETH_ALEN */

					TCA_FLOWER_KEY_ETH_SRC_MASK,	/* ETH_ALEN */

					TCA_FLOWER_KEY_ETH_TYPE,	/* be16 */

					TCA_FLOWER_KEY_IP_PROTO,	/* u8 */

					TCA_FLOWER_KEY_IPV4_SRC,	/* be32 */

					TCA_FLOWER_KEY_IPV4_SRC_MASK,	/* be32 */

					TCA_FLOWER_KEY_IPV4_DST,	/* be32 */

					TCA_FLOWER_KEY_IPV4_DST_MASK,	/* be32 */

					TCA_FLOWER_KEY_IPV6_SRC,	/* struct in6_addr */

					TCA_FLOWER_KEY_IPV6_SRC_MASK,	/* struct in6_addr */

					TCA_FLOWER_KEY_IPV6_DST,	/* struct in6_addr */

					TCA_FLOWER_KEY_IPV6_DST_MASK,	/* struct in6_addr */

					TCA_FLOWER_KEY_TCP_SRC,		/* be16 */

					TCA_FLOWER_KEY_TCP_DST,		/* be16 */

					TCA_FLOWER_KEY_UDP_SRC,		/* be16 */

					TCA_FLOWER_KEY_UDP_DST,		/* be16 */

					TCA_FLOWER_FLAGS,

					TCA_FLOWER_KEY_VLAN_ID,		/* be16 */

					TCA_FLOWER_KEY_VLAN_PRIO,	/* u8   */

					TCA_FLOWER_KEY_VLAN_ETH_TYPE,	/* be16 */

					TCA_FLOWER_KEY_ENC_KEY_ID,	/* be32 */

					TCA_FLOWER_KEY_ENC_IPV4_SRC,	/* be32 */

					TCA_FLOWER_KEY_ENC_IPV4_SRC_MASK,/* be32 */

					TCA_FLOWER_KEY_ENC_IPV4_DST,	/* be32 */

					TCA_FLOWER_KEY_ENC_IPV4_DST_MASK,/* be32 */

					TCA_FLOWER_KEY_ENC_IPV6_SRC,	/* struct in6_addr */

					TCA_FLOWER_KEY_ENC_IPV6_SRC_MASK,/* struct in6_addr */

					TCA_FLOWER_KEY_ENC_IPV6_DST,	/* struct in6_addr */

					TCA_FLOWER_KEY_ENC_IPV6_DST_MASK,/* struct in6_addr */

					TCA_FLOWER_KEY_TCP_SRC_MASK,	/* be16 */

					TCA_FLOWER_KEY_TCP_DST_MASK,	/* be16 */

					TCA_FLOWER_KEY_UDP_SRC_MASK,	/* be16 */

					TCA_FLOWER_KEY_UDP_DST_MASK,	/* be16 */

					TCA_FLOWER_KEY_SCTP_SRC_MASK,	/* be16 */

					TCA_FLOWER_KEY_SCTP_DST_MASK,	/* be16 */

					TCA_FLOWER_KEY_SCTP_SRC,	/* be16 */

					TCA_FLOWER_KEY_SCTP_DST,	/* be16 */

					TCA_FLOWER_KEY_ENC_UDP_SRC_PORT,	/* be16 */

					TCA_FLOWER_KEY_ENC_UDP_SRC_PORT_MASK,	/* be16 */

					TCA_FLOWER_KEY_ENC_UDP_DST_PORT,	/* be16 */

					TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK,	/* be16 */

					TCA_FLOWER_KEY_FLAGS,		/* be32 */

					TCA_FLOWER_KEY_FLAGS_MASK,	/* be32 */

					TCA_FLOWER_KEY_ICMPV4_CODE,	/* u8 */

					TCA_FLOWER_KEY_ICMPV4_CODE_MASK,/* u8 */

					TCA_FLOWER_KEY_ICMPV4_TYPE,	/* u8 */

					TCA_FLOWER_KEY_ICMPV4_TYPE_MASK,/* u8 */

					TCA_FLOWER_KEY_ICMPV6_CODE,	/* u8 */

					TCA_FLOWER_KEY_ICMPV6_CODE_MASK,/* u8 */

					TCA_FLOWER_KEY_ICMPV6_TYPE,	/* u8 */

					TCA_FLOWER_KEY_ICMPV6_TYPE_MASK,/* u8 */

					TCA_FLOWER_KEY_ARP_SIP,		/* be32 */

					TCA_FLOWER_KEY_ARP_SIP_MASK,	/* be32 */

					TCA_FLOWER_KEY_ARP_TIP,		/* be32 */

					TCA_FLOWER_KEY_ARP_TIP_MASK,	/* be32 */

					TCA_FLOWER_KEY_ARP_OP,		/* u8 */

					TCA_FLOWER_KEY_ARP_OP_MASK,	/* u8 */

					TCA_FLOWER_KEY_ARP_SHA,		/* ETH_ALEN */

					TCA_FLOWER_KEY_ARP_SHA_MASK,	/* ETH_ALEN */

					TCA_FLOWER_KEY_ARP_THA,		/* ETH_ALEN */

					TCA_FLOWER_KEY_ARP_THA_MASK,	/* ETH_ALEN */

					TCA_FLOWER_KEY_MPLS_TTL,	/* u8 - 8 bits */

					TCA_FLOWER_KEY_MPLS_BOS,	/* u8 - 1 bit */

					TCA_FLOWER_KEY_MPLS_TC,		/* u8 - 3 bits */

					TCA_FLOWER_KEY_MPLS_LABEL,	/* be32 - 20 bits */

					TCA_FLOWER_KEY_TCP_FLAGS,	/* be16 */

					TCA_FLOWER_KEY_TCP_FLAGS_MASK,	/* be16 */

					TCA_FLOWER_KEY_IP_TOS,		/* u8 */

					TCA_FLOWER_KEY_IP_TOS_MASK,	/* u8 */

					TCA_FLOWER_KEY_IP_TTL,		/* u8 */

					TCA_FLOWER_KEY_IP_TTL_MASK,	/* u8 */

					TCA_FLOWER_KEY_CVLAN_ID,	/* be16 */

					TCA_FLOWER_KEY_CVLAN_PRIO,	/* u8   */

					TCA_FLOWER_KEY_CVLAN_ETH_TYPE,	/* be16 */

					TCA_FLOWER_KEY_ENC_IP_TOS,	/* u8 */

					TCA_FLOWER_KEY_ENC_IP_TOS_MASK,	/* u8 */

					TCA_FLOWER_KEY_ENC_IP_TTL,	/* u8 */

					TCA_FLOWER_KEY_ENC_IP_TTL_MASK,	/* u8 */

					TCA_FLOWER_KEY_ENC_OPTS,

					TCA_FLOWER_KEY_ENC_OPTS_MASK,

					TCA_FLOWER_IN_HW_COUNT,

					__TCA_FLOWER_MAX,

				};

				#define TCA_FLOWER_MAX (__TCA_FLOWER_MAX - 1)

				enum {

					TCA_FLOWER_KEY_ENC_OPTS_UNSPEC,

					TCA_FLOWER_KEY_ENC_OPTS_GENEVE, /* Nested

									 * TCA_FLOWER_KEY_ENC_OPT_GENEVE_

									 * attributes

									 */

					__TCA_FLOWER_KEY_ENC_OPTS_MAX,

				};

				#define TCA_FLOWER_KEY_ENC_OPTS_MAX (__TCA_FLOWER_KEY_ENC_OPTS_MAX - 1)

				enum {

					TCA_FLOWER_KEY_ENC_OPT_GENEVE_UNSPEC,

					TCA_FLOWER_KEY_ENC_OPT_GENEVE_CLASS,            /* u16 */

					TCA_FLOWER_KEY_ENC_OPT_GENEVE_TYPE,             /* u8 */

					TCA_FLOWER_KEY_ENC_OPT_GENEVE_DATA,             /* 4 to 128 bytes */

					__TCA_FLOWER_KEY_ENC_OPT_GENEVE_MAX,

				};

				#define TCA_FLOWER_KEY_ENC_OPT_GENEVE_MAX \

						(__TCA_FLOWER_KEY_ENC_OPT_GENEVE_MAX - 1)

				enum {

					TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT = (1 << 0),

					TCA_FLOWER_KEY_FLAGS_FRAG_IS_FIRST = (1 << 1),

				};

				/* Match-all classifier */

				enum {

					TCA_MATCHALL_UNSPEC,

					TCA_MATCHALL_CLASSID,

					TCA_MATCHALL_ACT,

					TCA_MATCHALL_FLAGS,

					__TCA_MATCHALL_MAX,

				};

				#define TCA_MATCHALL_MAX (__TCA_MATCHALL_MAX - 1)

				/* Extended Matches */

				struct tcf_ematch_tree_hdr {

					__u16		nmatches;

					__u16		progid;

				};

				enum {

					TCA_EMATCH_TREE_UNSPEC,

					TCA_EMATCH_TREE_HDR,

					TCA_EMATCH_TREE_LIST,

					__TCA_EMATCH_TREE_MAX

				};

				#define TCA_EMATCH_TREE_MAX (__TCA_EMATCH_TREE_MAX - 1)

				struct tcf_ematch_hdr {

					__u16		matchid;

					__u16		kind;

					__u16		flags;

					__u16		pad; /* currently unused */

				};

				/*  0                   1

				 *  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 

				 * +-----------------------+-+-+---+

				 * |         Unused        |S|I| R |

				 * +-----------------------+-+-+---+

				 *

				 * R(2) ::= relation to next ematch

				 *          where: 0 0 END (last ematch)

				 *                 0 1 AND

				 *                 1 0 OR

				 *                 1 1 Unused (invalid)

				 * I(1) ::= invert result

				 * S(1) ::= simple payload

				 */

				#define TCF_EM_REL_END	0

				#define TCF_EM_REL_AND	(1<<0)

				#define TCF_EM_REL_OR	(1<<1)

				#define TCF_EM_INVERT	(1<<2)

				#define TCF_EM_SIMPLE	(1<<3)

				#define TCF_EM_REL_MASK	3

				#define TCF_EM_REL_VALID(v) (((v) & TCF_EM_REL_MASK) != TCF_EM_REL_MASK)

				enum {

					TCF_LAYER_LINK,

					TCF_LAYER_NETWORK,

					TCF_LAYER_TRANSPORT,

					__TCF_LAYER_MAX

				};

				#define TCF_LAYER_MAX (__TCF_LAYER_MAX - 1)

				/* Ematch type assignments

				 *   1..32767		Reserved for ematches inside kernel tree

				 *   32768..65535	Free to use, not reliable

				 */

				#define	TCF_EM_CONTAINER	0

				#define	TCF_EM_CMP		1

				#define	TCF_EM_NBYTE		2

				#define	TCF_EM_U32		3

				#define	TCF_EM_META		4

				#define	TCF_EM_TEXT		5

				#define	TCF_EM_VLAN		6

				#define	TCF_EM_CANID		7

				#define	TCF_EM_IPSET		8

				#define	TCF_EM_IPT		9

				#define	TCF_EM_MAX		9

				enum {

					TCF_EM_PROG_TC

				};

				enum {

					TCF_EM_OPND_EQ,

					TCF_EM_OPND_GT,

					TCF_EM_OPND_LT

				};

				#endif

1055

include/uapi/linux/pkt_sched.h Normal file

View File

File diff suppressed because it is too large Load Diff

									
										95

meson.build
									
												View File
											
				@@ -1,95 +0,0 @@

				# SPDX-License-Identifier: LGPL-2.1 OR BSD-2-Clause

				project('libbpf', 'c',

				        version : '0.0.3',

				        license : 'LGPL-2.1 OR BSD-2-Clause',

				        default_options : [

				                'prefix=/usr',

				        ],

				        meson_version : '>= 0.46',

				        )

				patchlevel = meson.project_version().split('.')[1]

				libbpf_source_dir = './'

				libbpf_sources = files(run_command('find',

				        [

				        '@0@/src'.format(libbpf_source_dir),

				        '-type',

				        'f',

				        '-name',

				        '*.[h|c]']).stdout().split())

				libbpf_headers = files(

				        join_paths(libbpf_source_dir, 'src/bpf.h'),

				        join_paths(libbpf_source_dir, 'src/btf.h'),

				        join_paths(libbpf_source_dir, 'src/libbpf.h'))

				feature_rellocarray = run_command(join_paths(libbpf_source_dir, 'scripts/check-reallocarray.sh'))

				libbpf_c_args = ['-g',

				                '-O2',

				                '-Werror',

				                '-Wall',

				]

				if feature_rellocarray.stdout().strip() != ''

				        libbpf_c_args += '-DCOMPAT_NEED_REALLOCARRAY'

				endif

				# bpf_includes are required to include bpf.h, btf.h, libbpf.h

				bpf_includes = include_directories(

				        join_paths(libbpf_source_dir, 'src'))

				libbpf_includes = include_directories(

				        join_paths(libbpf_source_dir, 'include'),

				        join_paths(libbpf_source_dir, 'include/uapi'))

				libelf = dependency('libelf')

				libelf = dependency('libelf', required: false)

				if not libelf.found()

				        libelf = cc.find_library('elf', required: true)

				endif

				deps = [libelf]

				libbpf_static = static_library(

				        'bpf',

				        libbpf_sources,

				        c_args : libbpf_c_args,

				        dependencies : deps,

				        include_directories : libbpf_includes,

				        install : true)

				libbpf_static_dep = declare_dependency(link_with : libbpf_static)

				libbpf_map_source_path = join_paths(libbpf_source_dir, 'src/libbpf.map')

				libbpf_map_abs_path = join_paths(meson.current_source_dir(), libbpf_map_source_path)

				libbpf_c_args += ['-fPIC', '-fvisibility=hidden']

				libbpf_link_args = ['-Wl,--version-script=@0@'.format(libbpf_map_abs_path)]

				libbpf_shared = shared_library(

				        'bpf',

				        libbpf_sources,

				        c_args : libbpf_c_args,

				        dependencies : deps,

				        include_directories : libbpf_includes,

				        install : true,

				        link_args : libbpf_link_args,

				        link_depends : libbpf_map_source_path,

				        soversion : patchlevel,

				        version : meson.project_version())

				libbpf_shared_dep = declare_dependency(link_with : libbpf_shared)

				install_headers(libbpf_headers, subdir : 'bpf')

				pkg = import('pkgconfig')

				pkg.generate(

				        name: meson.project_name(),

				        version: meson.project_version(),

				        libraries: libbpf_shared,

				        requires_private: ['libelf'],

				        description: '''BPF library''')

									
										82

scripts/build-fuzzers.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,82 @@

				#!/bin/bash

				set -eux

				SANITIZER=${SANITIZER:-address}

				flags="-O1 -fno-omit-frame-pointer -g -DFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION -fsanitize=$SANITIZER -fsanitize=fuzzer-no-link"

				export CC=${CC:-clang}

				export CFLAGS=${CFLAGS:-$flags}

				export CXX=${CXX:-clang++}

				export CXXFLAGS=${CXXFLAGS:-$flags}

				cd "$(dirname -- "$0")/.."

				export OUT=${OUT:-"$(pwd)/out"}

				mkdir -p "$OUT"

				export LIB_FUZZING_ENGINE=${LIB_FUZZING_ENGINE:--fsanitize=fuzzer}

				# libelf is compiled with _FORTIFY_SOURCE by default and it

				# isn't compatible with MSan. It was borrowed

				# from https://github.com/google/oss-fuzz/pull/7422

				if [[ "$SANITIZER" == memory ]]; then

				    CFLAGS+=" -U_FORTIFY_SOURCE"

				    CXXFLAGS+=" -U_FORTIFY_SOURCE"

				fi

				# The alignment check is turned off by default on OSS-Fuzz/CFLite so it should be

				# turned on explicitly there. It was borrowed from

				# https://github.com/google/oss-fuzz/pull/7092

				if [[ "$SANITIZER" == undefined ]]; then

				    additional_ubsan_checks=alignment

				    UBSAN_FLAGS="-fsanitize=$additional_ubsan_checks -fno-sanitize-recover=$additional_ubsan_checks"

				    CFLAGS+=" $UBSAN_FLAGS"

				    CXXFLAGS+=" $UBSAN_FLAGS"

				fi

				# Ideally libbelf should be built using release tarballs available

				# at https://sourceware.org/elfutils/ftp/. Unfortunately sometimes they

				# fail to compile (for example, elfutils-0.185 fails to compile with LDFLAGS enabled

				# due to https://bugs.gentoo.org/794601) so let's just point the script to

				# commits referring to versions of libelf that actually can be built

				rm -rf elfutils

				git clone https://sourceware.org/git/elfutils.git

				(

				cd elfutils

				git checkout 67a187d4c1790058fc7fd218317851cb68bb087c

				git log --oneline -1

				# ASan isn't compatible with -Wl,--no-undefined: https://github.com/google/sanitizers/issues/380

				sed -i 's/^\(NO_UNDEFINED=\).*/\1/' configure.ac

				# ASan isn't compatible with -Wl,-z,defs either:

				# https://clang.llvm.org/docs/AddressSanitizer.html#usage

				sed -i 's/^\(ZDEFS_LDFLAGS=\).*/\1/' configure.ac

				if [[ "$SANITIZER" == undefined ]]; then

				    # That's basicaly what --enable-sanitize-undefined does to turn off unaligned access

				    # elfutils heavily relies on on i386/x86_64 but without changing compiler flags along the way

				    sed -i 's/\(check_undefined_val\)=[0-9]/\1=1/' configure.ac

				fi

				autoreconf -i -f

				if ! ./configure --enable-maintainer-mode --disable-debuginfod --disable-libdebuginfod \

				            --disable-demangler --without-bzlib --without-lzma --without-zstd \

					    CC="$CC" CFLAGS="-Wno-error $CFLAGS" CXX="$CXX" CXXFLAGS="-Wno-error $CXXFLAGS" LDFLAGS="$CFLAGS"; then

				    cat config.log

				    exit 1

				fi

				make -C config -j$(nproc) V=1

				make -C lib -j$(nproc) V=1

				make -C libelf -j$(nproc) V=1

				)

				make -C src BUILD_STATIC_ONLY=y V=1 clean

				make -C src -j$(nproc) CFLAGS="-I$(pwd)/elfutils/libelf $CFLAGS" BUILD_STATIC_ONLY=y V=1

				$CC $CFLAGS -Isrc -Iinclude -Iinclude/uapi -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64  -c fuzz/bpf-object-fuzzer.c -o bpf-object-fuzzer.o

				$CXX $CXXFLAGS $LIB_FUZZING_ENGINE bpf-object-fuzzer.o src/libbpf.a "$(pwd)/elfutils/libelf/libelf.a" -l:libz.a -o "$OUT/bpf-object-fuzzer"

				cp fuzz/bpf-object-fuzzer_seed_corpus.zip "$OUT"

									
										18

scripts/check-reallocarray.sh
									
												View File
											
				@@ -1,18 +0,0 @@

				#!/bin/sh

				tfile=$(mktemp /tmp/test_reallocarray_XXXXXXXX.c)

				ofile=${tfile%.c}.o

				cat > $tfile <<EOL

				#define _GNU_SOURCE

				#include <stdlib.h>

				int main(void)

				{

					return !!reallocarray(NULL, 1, 1);

				}

				EOL

				gcc $tfile -o $ofile >/dev/null 2>&1

				if [ $? -ne 0 ]; then echo "FAIL"; fi

				/bin/rm -f $tfile $ofile

									
										105

scripts/coverity.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,105 @@

				#!/bin/bash

				# Taken from: https://scan.coverity.com/scripts/travisci_build_coverity_scan.sh

				# Local changes are annotated with "#[local]"

				set -e

				# Environment check

				echo -e "\033[33;1mNote: COVERITY_SCAN_PROJECT_NAME and COVERITY_SCAN_TOKEN are available on Project Settings page on scan.coverity.com\033[0m"

				[ -z "$COVERITY_SCAN_PROJECT_NAME" ] && echo "ERROR: COVERITY_SCAN_PROJECT_NAME must be set" && exit 1

				[ -z "$COVERITY_SCAN_NOTIFICATION_EMAIL" ] && echo "ERROR: COVERITY_SCAN_NOTIFICATION_EMAIL must be set" && exit 1

				[ -z "$COVERITY_SCAN_BRANCH_PATTERN" ] && echo "ERROR: COVERITY_SCAN_BRANCH_PATTERN must be set" && exit 1

				[ -z "$COVERITY_SCAN_BUILD_COMMAND" ] && echo "ERROR: COVERITY_SCAN_BUILD_COMMAND must be set" && exit 1

				[ -z "$COVERITY_SCAN_TOKEN" ] && echo "ERROR: COVERITY_SCAN_TOKEN must be set" && exit 1

				PLATFORM=`uname`

				#[local] Use /var/tmp for TOOL_ARCHIVE and TOOL_BASE, as on certain systems

				# /tmp is tmpfs and is sometimes too small to handle all necessary tooling

				TOOL_ARCHIVE=/var//tmp/cov-analysis-${PLATFORM}.tgz

				TOOL_URL=https://scan.coverity.com/download/${PLATFORM}

				TOOL_BASE=/var/tmp/coverity-scan-analysis

				UPLOAD_URL="https://scan.coverity.com/builds"

				SCAN_URL="https://scan.coverity.com"

				# Do not run on pull requests

				if [ "${TRAVIS_PULL_REQUEST}" = "true" ]; then

				  echo -e "\033[33;1mINFO: Skipping Coverity Analysis: branch is a pull request.\033[0m"

				  exit 0

				fi

				# Verify this branch should run

				IS_COVERITY_SCAN_BRANCH=`ruby -e "puts '${TRAVIS_BRANCH}' =~ /\\A$COVERITY_SCAN_BRANCH_PATTERN\\z/ ? 1 : 0"`

				if [ "$IS_COVERITY_SCAN_BRANCH" = "1" ]; then

				  echo -e "\033[33;1mCoverity Scan configured to run on branch ${TRAVIS_BRANCH}\033[0m"

				else

				  echo -e "\033[33;1mCoverity Scan NOT configured to run on branch ${TRAVIS_BRANCH}\033[0m"

				  exit 1

				fi

				# Verify upload is permitted

				AUTH_RES=`curl -s --form project="$COVERITY_SCAN_PROJECT_NAME" --form token="$COVERITY_SCAN_TOKEN" $SCAN_URL/api/upload_permitted`

				if [ "$AUTH_RES" = "Access denied" ]; then

				  echo -e "\033[33;1mCoverity Scan API access denied. Check COVERITY_SCAN_PROJECT_NAME and COVERITY_SCAN_TOKEN.\033[0m"

				  exit 1

				else

				  AUTH=`echo $AUTH_RES | ruby -e "require 'rubygems'; require 'json'; puts JSON[STDIN.read]['upload_permitted']"`

				  if [ "$AUTH" = "true" ]; then

				    echo -e "\033[33;1mCoverity Scan analysis authorized per quota.\033[0m"

				  else

				    WHEN=`echo $AUTH_RES | ruby -e "require 'rubygems'; require 'json'; puts JSON[STDIN.read]['next_upload_permitted_at']"`

				    echo -e "\033[33;1mCoverity Scan analysis NOT authorized until $WHEN.\033[0m"

				    exit 0

				  fi

				fi

				if [ ! -d $TOOL_BASE ]; then

				  # Download Coverity Scan Analysis Tool

				  if [ ! -e $TOOL_ARCHIVE ]; then

				    echo -e "\033[33;1mDownloading Coverity Scan Analysis Tool...\033[0m"

				    wget -nv -O $TOOL_ARCHIVE $TOOL_URL --post-data "project=$COVERITY_SCAN_PROJECT_NAME&token=$COVERITY_SCAN_TOKEN"

				  fi

				  # Extract Coverity Scan Analysis Tool

				  echo -e "\033[33;1mExtracting Coverity Scan Analysis Tool...\033[0m"

				  mkdir -p $TOOL_BASE

				  pushd $TOOL_BASE

				  tar xzf $TOOL_ARCHIVE

				  popd

				fi

				TOOL_DIR=`find $TOOL_BASE -type d -name 'cov-analysis*'`

				export PATH=$TOOL_DIR/bin:$PATH

				# Build

				echo -e "\033[33;1mRunning Coverity Scan Analysis Tool...\033[0m"

				COV_BUILD_OPTIONS=""

				#COV_BUILD_OPTIONS="--return-emit-failures 8 --parse-error-threshold 85"

				RESULTS_DIR="cov-int"

				eval "${COVERITY_SCAN_BUILD_COMMAND_PREPEND}"

				COVERITY_UNSUPPORTED=1 cov-build --dir $RESULTS_DIR $COV_BUILD_OPTIONS $COVERITY_SCAN_BUILD_COMMAND

				cov-import-scm --dir $RESULTS_DIR --scm git --log $RESULTS_DIR/scm_log.txt 2>&1

				# Upload results

				echo -e "\033[33;1mTarring Coverity Scan Analysis results...\033[0m"

				RESULTS_ARCHIVE=analysis-results.tgz

				tar czf $RESULTS_ARCHIVE $RESULTS_DIR

				SHA=`git rev-parse --short HEAD`

				echo -e "\033[33;1mUploading Coverity Scan Analysis results...\033[0m"

				response=$(curl \

				  --silent --write-out "\n%{http_code}\n" \

				  --form project=$COVERITY_SCAN_PROJECT_NAME \

				  --form token=$COVERITY_SCAN_TOKEN \

				  --form email=$COVERITY_SCAN_NOTIFICATION_EMAIL \

				  --form file=@$RESULTS_ARCHIVE \

				  --form version=$SHA \

				  --form description="Travis CI build" \

				  $UPLOAD_URL)

				status_code=$(echo "$response" | sed -n '$p')

				#[local] Coverity used to return 201 on success, but it's 200 now

				# See https://github.com/systemd/systemd/blob/master/tools/coverity.sh#L145

				if [ "$status_code" != "200" ]; then

				  TEXT=$(echo "$response" | sed '$d')

				  echo -e "\033[33;1mCoverity Scan upload failed: $TEXT.\033[0m"

				  exit 1

				fi

									
										37

scripts/mailmap-update.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,37 @@

				#!/usr/bin/env bash

				set -eu

				usage () {

				    echo "USAGE: ./mailmap-update.sh <libbpf-repo> <linux-repo>"

				    exit 1

				}

				LIBBPF_REPO="${1-""}"

				LINUX_REPO="${2-""}"

				if [ -z "${LIBBPF_REPO}" ] || [ -z "${LINUX_REPO}" ]; then

				    echo "Error: libbpf or linux repos are not specified"

				    usage

				fi

				LIBBPF_MAILMAP="${LIBBPF_REPO}/.mailmap"

				LINUX_MAILMAP="${LINUX_REPO}/.mailmap"

				tmpfile="$(mktemp)"

				cleanup() {

				    rm -f "${tmpfile}"

				}

				trap cleanup EXIT

				grep_lines() {

				    local pattern="$1"

				    local file="$2"

				    grep "${pattern}" "${file}" || true

				}

				while read -r email; do

				    grep_lines "${email}$" "${LINUX_MAILMAP}" >> "${tmpfile}"

				done < <(git log --format='<%ae>' | sort -u)

				sort -u "${tmpfile}" > "${LIBBPF_MAILMAP}"

									
										417

scripts/sync-kernel.sh
									
												View File
												
				@@ -1,40 +1,214 @@

				#!/bin/bash

				usage () {

				    echo "USAGE: ./sync-kernel.sh <kernel-repo> <libbpf-repo> [<baseline-commit>]"

				    echo ""

				    echo "If <baseline-commit> is not specified, it's read from <libbpf-repo>/CHECKPOINT-COMMIT"

				    exit 1

					echo "USAGE: ./sync-kernel.sh <libbpf-repo> <kernel-repo> <bpf-branch>"

					echo ""

					echo "Set BPF_NEXT_BASELINE to override bpf-next tree commit, otherwise read from <libbpf-repo>/CHECKPOINT-COMMIT."

					echo "Set BPF_BASELINE to override bpf tree commit, otherwise read from <libbpf-repo>/BPF-CHECKPOINT-COMMIT."

					echo "Set MANUAL_MODE to 1 to manually control every cherry-picked commits."

					exit 1

				}

				LINUX_REPO=${1-""}

				LIBBPF_REPO=${2-""}

				if [ -z "${LINUX_REPO}" ]; then

				    usage

				fi

				if [ -z "${LIBBPF_REPO}" ]; then

				    usage

				fi

				set -eu

				WORKDIR=$(pwd)

				trap "cd ${WORKDIR}; exit" INT TERM EXIT

				LIBBPF_REPO=${1-""}

				LINUX_REPO=${2-""}

				BPF_BRANCH=${3-""}

				BASELINE_COMMIT=${BPF_NEXT_BASELINE:-$(cat ${LIBBPF_REPO}/CHECKPOINT-COMMIT)}

				BPF_BASELINE_COMMIT=${BPF_BASELINE:-$(cat ${LIBBPF_REPO}/BPF-CHECKPOINT-COMMIT)}

				echo "WORKDIR:         ${WORKDIR}"

				echo "LINUX REPO:      ${LINUX_REPO}"

				echo "LIBBPF REPO:     ${LIBBPF_REPO}"

				if [ -z "${LIBBPF_REPO}" ] || [ -z "${LINUX_REPO}" ]; then

					echo "Error: libbpf or linux repos are not specified"

					usage

				fi

				if [ -z "${BPF_BRANCH}" ]; then

					echo "Error: linux's bpf tree branch is not specified"

					usage

				fi

				if [ -z "${BASELINE_COMMIT}" ] || [ -z "${BPF_BASELINE_COMMIT}" ]; then

					echo "Error: bpf or bpf-next baseline commits are not provided"

					usage

				fi

				SUFFIX=$(date --utc +%Y-%m-%dT%H-%M-%S.%3NZ)

				BASELINE_COMMIT=${3-$(cat ${LIBBPF_REPO}/CHECKPOINT-COMMIT)}

				WORKDIR=$(pwd)

				TMP_DIR=$(mktemp -d)

				trap "cd ${WORKDIR}; exit" INT TERM EXIT

				declare -A PATH_MAP

				PATH_MAP=(									\

					[tools/lib/bpf]=src							\

					[tools/include/uapi/linux/bpf_common.h]=include/uapi/linux/bpf_common.h	\

					[tools/include/uapi/linux/bpf.h]=include/uapi/linux/bpf.h		\

					[tools/include/uapi/linux/btf.h]=include/uapi/linux/btf.h		\

					[tools/include/uapi/linux/fcntl.h]=include/uapi/linux/fcntl.h		\

					[tools/include/uapi/linux/openat2.h]=include/uapi/linux/openat2.h	\

					[tools/include/uapi/linux/if_link.h]=include/uapi/linux/if_link.h	\

					[tools/include/uapi/linux/if_xdp.h]=include/uapi/linux/if_xdp.h		\

					[tools/include/uapi/linux/netdev.h]=include/uapi/linux/netdev.h		\

					[tools/include/uapi/linux/netlink.h]=include/uapi/linux/netlink.h	\

					[tools/include/uapi/linux/pkt_cls.h]=include/uapi/linux/pkt_cls.h	\

					[tools/include/uapi/linux/pkt_sched.h]=include/uapi/linux/pkt_sched.h	\

					[include/uapi/linux/perf_event.h]=include/uapi/linux/perf_event.h	\

					[Documentation/bpf/libbpf]=docs						\

				)

				LIBBPF_PATHS=("${!PATH_MAP[@]}" ":^tools/lib/bpf/Makefile" ":^tools/lib/bpf/Build" ":^tools/lib/bpf/.gitignore" ":^tools/include/tools/libc_compat.h")

				LIBBPF_VIEW_PATHS=("${PATH_MAP[@]}")

				LIBBPF_VIEW_EXCLUDE_REGEX='^src/(Makefile|Build|test_libbpf\.c|bpf_helper_defs\.h|\.gitignore)$|^docs/(\.gitignore|api\.rst|conf\.py)$|^docs/sphinx/.*'

				LINUX_VIEW_EXCLUDE_REGEX='^include/tools/libc_compat.h$'

				LIBBPF_TREE_FILTER="mkdir -p __libbpf/include/uapi/linux __libbpf/include/tools && "$'\\\n'

				for p in "${!PATH_MAP[@]}"; do

					LIBBPF_TREE_FILTER+="git mv -kf ${p} __libbpf/${PATH_MAP[${p}]} && "$'\\\n'

				done

				LIBBPF_TREE_FILTER+="git rm --ignore-unmatch -f __libbpf/src/{Makefile,Build,test_libbpf.c,.gitignore} >/dev/null"

				cd_to()

				{

					cd ${WORKDIR} && cd "$1"

				}

				# Output brief single-line commit description

				# $1 - commit ref

				commit_desc()

				{

					git log -n1 --pretty='%h ("%s")' $1

				}

				# Create commit single-line signature, which consists of:

				# - full commit subject

				# - author date in ISO8601 format

				# - full commit body with newlines replaced with vertical bars (|)

				# - shortstat appended at the end

				# The idea is that this single-line signature is good enough to make final

				# decision about whether two commits are the same, across different repos.

				# $1 - commit ref

				# $2 - paths filter

				commit_signature()

				{

					local ref=$1

					shift

					git show --pretty='("%s")|%aI|%b' --shortstat $ref -- "${@-.}" | tr '\n' '|'

				}

				# Cherry-pick commits touching libbpf-related files

				# $1 - baseline_tag

				# $2 - tip_tag

				cherry_pick_commits()

				{

					local manual_mode=${MANUAL_MODE:-0}

					local baseline_tag=$1

					local tip_tag=$2

					local new_commits

					local signature

					local should_skip

					local synced_cnt

					local manual_check

					local libbpf_conflict_cnt

					local desc

					new_commits=$(git rev-list --no-merges --topo-order --reverse ${baseline_tag}..${tip_tag} -- "${LIBBPF_PATHS[@]}")

					for new_commit in ${new_commits}; do

						desc="$(commit_desc ${new_commit})"

						signature="$(commit_signature ${new_commit} "${LIBBPF_PATHS[@]}")"

						synced_cnt=$(grep -F "${signature}" ${TMP_DIR}/libbpf_commits.txt | wc -l)

						manual_check=0

						if ((${synced_cnt} > 0)); then

							# commit with the same subject is already in libbpf, but it's

							# not 100% the same commit, so check with user

							echo "Commit '${desc}' is synced into libbpf as:"

							grep -F "${signature}" ${TMP_DIR}/libbpf_commits.txt | \

								cut -d'|' -f1 | sed -e 's/^/- /'

							if ((${manual_mode} != 1 && ${synced_cnt} == 1)); then

								echo "Skipping '${desc}' due to unique match..."

								continue

							fi

							if ((${synced_cnt} > 1)); then

								echo "'${desc} matches multiple commits, please, double-check!"

								manual_check=1

							fi

						fi

						if ((${manual_mode} == 1 || ${manual_check} == 1)); then

							read -p "Do you want to skip '${desc}'? [y/N]: " should_skip

							case "${should_skip}" in

								"y" | "Y")

									echo "Skipping '${desc}'..."

									continue

									;;

							esac

						fi

						# commit hasn't been synced into libbpf yet

						echo "Picking '${desc}'..."

						if ! git cherry-pick ${new_commit} &>/dev/null; then

							echo "Warning! Cherry-picking '${desc} failed, checking if it's non-libbpf files causing problems..."

							libbpf_conflict_cnt=$(git diff --name-only --diff-filter=U -- "${LIBBPF_PATHS[@]}" | wc -l)

							conflict_cnt=$(git diff --name-only | wc -l)

							prompt_resolution=1

							if ((${libbpf_conflict_cnt} == 0)); then

								echo "Looks like only non-libbpf files have conflicts, ignoring..."

								if ((${conflict_cnt} == 0)); then

									echo "Empty cherry-pick, skipping it..."

									git cherry-pick --abort

									continue

								fi

								git add .

								# GIT_EDITOR=true to avoid editor popping up to edit commit message

								if ! GIT_EDITOR=true git cherry-pick --continue &>/dev/null; then

									echo "Error! That still failed! Please resolve manually."

								else

									echo "Success! All cherry-pick conflicts were resolved for '${desc}'!"

									prompt_resolution=0

								fi

							fi

							if ((${prompt_resolution} == 1)); then

								read -p "Error! Cherry-picking '${desc}' failed, please fix manually and press <return> to proceed..."

							fi

						fi

						# Append signature of just cherry-picked commit to avoid

						# potentially cherry-picking the same commit twice later when

						# processing bpf tree commits. At this point we don't know yet

						# the final commit sha in libbpf repo, so we record Linux SHA

						# instead as LINUX_<sha>.

						echo LINUX_$(git log --pretty='%h' -n1) "${signature}" >> ${TMP_DIR}/libbpf_commits.txt

					done

				}

				cleanup()

				{

					echo "Cleaning up..."

					rm -r ${TMP_DIR}

					cd_to ${LINUX_REPO}

					git checkout ${TIP_SYM_REF}

					git branch -D ${BASELINE_TAG} ${TIP_TAG} ${BPF_BASELINE_TAG} ${BPF_TIP_TAG} \

						      ${SQUASH_BASE_TAG} ${SQUASH_TIP_TAG} ${VIEW_TAG} || true

					cd_to .

					echo "DONE."

				}

				cd_to ${LIBBPF_REPO}

				GITHUB_ABS_DIR=$(pwd)

				echo "Dumping existing libbpf commit signatures..."

				for h in $(git log --pretty='%h' -n500); do

					echo $h "$(commit_signature $h)" >> ${TMP_DIR}/libbpf_commits.txt

				done

				# Use current kernel repo HEAD as a source of patches

				cd ${LINUX_REPO}

				cd_to ${LINUX_REPO}

				LINUX_ABS_DIR=$(pwd)

				TIP_SYM_REF=$(git symbolic-ref -q --short HEAD || git rev-parse HEAD)

				TIP_COMMIT=$(git rev-parse HEAD)

				BPF_TIP_COMMIT=$(git rev-parse ${BPF_BRANCH})

				BASELINE_TAG=libbpf-baseline-${SUFFIX}

				TIP_TAG=libbpf-tip-${SUFFIX}

				BPF_BASELINE_TAG=libbpf-bpf-baseline-${SUFFIX}

				BPF_TIP_TAG=libbpf-bpf-tip-${SUFFIX}

				VIEW_TAG=libbpf-view-${SUFFIX}

				LIBBPF_SYNC_TAG=libbpf-sync-${SUFFIX}

				@@ -43,142 +217,155 @@ SQUASH_BASE_TAG=libbpf-squash-base-${SUFFIX}

				SQUASH_TIP_TAG=libbpf-squash-tip-${SUFFIX}

				SQUASH_COMMIT=$(git commit-tree ${BASELINE_COMMIT}^{tree} -m "BASELINE SQUASH ${BASELINE_COMMIT}")

				echo "SUFFIX:          ${SUFFIX}"

				echo "BASELINE COMMIT: $(git log --pretty=oneline --no-walk ${BASELINE_COMMIT})"

				echo "TIP COMMIT:      $(git log --pretty=oneline --no-walk ${TIP_COMMIT})"

				echo "SQUASH COMMIT:   ${SQUASH_COMMIT}"

				echo "BASELINE TAG:    ${BASELINE_TAG}"

				echo "TIP TAG:         ${TIP_TAG}"

				echo "SQUASH BASE TAG: ${SQUASH_BASE_TAG}"

				echo "SQUASH TIP TAG:  ${SQUASH_TIP_TAG}"

				echo "VIEW TAG:        ${VIEW_TAG}"

				echo "LIBBPF SYNC TAG: ${LIBBPF_SYNC_TAG}"

				TMP_DIR=$(mktemp -d)

				echo "TEMP DIR:        ${TMP_DIR}"

				echo "PATCHES+COVER:   ${TMP_DIR}/patches"

				echo "PATCHSET:        ${TMP_DIR}/patchset.patch"

				echo "WORKDIR:          ${WORKDIR}"

				echo "LINUX REPO:       ${LINUX_REPO}"

				echo "LIBBPF REPO:      ${LIBBPF_REPO}"

				echo "TEMP DIR:         ${TMP_DIR}"

				echo "SUFFIX:           ${SUFFIX}"

				echo "BASE COMMIT:      '$(commit_desc ${BASELINE_COMMIT})'"

				echo "TIP COMMIT:       '$(commit_desc ${TIP_COMMIT})'"

				echo "BPF BASE COMMIT:  '$(commit_desc ${BPF_BASELINE_COMMIT})'"

				echo "BPF TIP COMMIT:   '$(commit_desc ${BPF_TIP_COMMIT})'"

				echo "SQUASH COMMIT:    ${SQUASH_COMMIT}"

				echo "BASELINE TAG:     ${BASELINE_TAG}"

				echo "TIP TAG:          ${TIP_TAG}"

				echo "BPF BASELINE TAG: ${BPF_BASELINE_TAG}"

				echo "BPF TIP TAG:      ${BPF_TIP_TAG}"

				echo "SQUASH BASE TAG:  ${SQUASH_BASE_TAG}"

				echo "SQUASH TIP TAG:   ${SQUASH_TIP_TAG}"

				echo "VIEW TAG:         ${VIEW_TAG}"

				echo "LIBBPF SYNC TAG:  ${LIBBPF_SYNC_TAG}"

				echo "PATCHES:          ${TMP_DIR}/patches"

				git branch ${BASELINE_TAG} ${BASELINE_COMMIT}

				git branch ${TIP_TAG} ${TIP_COMMIT}

				git branch ${BPF_BASELINE_TAG} ${BPF_BASELINE_COMMIT}

				git branch ${BPF_TIP_TAG} ${BPF_TIP_COMMIT}

				git branch ${SQUASH_BASE_TAG} ${SQUASH_COMMIT}

				git checkout -b ${SQUASH_TIP_TAG} ${SQUASH_COMMIT}

				# Cherry-pick new commits onto squashed baseline commit

				LIBBPF_PATHS=(tools/lib/bpf tools/include/uapi/linux/{bpf_common.h,bpf.h,btf.h,if_link.h,if_xdp.h,netlink.h} tools/include/tools/libc_compat.h)

				cherry_pick_commits ${BASELINE_TAG} ${TIP_TAG}

				cherry_pick_commits ${BPF_BASELINE_TAG} ${BPF_TIP_TAG}

				LIBBPF_NEW_MERGES=$(git rev-list --merges --topo-order --reverse ${BASELINE_TAG}..${TIP_TAG} ${LIBBPF_PATHS[@]})

				for LIBBPF_NEW_MERGE in ${LIBBPF_NEW_MERGES}; do

					printf "MERGE:\t" && git log --oneline -n1 ${LIBBPF_NEW_MERGE}

					MERGE_CHANGES=$(git log --format='' -n1 ${LIBBPF_NEW_MERGE} | wc -l)

					if ((${MERGE_CHANGES} > 0)); then

						echo "Merge is non empty, aborting!.."

						exit 3

					fi

				done

				cd ${WORKDIR} && cd ${LIBBPF_REPO}

				git log --oneline -n500 > ${TMP_DIR}/libbpf_commits.txt

				cd ${WORKDIR} && cd ${LINUX_REPO}

				LIBBPF_NEW_COMMITS=$(git rev-list --no-merges --topo-order --reverse ${BASELINE_TAG}..${TIP_TAG} ${LIBBPF_PATHS[@]})

				for LIBBPF_NEW_COMMIT in ${LIBBPF_NEW_COMMITS}; do

					echo "Checking commit '${LIBBPF_NEW_COMMIT}'"

					SYNCED_COMMITS=$(grep -F "$(git log -n1 --pretty=format:%s ${LIBBPF_NEW_COMMIT})" ${TMP_DIR}/libbpf_commits.txt || echo "")

					if [ -n "${SYNCED_COMMITS}" ]; then

						# commit with the same subject is already in libbpf, but it's not 100% the same commit, so check with user

						echo "Commit '$(git log -n1 --oneline ${LIBBPF_NEW_COMMIT})' appears to be already synced into libbpf..."

						echo "Corresponding libbpf commit(s):"

						echo "${SYNCED_COMMITS}"

						read -p "Do you want to skip it? [y/N]: " SHOULD_SKIP

						case "${SHOULD_SKIP}" in 

							"y" | "Y")

								echo "Skipping '$(git log -n1 --oneline ${LIBBPF_NEW_COMMIT})'..."

								continue

								;;

						esac

					fi

					# commit hasn't been synced into libbpf yet

					if ! git cherry-pick ${LIBBPF_NEW_COMMIT}; then 

						read -p "Cherry-picking '$(git log --oneline -n1 ${LIBBPF_NEW_COMMIT})' failed, please fix manually and press <return> to proceed..."

					fi

				done

				LIBBPF_TREE_FILTER='												\

				    mkdir -p __libbpf/include/uapi/linux __libbpf/include/tools &&						\

				    git mv -kf tools/lib/bpf __libbpf/src &&									\

				    git mv -kf tools/include/uapi/linux/{bpf_common.h,bpf.h,btf.h,if_link.h,if_xdp.h,netlink.h}			\

					       __libbpf/include/uapi/linux &&									\

				    git mv -kf tools/include/tools/libc_compat.h __libbpf/include/tools &&					\

				    git rm --ignore-unmatch -f __libbpf/src/{Makefile,Build,test_libbpf.cpp,.gitignore}				\

				'

				# Move all libbpf files into __libbpf directory.

				git filter-branch --prune-empty -f --tree-filter "${LIBBPF_TREE_FILTER}" ${SQUASH_TIP_TAG} ${SQUASH_BASE_TAG}

				FILTER_BRANCH_SQUELCH_WARNING=1 git filter-branch --prune-empty -f --tree-filter "${LIBBPF_TREE_FILTER}" ${SQUASH_TIP_TAG} ${SQUASH_BASE_TAG}

				# Make __libbpf a new root directory

				git filter-branch --prune-empty -f --subdirectory-filter __libbpf ${SQUASH_TIP_TAG} ${SQUASH_BASE_TAG}

				FILTER_BRANCH_SQUELCH_WARNING=1 git filter-branch --prune-empty -f --subdirectory-filter __libbpf ${SQUASH_TIP_TAG} ${SQUASH_BASE_TAG}

				# If there are no new commits with  libbpf-related changes, bail out

				COMMIT_CNT=$(git rev-list --count ${SQUASH_BASE_TAG}..${SQUASH_TIP_TAG})

				if ((${COMMIT_CNT} <= 0)); then

				    echo "No new changes to apply, we are done!"

				    cleanup

				    exit 2

				fi

				# Exclude baseline commit and generate nice cover letter with summary

				git format-patch ${SQUASH_BASE_TAG}..${SQUASH_TIP_TAG} --cover-letter -o ${TMP_DIR}/patches

				# Now generate single-file patchset w/o cover to apply on top of libbpf repo

				git format-patch ${SQUASH_BASE_TAG}..${SQUASH_TIP_TAG} --stdout > ${TMP_DIR}/patchset.patch

				git format-patch --no-signature ${SQUASH_BASE_TAG}..${SQUASH_TIP_TAG} --cover-letter -o ${TMP_DIR}/patches

				# Now is time to re-apply libbpf-related linux  patches to libbpf repo

				cd ${WORKDIR} && cd ${LIBBPF_REPO}

				# Now is time to re-apply libbpf-related linux patches to libbpf repo

				cd_to ${LIBBPF_REPO}

				git checkout -b ${LIBBPF_SYNC_TAG}

				git am --committer-date-is-author-date ${TMP_DIR}/patchset.patch

				for patch in $(ls -1 ${TMP_DIR}/patches | tail -n +2); do

					if ! git am -3 --committer-date-is-author-date "${TMP_DIR}/patches/${patch}"; then

						if ! patch -p1 --merge < "${TMP_DIR}/patches/${patch}"; then

							read -p "Applying ${TMP_DIR}/patches/${patch} failed, please resolve manually and press <return> to proceed..."

						fi

						git am --continue

					fi

				done

				# Generate bpf_helper_defs.h and commit, if anything changed

				# restore Linux tip to use bpf_doc.py

				cd_to ${LINUX_REPO}

				git checkout ${TIP_TAG}

				# re-generate bpf_helper_defs.h

				cd_to ${LIBBPF_REPO}

				"${LINUX_ABS_DIR}/scripts/bpf_doc.py" --header					      \

					--file include/uapi/linux/bpf.h > src/bpf_helper_defs.h

				# if anything changed, commit it

				helpers_changes=$(git status --porcelain src/bpf_helper_defs.h | wc -l)

				if ((${helpers_changes} == 1)); then

					git add src/bpf_helper_defs.h

					git commit -s -m "sync: auto-generate latest BPF helpers

				Latest changes to BPF helper definitions.

				" -- src/bpf_helper_defs.h

				fi

				echo "Regenerating .mailmap..."

				cd_to "${LINUX_REPO}"

				git checkout "${TIP_SYM_REF}"

				cd_to "${LIBBPF_REPO}"

				"${LIBBPF_REPO}"/scripts/mailmap-update.sh "${LIBBPF_REPO}" "${LINUX_REPO}"

				# if anything changed, commit it

				mailmap_changes=$(git status --porcelain .mailmap | wc -l)

				if ((${mailmap_changes} == 1)); then

					git add .mailmap

					git commit -s -m "sync: update .mailmap

				Update .mailmap based on libbpf's list of contributors and on the latest

				.mailmap version in the upstream repository.

				" -- .mailmap

				fi

				# Use generated cover-letter as a template for "sync commit" with

				# baseline and checkpoint commits from kernel repo (and leave summary

				# from cover letter intact, of course)

				echo ${TIP_COMMIT} > CHECKPOINT-COMMIT &&					      \

				echo ${BPF_TIP_COMMIT} > BPF-CHECKPOINT-COMMIT &&				      \

				git add CHECKPOINT-COMMIT &&							      \

				git add BPF-CHECKPOINT-COMMIT &&						      \

				awk '/\*\*\* BLURB HERE \*\*\*/ {p=1} p' ${TMP_DIR}/patches/0000-cover-letter.patch | \

				sed "s/\*\*\* BLURB HERE \*\*\*/\

				sync: latest libbpf changes from kernel\n\

				\n\

				Syncing latest libbpf commits from kernel repository.\n\

				Baseline commit:   ${BASELINE_COMMIT}\n\

				Checkpoint commit: ${TIP_COMMIT}/" |						      \

				git commit --file=-

				Baseline bpf-next commit:   ${BASELINE_COMMIT}\n\

				Checkpoint bpf-next commit: ${TIP_COMMIT}\n\

				Baseline bpf commit:        ${BPF_BASELINE_COMMIT}\n\

				Checkpoint bpf commit:      ${BPF_TIP_COMMIT}/" |				      \

				git commit -s --file=-

				echo "SUCCESS! ${COMMIT_CNT} commits synced."

				echo "Verifying Linux's and Github's libbpf state"

				LIBBPF_VIEW_PATHS=(src include/uapi/linux/{bpf_common.h,bpf.h,btf.h,if_link.h,if_xdp.h,netlink.h} include/tools/libc_compat.h)

				LIBBPF_VIEW_EXCLUDE_REGEX='^src/(Makefile|Build|test_libbpf.cpp|\.gitignore)$'

				cd ${WORKDIR} && cd ${LINUX_REPO}

				LINUX_ABS_DIR=$(pwd)

				cd_to ${LINUX_REPO}

				git checkout -b ${VIEW_TAG} ${TIP_COMMIT}

				git filter-branch -f --tree-filter "${LIBBPF_TREE_FILTER}" ${VIEW_TAG}^..${VIEW_TAG}

				git filter-branch -f --subdirectory-filter __libbpf ${VIEW_TAG}^..${VIEW_TAG}

				git ls-files -- ${LIBBPF_VIEW_PATHS[@]} > ${TMP_DIR}/linux-view.ls

				FILTER_BRANCH_SQUELCH_WARNING=1 git filter-branch -f --tree-filter "${LIBBPF_TREE_FILTER}" ${VIEW_TAG}^..${VIEW_TAG}

				FILTER_BRANCH_SQUELCH_WARNING=1 git filter-branch -f --subdirectory-filter __libbpf ${VIEW_TAG}^..${VIEW_TAG}

				git ls-files -- "${LIBBPF_VIEW_PATHS[@]}" | grep -v -E "${LINUX_VIEW_EXCLUDE_REGEX}" > ${TMP_DIR}/linux-view.ls

				cd ${WORKDIR} && cd ${LIBBPF_REPO}

				GITHUB_ABS_DIR=$(pwd)

				git ls-files -- ${LIBBPF_VIEW_PATHS[@]} | grep -v -E "${LIBBPF_VIEW_EXCLUDE_REGEX}" > ${TMP_DIR}/github-view.ls

				cd_to ${LIBBPF_REPO}

				git ls-files -- "${LIBBPF_VIEW_PATHS[@]}" | grep -v -E "${LIBBPF_VIEW_EXCLUDE_REGEX}" > ${TMP_DIR}/github-view.ls

				echo "Comparing list of files..."

				diff ${TMP_DIR}/linux-view.ls ${TMP_DIR}/github-view.ls

				diff -u ${TMP_DIR}/linux-view.ls ${TMP_DIR}/github-view.ls

				echo "Comparing file contents..."

				CONSISTENT=1

				for F in $(cat ${TMP_DIR}/linux-view.ls); do

					diff "${LINUX_ABS_DIR}/${F}" "${GITHUB_ABS_DIR}/${F}"

					if ! diff -u "${LINUX_ABS_DIR}/${F}" "${GITHUB_ABS_DIR}/${F}"; then

						echo "${LINUX_ABS_DIR}/${F} and ${GITHUB_ABS_DIR}/${F} are different!"

						CONSISTENT=0

					fi

				done

				echo "Contents appear identical!"

				echo "Cleaning up..."

				rm -r ${TMP_DIR}

				cd ${WORKDIR} && cd ${LINUX_REPO}

				git checkout ${TIP_SYM_REF}

				git branch -D ${BASELINE_TAG} ${TIP_TAG} ${SQUASH_BASE_TAG} ${SQUASH_TIP_TAG} ${VIEW_TAG}

				cd ${WORKDIR}

				echo "DONE."

				if ((${CONSISTENT} == 1)); then

					echo "Great! Content is identical!"

				else

					ignore_inconsistency=n

					echo "Unfortunately, there are some inconsistencies, please double check."

					read -p "Does everything look good? [y/N]: " ignore_inconsistency

					case "${ignore_inconsistency}" in

						"y" | "Y")

							echo "Ok, proceeding..."

							;;

						*)

							echo "Oops, exiting with error..."

							exit 4

					esac

				fi

				cleanup

2

src/.gitignore vendored

View File

@@ -2,3 +2,5 @@
 *.a
 /libbpf.pc
 /libbpf.so*
 /staticobjs
 /sharedobjs

									
										173

src/Makefile
									
												View File
												
				@@ -1,53 +1,81 @@

				# SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)

				VERSION = 0

				PATCHLEVEL = 0

				EXTRAVERSION = 3

				ifeq ($(V),1)

					Q =

					msg =

				else

					Q = @

					msg = @printf '  %-8s %s%s\n' "$(1)" "$(2)" "$(if $(3), $(3))";

				endif

				LIBBPF_VERSION	= $(VERSION).$(PATCHLEVEL).$(EXTRAVERSION)

				LIBBPF_MAJOR_VERSION := 1

				LIBBPF_MINOR_VERSION := 6

				LIBBPF_PATCH_VERSION := 0

				LIBBPF_VERSION := $(LIBBPF_MAJOR_VERSION).$(LIBBPF_MINOR_VERSION).$(LIBBPF_PATCH_VERSION)

				LIBBPF_MAJMIN_VERSION := $(LIBBPF_MAJOR_VERSION).$(LIBBPF_MINOR_VERSION).0

				LIBBPF_MAP_VERSION := $(shell grep -oE '^LIBBPF_([0-9.]+)' libbpf.map | sort -rV | head -n1 | cut -d'_' -f2)

				ifneq ($(LIBBPF_MAJMIN_VERSION), $(LIBBPF_MAP_VERSION))

				$(error Libbpf release ($(LIBBPF_VERSION)) and map ($(LIBBPF_MAP_VERSION)) versions are out of sync!)

				endif

				define allow-override

				  $(if $(or $(findstring environment,$(origin $(1))),\

				            $(findstring command line,$(origin $(1)))),,\

				    $(eval $(1) = $(2)))

				endef

				$(call allow-override,CC,$(CROSS_COMPILE)cc)

				$(call allow-override,LD,$(CROSS_COMPILE)ld)

				PKG_CONFIG ?= pkg-config

				TOPDIR = ..

				INCLUDES := -I. -I$(TOPDIR)/include -I$(TOPDIR)/include/uapi

				ALL_CFLAGS := $(INCLUDES)

				FEATURE_REALLOCARRAY := $(shell $(TOPDIR)/scripts/check-reallocarray.sh)

				ifneq ($(FEATURE_REALLOCARRAY),)

					ALL_CFLAGS += -DCOMPAT_NEED_REALLOCARRAY

				endif

				SHARED_CFLAGS += -fPIC -fvisibility=hidden -DSHARED

				ifndef BUILD_STATIC_ONLY

					ALL_CFLAGS += -fPIC -fvisibility=hidden

				endif

				CFLAGS ?= -g -O2 -Werror -Wall -std=gnu89

				ALL_CFLAGS += $(CFLAGS) 						\

					      -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64		\

					      -Wno-unknown-warning-option -Wno-format-overflow		\

					      $(EXTRA_CFLAGS)

				ALL_LDFLAGS += $(LDFLAGS) $(EXTRA_LDFLAGS)

				CFLAGS ?= -g -O2 -Werror -Wall

				ALL_CFLAGS += $(CFLAGS)

				ALL_LDFLAGS += $(LDFLAGS)

				ifeq ($(shell command -v $(PKG_CONFIG) 2> /dev/null),)

				NO_PKG_CONFIG := 1

				endif

				ifdef NO_PKG_CONFIG

					ALL_LDFLAGS += -lelf

					ALL_LDFLAGS += -lelf -lz

				else

					PKG_CONFIG ?= pkg-config

					ALL_CFLAGS += $(shell $(PKG_CONFIG) --cflags libelf)

					ALL_LDFLAGS += $(shell $(PKG_CONFIG) --libs libelf)

					ALL_CFLAGS += $(shell $(PKG_CONFIG) --cflags libelf zlib)

					ALL_LDFLAGS += $(shell $(PKG_CONFIG) --libs libelf zlib)

				endif

				OBJDIR ?= .

				SHARED_OBJDIR := $(OBJDIR)/sharedobjs

				STATIC_OBJDIR := $(OBJDIR)/staticobjs

				OBJS := bpf.o btf.o libbpf.o libbpf_errno.o netlink.o \

					nlattr.o str_error.o libbpf_probes.o bpf_prog_linfo.o \

					btf_dump.o hashmap.o ringbuf.o strset.o linker.o gen_loader.o \

					relo_core.o usdt.o zip.o elf.o features.o btf_iter.o btf_relocate.o

				SHARED_OBJS := $(addprefix $(SHARED_OBJDIR)/,$(OBJS))

				STATIC_OBJS := $(addprefix $(STATIC_OBJDIR)/,$(OBJS))

				OBJS := $(addprefix $(OBJDIR)/,bpf.o btf.o libbpf.o libbpf_errno.o netlink.o \

					nlattr.o str_error.o libbpf_probes.o bpf_prog_linfo.o xsk.o \

					btf_dump.o hashmap.o)

				LIBS := $(OBJDIR)/libbpf.a

				STATIC_LIBS := $(OBJDIR)/libbpf.a

				ifndef BUILD_STATIC_ONLY

					LIBS += $(OBJDIR)/libbpf.so \

						$(OBJDIR)/libbpf.so.$(VERSION) \

						$(OBJDIR)/libbpf.so.$(LIBBPF_VERSION)

					SHARED_LIBS := $(OBJDIR)/libbpf.so \

						       $(OBJDIR)/libbpf.so.$(LIBBPF_MAJOR_VERSION) \

						       $(OBJDIR)/libbpf.so.$(LIBBPF_VERSION)

					VERSION_SCRIPT := libbpf.map

				endif

				HEADERS := bpf.h libbpf.h btf.h xsk.h libbpf_util.h

				UAPI_HEADERS := $(addprefix $(TOPDIR)/include/uapi/linux/,bpf.h bpf_common.h \

					btf.h)

				HEADERS := bpf.h libbpf.h btf.h libbpf_common.h libbpf_legacy.h \

					   bpf_helpers.h bpf_helper_defs.h bpf_tracing.h \

					   bpf_endian.h bpf_core_read.h skel_internal.h libbpf_version.h \

					   usdt.bpf.h

				UAPI_HEADERS := $(addprefix $(TOPDIR)/include/uapi/linux/,\

							    bpf.h bpf_common.h btf.h)

				PC_FILE := $(OBJDIR)/libbpf.pc

				@@ -55,59 +83,82 @@ INSTALL = install

				DESTDIR ?=

				ifeq ($(shell uname -m),x86_64)

				HOSTARCH = $(firstword $(subst -, ,$(shell $(CC) -dumpmachine)))

				ifeq ($(filter-out %64 %64be %64eb %64le %64el s390x, $(HOSTARCH)),)

					LIBSUBDIR := lib64

				else

					LIBSUBDIR := lib

				endif

				# By default let the pc file itself use ${prefix} in includedir/libdir so that

				# the prefix can be overridden at runtime (eg: --define-prefix)

				ifndef LIBDIR

					LIBDIR_PC := $$\{prefix\}/$(LIBSUBDIR)

				else

					LIBDIR_PC := $(LIBDIR)

				endif

				PREFIX ?= /usr

				LIBDIR ?= $(PREFIX)/$(LIBSUBDIR)

				INCLUDEDIR ?= $(PREFIX)/include

				UAPIDIR ?= $(PREFIX)/include

				all: $(LIBS) $(PC_FILE)

				TAGS_PROG := $(if $(shell which etags 2>/dev/null),etags,ctags)

				$(OBJDIR)/libbpf.a: $(OBJS)

					$(AR) rcs $@ $^

				all: $(STATIC_LIBS) $(SHARED_LIBS) $(PC_FILE)

				$(OBJDIR)/libbpf.so: $(OBJDIR)/libbpf.so.$(VERSION)

					ln -sf $(^F) $@

				$(OBJDIR)/libbpf.a: $(STATIC_OBJS)

					$(call msg,AR,$@)

					$(Q)$(AR) rcs $@ $^

				$(OBJDIR)/libbpf.so.$(VERSION): $(OBJDIR)/libbpf.so.$(LIBBPF_VERSION)

					ln -sf $(^F) $@

				$(OBJDIR)/libbpf.so: $(OBJDIR)/libbpf.so.$(LIBBPF_MAJOR_VERSION)

					$(Q)ln -sf $(^F) $@

				$(OBJDIR)/libbpf.so.$(LIBBPF_VERSION): $(OBJS)

					$(CC) -shared $(ALL_LDFLAGS) -Wl,--version-script=$(VERSION_SCRIPT) \

								     -Wl,-soname,libbpf.so.$(VERSION) \

								     $^ -o $@

				$(OBJDIR)/libbpf.so.$(LIBBPF_MAJOR_VERSION): $(OBJDIR)/libbpf.so.$(LIBBPF_VERSION)

					$(Q)ln -sf $(^F) $@

				$(OBJDIR)/libbpf.pc:

					sed -e "s|@PREFIX@|$(PREFIX)|" \

						-e "s|@LIBDIR@|$(LIBDIR)|" \

				$(OBJDIR)/libbpf.so.$(LIBBPF_VERSION): $(SHARED_OBJS)

					$(call msg,CC,$@)

					$(Q)$(CC) -shared -Wl,--version-script=$(VERSION_SCRIPT) \

						  -Wl,-soname,libbpf.so.$(LIBBPF_MAJOR_VERSION) \

						  $^ $(ALL_LDFLAGS) -o $@

				$(OBJDIR)/libbpf.pc: force | $(OBJDIR)

					$(Q)sed -e "s|@PREFIX@|$(PREFIX)|" \

						-e "s|@LIBDIR@|$(LIBDIR_PC)|" \

						-e "s|@VERSION@|$(LIBBPF_VERSION)|" \

						< libbpf.pc.template > $@

				$(OBJDIR)/%.o: %.c

					$(CC) $(ALL_CFLAGS) $(CPPFLAGS) -c $< -o $@

				$(OBJDIR) $(STATIC_OBJDIR) $(SHARED_OBJDIR):

					$(call msg,MKDIR,$@)

					$(Q)mkdir -p $@

				$(STATIC_OBJDIR)/%.o: %.c | $(STATIC_OBJDIR)

					$(call msg,CC,$@)

					$(Q)$(CC) $(ALL_CFLAGS) $(CPPFLAGS) -c $< -o $@

				$(SHARED_OBJDIR)/%.o: %.c | $(SHARED_OBJDIR)

					$(call msg,CC,$@)

					$(Q)$(CC) $(ALL_CFLAGS) $(SHARED_CFLAGS) $(CPPFLAGS) -c $< -o $@

				define do_install

					if [ ! -d '$(DESTDIR)$2' ]; then		\

					$(call msg,INSTALL,$1)

					$(Q)if [ ! -d '$(DESTDIR)$2' ]; then		\

						$(INSTALL) -d -m 755 '$(DESTDIR)$2';	\

					fi;						\

					$(INSTALL) $1 $(if $3,-m $3,) '$(DESTDIR)$2'

					fi;

					$(Q)$(INSTALL) $(if $3,-m $3,) $1 '$(DESTDIR)$2'

				endef

				# Preserve symlinks at installation.

				define do_s_install

					if [ ! -d '$(DESTDIR)$2' ]; then		\

					$(call msg,INSTALL,$1)

					$(Q)if [ ! -d '$(DESTDIR)$2' ]; then		\

						$(INSTALL) -d -m 755 '$(DESTDIR)$2';	\

					fi;						\

					cp -fpR $1 '$(DESTDIR)$2'

					fi;

					$(Q)cp -fR $1 '$(DESTDIR)$2'

				endef

				install: all install_headers install_pkgconfig

					$(call do_s_install,$(LIBS),$(LIBDIR))

					$(call do_s_install,$(STATIC_LIBS) $(SHARED_LIBS),$(LIBDIR))

				install_headers:

					$(call do_install,$(HEADERS),$(INCLUDEDIR)/bpf,644)

				@@ -121,4 +172,18 @@ install_pkgconfig: $(PC_FILE)

					$(call do_install,$(PC_FILE),$(LIBDIR)/pkgconfig,644)

				clean:

					rm -f *.o *.a *.so *.so.* *.pc

					$(call msg,CLEAN)

					$(Q)rm -rf *.o *.a *.so *.so.* *.pc $(SHARED_OBJDIR) $(STATIC_OBJDIR)

				.PHONY: cscope tags force

				cscope:

					$(call msg,CSCOPE)

					$(Q)ls *.c *.h > cscope.files

					$(Q)cscope -b -q -f cscope.out

				tags:

					$(call msg,CTAGS)

					$(Q)rm -f TAGS tags

					$(Q)ls *.c *.h | xargs $(TAGS_PROG) -a

				force:

1290

src/bpf.c

View File

File diff suppressed because it is too large Load Diff

									
										670

src/bpf.h
									
												View File
												
				@@ -1,7 +1,7 @@

				/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */

				/*

				 * common eBPF ELF operations.

				 * Common BPF ELF operations.

				 *

				 * Copyright (C) 2013-2015 Alexei Starovoitov <ast@kernel.org>

				 * Copyright (C) 2015 Wang Nan <wangnan0@huawei.com>

				@@ -28,86 +28,125 @@

				#include <stddef.h>

				#include <stdint.h>

				#include "libbpf_common.h"

				#include "libbpf_legacy.h"

				#ifdef __cplusplus

				extern "C" {

				#endif

				#ifndef LIBBPF_API

				#define LIBBPF_API __attribute__((visibility("default")))

				#endif

				LIBBPF_API int libbpf_set_memlock_rlim(size_t memlock_bytes);

				struct bpf_map_create_opts {

					size_t sz; /* size of this struct for forward/backward compatibility */

				struct bpf_create_map_attr {

					const char *name;

					enum bpf_map_type map_type;

					__u32 map_flags;

					__u32 key_size;

					__u32 value_size;

					__u32 max_entries;

					__u32 numa_node;

					__u32 btf_fd;

					__u32 btf_key_type_id;

					__u32 btf_value_type_id;

					__u32 map_ifindex;

					__u32 btf_vmlinux_value_type_id;

					__u32 inner_map_fd;

					__u32 map_flags;

					__u64 map_extra;

					__u32 numa_node;

					__u32 map_ifindex;

					__s32 value_type_btf_obj_fd;

					__u32 token_fd;

					size_t :0;

				};

				#define bpf_map_create_opts__last_field token_fd

				LIBBPF_API int

				bpf_create_map_xattr(const struct bpf_create_map_attr *create_attr);

				LIBBPF_API int bpf_create_map_node(enum bpf_map_type map_type, const char *name,

								   int key_size, int value_size,

								   int max_entries, __u32 map_flags, int node);

				LIBBPF_API int bpf_create_map_name(enum bpf_map_type map_type, const char *name,

								   int key_size, int value_size,

								   int max_entries, __u32 map_flags);

				LIBBPF_API int bpf_create_map(enum bpf_map_type map_type, int key_size,

							      int value_size, int max_entries, __u32 map_flags);

				LIBBPF_API int bpf_create_map_in_map_node(enum bpf_map_type map_type,

									  const char *name, int key_size,

									  int inner_map_fd, int max_entries,

									  __u32 map_flags, int node);

				LIBBPF_API int bpf_create_map_in_map(enum bpf_map_type map_type,

								     const char *name, int key_size,

								     int inner_map_fd, int max_entries,

								     __u32 map_flags);

				LIBBPF_API int bpf_map_create(enum bpf_map_type map_type,

							      const char *map_name,

							      __u32 key_size,

							      __u32 value_size,

							      __u32 max_entries,

							      const struct bpf_map_create_opts *opts);

				struct bpf_prog_load_opts {

					size_t sz; /* size of this struct for forward/backward compatibility */

					/* libbpf can retry BPF_PROG_LOAD command if bpf() syscall returns

					 * -EAGAIN. This field determines how many attempts libbpf has to

					 *  make. If not specified, libbpf will use default value of 5.

					 */

					int attempts;

				struct bpf_load_program_attr {

					enum bpf_prog_type prog_type;

					enum bpf_attach_type expected_attach_type;

					const char *name;

					const struct bpf_insn *insns;

					size_t insns_cnt;

					const char *license;

					__u32 kern_version;

					__u32 prog_ifindex;

					__u32 prog_btf_fd;

					__u32 func_info_rec_size;

					__u32 prog_flags;

					__u32 prog_ifindex;

					__u32 kern_version;

					__u32 attach_btf_id;

					__u32 attach_prog_fd;

					__u32 attach_btf_obj_fd;

					const int *fd_array;

					/* .BTF.ext func info data */

					const void *func_info;

					__u32 func_info_cnt;

					__u32 line_info_rec_size;

					__u32 func_info_rec_size;

					/* .BTF.ext line info data */

					const void *line_info;

					__u32 line_info_cnt;

					__u32 line_info_rec_size;

					/* verifier log options */

					__u32 log_level;

					__u32 prog_flags;

					__u32 log_size;

					char *log_buf;

					/* output: actual total log contents size (including terminating zero).

					 * It could be both larger than original log_size (if log was

					 * truncated), or smaller (if log buffer wasn't filled completely).

					 * If kernel doesn't support this feature, log_size is left unchanged.

					 */

					__u32 log_true_size;

					__u32 token_fd;

					/* if set, provides the length of fd_array */

					__u32 fd_array_cnt;

					size_t :0;

				};

				#define bpf_prog_load_opts__last_field fd_array_cnt

				LIBBPF_API int bpf_prog_load(enum bpf_prog_type prog_type,

							     const char *prog_name, const char *license,

							     const struct bpf_insn *insns, size_t insn_cnt,

							     struct bpf_prog_load_opts *opts);

				/* Flags to direct loading requirements */

				#define MAPS_RELAX_COMPAT	0x01

				/* Recommend log buffer size */

				/* Recommended log buffer size */

				#define BPF_LOG_BUF_SIZE (UINT32_MAX >> 8) /* verifier maximum in kernels <= 5.1 */

				LIBBPF_API int

				bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,

						       char *log_buf, size_t log_buf_sz);

				LIBBPF_API int bpf_load_program(enum bpf_prog_type type,

								const struct bpf_insn *insns, size_t insns_cnt,

								const char *license, __u32 kern_version,

								char *log_buf, size_t log_buf_sz);

				LIBBPF_API int bpf_verify_program(enum bpf_prog_type type,

								  const struct bpf_insn *insns,

								  size_t insns_cnt, __u32 prog_flags,

								  const char *license, __u32 kern_version,

								  char *log_buf, size_t log_buf_sz,

								  int log_level);

				struct bpf_btf_load_opts {

					size_t sz; /* size of this struct for forward/backward compatibility */

					/* kernel log options */

					char *log_buf;

					__u32 log_level;

					__u32 log_size;

					/* output: actual total log contents size (including terminating zero).

					 * It could be both larger than original log_size (if log was

					 * truncated), or smaller (if log buffer wasn't filled completely).

					 * If kernel doesn't support this feature, log_size is left unchanged.

					 */

					__u32 log_true_size;

					__u32 btf_flags;

					__u32 token_fd;

					size_t :0;

				};

				#define bpf_btf_load_opts__last_field token_fd

				LIBBPF_API int bpf_btf_load(const void *btf_data, size_t btf_size,

							    struct bpf_btf_load_opts *opts);

				LIBBPF_API int bpf_map_update_elem(int fd, const void *key, const void *value,

								   __u64 flags);

				@@ -117,17 +156,312 @@ LIBBPF_API int bpf_map_lookup_elem_flags(int fd, const void *key, void *value,

									 __u64 flags);

				LIBBPF_API int bpf_map_lookup_and_delete_elem(int fd, const void *key,

									      void *value);

				LIBBPF_API int bpf_map_lookup_and_delete_elem_flags(int fd, const void *key,

										    void *value, __u64 flags);

				LIBBPF_API int bpf_map_delete_elem(int fd, const void *key);

				LIBBPF_API int bpf_map_delete_elem_flags(int fd, const void *key, __u64 flags);

				LIBBPF_API int bpf_map_get_next_key(int fd, const void *key, void *next_key);

				LIBBPF_API int bpf_map_freeze(int fd);

				struct bpf_map_batch_opts {

					size_t sz; /* size of this struct for forward/backward compatibility */

					__u64 elem_flags;

					__u64 flags;

				};

				#define bpf_map_batch_opts__last_field flags

				/**

				 * @brief **bpf_map_delete_batch()** allows for batch deletion of multiple

				 * elements in a BPF map.

				 *

				 * @param fd BPF map file descriptor

				 * @param keys pointer to an array of *count* keys

				 * @param count input and output parameter; on input **count** represents the

				 * number of  elements in the map to delete in batch;

				 * on output if a non-EFAULT error is returned, **count** represents the number of deleted

				 * elements if the output **count** value is not equal to the input **count** value

				 * If EFAULT is returned, **count** should not be trusted to be correct.

				 * @param opts options for configuring the way the batch deletion works

				 * @return 0, on success; negative error code, otherwise (errno is also set to

				 * the error code)

				 */

				LIBBPF_API int bpf_map_delete_batch(int fd, const void *keys,

								    __u32 *count,

								    const struct bpf_map_batch_opts *opts);

				/**

				 * @brief **bpf_map_lookup_batch()** allows for batch lookup of BPF map elements.

				 *

				 * The parameter *in_batch* is the address of the first element in the batch to

				 * read. *out_batch* is an output parameter that should be passed as *in_batch*

				 * to subsequent calls to **bpf_map_lookup_batch()**. NULL can be passed for

				 * *in_batch* to indicate that the batched lookup starts from the beginning of

				 * the map. Both *in_batch* and *out_batch* must point to memory large enough to

				 * hold a single key, except for maps of type **BPF_MAP_TYPE_{HASH, PERCPU_HASH,

				 * LRU_HASH, LRU_PERCPU_HASH}**, for which the memory size must be at

				 * least 4 bytes wide regardless of key size.

				 *

				 * The *keys* and *values* are output parameters which must point to memory large enough to

				 * hold *count* items based on the key and value size of the map *map_fd*. The *keys*

				 * buffer must be of *key_size* * *count*. The *values* buffer must be of

				 * *value_size* * *count*.

				 *

				 * @param fd BPF map file descriptor

				 * @param in_batch address of the first element in batch to read, can pass NULL to

				 * indicate that the batched lookup starts from the beginning of the map.

				 * @param out_batch output parameter that should be passed to next call as *in_batch*

				 * @param keys pointer to an array large enough for *count* keys

				 * @param values pointer to an array large enough for *count* values

				 * @param count input and output parameter; on input it's the number of elements

				 * in the map to read in batch; on output it's the number of elements that were

				 * successfully read.

				 * If a non-EFAULT error is returned, count will be set as the number of elements

				 * that were read before the error occurred.

				 * If EFAULT is returned, **count** should not be trusted to be correct.

				 * @param opts options for configuring the way the batch lookup works

				 * @return 0, on success; negative error code, otherwise (errno is also set to

				 * the error code)

				 */

				LIBBPF_API int bpf_map_lookup_batch(int fd, void *in_batch, void *out_batch,

								    void *keys, void *values, __u32 *count,

								    const struct bpf_map_batch_opts *opts);

				/**

				 * @brief **bpf_map_lookup_and_delete_batch()** allows for batch lookup and deletion

				 * of BPF map elements where each element is deleted after being retrieved.

				 *

				 * @param fd BPF map file descriptor

				 * @param in_batch address of the first element in batch to read, can pass NULL to

				 * get address of the first element in *out_batch*. If not NULL, must be large

				 * enough to hold a key. For **BPF_MAP_TYPE_{HASH, PERCPU_HASH, LRU_HASH,

				 * LRU_PERCPU_HASH}**, the memory size must be at least 4 bytes wide regardless

				 * of key size.

				 * @param out_batch output parameter that should be passed to next call as *in_batch*

				 * @param keys pointer to an array of *count* keys

				 * @param values pointer to an array large enough for *count* values

				 * @param count input and output parameter; on input it's the number of elements

				 * in the map to read and delete in batch; on output it represents the number of

				 * elements that were successfully read and deleted

				 * If a non-**EFAULT** error code is returned and if the output **count** value

				 * is not equal to the input **count** value, up to **count** elements may

				 * have been deleted.

				 * if **EFAULT** is returned up to *count* elements may have been deleted without

				 * being returned via the *keys* and *values* output parameters.

				 * @param opts options for configuring the way the batch lookup and delete works

				 * @return 0, on success; negative error code, otherwise (errno is also set to

				 * the error code)

				 */

				LIBBPF_API int bpf_map_lookup_and_delete_batch(int fd, void *in_batch,

									void *out_batch, void *keys,

									void *values, __u32 *count,

									const struct bpf_map_batch_opts *opts);

				/**

				 * @brief **bpf_map_update_batch()** updates multiple elements in a map

				 * by specifying keys and their corresponding values.

				 *

				 * The *keys* and *values* parameters must point to memory large enough

				 * to hold *count* items based on the key and value size of the map.

				 *

				 * The *opts* parameter can be used to control how *bpf_map_update_batch()*

				 * should handle keys that either do or do not already exist in the map.

				 * In particular the *flags* parameter of *bpf_map_batch_opts* can be

				 * one of the following:

				 *

				 * Note that *count* is an input and output parameter, where on output it

				 * represents how many elements were successfully updated. Also note that if

				 * **EFAULT** then *count* should not be trusted to be correct.

				 *

				 * **BPF_ANY**

				 *    Create new elements or update existing.

				 *

				 * **BPF_NOEXIST**

				 *    Create new elements only if they do not exist.

				 *

				 * **BPF_EXIST**

				 *    Update existing elements.

				 *

				 * **BPF_F_LOCK**

				 *    Update spin_lock-ed map elements. This must be

				 *    specified if the map value contains a spinlock.

				 *

				 * @param fd BPF map file descriptor

				 * @param keys pointer to an array of *count* keys

				 * @param values pointer to an array of *count* values

				 * @param count input and output parameter; on input it's the number of elements

				 * in the map to update in batch; on output if a non-EFAULT error is returned,

				 * **count** represents the number of updated elements if the output **count**

				 * value is not equal to the input **count** value.

				 * If EFAULT is returned, **count** should not be trusted to be correct.

				 * @param opts options for configuring the way the batch update works

				 * @return 0, on success; negative error code, otherwise (errno is also set to

				 * the error code)

				 */

				LIBBPF_API int bpf_map_update_batch(int fd, const void *keys, const void *values,

								    __u32 *count,

								    const struct bpf_map_batch_opts *opts);

				struct bpf_obj_pin_opts {

					size_t sz; /* size of this struct for forward/backward compatibility */

					__u32 file_flags;

					int path_fd;

					size_t :0;

				};

				#define bpf_obj_pin_opts__last_field path_fd

				LIBBPF_API int bpf_obj_pin(int fd, const char *pathname);

				LIBBPF_API int bpf_obj_pin_opts(int fd, const char *pathname,

								const struct bpf_obj_pin_opts *opts);

				struct bpf_obj_get_opts {

					size_t sz; /* size of this struct for forward/backward compatibility */

					__u32 file_flags;

					int path_fd;

					size_t :0;

				};

				#define bpf_obj_get_opts__last_field path_fd

				LIBBPF_API int bpf_obj_get(const char *pathname);

				LIBBPF_API int bpf_obj_get_opts(const char *pathname,

								const struct bpf_obj_get_opts *opts);

				LIBBPF_API int bpf_prog_attach(int prog_fd, int attachable_fd,

							       enum bpf_attach_type type, unsigned int flags);

				LIBBPF_API int bpf_prog_detach(int attachable_fd, enum bpf_attach_type type);

				LIBBPF_API int bpf_prog_detach2(int prog_fd, int attachable_fd,

								enum bpf_attach_type type);

				struct bpf_prog_attach_opts {

					size_t sz; /* size of this struct for forward/backward compatibility */

					__u32 flags;

					union {

						int replace_prog_fd;

						int replace_fd;

					};

					int relative_fd;

					__u32 relative_id;

					__u64 expected_revision;

					size_t :0;

				};

				#define bpf_prog_attach_opts__last_field expected_revision

				struct bpf_prog_detach_opts {

					size_t sz; /* size of this struct for forward/backward compatibility */

					__u32 flags;

					int relative_fd;

					__u32 relative_id;

					__u64 expected_revision;

					size_t :0;

				};

				#define bpf_prog_detach_opts__last_field expected_revision

				/**

				 * @brief **bpf_prog_attach_opts()** attaches the BPF program corresponding to

				 * *prog_fd* to a *target* which can represent a file descriptor or netdevice

				 * ifindex.

				 *

				 * @param prog_fd BPF program file descriptor

				 * @param target attach location file descriptor or ifindex

				 * @param type attach type for the BPF program

				 * @param opts options for configuring the attachment

				 * @return 0, on success; negative error code, otherwise (errno is also set to

				 * the error code)

				 */

				LIBBPF_API int bpf_prog_attach_opts(int prog_fd, int target,

								    enum bpf_attach_type type,

								    const struct bpf_prog_attach_opts *opts);

				/**

				 * @brief **bpf_prog_detach_opts()** detaches the BPF program corresponding to

				 * *prog_fd* from a *target* which can represent a file descriptor or netdevice

				 * ifindex.

				 *

				 * @param prog_fd BPF program file descriptor

				 * @param target detach location file descriptor or ifindex

				 * @param type detach type for the BPF program

				 * @param opts options for configuring the detachment

				 * @return 0, on success; negative error code, otherwise (errno is also set to

				 * the error code)

				 */

				LIBBPF_API int bpf_prog_detach_opts(int prog_fd, int target,

								    enum bpf_attach_type type,

								    const struct bpf_prog_detach_opts *opts);

				union bpf_iter_link_info; /* defined in up-to-date linux/bpf.h */

				struct bpf_link_create_opts {

					size_t sz; /* size of this struct for forward/backward compatibility */

					__u32 flags;

					union bpf_iter_link_info *iter_info;

					__u32 iter_info_len;

					__u32 target_btf_id;

					union {

						struct {

							__u64 bpf_cookie;

						} perf_event;

						struct {

							__u32 flags;

							__u32 cnt;

							const char **syms;

							const unsigned long *addrs;

							const __u64 *cookies;

						} kprobe_multi;

						struct {

							__u32 flags;

							__u32 cnt;

							const char *path;

							const unsigned long *offsets;

							const unsigned long *ref_ctr_offsets;

							const __u64 *cookies;

							__u32 pid;

						} uprobe_multi;

						struct {

							__u64 cookie;

						} tracing;

						struct {

							__u32 pf;

							__u32 hooknum;

							__s32 priority;

							__u32 flags;

						} netfilter;

						struct {

							__u32 relative_fd;

							__u32 relative_id;

							__u64 expected_revision;

						} tcx;

						struct {

							__u32 relative_fd;

							__u32 relative_id;

							__u64 expected_revision;

						} netkit;

					};

					size_t :0;

				};

				#define bpf_link_create_opts__last_field uprobe_multi.pid

				LIBBPF_API int bpf_link_create(int prog_fd, int target_fd,

							       enum bpf_attach_type attach_type,

							       const struct bpf_link_create_opts *opts);

				LIBBPF_API int bpf_link_detach(int link_fd);

				struct bpf_link_update_opts {

					size_t sz; /* size of this struct for forward/backward compatibility */

					__u32 flags;	   /* extra flags */

					__u32 old_prog_fd; /* expected old program FD */

					__u32 old_map_fd;  /* expected old map FD */

				};

				#define bpf_link_update_opts__last_field old_map_fd

				LIBBPF_API int bpf_link_update(int link_fd, int new_prog_fd,

							       const struct bpf_link_update_opts *opts);

				LIBBPF_API int bpf_iter_create(int link_fd);

				struct bpf_prog_test_run_attr {

					int prog_fd;

					int repeat;

				@@ -145,31 +479,231 @@ struct bpf_prog_test_run_attr {

							     * out: length of cxt_out */

				};

				LIBBPF_API int bpf_prog_test_run_xattr(struct bpf_prog_test_run_attr *test_attr);

				/*

				 * bpf_prog_test_run does not check that data_out is large enough. Consider

				 * using bpf_prog_test_run_xattr instead.

				 */

				LIBBPF_API int bpf_prog_test_run(int prog_fd, int repeat, void *data,

								 __u32 size, void *data_out, __u32 *size_out,

								 __u32 *retval, __u32 *duration);

				LIBBPF_API int bpf_prog_get_next_id(__u32 start_id, __u32 *next_id);

				LIBBPF_API int bpf_map_get_next_id(__u32 start_id, __u32 *next_id);

				LIBBPF_API int bpf_btf_get_next_id(__u32 start_id, __u32 *next_id);

				LIBBPF_API int bpf_link_get_next_id(__u32 start_id, __u32 *next_id);

				struct bpf_get_fd_by_id_opts {

					size_t sz; /* size of this struct for forward/backward compatibility */

					__u32 open_flags; /* permissions requested for the operation on fd */

					__u32 token_fd;

					size_t :0;

				};

				#define bpf_get_fd_by_id_opts__last_field token_fd

				LIBBPF_API int bpf_prog_get_fd_by_id(__u32 id);

				LIBBPF_API int bpf_prog_get_fd_by_id_opts(__u32 id,

								const struct bpf_get_fd_by_id_opts *opts);

				LIBBPF_API int bpf_map_get_fd_by_id(__u32 id);

				LIBBPF_API int bpf_map_get_fd_by_id_opts(__u32 id,

								const struct bpf_get_fd_by_id_opts *opts);

				LIBBPF_API int bpf_btf_get_fd_by_id(__u32 id);

				LIBBPF_API int bpf_obj_get_info_by_fd(int prog_fd, void *info, __u32 *info_len);

				LIBBPF_API int bpf_btf_get_fd_by_id_opts(__u32 id,

								const struct bpf_get_fd_by_id_opts *opts);

				LIBBPF_API int bpf_link_get_fd_by_id(__u32 id);

				LIBBPF_API int bpf_link_get_fd_by_id_opts(__u32 id,

								const struct bpf_get_fd_by_id_opts *opts);

				LIBBPF_API int bpf_obj_get_info_by_fd(int bpf_fd, void *info, __u32 *info_len);

				/**

				 * @brief **bpf_prog_get_info_by_fd()** obtains information about the BPF

				 * program corresponding to *prog_fd*.

				 *

				 * Populates up to *info_len* bytes of *info* and updates *info_len* with the

				 * actual number of bytes written to *info*. Note that *info* should be

				 * zero-initialized or initialized as expected by the requested *info*

				 * type. Failing to (zero-)initialize *info* under certain circumstances can

				 * result in this helper returning an error.

				 *

				 * @param prog_fd BPF program file descriptor

				 * @param info pointer to **struct bpf_prog_info** that will be populated with

				 * BPF program information

				 * @param info_len pointer to the size of *info*; on success updated with the

				 * number of bytes written to *info*

				 * @return 0, on success; negative error code, otherwise (errno is also set to

				 * the error code)

				 */

				LIBBPF_API int bpf_prog_get_info_by_fd(int prog_fd, struct bpf_prog_info *info, __u32 *info_len);

				/**

				 * @brief **bpf_map_get_info_by_fd()** obtains information about the BPF

				 * map corresponding to *map_fd*.

				 *

				 * Populates up to *info_len* bytes of *info* and updates *info_len* with the

				 * actual number of bytes written to *info*. Note that *info* should be

				 * zero-initialized or initialized as expected by the requested *info*

				 * type. Failing to (zero-)initialize *info* under certain circumstances can

				 * result in this helper returning an error.

				 *

				 * @param map_fd BPF map file descriptor

				 * @param info pointer to **struct bpf_map_info** that will be populated with

				 * BPF map information

				 * @param info_len pointer to the size of *info*; on success updated with the

				 * number of bytes written to *info*

				 * @return 0, on success; negative error code, otherwise (errno is also set to

				 * the error code)

				 */

				LIBBPF_API int bpf_map_get_info_by_fd(int map_fd, struct bpf_map_info *info, __u32 *info_len);

				/**

				 * @brief **bpf_btf_get_info_by_fd()** obtains information about the

				 * BTF object corresponding to *btf_fd*.

				 *

				 * Populates up to *info_len* bytes of *info* and updates *info_len* with the

				 * actual number of bytes written to *info*. Note that *info* should be

				 * zero-initialized or initialized as expected by the requested *info*

				 * type. Failing to (zero-)initialize *info* under certain circumstances can

				 * result in this helper returning an error.

				 *

				 * @param btf_fd BTF object file descriptor

				 * @param info pointer to **struct bpf_btf_info** that will be populated with

				 * BTF object information

				 * @param info_len pointer to the size of *info*; on success updated with the

				 * number of bytes written to *info*

				 * @return 0, on success; negative error code, otherwise (errno is also set to

				 * the error code)

				 */

				LIBBPF_API int bpf_btf_get_info_by_fd(int btf_fd, struct bpf_btf_info *info, __u32 *info_len);

				/**

				 * @brief **bpf_btf_get_info_by_fd()** obtains information about the BPF

				 * link corresponding to *link_fd*.

				 *

				 * Populates up to *info_len* bytes of *info* and updates *info_len* with the

				 * actual number of bytes written to *info*. Note that *info* should be

				 * zero-initialized or initialized as expected by the requested *info*

				 * type. Failing to (zero-)initialize *info* under certain circumstances can

				 * result in this helper returning an error.

				 *

				 * @param link_fd BPF link file descriptor

				 * @param info pointer to **struct bpf_link_info** that will be populated with

				 * BPF link information

				 * @param info_len pointer to the size of *info*; on success updated with the

				 * number of bytes written to *info*

				 * @return 0, on success; negative error code, otherwise (errno is also set to

				 * the error code)

				 */

				LIBBPF_API int bpf_link_get_info_by_fd(int link_fd, struct bpf_link_info *info, __u32 *info_len);

				struct bpf_prog_query_opts {

					size_t sz; /* size of this struct for forward/backward compatibility */

					__u32 query_flags;

					__u32 attach_flags; /* output argument */

					__u32 *prog_ids;

					union {

						/* input+output argument */

						__u32 prog_cnt;

						__u32 count;

					};

					__u32 *prog_attach_flags;

					__u32 *link_ids;

					__u32 *link_attach_flags;

					__u64 revision;

					size_t :0;

				};

				#define bpf_prog_query_opts__last_field revision

				/**

				 * @brief **bpf_prog_query_opts()** queries the BPF programs and BPF links

				 * which are attached to *target* which can represent a file descriptor or

				 * netdevice ifindex.

				 *

				 * @param target query location file descriptor or ifindex

				 * @param type attach type for the BPF program

				 * @param opts options for configuring the query

				 * @return 0, on success; negative error code, otherwise (errno is also set to

				 * the error code)

				 */

				LIBBPF_API int bpf_prog_query_opts(int target, enum bpf_attach_type type,

								   struct bpf_prog_query_opts *opts);

				LIBBPF_API int bpf_prog_query(int target_fd, enum bpf_attach_type type,

							      __u32 query_flags, __u32 *attach_flags,

							      __u32 *prog_ids, __u32 *prog_cnt);

				struct bpf_raw_tp_opts {

					size_t sz; /* size of this struct for forward/backward compatibility */

					const char *tp_name;

					__u64 cookie;

					size_t :0;

				};

				#define bpf_raw_tp_opts__last_field cookie

				LIBBPF_API int bpf_raw_tracepoint_open_opts(int prog_fd, struct bpf_raw_tp_opts *opts);

				LIBBPF_API int bpf_raw_tracepoint_open(const char *name, int prog_fd);

				LIBBPF_API int bpf_load_btf(void *btf, __u32 btf_size, char *log_buf,

							    __u32 log_buf_size, bool do_log);

				LIBBPF_API int bpf_task_fd_query(int pid, int fd, __u32 flags, char *buf,

								 __u32 *buf_len, __u32 *prog_id, __u32 *fd_type,

								 __u64 *probe_offset, __u64 *probe_addr);

				#ifdef __cplusplus

				/* forward-declaring enums in C++ isn't compatible with pure C enums, so

				 * instead define bpf_enable_stats() as accepting int as an input

				 */

				LIBBPF_API int bpf_enable_stats(int type);

				#else

				enum bpf_stats_type; /* defined in up-to-date linux/bpf.h */

				LIBBPF_API int bpf_enable_stats(enum bpf_stats_type type);

				#endif

				struct bpf_prog_bind_opts {

					size_t sz; /* size of this struct for forward/backward compatibility */

					__u32 flags;

				};

				#define bpf_prog_bind_opts__last_field flags

				LIBBPF_API int bpf_prog_bind_map(int prog_fd, int map_fd,

								 const struct bpf_prog_bind_opts *opts);

				struct bpf_test_run_opts {

					size_t sz; /* size of this struct for forward/backward compatibility */

					const void *data_in; /* optional */

					void *data_out;      /* optional */

					__u32 data_size_in;

					__u32 data_size_out; /* in: max length of data_out

							      * out: length of data_out

							      */

					const void *ctx_in; /* optional */

					void *ctx_out;      /* optional */

					__u32 ctx_size_in;

					__u32 ctx_size_out; /* in: max length of ctx_out

							     * out: length of cxt_out

							     */

					__u32 retval;        /* out: return code of the BPF program */

					int repeat;

					__u32 duration;      /* out: average per repetition in ns */

					__u32 flags;

					__u32 cpu;

					__u32 batch_size;

				};

				#define bpf_test_run_opts__last_field batch_size

				LIBBPF_API int bpf_prog_test_run_opts(int prog_fd,

								      struct bpf_test_run_opts *opts);

				struct bpf_token_create_opts {

					size_t sz; /* size of this struct for forward/backward compatibility */

					__u32 flags;

					size_t :0;

				};

				#define bpf_token_create_opts__last_field flags

				/**

				 * @brief **bpf_token_create()** creates a new instance of BPF token derived

				 * from specified BPF FS mount point.

				 *

				 * BPF token created with this API can be passed to bpf() syscall for

				 * commands like BPF_PROG_LOAD, BPF_MAP_CREATE, etc.

				 *

				 * @param bpffs_fd FD for BPF FS instance from which to derive a BPF token

				 * instance.

				 * @param opts optional BPF token creation options, can be NULL

				 *

				 * @return BPF token FD > 0, on success; negative error code, otherwise (errno

				 * is also set to the error code)

				 */

				LIBBPF_API int bpf_token_create(int bpffs_fd,

								struct bpf_token_create_opts *opts);

				#ifdef __cplusplus

				} /* extern "C" */

				#endif

									
										567

src/bpf_core_read.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,567 @@

				/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */

				#ifndef __BPF_CORE_READ_H__

				#define __BPF_CORE_READ_H__

				#include "bpf_helpers.h"

				/*

				 * enum bpf_field_info_kind is passed as a second argument into

				 * __builtin_preserve_field_info() built-in to get a specific aspect of

				 * a field, captured as a first argument. __builtin_preserve_field_info(field,

				 * info_kind) returns __u32 integer and produces BTF field relocation, which

				 * is understood and processed by libbpf during BPF object loading. See

				 * selftests/bpf for examples.

				 */

				enum bpf_field_info_kind {

					BPF_FIELD_BYTE_OFFSET = 0,	/* field byte offset */

					BPF_FIELD_BYTE_SIZE = 1,

					BPF_FIELD_EXISTS = 2,		/* field existence in target kernel */

					BPF_FIELD_SIGNED = 3,

					BPF_FIELD_LSHIFT_U64 = 4,

					BPF_FIELD_RSHIFT_U64 = 5,

				};

				/* second argument to __builtin_btf_type_id() built-in */

				enum bpf_type_id_kind {

					BPF_TYPE_ID_LOCAL = 0,		/* BTF type ID in local program */

					BPF_TYPE_ID_TARGET = 1,		/* BTF type ID in target kernel */

				};

				/* second argument to __builtin_preserve_type_info() built-in */

				enum bpf_type_info_kind {

					BPF_TYPE_EXISTS = 0,		/* type existence in target kernel */

					BPF_TYPE_SIZE = 1,		/* type size in target kernel */

					BPF_TYPE_MATCHES = 2,		/* type match in target kernel */

				};

				/* second argument to __builtin_preserve_enum_value() built-in */

				enum bpf_enum_value_kind {

					BPF_ENUMVAL_EXISTS = 0,		/* enum value existence in kernel */

					BPF_ENUMVAL_VALUE = 1,		/* enum value value relocation */

				};

				#define __CORE_RELO(src, field, info)					      \

					__builtin_preserve_field_info((src)->field, BPF_FIELD_##info)

				#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__

				#define __CORE_BITFIELD_PROBE_READ(dst, src, fld)			      \

					bpf_probe_read_kernel(						      \

							(void *)dst,					      \

							__CORE_RELO(src, fld, BYTE_SIZE),		      \

							(const void *)src + __CORE_RELO(src, fld, BYTE_OFFSET))

				#else

				/* semantics of LSHIFT_64 assumes loading values into low-ordered bytes, so

				 * for big-endian we need to adjust destination pointer accordingly, based on

				 * field byte size

				 */

				#define __CORE_BITFIELD_PROBE_READ(dst, src, fld)			      \

					bpf_probe_read_kernel(						      \

							(void *)dst + (8 - __CORE_RELO(src, fld, BYTE_SIZE)), \

							__CORE_RELO(src, fld, BYTE_SIZE),		      \

							(const void *)src + __CORE_RELO(src, fld, BYTE_OFFSET))

				#endif

				/*

				 * Extract bitfield, identified by s->field, and return its value as u64.

				 * All this is done in relocatable manner, so bitfield changes such as

				 * signedness, bit size, offset changes, this will be handled automatically.

				 * This version of macro is using bpf_probe_read_kernel() to read underlying

				 * integer storage. Macro functions as an expression and its return type is

				 * bpf_probe_read_kernel()'s return value: 0, on success, <0 on error.

				 */

				#define BPF_CORE_READ_BITFIELD_PROBED(s, field) ({			      \

					unsigned long long val = 0;					      \

													      \

					__CORE_BITFIELD_PROBE_READ(&val, s, field);			      \

					val <<= __CORE_RELO(s, field, LSHIFT_U64);			      \

					if (__CORE_RELO(s, field, SIGNED))				      \

						val = ((long long)val) >> __CORE_RELO(s, field, RSHIFT_U64);  \

					else								      \

						val = val >> __CORE_RELO(s, field, RSHIFT_U64);		      \

					val;								      \

				})

				/*

				 * Extract bitfield, identified by s->field, and return its value as u64.

				 * This version of macro is using direct memory reads and should be used from

				 * BPF program types that support such functionality (e.g., typed raw

				 * tracepoints).

				 */

				#define BPF_CORE_READ_BITFIELD(s, field) ({				      \

					const void *p = (const void *)s + __CORE_RELO(s, field, BYTE_OFFSET); \

					unsigned long long val;						      \

													      \

					/* This is a so-called barrier_var() operation that makes specified   \

					 * variable "a black box" for optimizing compiler.		      \

					 * It forces compiler to perform BYTE_OFFSET relocation on p and use  \

					 * its calculated value in the switch below, instead of applying      \

					 * the same relocation 4 times for each individual memory load.       \

					 */								      \

					asm volatile("" : "=r"(p) : "0"(p));				      \

													      \

					switch (__CORE_RELO(s, field, BYTE_SIZE)) {			      \

					case 1: val = *(const unsigned char *)p; break;			      \

					case 2: val = *(const unsigned short *)p; break;		      \

					case 4: val = *(const unsigned int *)p; break;			      \

					case 8: val = *(const unsigned long long *)p; break;		      \

					default: val = 0; break;					      \

					}								      \

					val <<= __CORE_RELO(s, field, LSHIFT_U64);			      \

					if (__CORE_RELO(s, field, SIGNED))				      \

						val = ((long long)val) >> __CORE_RELO(s, field, RSHIFT_U64);  \

					else								      \

						val = val >> __CORE_RELO(s, field, RSHIFT_U64);		      \

					val;								      \

				})

				/*

				 * Write to a bitfield, identified by s->field.

				 * This is the inverse of BPF_CORE_WRITE_BITFIELD().

				 */

				#define BPF_CORE_WRITE_BITFIELD(s, field, new_val) ({			\

					void *p = (void *)s + __CORE_RELO(s, field, BYTE_OFFSET);	\

					unsigned int byte_size = __CORE_RELO(s, field, BYTE_SIZE);	\

					unsigned int lshift = __CORE_RELO(s, field, LSHIFT_U64);	\

					unsigned int rshift = __CORE_RELO(s, field, RSHIFT_U64);	\

					unsigned long long mask, val, nval = new_val;			\

					unsigned int rpad = rshift - lshift;				\

													\

					asm volatile("" : "+r"(p));					\

													\

					switch (byte_size) {						\

					case 1: val = *(unsigned char *)p; break;			\

					case 2: val = *(unsigned short *)p; break;			\

					case 4: val = *(unsigned int *)p; break;			\

					case 8: val = *(unsigned long long *)p; break;			\

					}								\

													\

					mask = (~0ULL << rshift) >> lshift;				\

					val = (val & ~mask) | ((nval << rpad) & mask);			\

													\

					switch (byte_size) {						\

					case 1: *(unsigned char *)p      = val; break;			\

					case 2: *(unsigned short *)p     = val; break;			\

					case 4: *(unsigned int *)p       = val; break;			\

					case 8: *(unsigned long long *)p = val; break;			\

					}								\

				})

				/* Differentiator between compilers builtin implementations. This is a

				 * requirement due to the compiler parsing differences where GCC optimizes

				 * early in parsing those constructs of type pointers to the builtin specific

				 * type, resulting in not being possible to collect the required type

				 * information in the builtin expansion.

				 */

				#ifdef __clang__

				#define ___bpf_typeof(type) ((typeof(type) *) 0)

				#else

				#define ___bpf_typeof1(type, NR) ({					    \

					extern typeof(type) *___concat(bpf_type_tmp_, NR);		    \

					___concat(bpf_type_tmp_, NR);					    \

				})

				#define ___bpf_typeof(type) ___bpf_typeof1(type, __COUNTER__)

				#endif

				#ifdef __clang__

				#define ___bpf_field_ref1(field)	(field)

				#define ___bpf_field_ref2(type, field)	(___bpf_typeof(type)->field)

				#else

				#define ___bpf_field_ref1(field)	(&(field))

				#define ___bpf_field_ref2(type, field)	(&(___bpf_typeof(type)->field))

				#endif

				#define ___bpf_field_ref(args...)					    \

					___bpf_apply(___bpf_field_ref, ___bpf_narg(args))(args)

				/*

				 * Convenience macro to check that field actually exists in target kernel's.

				 * Returns:

				 *    1, if matching field is present in target kernel;

				 *    0, if no matching field found.

				 *

				 * Supports two forms:

				 *   - field reference through variable access:

				 *     bpf_core_field_exists(p->my_field);

				 *   - field reference through type and field names:

				 *     bpf_core_field_exists(struct my_type, my_field).

				 */

				#define bpf_core_field_exists(field...)					    \

					__builtin_preserve_field_info(___bpf_field_ref(field), BPF_FIELD_EXISTS)

				/*

				 * Convenience macro to get the byte size of a field. Works for integers,

				 * struct/unions, pointers, arrays, and enums.

				 *

				 * Supports two forms:

				 *   - field reference through variable access:

				 *     bpf_core_field_size(p->my_field);

				 *   - field reference through type and field names:

				 *     bpf_core_field_size(struct my_type, my_field).

				 */

				#define bpf_core_field_size(field...)					    \

					__builtin_preserve_field_info(___bpf_field_ref(field), BPF_FIELD_BYTE_SIZE)

				/*

				 * Convenience macro to get field's byte offset.

				 *

				 * Supports two forms:

				 *   - field reference through variable access:

				 *     bpf_core_field_offset(p->my_field);

				 *   - field reference through type and field names:

				 *     bpf_core_field_offset(struct my_type, my_field).

				 */

				#define bpf_core_field_offset(field...)					    \

					__builtin_preserve_field_info(___bpf_field_ref(field), BPF_FIELD_BYTE_OFFSET)

				/*

				 * Convenience macro to get BTF type ID of a specified type, using a local BTF

				 * information. Return 32-bit unsigned integer with type ID from program's own

				 * BTF. Always succeeds.

				 */

				#define bpf_core_type_id_local(type)					    \

					__builtin_btf_type_id(*___bpf_typeof(type), BPF_TYPE_ID_LOCAL)

				/*

				 * Convenience macro to get BTF type ID of a target kernel's type that matches

				 * specified local type.

				 * Returns:

				 *    - valid 32-bit unsigned type ID in kernel BTF;

				 *    - 0, if no matching type was found in a target kernel BTF.

				 */

				#define bpf_core_type_id_kernel(type)					    \

					__builtin_btf_type_id(*___bpf_typeof(type), BPF_TYPE_ID_TARGET)

				/*

				 * Convenience macro to check that provided named type

				 * (struct/union/enum/typedef) exists in a target kernel.

				 * Returns:

				 *    1, if such type is present in target kernel's BTF;

				 *    0, if no matching type is found.

				 */

				#define bpf_core_type_exists(type)					    \

					__builtin_preserve_type_info(*___bpf_typeof(type), BPF_TYPE_EXISTS)

				/*

				 * Convenience macro to check that provided named type

				 * (struct/union/enum/typedef) "matches" that in a target kernel.

				 * Returns:

				 *    1, if the type matches in the target kernel's BTF;

				 *    0, if the type does not match any in the target kernel

				 */

				#define bpf_core_type_matches(type)					    \

					__builtin_preserve_type_info(*___bpf_typeof(type), BPF_TYPE_MATCHES)

				/*

				 * Convenience macro to get the byte size of a provided named type

				 * (struct/union/enum/typedef) in a target kernel.

				 * Returns:

				 *    >= 0 size (in bytes), if type is present in target kernel's BTF;

				 *    0, if no matching type is found.

				 */

				#define bpf_core_type_size(type)					    \

					__builtin_preserve_type_info(*___bpf_typeof(type), BPF_TYPE_SIZE)

				/*

				 * Convenience macro to check that provided enumerator value is defined in

				 * a target kernel.

				 * Returns:

				 *    1, if specified enum type and its enumerator value are present in target

				 *    kernel's BTF;

				 *    0, if no matching enum and/or enum value within that enum is found.

				 */

				#ifdef __clang__

				#define bpf_core_enum_value_exists(enum_type, enum_value)		    \

					__builtin_preserve_enum_value(*(typeof(enum_type) *)enum_value, BPF_ENUMVAL_EXISTS)

				#else

				#define bpf_core_enum_value_exists(enum_type, enum_value)		    \

					__builtin_preserve_enum_value(___bpf_typeof(enum_type), enum_value, BPF_ENUMVAL_EXISTS)

				#endif

				/*

				 * Convenience macro to get the integer value of an enumerator value in

				 * a target kernel.

				 * Returns:

				 *    64-bit value, if specified enum type and its enumerator value are

				 *    present in target kernel's BTF;

				 *    0, if no matching enum and/or enum value within that enum is found.

				 */

				#ifdef __clang__

				#define bpf_core_enum_value(enum_type, enum_value)			    \

					__builtin_preserve_enum_value(*(typeof(enum_type) *)enum_value, BPF_ENUMVAL_VALUE)

				#else

				#define bpf_core_enum_value(enum_type, enum_value)			    \

					__builtin_preserve_enum_value(___bpf_typeof(enum_type), enum_value, BPF_ENUMVAL_VALUE)

				#endif

				/*

				 * bpf_core_read() abstracts away bpf_probe_read_kernel() call and captures

				 * offset relocation for source address using __builtin_preserve_access_index()

				 * built-in, provided by Clang.

				 *

				 * __builtin_preserve_access_index() takes as an argument an expression of

				 * taking an address of a field within struct/union. It makes compiler emit

				 * a relocation, which records BTF type ID describing root struct/union and an

				 * accessor string which describes exact embedded field that was used to take

				 * an address. See detailed description of this relocation format and

				 * semantics in comments to struct bpf_core_relo in include/uapi/linux/bpf.h.

				 *

				 * This relocation allows libbpf to adjust BPF instruction to use correct

				 * actual field offset, based on target kernel BTF type that matches original

				 * (local) BTF, used to record relocation.

				 */

				#define bpf_core_read(dst, sz, src)					    \

					bpf_probe_read_kernel(dst, sz, (const void *)__builtin_preserve_access_index(src))

				/* NOTE: see comments for BPF_CORE_READ_USER() about the proper types use. */

				#define bpf_core_read_user(dst, sz, src)				    \

					bpf_probe_read_user(dst, sz, (const void *)__builtin_preserve_access_index(src))

				/*

				 * bpf_core_read_str() is a thin wrapper around bpf_probe_read_str()

				 * additionally emitting BPF CO-RE field relocation for specified source

				 * argument.

				 */

				#define bpf_core_read_str(dst, sz, src)					    \

					bpf_probe_read_kernel_str(dst, sz, (const void *)__builtin_preserve_access_index(src))

				/* NOTE: see comments for BPF_CORE_READ_USER() about the proper types use. */

				#define bpf_core_read_user_str(dst, sz, src)				    \

					bpf_probe_read_user_str(dst, sz, (const void *)__builtin_preserve_access_index(src))

				extern void *bpf_rdonly_cast(const void *obj, __u32 btf_id) __ksym __weak;

				/*

				 * Cast provided pointer *ptr* into a pointer to a specified *type* in such

				 * a way that BPF verifier will become aware of associated kernel-side BTF

				 * type. This allows to access members of kernel types directly without the

				 * need to use BPF_CORE_READ() macros.

				 */

				#define bpf_core_cast(ptr, type)					    \

					((typeof(type) *)bpf_rdonly_cast((ptr), bpf_core_type_id_kernel(type)))

				#define ___concat(a, b) a ## b

				#define ___apply(fn, n) ___concat(fn, n)

				#define ___nth(_1, _2, _3, _4, _5, _6, _7, _8, _9, _10, __11, N, ...) N

				/*

				 * return number of provided arguments; used for switch-based variadic macro

				 * definitions (see ___last, ___arrow, etc below)

				 */

				#define ___narg(...) ___nth(_, ##__VA_ARGS__, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0)

				/*

				 * return 0 if no arguments are passed, N - otherwise; used for

				 * recursively-defined macros to specify termination (0) case, and generic

				 * (N) case (e.g., ___read_ptrs, ___core_read)

				 */

				#define ___empty(...) ___nth(_, ##__VA_ARGS__, N, N, N, N, N, N, N, N, N, N, 0)

				#define ___last1(x) x

				#define ___last2(a, x) x

				#define ___last3(a, b, x) x

				#define ___last4(a, b, c, x) x

				#define ___last5(a, b, c, d, x) x

				#define ___last6(a, b, c, d, e, x) x

				#define ___last7(a, b, c, d, e, f, x) x

				#define ___last8(a, b, c, d, e, f, g, x) x

				#define ___last9(a, b, c, d, e, f, g, h, x) x

				#define ___last10(a, b, c, d, e, f, g, h, i, x) x

				#define ___last(...) ___apply(___last, ___narg(__VA_ARGS__))(__VA_ARGS__)

				#define ___nolast2(a, _) a

				#define ___nolast3(a, b, _) a, b

				#define ___nolast4(a, b, c, _) a, b, c

				#define ___nolast5(a, b, c, d, _) a, b, c, d

				#define ___nolast6(a, b, c, d, e, _) a, b, c, d, e

				#define ___nolast7(a, b, c, d, e, f, _) a, b, c, d, e, f

				#define ___nolast8(a, b, c, d, e, f, g, _) a, b, c, d, e, f, g

				#define ___nolast9(a, b, c, d, e, f, g, h, _) a, b, c, d, e, f, g, h

				#define ___nolast10(a, b, c, d, e, f, g, h, i, _) a, b, c, d, e, f, g, h, i

				#define ___nolast(...) ___apply(___nolast, ___narg(__VA_ARGS__))(__VA_ARGS__)

				#define ___arrow1(a) a

				#define ___arrow2(a, b) a->b

				#define ___arrow3(a, b, c) a->b->c

				#define ___arrow4(a, b, c, d) a->b->c->d

				#define ___arrow5(a, b, c, d, e) a->b->c->d->e

				#define ___arrow6(a, b, c, d, e, f) a->b->c->d->e->f

				#define ___arrow7(a, b, c, d, e, f, g) a->b->c->d->e->f->g

				#define ___arrow8(a, b, c, d, e, f, g, h) a->b->c->d->e->f->g->h

				#define ___arrow9(a, b, c, d, e, f, g, h, i) a->b->c->d->e->f->g->h->i

				#define ___arrow10(a, b, c, d, e, f, g, h, i, j) a->b->c->d->e->f->g->h->i->j

				#define ___arrow(...) ___apply(___arrow, ___narg(__VA_ARGS__))(__VA_ARGS__)

				#if defined(__clang__) && (__clang_major__ >= 19)

				#define ___type(...) __typeof_unqual__(___arrow(__VA_ARGS__))

				#elif defined(__GNUC__) && (__GNUC__ >= 14)

				#define ___type(...) __typeof_unqual__(___arrow(__VA_ARGS__))

				#else

				#define ___type(...) typeof(___arrow(__VA_ARGS__))

				#endif

				#define ___read(read_fn, dst, src_type, src, accessor)			    \

					read_fn((void *)(dst), sizeof(*(dst)), &((src_type)(src))->accessor)

				/* "recursively" read a sequence of inner pointers using local __t var */

				#define ___rd_first(fn, src, a) ___read(fn, &__t, ___type(src), src, a);

				#define ___rd_last(fn, ...)						    \

					___read(fn, &__t, ___type(___nolast(__VA_ARGS__)), __t, ___last(__VA_ARGS__));

				#define ___rd_p1(fn, ...) const void *__t; ___rd_first(fn, __VA_ARGS__)

				#define ___rd_p2(fn, ...) ___rd_p1(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__)

				#define ___rd_p3(fn, ...) ___rd_p2(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__)

				#define ___rd_p4(fn, ...) ___rd_p3(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__)

				#define ___rd_p5(fn, ...) ___rd_p4(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__)

				#define ___rd_p6(fn, ...) ___rd_p5(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__)

				#define ___rd_p7(fn, ...) ___rd_p6(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__)

				#define ___rd_p8(fn, ...) ___rd_p7(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__)

				#define ___rd_p9(fn, ...) ___rd_p8(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__)

				#define ___read_ptrs(fn, src, ...)					    \

					___apply(___rd_p, ___narg(__VA_ARGS__))(fn, src, __VA_ARGS__)

				#define ___core_read0(fn, fn_ptr, dst, src, a)				    \

					___read(fn, dst, ___type(src), src, a);

				#define ___core_readN(fn, fn_ptr, dst, src, ...)			    \

					___read_ptrs(fn_ptr, src, ___nolast(__VA_ARGS__))		    \

					___read(fn, dst, ___type(src, ___nolast(__VA_ARGS__)), __t,	    \

						___last(__VA_ARGS__));

				#define ___core_read(fn, fn_ptr, dst, src, a, ...)			    \

					___apply(___core_read, ___empty(__VA_ARGS__))(fn, fn_ptr, dst,	    \

										      src, a, ##__VA_ARGS__)

				/*

				 * BPF_CORE_READ_INTO() is a more performance-conscious variant of

				 * BPF_CORE_READ(), in which final field is read into user-provided storage.

				 * See BPF_CORE_READ() below for more details on general usage.

				 */

				#define BPF_CORE_READ_INTO(dst, src, a, ...) ({				    \

					___core_read(bpf_core_read, bpf_core_read,			    \

						     dst, (src), a, ##__VA_ARGS__)			    \

				})

				/*

				 * Variant of BPF_CORE_READ_INTO() for reading from user-space memory.

				 *

				 * NOTE: see comments for BPF_CORE_READ_USER() about the proper types use.

				 */

				#define BPF_CORE_READ_USER_INTO(dst, src, a, ...) ({			    \

					___core_read(bpf_core_read_user, bpf_core_read_user,		    \

						     dst, (src), a, ##__VA_ARGS__)			    \

				})

				/* Non-CO-RE variant of BPF_CORE_READ_INTO() */

				#define BPF_PROBE_READ_INTO(dst, src, a, ...) ({			    \

					___core_read(bpf_probe_read_kernel, bpf_probe_read_kernel,	    \

						     dst, (src), a, ##__VA_ARGS__)			    \

				})

				/* Non-CO-RE variant of BPF_CORE_READ_USER_INTO().

				 *

				 * As no CO-RE relocations are emitted, source types can be arbitrary and are

				 * not restricted to kernel types only.

				 */

				#define BPF_PROBE_READ_USER_INTO(dst, src, a, ...) ({			    \

					___core_read(bpf_probe_read_user, bpf_probe_read_user,		    \

						     dst, (src), a, ##__VA_ARGS__)			    \

				})

				/*

				 * BPF_CORE_READ_STR_INTO() does same "pointer chasing" as

				 * BPF_CORE_READ() for intermediate pointers, but then executes (and returns

				 * corresponding error code) bpf_core_read_str() for final string read.

				 */

				#define BPF_CORE_READ_STR_INTO(dst, src, a, ...) ({			    \

					___core_read(bpf_core_read_str, bpf_core_read,			    \

						     dst, (src), a, ##__VA_ARGS__)			    \

				})

				/*

				 * Variant of BPF_CORE_READ_STR_INTO() for reading from user-space memory.

				 *

				 * NOTE: see comments for BPF_CORE_READ_USER() about the proper types use.

				 */

				#define BPF_CORE_READ_USER_STR_INTO(dst, src, a, ...) ({		    \

					___core_read(bpf_core_read_user_str, bpf_core_read_user,	    \

						     dst, (src), a, ##__VA_ARGS__)			    \

				})

				/* Non-CO-RE variant of BPF_CORE_READ_STR_INTO() */

				#define BPF_PROBE_READ_STR_INTO(dst, src, a, ...) ({			    \

					___core_read(bpf_probe_read_kernel_str, bpf_probe_read_kernel,	    \

						     dst, (src), a, ##__VA_ARGS__)			    \

				})

				/*

				 * Non-CO-RE variant of BPF_CORE_READ_USER_STR_INTO().

				 *

				 * As no CO-RE relocations are emitted, source types can be arbitrary and are

				 * not restricted to kernel types only.

				 */

				#define BPF_PROBE_READ_USER_STR_INTO(dst, src, a, ...) ({		    \

					___core_read(bpf_probe_read_user_str, bpf_probe_read_user,	    \

						     dst, (src), a, ##__VA_ARGS__)			    \

				})

				/*

				 * BPF_CORE_READ() is used to simplify BPF CO-RE relocatable read, especially

				 * when there are few pointer chasing steps.

				 * E.g., what in non-BPF world (or in BPF w/ BCC) would be something like:

				 *	int x = s->a.b.c->d.e->f->g;

				 * can be succinctly achieved using BPF_CORE_READ as:

				 *	int x = BPF_CORE_READ(s, a.b.c, d.e, f, g);

				 *

				 * BPF_CORE_READ will decompose above statement into 4 bpf_core_read (BPF

				 * CO-RE relocatable bpf_probe_read_kernel() wrapper) calls, logically

				 * equivalent to:

				 * 1. const void *__t = s->a.b.c;

				 * 2. __t = __t->d.e;

				 * 3. __t = __t->f;

				 * 4. return __t->g;

				 *

				 * Equivalence is logical, because there is a heavy type casting/preservation

				 * involved, as well as all the reads are happening through

				 * bpf_probe_read_kernel() calls using __builtin_preserve_access_index() to

				 * emit CO-RE relocations.

				 *

				 * N.B. Only up to 9 "field accessors" are supported, which should be more

				 * than enough for any practical purpose.

				 */

				#define BPF_CORE_READ(src, a, ...) ({					    \

					___type((src), a, ##__VA_ARGS__) __r;				    \

					BPF_CORE_READ_INTO(&__r, (src), a, ##__VA_ARGS__);		    \

					__r;								    \

				})

				/*

				 * Variant of BPF_CORE_READ() for reading from user-space memory.

				 *

				 * NOTE: all the source types involved are still *kernel types* and need to

				 * exist in kernel (or kernel module) BTF, otherwise CO-RE relocation will

				 * fail. Custom user types are not relocatable with CO-RE.

				 * The typical situation in which BPF_CORE_READ_USER() might be used is to

				 * read kernel UAPI types from the user-space memory passed in as a syscall

				 * input argument.

				 */

				#define BPF_CORE_READ_USER(src, a, ...) ({				    \

					___type((src), a, ##__VA_ARGS__) __r;				    \

					BPF_CORE_READ_USER_INTO(&__r, (src), a, ##__VA_ARGS__);		    \

					__r;								    \

				})

				/* Non-CO-RE variant of BPF_CORE_READ() */

				#define BPF_PROBE_READ(src, a, ...) ({					    \

					___type((src), a, ##__VA_ARGS__) __r;				    \

					BPF_PROBE_READ_INTO(&__r, (src), a, ##__VA_ARGS__);		    \

					__r;								    \

				})

				/*

				 * Non-CO-RE variant of BPF_CORE_READ_USER().

				 *

				 * As no CO-RE relocations are emitted, source types can be arbitrary and are

				 * not restricted to kernel types only.

				 */

				#define BPF_PROBE_READ_USER(src, a, ...) ({				    \

					___type((src), a, ##__VA_ARGS__) __r;				    \

					BPF_PROBE_READ_USER_INTO(&__r, (src), a, ##__VA_ARGS__);	    \

					__r;								    \

				})

				#endif

									
										99

src/bpf_endian.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,99 @@

				/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */

				#ifndef __BPF_ENDIAN__

				#define __BPF_ENDIAN__

				/*

				 * Isolate byte #n and put it into byte #m, for __u##b type.

				 * E.g., moving byte #6 (nnnnnnnn) into byte #1 (mmmmmmmm) for __u64:

				 * 1) xxxxxxxx nnnnnnnn xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx mmmmmmmm xxxxxxxx

				 * 2) nnnnnnnn xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx mmmmmmmm xxxxxxxx 00000000

				 * 3) 00000000 00000000 00000000 00000000 00000000 00000000 00000000 nnnnnnnn

				 * 4) 00000000 00000000 00000000 00000000 00000000 00000000 nnnnnnnn 00000000

				 */

				#define ___bpf_mvb(x, b, n, m) ((__u##b)(x) << (b-(n+1)*8) >> (b-8) << (m*8))

				#define ___bpf_swab16(x) ((__u16)(			\

							  ___bpf_mvb(x, 16, 0, 1) |	\

							  ___bpf_mvb(x, 16, 1, 0)))

				#define ___bpf_swab32(x) ((__u32)(			\

							  ___bpf_mvb(x, 32, 0, 3) |	\

							  ___bpf_mvb(x, 32, 1, 2) |	\

							  ___bpf_mvb(x, 32, 2, 1) |	\

							  ___bpf_mvb(x, 32, 3, 0)))

				#define ___bpf_swab64(x) ((__u64)(			\

							  ___bpf_mvb(x, 64, 0, 7) |	\

							  ___bpf_mvb(x, 64, 1, 6) |	\

							  ___bpf_mvb(x, 64, 2, 5) |	\

							  ___bpf_mvb(x, 64, 3, 4) |	\

							  ___bpf_mvb(x, 64, 4, 3) |	\

							  ___bpf_mvb(x, 64, 5, 2) |	\

							  ___bpf_mvb(x, 64, 6, 1) |	\

							  ___bpf_mvb(x, 64, 7, 0)))

				/* LLVM's BPF target selects the endianness of the CPU

				 * it compiles on, or the user specifies (bpfel/bpfeb),

				 * respectively. The used __BYTE_ORDER__ is defined by

				 * the compiler, we cannot rely on __BYTE_ORDER from

				 * libc headers, since it doesn't reflect the actual

				 * requested byte order.

				 *

				 * Note, LLVM's BPF target has different __builtin_bswapX()

				 * semantics. It does map to BPF_ALU | BPF_END | BPF_TO_BE

				 * in bpfel and bpfeb case, which means below, that we map

				 * to cpu_to_be16(). We could use it unconditionally in BPF

				 * case, but better not rely on it, so that this header here

				 * can be used from application and BPF program side, which

				 * use different targets.

				 */

				#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__

				# define __bpf_ntohs(x)			__builtin_bswap16(x)

				# define __bpf_htons(x)			__builtin_bswap16(x)

				# define __bpf_constant_ntohs(x)	___bpf_swab16(x)

				# define __bpf_constant_htons(x)	___bpf_swab16(x)

				# define __bpf_ntohl(x)			__builtin_bswap32(x)

				# define __bpf_htonl(x)			__builtin_bswap32(x)

				# define __bpf_constant_ntohl(x)	___bpf_swab32(x)

				# define __bpf_constant_htonl(x)	___bpf_swab32(x)

				# define __bpf_be64_to_cpu(x)		__builtin_bswap64(x)

				# define __bpf_cpu_to_be64(x)		__builtin_bswap64(x)

				# define __bpf_constant_be64_to_cpu(x)	___bpf_swab64(x)

				# define __bpf_constant_cpu_to_be64(x)	___bpf_swab64(x)

				#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__

				# define __bpf_ntohs(x)			(x)

				# define __bpf_htons(x)			(x)

				# define __bpf_constant_ntohs(x)	(x)

				# define __bpf_constant_htons(x)	(x)

				# define __bpf_ntohl(x)			(x)

				# define __bpf_htonl(x)			(x)

				# define __bpf_constant_ntohl(x)	(x)

				# define __bpf_constant_htonl(x)	(x)

				# define __bpf_be64_to_cpu(x)		(x)

				# define __bpf_cpu_to_be64(x)		(x)

				# define __bpf_constant_be64_to_cpu(x)  (x)

				# define __bpf_constant_cpu_to_be64(x)  (x)

				#else

				# error "Fix your compiler's __BYTE_ORDER__?!"

				#endif

				#define bpf_htons(x)				\

					(__builtin_constant_p(x) ?		\

					 __bpf_constant_htons(x) : __bpf_htons(x))

				#define bpf_ntohs(x)				\

					(__builtin_constant_p(x) ?		\

					 __bpf_constant_ntohs(x) : __bpf_ntohs(x))

				#define bpf_htonl(x)				\

					(__builtin_constant_p(x) ?		\

					 __bpf_constant_htonl(x) : __bpf_htonl(x))

				#define bpf_ntohl(x)				\

					(__builtin_constant_p(x) ?		\

					 __bpf_constant_ntohl(x) : __bpf_ntohl(x))

				#define bpf_cpu_to_be64(x)			\

					(__builtin_constant_p(x) ?		\

					 __bpf_constant_cpu_to_be64(x) : __bpf_cpu_to_be64(x))

				#define bpf_be64_to_cpu(x)			\

					(__builtin_constant_p(x) ?		\

					 __bpf_constant_be64_to_cpu(x) : __bpf_be64_to_cpu(x))

				#endif /* __BPF_ENDIAN__ */

									
										75

src/bpf_gen_internal.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,75 @@

				/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */

				/* Copyright (c) 2021 Facebook */

				#ifndef __BPF_GEN_INTERNAL_H

				#define __BPF_GEN_INTERNAL_H

				#include "bpf.h"

				struct ksym_relo_desc {

					const char *name;

					int kind;

					int insn_idx;

					bool is_weak;

					bool is_typeless;

					bool is_ld64;

				};

				struct ksym_desc {

					const char *name;

					int ref;

					int kind;

					union {

						/* used for kfunc */

						int off;

						/* used for typeless ksym */

						bool typeless;

					};

					int insn;

					bool is_ld64;

				};

				struct bpf_gen {

					struct gen_loader_opts *opts;

					void *data_start;

					void *data_cur;

					void *insn_start;

					void *insn_cur;

					bool swapped_endian;

					ssize_t cleanup_label;

					__u32 nr_progs;

					__u32 nr_maps;

					int log_level;

					int error;

					struct ksym_relo_desc *relos;

					int relo_cnt;

					struct bpf_core_relo *core_relos;

					int core_relo_cnt;

					char attach_target[128];

					int attach_kind;

					struct ksym_desc *ksyms;

					__u32 nr_ksyms;

					int fd_array;

					int nr_fd_array;

				};

				void bpf_gen__init(struct bpf_gen *gen, int log_level, int nr_progs, int nr_maps);

				int bpf_gen__finish(struct bpf_gen *gen, int nr_progs, int nr_maps);

				void bpf_gen__free(struct bpf_gen *gen);

				void bpf_gen__load_btf(struct bpf_gen *gen, const void *raw_data, __u32 raw_size);

				void bpf_gen__map_create(struct bpf_gen *gen,

							 enum bpf_map_type map_type, const char *map_name,

							 __u32 key_size, __u32 value_size, __u32 max_entries,

							 struct bpf_map_create_opts *map_attr, int map_idx);

				void bpf_gen__prog_load(struct bpf_gen *gen,

							enum bpf_prog_type prog_type, const char *prog_name,

							const char *license, struct bpf_insn *insns, size_t insn_cnt,

							struct bpf_prog_load_opts *load_attr, int prog_idx);

				void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *value, __u32 value_size);

				void bpf_gen__map_freeze(struct bpf_gen *gen, int map_idx);

				void bpf_gen__record_attach_target(struct bpf_gen *gen, const char *name, enum bpf_attach_type type);

				void bpf_gen__record_extern(struct bpf_gen *gen, const char *name, bool is_weak,

							    bool is_typeless, bool is_ld64, int kind, int insn_idx);

				void bpf_gen__record_relo_core(struct bpf_gen *gen, const struct bpf_core_relo *core_relo);

				void bpf_gen__populate_outer_map(struct bpf_gen *gen, int outer_map_idx, int key, int inner_map_idx);

				#endif

4779

src/bpf_helper_defs.h Normal file

View File

File diff suppressed because it is too large Load Diff

									
										432

src/bpf_helpers.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,432 @@

				/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */

				#ifndef __BPF_HELPERS__

				#define __BPF_HELPERS__

				/*

				 * Note that bpf programs need to include either

				 * vmlinux.h (auto-generated from BTF) or linux/types.h

				 * in advance since bpf_helper_defs.h uses such types

				 * as __u64.

				 */

				#include "bpf_helper_defs.h"

				#define __uint(name, val) int (*name)[val]

				#define __type(name, val) typeof(val) *name

				#define __array(name, val) typeof(val) *name[]

				#define __ulong(name, val) enum { ___bpf_concat(__unique_value, __COUNTER__) = val } name

				#ifndef likely

				#define likely(x)      (__builtin_expect(!!(x), 1))

				#endif

				#ifndef unlikely

				#define unlikely(x)    (__builtin_expect(!!(x), 0))

				#endif

				/*

				 * Helper macro to place programs, maps, license in

				 * different sections in elf_bpf file. Section names

				 * are interpreted by libbpf depending on the context (BPF programs, BPF maps,

				 * extern variables, etc).

				 * To allow use of SEC() with externs (e.g., for extern .maps declarations),

				 * make sure __attribute__((unused)) doesn't trigger compilation warning.

				 */

				#if __GNUC__ && !__clang__

				/*

				 * Pragma macros are broken on GCC

				 * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55578

				 * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90400

				 */

				#define SEC(name) __attribute__((section(name), used))

				#else

				#define SEC(name) \

					_Pragma("GCC diagnostic push")					    \

					_Pragma("GCC diagnostic ignored \"-Wignored-attributes\"")	    \

					__attribute__((section(name), used))				    \

					_Pragma("GCC diagnostic pop")					    \

				#endif

				/* Avoid 'linux/stddef.h' definition of '__always_inline'. */

				#undef __always_inline

				#define __always_inline inline __attribute__((always_inline))

				#ifndef __noinline

				#define __noinline __attribute__((noinline))

				#endif

				#ifndef __weak

				#define __weak __attribute__((weak))

				#endif

				/*

				 * Use __hidden attribute to mark a non-static BPF subprogram effectively

				 * static for BPF verifier's verification algorithm purposes, allowing more

				 * extensive and permissive BPF verification process, taking into account

				 * subprogram's caller context.

				 */

				#define __hidden __attribute__((visibility("hidden")))

				/* When utilizing vmlinux.h with BPF CO-RE, user BPF programs can't include

				 * any system-level headers (such as stddef.h, linux/version.h, etc), and

				 * commonly-used macros like NULL and KERNEL_VERSION aren't available through

				 * vmlinux.h. This just adds unnecessary hurdles and forces users to re-define

				 * them on their own. So as a convenience, provide such definitions here.

				 */

				#ifndef NULL

				#define NULL ((void *)0)

				#endif

				#ifndef KERNEL_VERSION

				#define KERNEL_VERSION(a, b, c) (((a) << 16) + ((b) << 8) + ((c) > 255 ? 255 : (c)))

				#endif

				/*

				 * Helper macros to manipulate data structures

				 */

				/* offsetof() definition that uses __builtin_offset() might not preserve field

				 * offset CO-RE relocation properly, so force-redefine offsetof() using

				 * old-school approach which works with CO-RE correctly

				 */

				#undef offsetof

				#define offsetof(type, member)	((unsigned long)&((type *)0)->member)

				/* redefined container_of() to ensure we use the above offsetof() macro */

				#undef container_of

				#define container_of(ptr, type, member)				\

					({							\

						void *__mptr = (void *)(ptr);			\

						((type *)(__mptr - offsetof(type, member)));	\

					})

				/*

				 * Compiler (optimization) barrier.

				 */

				#ifndef barrier

				#define barrier() asm volatile("" ::: "memory")

				#endif

				/* Variable-specific compiler (optimization) barrier. It's a no-op which makes

				 * compiler believe that there is some black box modification of a given

				 * variable and thus prevents compiler from making extra assumption about its

				 * value and potential simplifications and optimizations on this variable.

				 *

				 * E.g., compiler might often delay or even omit 32-bit to 64-bit casting of

				 * a variable, making some code patterns unverifiable. Putting barrier_var()

				 * in place will ensure that cast is performed before the barrier_var()

				 * invocation, because compiler has to pessimistically assume that embedded

				 * asm section might perform some extra operations on that variable.

				 *

				 * This is a variable-specific variant of more global barrier().

				 */

				#ifndef barrier_var

				#define barrier_var(var) asm volatile("" : "+r"(var))

				#endif

				/*

				 * Helper macro to throw a compilation error if __bpf_unreachable() gets

				 * built into the resulting code. This works given BPF back end does not

				 * implement __builtin_trap(). This is useful to assert that certain paths

				 * of the program code are never used and hence eliminated by the compiler.

				 *

				 * For example, consider a switch statement that covers known cases used by

				 * the program. __bpf_unreachable() can then reside in the default case. If

				 * the program gets extended such that a case is not covered in the switch

				 * statement, then it will throw a build error due to the default case not

				 * being compiled out.

				 */

				#ifndef __bpf_unreachable

				# define __bpf_unreachable()	__builtin_trap()

				#endif

				/*

				 * Helper function to perform a tail call with a constant/immediate map slot.

				 */

				#if (defined(__clang__) && __clang_major__ >= 8) || (!defined(__clang__) && __GNUC__ > 12)

				#if defined(__bpf__)

				static __always_inline void

				bpf_tail_call_static(void *ctx, const void *map, const __u32 slot)

				{

					if (!__builtin_constant_p(slot))

						__bpf_unreachable();

					/*

					 * Provide a hard guarantee that LLVM won't optimize setting r2 (map

					 * pointer) and r3 (constant map index) from _different paths_ ending

					 * up at the _same_ call insn as otherwise we won't be able to use the

					 * jmpq/nopl retpoline-free patching by the x86-64 JIT in the kernel

					 * given they mismatch. See also d2e4c1e6c294 ("bpf: Constant map key

					 * tracking for prog array pokes") for details on verifier tracking.

					 *

					 * Note on clobber list: we need to stay in-line with BPF calling

					 * convention, so even if we don't end up using r0, r4, r5, we need

					 * to mark them as clobber so that LLVM doesn't end up using them

					 * before / after the call.

					 */

					asm volatile("r1 = %[ctx]\n\t"

						     "r2 = %[map]\n\t"

						     "r3 = %[slot]\n\t"

						     "call 12"

						     :: [ctx]"r"(ctx), [map]"r"(map), [slot]"i"(slot)

						     : "r0", "r1", "r2", "r3", "r4", "r5");

				}

				#endif

				#endif

				enum libbpf_pin_type {

					LIBBPF_PIN_NONE,

					/* PIN_BY_NAME: pin maps by name (in /sys/fs/bpf by default) */

					LIBBPF_PIN_BY_NAME,

				};

				enum libbpf_tristate {

					TRI_NO = 0,

					TRI_YES = 1,

					TRI_MODULE = 2,

				};

				#define __kconfig __attribute__((section(".kconfig")))

				#define __ksym __attribute__((section(".ksyms")))

				#define __kptr_untrusted __attribute__((btf_type_tag("kptr_untrusted")))

				#define __kptr __attribute__((btf_type_tag("kptr")))

				#define __percpu_kptr __attribute__((btf_type_tag("percpu_kptr")))

				#define __uptr __attribute__((btf_type_tag("uptr")))

				#if defined (__clang__)

				#define bpf_ksym_exists(sym) ({						\

					_Static_assert(!__builtin_constant_p(!!sym),			\

						       #sym " should be marked as __weak");		\

					!!sym;								\

				})

				#elif __GNUC__ > 8

				#define bpf_ksym_exists(sym) ({						\

					_Static_assert(__builtin_has_attribute (*sym, __weak__),	\

						       #sym " should be marked as __weak");		\

					!!sym;								\

				})

				#else

				#define bpf_ksym_exists(sym) !!sym

				#endif

				#define __arg_ctx __attribute__((btf_decl_tag("arg:ctx")))

				#define __arg_nonnull __attribute((btf_decl_tag("arg:nonnull")))

				#define __arg_nullable __attribute((btf_decl_tag("arg:nullable")))

				#define __arg_trusted __attribute((btf_decl_tag("arg:trusted")))

				#define __arg_arena __attribute((btf_decl_tag("arg:arena")))

				#ifndef ___bpf_concat

				#define ___bpf_concat(a, b) a ## b

				#endif

				#ifndef ___bpf_apply

				#define ___bpf_apply(fn, n) ___bpf_concat(fn, n)

				#endif

				#ifndef ___bpf_nth

				#define ___bpf_nth(_, _1, _2, _3, _4, _5, _6, _7, _8, _9, _a, _b, _c, N, ...) N

				#endif

				#ifndef ___bpf_narg

				#define ___bpf_narg(...) \

					___bpf_nth(_, ##__VA_ARGS__, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0)

				#endif

				#define ___bpf_fill0(arr, p, x) do {} while (0)

				#define ___bpf_fill1(arr, p, x) arr[p] = x

				#define ___bpf_fill2(arr, p, x, args...) arr[p] = x; ___bpf_fill1(arr, p + 1, args)

				#define ___bpf_fill3(arr, p, x, args...) arr[p] = x; ___bpf_fill2(arr, p + 1, args)

				#define ___bpf_fill4(arr, p, x, args...) arr[p] = x; ___bpf_fill3(arr, p + 1, args)

				#define ___bpf_fill5(arr, p, x, args...) arr[p] = x; ___bpf_fill4(arr, p + 1, args)

				#define ___bpf_fill6(arr, p, x, args...) arr[p] = x; ___bpf_fill5(arr, p + 1, args)

				#define ___bpf_fill7(arr, p, x, args...) arr[p] = x; ___bpf_fill6(arr, p + 1, args)

				#define ___bpf_fill8(arr, p, x, args...) arr[p] = x; ___bpf_fill7(arr, p + 1, args)

				#define ___bpf_fill9(arr, p, x, args...) arr[p] = x; ___bpf_fill8(arr, p + 1, args)

				#define ___bpf_fill10(arr, p, x, args...) arr[p] = x; ___bpf_fill9(arr, p + 1, args)

				#define ___bpf_fill11(arr, p, x, args...) arr[p] = x; ___bpf_fill10(arr, p + 1, args)

				#define ___bpf_fill12(arr, p, x, args...) arr[p] = x; ___bpf_fill11(arr, p + 1, args)

				#define ___bpf_fill(arr, args...) \

					___bpf_apply(___bpf_fill, ___bpf_narg(args))(arr, 0, args)

				/*

				 * BPF_SEQ_PRINTF to wrap bpf_seq_printf to-be-printed values

				 * in a structure.

				 */

				#define BPF_SEQ_PRINTF(seq, fmt, args...)			\

				({								\

					static const char ___fmt[] = fmt;			\

					unsigned long long ___param[___bpf_narg(args)];		\

												\

					_Pragma("GCC diagnostic push")				\

					_Pragma("GCC diagnostic ignored \"-Wint-conversion\"")	\

					___bpf_fill(___param, args);				\

					_Pragma("GCC diagnostic pop")				\

												\

					bpf_seq_printf(seq, ___fmt, sizeof(___fmt),		\

						       ___param, sizeof(___param));		\

				})

				/*

				 * BPF_SNPRINTF wraps the bpf_snprintf helper with variadic arguments instead of

				 * an array of u64.

				 */

				#define BPF_SNPRINTF(out, out_size, fmt, args...)		\

				({								\

					static const char ___fmt[] = fmt;			\

					unsigned long long ___param[___bpf_narg(args)];		\

												\

					_Pragma("GCC diagnostic push")				\

					_Pragma("GCC diagnostic ignored \"-Wint-conversion\"")	\

					___bpf_fill(___param, args);				\

					_Pragma("GCC diagnostic pop")				\

												\

					bpf_snprintf(out, out_size, ___fmt,			\

						     ___param, sizeof(___param));		\

				})

				#ifdef BPF_NO_GLOBAL_DATA

				#define BPF_PRINTK_FMT_MOD

				#else

				#define BPF_PRINTK_FMT_MOD static const

				#endif

				#define __bpf_printk(fmt, ...)				\

				({							\

					BPF_PRINTK_FMT_MOD char ____fmt[] = fmt;	\

					bpf_trace_printk(____fmt, sizeof(____fmt),	\

							 ##__VA_ARGS__);		\

				})

				/*

				 * __bpf_vprintk wraps the bpf_trace_vprintk helper with variadic arguments

				 * instead of an array of u64.

				 */

				#define __bpf_vprintk(fmt, args...)				\

				({								\

					static const char ___fmt[] = fmt;			\

					unsigned long long ___param[___bpf_narg(args)];		\

												\

					_Pragma("GCC diagnostic push")				\

					_Pragma("GCC diagnostic ignored \"-Wint-conversion\"")	\

					___bpf_fill(___param, args);				\

					_Pragma("GCC diagnostic pop")				\

												\

					bpf_trace_vprintk(___fmt, sizeof(___fmt),		\

							  ___param, sizeof(___param));		\

				})

				/* Use __bpf_printk when bpf_printk call has 3 or fewer fmt args

				 * Otherwise use __bpf_vprintk

				 */

				#define ___bpf_pick_printk(...) \

					___bpf_nth(_, ##__VA_ARGS__, __bpf_vprintk, __bpf_vprintk, __bpf_vprintk,	\

						   __bpf_vprintk, __bpf_vprintk, __bpf_vprintk, __bpf_vprintk,		\

						   __bpf_vprintk, __bpf_vprintk, __bpf_printk /*3*/, __bpf_printk /*2*/,\

						   __bpf_printk /*1*/, __bpf_printk /*0*/)

				/* Helper macro to print out debug messages */

				#define bpf_printk(fmt, args...) ___bpf_pick_printk(args)(fmt, ##args)

				struct bpf_iter_num;

				extern int bpf_iter_num_new(struct bpf_iter_num *it, int start, int end) __weak __ksym;

				extern int *bpf_iter_num_next(struct bpf_iter_num *it) __weak __ksym;

				extern void bpf_iter_num_destroy(struct bpf_iter_num *it) __weak __ksym;

				#ifndef bpf_for_each

				/* bpf_for_each(iter_type, cur_elem, args...) provides generic construct for

				 * using BPF open-coded iterators without having to write mundane explicit

				 * low-level loop logic. Instead, it provides for()-like generic construct

				 * that can be used pretty naturally. E.g., for some hypothetical cgroup

				 * iterator, you'd write:

				 *

				 * struct cgroup *cg, *parent_cg = <...>;

				 *

				 * bpf_for_each(cgroup, cg, parent_cg, CG_ITER_CHILDREN) {

				 *     bpf_printk("Child cgroup id = %d", cg->cgroup_id);

				 *     if (cg->cgroup_id == 123)

				 *         break;

				 * }

				 *

				 * I.e., it looks almost like high-level for each loop in other languages,

				 * supports continue/break, and is verifiable by BPF verifier.

				 *

				 * For iterating integers, the difference between bpf_for_each(num, i, N, M)

				 * and bpf_for(i, N, M) is in that bpf_for() provides additional proof to

				 * verifier that i is in [N, M) range, and in bpf_for_each() case i is `int

				 * *`, not just `int`. So for integers bpf_for() is more convenient.

				 *

				 * Note: this macro relies on C99 feature of allowing to declare variables

				 * inside for() loop, bound to for() loop lifetime. It also utilizes GCC

				 * extension: __attribute__((cleanup(<func>))), supported by both GCC and

				 * Clang.

				 */

				#define bpf_for_each(type, cur, args...) for (							\

					/* initialize and define destructor */							\

					struct bpf_iter_##type ___it __attribute__((aligned(8), /* enforce, just in case */,	\

										    cleanup(bpf_iter_##type##_destroy))),	\

					/* ___p pointer is just to call bpf_iter_##type##_new() *once* to init ___it */		\

							       *___p __attribute__((unused)) = (				\

									bpf_iter_##type##_new(&___it, ##args),			\

					/* this is a workaround for Clang bug: it currently doesn't emit BTF */			\

					/* for bpf_iter_##type##_destroy() when used from cleanup() attribute */		\

									(void)bpf_iter_##type##_destroy, (void *)0);		\

					/* iteration and termination check */							\

					(((cur) = bpf_iter_##type##_next(&___it)));						\

				)

				#endif /* bpf_for_each */

				#ifndef bpf_for

				/* bpf_for(i, start, end) implements a for()-like looping construct that sets

				 * provided integer variable *i* to values starting from *start* through,

				 * but not including, *end*. It also proves to BPF verifier that *i* belongs

				 * to range [start, end), so this can be used for accessing arrays without

				 * extra checks.

				 *

				 * Note: *start* and *end* are assumed to be expressions with no side effects

				 * and whose values do not change throughout bpf_for() loop execution. They do

				 * not have to be statically known or constant, though.

				 *

				 * Note: similarly to bpf_for_each(), it relies on C99 feature of declaring for()

				 * loop bound variables and cleanup attribute, supported by GCC and Clang.

				 */

				#define bpf_for(i, start, end) for (								\

					/* initialize and define destructor */							\

					struct bpf_iter_num ___it __attribute__((aligned(8), /* enforce, just in case */	\

										 cleanup(bpf_iter_num_destroy))),		\

					/* ___p pointer is necessary to call bpf_iter_num_new() *once* to init ___it */		\

							    *___p __attribute__((unused)) = (					\

								bpf_iter_num_new(&___it, (start), (end)),			\

					/* this is a workaround for Clang bug: it currently doesn't emit BTF */			\

					/* for bpf_iter_num_destroy() when used from cleanup() attribute */			\

								(void)bpf_iter_num_destroy, (void *)0);				\

					({											\

						/* iteration step */								\

						int *___t = bpf_iter_num_next(&___it);						\

						/* termination and bounds check */						\

						(___t && ((i) = *___t, (i) >= (start) && (i) < (end)));				\

					});											\

				)

				#endif /* bpf_for */

				#ifndef bpf_repeat

				/* bpf_repeat(N) performs N iterations without exposing iteration number

				 *

				 * Note: similarly to bpf_for_each(), it relies on C99 feature of declaring for()

				 * loop bound variables and cleanup attribute, supported by GCC and Clang.

				 */

				#define bpf_repeat(N) for (									\

					/* initialize and define destructor */							\

					struct bpf_iter_num ___it __attribute__((aligned(8), /* enforce, just in case */	\

										 cleanup(bpf_iter_num_destroy))),		\

					/* ___p pointer is necessary to call bpf_iter_num_new() *once* to init ___it */		\

							    *___p __attribute__((unused)) = (					\

								bpf_iter_num_new(&___it, 0, (N)),				\

					/* this is a workaround for Clang bug: it currently doesn't emit BTF */			\

					/* for bpf_iter_num_destroy() when used from cleanup() attribute */			\

								(void)bpf_iter_num_destroy, (void *)0);				\

					bpf_iter_num_next(&___it);								\

					/* nothing here  */									\

				)

				#endif /* bpf_repeat */

				#endif

									
										32

src/bpf_prog_linfo.c
									
												View File
												
				@@ -101,11 +101,12 @@ struct bpf_prog_linfo *bpf_prog_linfo__new(const struct bpf_prog_info *info)

				{

					struct bpf_prog_linfo *prog_linfo;

					__u32 nr_linfo, nr_jited_func;

					__u64 data_sz;

					nr_linfo = info->nr_line_info;

					if (!nr_linfo)

						return NULL;

						return errno = EINVAL, NULL;

					/*

					 * The min size that bpf_prog_linfo has to access for

				@@ -113,20 +114,20 @@ struct bpf_prog_linfo *bpf_prog_linfo__new(const struct bpf_prog_info *info)

					 */

					if (info->line_info_rec_size <

					    offsetof(struct bpf_line_info, file_name_off))

						return NULL;

						return errno = EINVAL, NULL;

					prog_linfo = calloc(1, sizeof(*prog_linfo));

					if (!prog_linfo)

						return NULL;

						return errno = ENOMEM, NULL;

					/* Copy xlated line_info */

					prog_linfo->nr_linfo = nr_linfo;

					prog_linfo->rec_size = info->line_info_rec_size;

					prog_linfo->raw_linfo = malloc(nr_linfo * prog_linfo->rec_size);

					data_sz = (__u64)nr_linfo * prog_linfo->rec_size;

					prog_linfo->raw_linfo = malloc(data_sz);

					if (!prog_linfo->raw_linfo)

						goto err_free;

					memcpy(prog_linfo->raw_linfo, (void *)(long)info->line_info,

					       nr_linfo * prog_linfo->rec_size);

					memcpy(prog_linfo->raw_linfo, (void *)(long)info->line_info, data_sz);

					nr_jited_func = info->nr_jited_ksyms;

					if (!nr_jited_func ||

				@@ -142,13 +143,12 @@ struct bpf_prog_linfo *bpf_prog_linfo__new(const struct bpf_prog_info *info)

					/* Copy jited_line_info */

					prog_linfo->nr_jited_func = nr_jited_func;

					prog_linfo->jited_rec_size = info->jited_line_info_rec_size;

					prog_linfo->raw_jited_linfo = malloc(nr_linfo *

									     prog_linfo->jited_rec_size);

					data_sz = (__u64)nr_linfo * prog_linfo->jited_rec_size;

					prog_linfo->raw_jited_linfo = malloc(data_sz);

					if (!prog_linfo->raw_jited_linfo)

						goto err_free;

					memcpy(prog_linfo->raw_jited_linfo,

					       (void *)(long)info->jited_line_info,

					       nr_linfo * prog_linfo->jited_rec_size);

					       (void *)(long)info->jited_line_info, data_sz);

					/* Number of jited_line_info per jited func */

					prog_linfo->nr_jited_linfo_per_func = malloc(nr_jited_func *

				@@ -174,7 +174,7 @@ struct bpf_prog_linfo *bpf_prog_linfo__new(const struct bpf_prog_info *info)

				err_free:

					bpf_prog_linfo__free(prog_linfo);

					return NULL;

					return errno = EINVAL, NULL;

				}

				const struct bpf_line_info *

				@@ -186,11 +186,11 @@ bpf_prog_linfo__lfind_addr_func(const struct bpf_prog_linfo *prog_linfo,

					const __u64 *jited_linfo;

					if (func_idx >= prog_linfo->nr_jited_func)

						return NULL;

						return errno = ENOENT, NULL;

					nr_linfo = prog_linfo->nr_jited_linfo_per_func[func_idx];

					if (nr_skip >= nr_linfo)

						return NULL;

						return errno = ENOENT, NULL;

					start = prog_linfo->jited_linfo_func_idx[func_idx] + nr_skip;

					jited_rec_size = prog_linfo->jited_rec_size;

				@@ -198,7 +198,7 @@ bpf_prog_linfo__lfind_addr_func(const struct bpf_prog_linfo *prog_linfo,

						(start * jited_rec_size);

					jited_linfo = raw_jited_linfo;

					if (addr < *jited_linfo)

						return NULL;

						return errno = ENOENT, NULL;

					nr_linfo -= nr_skip;

					rec_size = prog_linfo->rec_size;

				@@ -225,13 +225,13 @@ bpf_prog_linfo__lfind(const struct bpf_prog_linfo *prog_linfo,

					nr_linfo = prog_linfo->nr_linfo;

					if (nr_skip >= nr_linfo)

						return NULL;

						return errno = ENOENT, NULL;

					rec_size = prog_linfo->rec_size;

					raw_linfo = prog_linfo->raw_linfo + (nr_skip * rec_size);

					linfo = raw_linfo;

					if (insn_off < linfo->insn_off)

						return NULL;

						return errno = ENOENT, NULL;

					nr_linfo -= nr_skip;

					for (i = 0; i < nr_linfo; i++) {

									
										929

src/bpf_tracing.h
									
										Normal file
									
												View File
												
				@@ -0,0 +1,929 @@

				/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */

				#ifndef __BPF_TRACING_H__

				#define __BPF_TRACING_H__

				#include "bpf_helpers.h"

				/* Scan the ARCH passed in from ARCH env variable (see Makefile) */

				#if defined(__TARGET_ARCH_x86)

					#define bpf_target_x86

					#define bpf_target_defined

				#elif defined(__TARGET_ARCH_s390)

					#define bpf_target_s390

					#define bpf_target_defined

				#elif defined(__TARGET_ARCH_arm)

					#define bpf_target_arm

					#define bpf_target_defined

				#elif defined(__TARGET_ARCH_arm64)

					#define bpf_target_arm64

					#define bpf_target_defined

				#elif defined(__TARGET_ARCH_mips)

					#define bpf_target_mips

					#define bpf_target_defined

				#elif defined(__TARGET_ARCH_powerpc)

					#define bpf_target_powerpc

					#define bpf_target_defined

				#elif defined(__TARGET_ARCH_sparc)

					#define bpf_target_sparc

					#define bpf_target_defined

				#elif defined(__TARGET_ARCH_riscv)

					#define bpf_target_riscv

					#define bpf_target_defined

				#elif defined(__TARGET_ARCH_arc)

					#define bpf_target_arc

					#define bpf_target_defined

				#elif defined(__TARGET_ARCH_loongarch)

					#define bpf_target_loongarch

					#define bpf_target_defined

				#else

				/* Fall back to what the compiler says */

				#if defined(__x86_64__)

					#define bpf_target_x86

					#define bpf_target_defined

				#elif defined(__s390__)

					#define bpf_target_s390

					#define bpf_target_defined

				#elif defined(__arm__)

					#define bpf_target_arm

					#define bpf_target_defined

				#elif defined(__aarch64__)

					#define bpf_target_arm64

					#define bpf_target_defined

				#elif defined(__mips__)

					#define bpf_target_mips

					#define bpf_target_defined

				#elif defined(__powerpc__)

					#define bpf_target_powerpc

					#define bpf_target_defined

				#elif defined(__sparc__)

					#define bpf_target_sparc

					#define bpf_target_defined

				#elif defined(__riscv) && __riscv_xlen == 64

					#define bpf_target_riscv

					#define bpf_target_defined

				#elif defined(__arc__)

					#define bpf_target_arc

					#define bpf_target_defined

				#elif defined(__loongarch__)

					#define bpf_target_loongarch

					#define bpf_target_defined

				#endif /* no compiler target */

				#endif

				#ifndef __BPF_TARGET_MISSING

				#define __BPF_TARGET_MISSING "GCC error \"Must specify a BPF target arch via __TARGET_ARCH_xxx\""

				#endif

				#if defined(bpf_target_x86)

				/*

				 * https://en.wikipedia.org/wiki/X86_calling_conventions#System_V_AMD64_ABI

				 */

				#if defined(__KERNEL__) || defined(__VMLINUX_H__)

				#define __PT_PARM1_REG di

				#define __PT_PARM2_REG si

				#define __PT_PARM3_REG dx

				#define __PT_PARM4_REG cx

				#define __PT_PARM5_REG r8

				#define __PT_PARM6_REG r9

				/*

				 * Syscall uses r10 for PARM4. See arch/x86/entry/entry_64.S:entry_SYSCALL_64

				 * comments in Linux sources. And refer to syscall(2) manpage.

				 */

				#define __PT_PARM1_SYSCALL_REG __PT_PARM1_REG

				#define __PT_PARM2_SYSCALL_REG __PT_PARM2_REG

				#define __PT_PARM3_SYSCALL_REG __PT_PARM3_REG

				#define __PT_PARM4_SYSCALL_REG r10

				#define __PT_PARM5_SYSCALL_REG __PT_PARM5_REG

				#define __PT_PARM6_SYSCALL_REG __PT_PARM6_REG

				#define __PT_RET_REG sp

				#define __PT_FP_REG bp

				#define __PT_RC_REG ax

				#define __PT_SP_REG sp

				#define __PT_IP_REG ip

				#else

				#ifdef __i386__

				/* i386 kernel is built with -mregparm=3 */

				#define __PT_PARM1_REG eax

				#define __PT_PARM2_REG edx

				#define __PT_PARM3_REG ecx

				/* i386 syscall ABI is very different, refer to syscall(2) manpage */

				#define __PT_PARM1_SYSCALL_REG ebx

				#define __PT_PARM2_SYSCALL_REG ecx

				#define __PT_PARM3_SYSCALL_REG edx

				#define __PT_PARM4_SYSCALL_REG esi

				#define __PT_PARM5_SYSCALL_REG edi

				#define __PT_PARM6_SYSCALL_REG ebp

				#define __PT_RET_REG esp

				#define __PT_FP_REG ebp

				#define __PT_RC_REG eax

				#define __PT_SP_REG esp

				#define __PT_IP_REG eip

				#else /* __i386__ */

				#define __PT_PARM1_REG rdi

				#define __PT_PARM2_REG rsi

				#define __PT_PARM3_REG rdx

				#define __PT_PARM4_REG rcx

				#define __PT_PARM5_REG r8

				#define __PT_PARM6_REG r9

				#define __PT_PARM1_SYSCALL_REG __PT_PARM1_REG

				#define __PT_PARM2_SYSCALL_REG __PT_PARM2_REG

				#define __PT_PARM3_SYSCALL_REG __PT_PARM3_REG

				#define __PT_PARM4_SYSCALL_REG r10

				#define __PT_PARM5_SYSCALL_REG __PT_PARM5_REG

				#define __PT_PARM6_SYSCALL_REG __PT_PARM6_REG

				#define __PT_RET_REG rsp

				#define __PT_FP_REG rbp

				#define __PT_RC_REG rax

				#define __PT_SP_REG rsp

				#define __PT_IP_REG rip

				#endif /* __i386__ */

				#endif /* __KERNEL__ || __VMLINUX_H__ */

				#elif defined(bpf_target_s390)

				/*

				 * https://github.com/IBM/s390x-abi/releases/download/v1.6/lzsabi_s390x.pdf

				 */

				struct pt_regs___s390 {

					unsigned long orig_gpr2;

				} __attribute__((preserve_access_index));

				/* s390 provides user_pt_regs instead of struct pt_regs to userspace */

				#define __PT_REGS_CAST(x) ((const user_pt_regs *)(x))

				#define __PT_PARM1_REG gprs[2]

				#define __PT_PARM2_REG gprs[3]

				#define __PT_PARM3_REG gprs[4]

				#define __PT_PARM4_REG gprs[5]

				#define __PT_PARM5_REG gprs[6]

				#define __PT_PARM1_SYSCALL_REG orig_gpr2

				#define __PT_PARM2_SYSCALL_REG __PT_PARM2_REG

				#define __PT_PARM3_SYSCALL_REG __PT_PARM3_REG

				#define __PT_PARM4_SYSCALL_REG __PT_PARM4_REG

				#define __PT_PARM5_SYSCALL_REG __PT_PARM5_REG

				#define __PT_PARM6_SYSCALL_REG gprs[7]

				#define PT_REGS_PARM1_SYSCALL(x) (((const struct pt_regs___s390 *)(x))->__PT_PARM1_SYSCALL_REG)

				#define PT_REGS_PARM1_CORE_SYSCALL(x) \

					BPF_CORE_READ((const struct pt_regs___s390 *)(x), __PT_PARM1_SYSCALL_REG)

				#define __PT_RET_REG gprs[14]

				#define __PT_FP_REG gprs[11]	/* Works only with CONFIG_FRAME_POINTER */

				#define __PT_RC_REG gprs[2]

				#define __PT_SP_REG gprs[15]

				#define __PT_IP_REG psw.addr

				#elif defined(bpf_target_arm)

				/*

				 * https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst#machine-registers

				 */

				#define __PT_PARM1_REG uregs[0]

				#define __PT_PARM2_REG uregs[1]

				#define __PT_PARM3_REG uregs[2]

				#define __PT_PARM4_REG uregs[3]

				#define __PT_PARM1_SYSCALL_REG __PT_PARM1_REG

				#define __PT_PARM2_SYSCALL_REG __PT_PARM2_REG

				#define __PT_PARM3_SYSCALL_REG __PT_PARM3_REG

				#define __PT_PARM4_SYSCALL_REG __PT_PARM4_REG

				#define __PT_PARM5_SYSCALL_REG uregs[4]

				#define __PT_PARM6_SYSCALL_REG uregs[5]

				#define __PT_PARM7_SYSCALL_REG uregs[6]

				#define __PT_RET_REG uregs[14]

				#define __PT_FP_REG uregs[11]	/* Works only with CONFIG_FRAME_POINTER */

				#define __PT_RC_REG uregs[0]

				#define __PT_SP_REG uregs[13]

				#define __PT_IP_REG uregs[12]

				#elif defined(bpf_target_arm64)

				/*

				 * https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#machine-registers

				 */

				struct pt_regs___arm64 {

					unsigned long orig_x0;

				} __attribute__((preserve_access_index));

				/* arm64 provides struct user_pt_regs instead of struct pt_regs to userspace */

				#define __PT_REGS_CAST(x) ((const struct user_pt_regs *)(x))

				#define __PT_PARM1_REG regs[0]

				#define __PT_PARM2_REG regs[1]

				#define __PT_PARM3_REG regs[2]

				#define __PT_PARM4_REG regs[3]

				#define __PT_PARM5_REG regs[4]

				#define __PT_PARM6_REG regs[5]

				#define __PT_PARM7_REG regs[6]

				#define __PT_PARM8_REG regs[7]

				#define __PT_PARM1_SYSCALL_REG orig_x0

				#define __PT_PARM2_SYSCALL_REG __PT_PARM2_REG

				#define __PT_PARM3_SYSCALL_REG __PT_PARM3_REG

				#define __PT_PARM4_SYSCALL_REG __PT_PARM4_REG

				#define __PT_PARM5_SYSCALL_REG __PT_PARM5_REG

				#define __PT_PARM6_SYSCALL_REG __PT_PARM6_REG

				#define PT_REGS_PARM1_SYSCALL(x) (((const struct pt_regs___arm64 *)(x))->__PT_PARM1_SYSCALL_REG)

				#define PT_REGS_PARM1_CORE_SYSCALL(x) \

					BPF_CORE_READ((const struct pt_regs___arm64 *)(x), __PT_PARM1_SYSCALL_REG)

				#define __PT_RET_REG regs[30]

				#define __PT_FP_REG regs[29]	/* Works only with CONFIG_FRAME_POINTER */

				#define __PT_RC_REG regs[0]

				#define __PT_SP_REG sp

				#define __PT_IP_REG pc

				#elif defined(bpf_target_mips)

				/*

				 * N64 ABI is assumed right now.

				 * https://en.wikipedia.org/wiki/MIPS_architecture#Calling_conventions

				 */

				#define __PT_PARM1_REG regs[4]

				#define __PT_PARM2_REG regs[5]

				#define __PT_PARM3_REG regs[6]

				#define __PT_PARM4_REG regs[7]

				#define __PT_PARM5_REG regs[8]

				#define __PT_PARM6_REG regs[9]

				#define __PT_PARM7_REG regs[10]

				#define __PT_PARM8_REG regs[11]

				#define __PT_PARM1_SYSCALL_REG __PT_PARM1_REG

				#define __PT_PARM2_SYSCALL_REG __PT_PARM2_REG

				#define __PT_PARM3_SYSCALL_REG __PT_PARM3_REG

				#define __PT_PARM4_SYSCALL_REG __PT_PARM4_REG

				#define __PT_PARM5_SYSCALL_REG __PT_PARM5_REG /* only N32/N64 */

				#define __PT_PARM6_SYSCALL_REG __PT_PARM6_REG /* only N32/N64 */

				#define __PT_RET_REG regs[31]

				#define __PT_FP_REG regs[30]	/* Works only with CONFIG_FRAME_POINTER */

				#define __PT_RC_REG regs[2]

				#define __PT_SP_REG regs[29]

				#define __PT_IP_REG cp0_epc

				#elif defined(bpf_target_powerpc)

				/*

				 * http://refspecs.linux-foundation.org/elf/elfspec_ppc.pdf (page 3-14,

				 * section "Function Calling Sequence")

				 */

				#define __PT_PARM1_REG gpr[3]

				#define __PT_PARM2_REG gpr[4]

				#define __PT_PARM3_REG gpr[5]

				#define __PT_PARM4_REG gpr[6]

				#define __PT_PARM5_REG gpr[7]

				#define __PT_PARM6_REG gpr[8]

				#define __PT_PARM7_REG gpr[9]

				#define __PT_PARM8_REG gpr[10]

				/* powerpc does not select ARCH_HAS_SYSCALL_WRAPPER. */

				#define PT_REGS_SYSCALL_REGS(ctx) ctx

				#define __PT_PARM1_SYSCALL_REG orig_gpr3

				#define __PT_PARM2_SYSCALL_REG __PT_PARM2_REG

				#define __PT_PARM3_SYSCALL_REG __PT_PARM3_REG

				#define __PT_PARM4_SYSCALL_REG __PT_PARM4_REG

				#define __PT_PARM5_SYSCALL_REG __PT_PARM5_REG

				#define __PT_PARM6_SYSCALL_REG __PT_PARM6_REG

				#if !defined(__arch64__)

				#define __PT_PARM7_SYSCALL_REG __PT_PARM7_REG /* only powerpc (not powerpc64) */

				#endif

				#define __PT_RET_REG regs[31]

				#define __PT_FP_REG __unsupported__

				#define __PT_RC_REG gpr[3]

				#define __PT_SP_REG sp

				#define __PT_IP_REG nip

				#elif defined(bpf_target_sparc)

				/*

				 * https://en.wikipedia.org/wiki/Calling_convention#SPARC

				 */

				#define __PT_PARM1_REG u_regs[UREG_I0]

				#define __PT_PARM2_REG u_regs[UREG_I1]

				#define __PT_PARM3_REG u_regs[UREG_I2]

				#define __PT_PARM4_REG u_regs[UREG_I3]

				#define __PT_PARM5_REG u_regs[UREG_I4]

				#define __PT_PARM6_REG u_regs[UREG_I5]

				#define __PT_PARM1_SYSCALL_REG __PT_PARM1_REG

				#define __PT_PARM2_SYSCALL_REG __PT_PARM2_REG

				#define __PT_PARM3_SYSCALL_REG __PT_PARM3_REG

				#define __PT_PARM4_SYSCALL_REG __PT_PARM4_REG

				#define __PT_PARM5_SYSCALL_REG __PT_PARM5_REG

				#define __PT_PARM6_SYSCALL_REG __PT_PARM6_REG

				#define __PT_RET_REG u_regs[UREG_I7]

				#define __PT_FP_REG __unsupported__

				#define __PT_RC_REG u_regs[UREG_I0]

				#define __PT_SP_REG u_regs[UREG_FP]

				/* Should this also be a bpf_target check for the sparc case? */

				#if defined(__arch64__)

				#define __PT_IP_REG tpc

				#else

				#define __PT_IP_REG pc

				#endif

				#elif defined(bpf_target_riscv)

				/*

				 * https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc#risc-v-calling-conventions

				 */

				struct pt_regs___riscv {

					unsigned long orig_a0;

				} __attribute__((preserve_access_index));

				/* riscv provides struct user_regs_struct instead of struct pt_regs to userspace */

				#define __PT_REGS_CAST(x) ((const struct user_regs_struct *)(x))

				#define __PT_PARM1_REG a0

				#define __PT_PARM2_REG a1

				#define __PT_PARM3_REG a2

				#define __PT_PARM4_REG a3

				#define __PT_PARM5_REG a4

				#define __PT_PARM6_REG a5

				#define __PT_PARM7_REG a6

				#define __PT_PARM8_REG a7

				#define __PT_PARM1_SYSCALL_REG orig_a0

				#define __PT_PARM2_SYSCALL_REG __PT_PARM2_REG

				#define __PT_PARM3_SYSCALL_REG __PT_PARM3_REG

				#define __PT_PARM4_SYSCALL_REG __PT_PARM4_REG

				#define __PT_PARM5_SYSCALL_REG __PT_PARM5_REG

				#define __PT_PARM6_SYSCALL_REG __PT_PARM6_REG

				#define PT_REGS_PARM1_SYSCALL(x) (((const struct pt_regs___riscv *)(x))->__PT_PARM1_SYSCALL_REG)

				#define PT_REGS_PARM1_CORE_SYSCALL(x) \

					BPF_CORE_READ((const struct pt_regs___riscv *)(x), __PT_PARM1_SYSCALL_REG)

				#define __PT_RET_REG ra

				#define __PT_FP_REG s0

				#define __PT_RC_REG a0

				#define __PT_SP_REG sp

				#define __PT_IP_REG pc

				#elif defined(bpf_target_arc)

				/*

				 * Section "Function Calling Sequence" (page 24):

				 * https://raw.githubusercontent.com/wiki/foss-for-synopsys-dwc-arc-processors/toolchain/files/ARCv2_ABI.pdf

				 */

				/* arc provides struct user_regs_struct instead of struct pt_regs to userspace */

				#define __PT_REGS_CAST(x) ((const struct user_regs_struct *)(x))

				#define __PT_PARM1_REG scratch.r0

				#define __PT_PARM2_REG scratch.r1

				#define __PT_PARM3_REG scratch.r2

				#define __PT_PARM4_REG scratch.r3

				#define __PT_PARM5_REG scratch.r4

				#define __PT_PARM6_REG scratch.r5

				#define __PT_PARM7_REG scratch.r6

				#define __PT_PARM8_REG scratch.r7

				/* arc does not select ARCH_HAS_SYSCALL_WRAPPER. */

				#define PT_REGS_SYSCALL_REGS(ctx) ctx

				#define __PT_PARM1_SYSCALL_REG __PT_PARM1_REG

				#define __PT_PARM2_SYSCALL_REG __PT_PARM2_REG

				#define __PT_PARM3_SYSCALL_REG __PT_PARM3_REG

				#define __PT_PARM4_SYSCALL_REG __PT_PARM4_REG

				#define __PT_PARM5_SYSCALL_REG __PT_PARM5_REG

				#define __PT_PARM6_SYSCALL_REG __PT_PARM6_REG

				#define __PT_RET_REG scratch.blink

				#define __PT_FP_REG scratch.fp

				#define __PT_RC_REG scratch.r0

				#define __PT_SP_REG scratch.sp

				#define __PT_IP_REG scratch.ret

				#elif defined(bpf_target_loongarch)

				/*

				 * https://docs.kernel.org/loongarch/introduction.html

				 * https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html

				 */

				/* loongarch provides struct user_pt_regs instead of struct pt_regs to userspace */

				#define __PT_REGS_CAST(x) ((const struct user_pt_regs *)(x))

				#define __PT_PARM1_REG regs[4]

				#define __PT_PARM2_REG regs[5]

				#define __PT_PARM3_REG regs[6]

				#define __PT_PARM4_REG regs[7]

				#define __PT_PARM5_REG regs[8]

				#define __PT_PARM6_REG regs[9]

				#define __PT_PARM7_REG regs[10]

				#define __PT_PARM8_REG regs[11]

				/* loongarch does not select ARCH_HAS_SYSCALL_WRAPPER. */

				#define PT_REGS_SYSCALL_REGS(ctx) ctx

				#define __PT_PARM1_SYSCALL_REG __PT_PARM1_REG

				#define __PT_PARM2_SYSCALL_REG __PT_PARM2_REG

				#define __PT_PARM3_SYSCALL_REG __PT_PARM3_REG

				#define __PT_PARM4_SYSCALL_REG __PT_PARM4_REG

				#define __PT_PARM5_SYSCALL_REG __PT_PARM5_REG

				#define __PT_PARM6_SYSCALL_REG __PT_PARM6_REG

				#define __PT_RET_REG regs[1]

				#define __PT_FP_REG regs[22]

				#define __PT_RC_REG regs[4]

				#define __PT_SP_REG regs[3]

				#define __PT_IP_REG csr_era

				#endif

				#if defined(bpf_target_defined)

				struct pt_regs;

				/* allow some architectures to override `struct pt_regs` */

				#ifndef __PT_REGS_CAST

				#define __PT_REGS_CAST(x) (x)

				#endif

				/*

				 * Different architectures support different number of arguments passed

				 * through registers. i386 supports just 3, some arches support up to 8.

				 */

				#ifndef __PT_PARM4_REG

				#define __PT_PARM4_REG __unsupported__

				#endif

				#ifndef __PT_PARM5_REG

				#define __PT_PARM5_REG __unsupported__

				#endif

				#ifndef __PT_PARM6_REG

				#define __PT_PARM6_REG __unsupported__

				#endif

				#ifndef __PT_PARM7_REG

				#define __PT_PARM7_REG __unsupported__

				#endif

				#ifndef __PT_PARM8_REG

				#define __PT_PARM8_REG __unsupported__

				#endif

				/*

				 * Similarly, syscall-specific conventions might differ between function call

				 * conventions within each architecture. All supported architectures pass

				 * either 6 or 7 syscall arguments in registers.

				 *

				 * See syscall(2) manpage for succinct table with information on each arch.

				 */

				#ifndef __PT_PARM7_SYSCALL_REG

				#define __PT_PARM7_SYSCALL_REG __unsupported__

				#endif

				#define PT_REGS_PARM1(x) (__PT_REGS_CAST(x)->__PT_PARM1_REG)

				#define PT_REGS_PARM2(x) (__PT_REGS_CAST(x)->__PT_PARM2_REG)

				#define PT_REGS_PARM3(x) (__PT_REGS_CAST(x)->__PT_PARM3_REG)

				#define PT_REGS_PARM4(x) (__PT_REGS_CAST(x)->__PT_PARM4_REG)

				#define PT_REGS_PARM5(x) (__PT_REGS_CAST(x)->__PT_PARM5_REG)

				#define PT_REGS_PARM6(x) (__PT_REGS_CAST(x)->__PT_PARM6_REG)

				#define PT_REGS_PARM7(x) (__PT_REGS_CAST(x)->__PT_PARM7_REG)

				#define PT_REGS_PARM8(x) (__PT_REGS_CAST(x)->__PT_PARM8_REG)

				#define PT_REGS_RET(x) (__PT_REGS_CAST(x)->__PT_RET_REG)

				#define PT_REGS_FP(x) (__PT_REGS_CAST(x)->__PT_FP_REG)

				#define PT_REGS_RC(x) (__PT_REGS_CAST(x)->__PT_RC_REG)

				#define PT_REGS_SP(x) (__PT_REGS_CAST(x)->__PT_SP_REG)

				#define PT_REGS_IP(x) (__PT_REGS_CAST(x)->__PT_IP_REG)

				#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM1_REG)

				#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM2_REG)

				#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM3_REG)

				#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM4_REG)

				#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM5_REG)

				#define PT_REGS_PARM6_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM6_REG)

				#define PT_REGS_PARM7_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM7_REG)

				#define PT_REGS_PARM8_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM8_REG)

				#define PT_REGS_RET_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_RET_REG)

				#define PT_REGS_FP_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_FP_REG)

				#define PT_REGS_RC_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_RC_REG)

				#define PT_REGS_SP_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_SP_REG)

				#define PT_REGS_IP_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_IP_REG)

				#if defined(bpf_target_powerpc)

				#define BPF_KPROBE_READ_RET_IP(ip, ctx)		({ (ip) = (ctx)->link; })

				#define BPF_KRETPROBE_READ_RET_IP		BPF_KPROBE_READ_RET_IP

				#elif defined(bpf_target_sparc) || defined(bpf_target_arm64)

				#define BPF_KPROBE_READ_RET_IP(ip, ctx)		({ (ip) = PT_REGS_RET(ctx); })

				#define BPF_KRETPROBE_READ_RET_IP		BPF_KPROBE_READ_RET_IP

				#else

				#define BPF_KPROBE_READ_RET_IP(ip, ctx)					    \

					({ bpf_probe_read_kernel(&(ip), sizeof(ip), (void *)PT_REGS_RET(ctx)); })

				#define BPF_KRETPROBE_READ_RET_IP(ip, ctx)				    \

					({ bpf_probe_read_kernel(&(ip), sizeof(ip), (void *)(PT_REGS_FP(ctx) + sizeof(ip))); })

				#endif

				#ifndef PT_REGS_PARM1_SYSCALL

				#define PT_REGS_PARM1_SYSCALL(x) (__PT_REGS_CAST(x)->__PT_PARM1_SYSCALL_REG)

				#define PT_REGS_PARM1_CORE_SYSCALL(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM1_SYSCALL_REG)

				#endif

				#ifndef PT_REGS_PARM2_SYSCALL

				#define PT_REGS_PARM2_SYSCALL(x) (__PT_REGS_CAST(x)->__PT_PARM2_SYSCALL_REG)

				#define PT_REGS_PARM2_CORE_SYSCALL(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM2_SYSCALL_REG)

				#endif

				#ifndef PT_REGS_PARM3_SYSCALL

				#define PT_REGS_PARM3_SYSCALL(x) (__PT_REGS_CAST(x)->__PT_PARM3_SYSCALL_REG)

				#define PT_REGS_PARM3_CORE_SYSCALL(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM3_SYSCALL_REG)

				#endif

				#ifndef PT_REGS_PARM4_SYSCALL

				#define PT_REGS_PARM4_SYSCALL(x) (__PT_REGS_CAST(x)->__PT_PARM4_SYSCALL_REG)

				#define PT_REGS_PARM4_CORE_SYSCALL(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM4_SYSCALL_REG)

				#endif

				#ifndef PT_REGS_PARM5_SYSCALL

				#define PT_REGS_PARM5_SYSCALL(x) (__PT_REGS_CAST(x)->__PT_PARM5_SYSCALL_REG)

				#define PT_REGS_PARM5_CORE_SYSCALL(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM5_SYSCALL_REG)

				#endif

				#ifndef PT_REGS_PARM6_SYSCALL

				#define PT_REGS_PARM6_SYSCALL(x) (__PT_REGS_CAST(x)->__PT_PARM6_SYSCALL_REG)

				#define PT_REGS_PARM6_CORE_SYSCALL(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM6_SYSCALL_REG)

				#endif

				#ifndef PT_REGS_PARM7_SYSCALL

				#define PT_REGS_PARM7_SYSCALL(x) (__PT_REGS_CAST(x)->__PT_PARM7_SYSCALL_REG)

				#define PT_REGS_PARM7_CORE_SYSCALL(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM7_SYSCALL_REG)

				#endif

				#else /* defined(bpf_target_defined) */

				#define PT_REGS_PARM1(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_PARM2(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_PARM3(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_PARM4(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_PARM5(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_PARM6(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_PARM7(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_PARM8(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_RET(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_FP(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_RC(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_SP(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_IP(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_PARM1_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_PARM2_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_PARM3_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_PARM4_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_PARM5_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_PARM6_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_PARM7_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_PARM8_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_RET_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_FP_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_RC_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_SP_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_IP_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define BPF_KPROBE_READ_RET_IP(ip, ctx) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define BPF_KRETPROBE_READ_RET_IP(ip, ctx) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_PARM1_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_PARM2_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_PARM3_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_PARM4_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_PARM5_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_PARM6_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_PARM7_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_PARM1_CORE_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_PARM2_CORE_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_PARM3_CORE_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_PARM4_CORE_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_PARM5_CORE_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_PARM6_CORE_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#define PT_REGS_PARM7_CORE_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })

				#endif /* defined(bpf_target_defined) */

				/*

				 * When invoked from a syscall handler kprobe, returns a pointer to a

				 * struct pt_regs containing syscall arguments and suitable for passing to

				 * PT_REGS_PARMn_SYSCALL() and PT_REGS_PARMn_CORE_SYSCALL().

				 */

				#ifndef PT_REGS_SYSCALL_REGS

				/* By default, assume that the arch selects ARCH_HAS_SYSCALL_WRAPPER. */

				#define PT_REGS_SYSCALL_REGS(ctx) ((struct pt_regs *)PT_REGS_PARM1(ctx))

				#endif

				#ifndef ___bpf_concat

				#define ___bpf_concat(a, b) a ## b

				#endif

				#ifndef ___bpf_apply

				#define ___bpf_apply(fn, n) ___bpf_concat(fn, n)

				#endif

				#ifndef ___bpf_nth

				#define ___bpf_nth(_, _1, _2, _3, _4, _5, _6, _7, _8, _9, _a, _b, _c, N, ...) N

				#endif

				#ifndef ___bpf_narg

				#define ___bpf_narg(...) ___bpf_nth(_, ##__VA_ARGS__, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0)

				#endif

				#define ___bpf_ctx_cast0()            ctx

				#define ___bpf_ctx_cast1(x)           ___bpf_ctx_cast0(), ctx[0]

				#define ___bpf_ctx_cast2(x, args...)  ___bpf_ctx_cast1(args), ctx[1]

				#define ___bpf_ctx_cast3(x, args...)  ___bpf_ctx_cast2(args), ctx[2]

				#define ___bpf_ctx_cast4(x, args...)  ___bpf_ctx_cast3(args), ctx[3]

				#define ___bpf_ctx_cast5(x, args...)  ___bpf_ctx_cast4(args), ctx[4]

				#define ___bpf_ctx_cast6(x, args...)  ___bpf_ctx_cast5(args), ctx[5]

				#define ___bpf_ctx_cast7(x, args...)  ___bpf_ctx_cast6(args), ctx[6]

				#define ___bpf_ctx_cast8(x, args...)  ___bpf_ctx_cast7(args), ctx[7]

				#define ___bpf_ctx_cast9(x, args...)  ___bpf_ctx_cast8(args), ctx[8]

				#define ___bpf_ctx_cast10(x, args...) ___bpf_ctx_cast9(args), ctx[9]

				#define ___bpf_ctx_cast11(x, args...) ___bpf_ctx_cast10(args), ctx[10]

				#define ___bpf_ctx_cast12(x, args...) ___bpf_ctx_cast11(args), ctx[11]

				#define ___bpf_ctx_cast(args...)      ___bpf_apply(___bpf_ctx_cast, ___bpf_narg(args))(args)

				/*

				 * BPF_PROG is a convenience wrapper for generic tp_btf/fentry/fexit and

				 * similar kinds of BPF programs, that accept input arguments as a single

				 * pointer to untyped u64 array, where each u64 can actually be a typed

				 * pointer or integer of different size. Instead of requiring user to write

				 * manual casts and work with array elements by index, BPF_PROG macro

				 * allows user to declare a list of named and typed input arguments in the

				 * same syntax as for normal C function. All the casting is hidden and

				 * performed transparently, while user code can just assume working with

				 * function arguments of specified type and name.

				 *

				 * Original raw context argument is preserved as well as 'ctx' argument.

				 * This is useful when using BPF helpers that expect original context

				 * as one of the parameters (e.g., for bpf_perf_event_output()).

				 */

				#define BPF_PROG(name, args...)						    \

				name(unsigned long long *ctx);						    \

				static __always_inline typeof(name(0))					    \

				____##name(unsigned long long *ctx, ##args);				    \

				typeof(name(0)) name(unsigned long long *ctx)				    \

				{									    \

					_Pragma("GCC diagnostic push")					    \

					_Pragma("GCC diagnostic ignored \"-Wint-conversion\"")		    \

					return ____##name(___bpf_ctx_cast(args));			    \

					_Pragma("GCC diagnostic pop")					    \

				}									    \

				static __always_inline typeof(name(0))					    \

				____##name(unsigned long long *ctx, ##args)

				#ifndef ___bpf_nth2

				#define ___bpf_nth2(_, _1, _2, _3, _4, _5, _6, _7, _8, _9, _10, _11, _12, _13,	\

						    _14, _15, _16, _17, _18, _19, _20, _21, _22, _23, _24, N, ...) N

				#endif

				#ifndef ___bpf_narg2

				#define ___bpf_narg2(...)	\

					___bpf_nth2(_, ##__VA_ARGS__, 12, 12, 11, 11, 10, 10, 9, 9, 8, 8, 7, 7,	\

						    6, 6, 5, 5, 4, 4, 3, 3, 2, 2, 1, 1, 0)

				#endif

				#define ___bpf_treg_cnt(t) \

					__builtin_choose_expr(sizeof(t) == 1, 1,	\

					__builtin_choose_expr(sizeof(t) == 2, 1,	\

					__builtin_choose_expr(sizeof(t) == 4, 1,	\

					__builtin_choose_expr(sizeof(t) == 8, 1,	\

					__builtin_choose_expr(sizeof(t) == 16, 2,	\

							      (void)0)))))

				#define ___bpf_reg_cnt0()		(0)

				#define ___bpf_reg_cnt1(t, x)		(___bpf_reg_cnt0() + ___bpf_treg_cnt(t))

				#define ___bpf_reg_cnt2(t, x, args...)	(___bpf_reg_cnt1(args) + ___bpf_treg_cnt(t))

				#define ___bpf_reg_cnt3(t, x, args...)	(___bpf_reg_cnt2(args) + ___bpf_treg_cnt(t))

				#define ___bpf_reg_cnt4(t, x, args...)	(___bpf_reg_cnt3(args) + ___bpf_treg_cnt(t))

				#define ___bpf_reg_cnt5(t, x, args...)	(___bpf_reg_cnt4(args) + ___bpf_treg_cnt(t))

				#define ___bpf_reg_cnt6(t, x, args...)	(___bpf_reg_cnt5(args) + ___bpf_treg_cnt(t))

				#define ___bpf_reg_cnt7(t, x, args...)	(___bpf_reg_cnt6(args) + ___bpf_treg_cnt(t))

				#define ___bpf_reg_cnt8(t, x, args...)	(___bpf_reg_cnt7(args) + ___bpf_treg_cnt(t))

				#define ___bpf_reg_cnt9(t, x, args...)	(___bpf_reg_cnt8(args) + ___bpf_treg_cnt(t))

				#define ___bpf_reg_cnt10(t, x, args...)	(___bpf_reg_cnt9(args) + ___bpf_treg_cnt(t))

				#define ___bpf_reg_cnt11(t, x, args...)	(___bpf_reg_cnt10(args) + ___bpf_treg_cnt(t))

				#define ___bpf_reg_cnt12(t, x, args...)	(___bpf_reg_cnt11(args) + ___bpf_treg_cnt(t))

				#define ___bpf_reg_cnt(args...)	 ___bpf_apply(___bpf_reg_cnt, ___bpf_narg2(args))(args)

				#define ___bpf_union_arg(t, x, n) \

					__builtin_choose_expr(sizeof(t) == 1, ({ union { __u8 z[1]; t x; } ___t = { .z = {ctx[n]}}; ___t.x; }), \

					__builtin_choose_expr(sizeof(t) == 2, ({ union { __u16 z[1]; t x; } ___t = { .z = {ctx[n]} }; ___t.x; }), \

					__builtin_choose_expr(sizeof(t) == 4, ({ union { __u32 z[1]; t x; } ___t = { .z = {ctx[n]} }; ___t.x; }), \

					__builtin_choose_expr(sizeof(t) == 8, ({ union { __u64 z[1]; t x; } ___t = {.z = {ctx[n]} }; ___t.x; }), \

					__builtin_choose_expr(sizeof(t) == 16, ({ union { __u64 z[2]; t x; } ___t = {.z = {ctx[n], ctx[n + 1]} }; ___t.x; }), \

							      (void)0)))))

				#define ___bpf_ctx_arg0(n, args...)

				#define ___bpf_ctx_arg1(n, t, x)		, ___bpf_union_arg(t, x, n - ___bpf_reg_cnt1(t, x))

				#define ___bpf_ctx_arg2(n, t, x, args...)	, ___bpf_union_arg(t, x, n - ___bpf_reg_cnt2(t, x, args)) ___bpf_ctx_arg1(n, args)

				#define ___bpf_ctx_arg3(n, t, x, args...)	, ___bpf_union_arg(t, x, n - ___bpf_reg_cnt3(t, x, args)) ___bpf_ctx_arg2(n, args)

				#define ___bpf_ctx_arg4(n, t, x, args...)	, ___bpf_union_arg(t, x, n - ___bpf_reg_cnt4(t, x, args)) ___bpf_ctx_arg3(n, args)

				#define ___bpf_ctx_arg5(n, t, x, args...)	, ___bpf_union_arg(t, x, n - ___bpf_reg_cnt5(t, x, args)) ___bpf_ctx_arg4(n, args)

				#define ___bpf_ctx_arg6(n, t, x, args...)	, ___bpf_union_arg(t, x, n - ___bpf_reg_cnt6(t, x, args)) ___bpf_ctx_arg5(n, args)

				#define ___bpf_ctx_arg7(n, t, x, args...)	, ___bpf_union_arg(t, x, n - ___bpf_reg_cnt7(t, x, args)) ___bpf_ctx_arg6(n, args)

				#define ___bpf_ctx_arg8(n, t, x, args...)	, ___bpf_union_arg(t, x, n - ___bpf_reg_cnt8(t, x, args)) ___bpf_ctx_arg7(n, args)

				#define ___bpf_ctx_arg9(n, t, x, args...)	, ___bpf_union_arg(t, x, n - ___bpf_reg_cnt9(t, x, args)) ___bpf_ctx_arg8(n, args)

				#define ___bpf_ctx_arg10(n, t, x, args...)	, ___bpf_union_arg(t, x, n - ___bpf_reg_cnt10(t, x, args)) ___bpf_ctx_arg9(n, args)

				#define ___bpf_ctx_arg11(n, t, x, args...)	, ___bpf_union_arg(t, x, n - ___bpf_reg_cnt11(t, x, args)) ___bpf_ctx_arg10(n, args)

				#define ___bpf_ctx_arg12(n, t, x, args...)	, ___bpf_union_arg(t, x, n - ___bpf_reg_cnt12(t, x, args)) ___bpf_ctx_arg11(n, args)

				#define ___bpf_ctx_arg(args...)	___bpf_apply(___bpf_ctx_arg, ___bpf_narg2(args))(___bpf_reg_cnt(args), args)

				#define ___bpf_ctx_decl0()

				#define ___bpf_ctx_decl1(t, x)			, t x

				#define ___bpf_ctx_decl2(t, x, args...)		, t x ___bpf_ctx_decl1(args)

				#define ___bpf_ctx_decl3(t, x, args...)		, t x ___bpf_ctx_decl2(args)

				#define ___bpf_ctx_decl4(t, x, args...)		, t x ___bpf_ctx_decl3(args)

				#define ___bpf_ctx_decl5(t, x, args...)		, t x ___bpf_ctx_decl4(args)

				#define ___bpf_ctx_decl6(t, x, args...)		, t x ___bpf_ctx_decl5(args)

				#define ___bpf_ctx_decl7(t, x, args...)		, t x ___bpf_ctx_decl6(args)

				#define ___bpf_ctx_decl8(t, x, args...)		, t x ___bpf_ctx_decl7(args)

				#define ___bpf_ctx_decl9(t, x, args...)		, t x ___bpf_ctx_decl8(args)

				#define ___bpf_ctx_decl10(t, x, args...)	, t x ___bpf_ctx_decl9(args)

				#define ___bpf_ctx_decl11(t, x, args...)	, t x ___bpf_ctx_decl10(args)

				#define ___bpf_ctx_decl12(t, x, args...)	, t x ___bpf_ctx_decl11(args)

				#define ___bpf_ctx_decl(args...)	___bpf_apply(___bpf_ctx_decl, ___bpf_narg2(args))(args)

				/*

				 * BPF_PROG2 is an enhanced version of BPF_PROG in order to handle struct

				 * arguments. Since each struct argument might take one or two u64 values

				 * in the trampoline stack, argument type size is needed to place proper number

				 * of u64 values for each argument. Therefore, BPF_PROG2 has different

				 * syntax from BPF_PROG. For example, for the following BPF_PROG syntax:

				 *

				 *   int BPF_PROG(test2, int a, int b) { ... }

				 *

				 * the corresponding BPF_PROG2 syntax is:

				 *

				 *   int BPF_PROG2(test2, int, a, int, b) { ... }

				 *

				 * where type and the corresponding argument name are separated by comma.

				 *

				 * Use BPF_PROG2 macro if one of the arguments might be a struct/union larger

				 * than 8 bytes:

				 *

				 *   int BPF_PROG2(test_struct_arg, struct bpf_testmod_struct_arg_1, a, int, b,

				 *		   int, c, int, d, struct bpf_testmod_struct_arg_2, e, int, ret)

				 *   {

				 *        // access a, b, c, d, e, and ret directly

				 *        ...

				 *   }

				 */

				#define BPF_PROG2(name, args...)						\

				name(unsigned long long *ctx);							\

				static __always_inline typeof(name(0))						\

				____##name(unsigned long long *ctx ___bpf_ctx_decl(args));			\

				typeof(name(0)) name(unsigned long long *ctx)					\

				{										\

					return ____##name(ctx ___bpf_ctx_arg(args));				\

				}										\

				static __always_inline typeof(name(0))						\

				____##name(unsigned long long *ctx ___bpf_ctx_decl(args))

				struct pt_regs;

				#define ___bpf_kprobe_args0()           ctx

				#define ___bpf_kprobe_args1(x)          ___bpf_kprobe_args0(), (unsigned long long)PT_REGS_PARM1(ctx)

				#define ___bpf_kprobe_args2(x, args...) ___bpf_kprobe_args1(args), (unsigned long long)PT_REGS_PARM2(ctx)

				#define ___bpf_kprobe_args3(x, args...) ___bpf_kprobe_args2(args), (unsigned long long)PT_REGS_PARM3(ctx)

				#define ___bpf_kprobe_args4(x, args...) ___bpf_kprobe_args3(args), (unsigned long long)PT_REGS_PARM4(ctx)

				#define ___bpf_kprobe_args5(x, args...) ___bpf_kprobe_args4(args), (unsigned long long)PT_REGS_PARM5(ctx)

				#define ___bpf_kprobe_args6(x, args...) ___bpf_kprobe_args5(args), (unsigned long long)PT_REGS_PARM6(ctx)

				#define ___bpf_kprobe_args7(x, args...) ___bpf_kprobe_args6(args), (unsigned long long)PT_REGS_PARM7(ctx)

				#define ___bpf_kprobe_args8(x, args...) ___bpf_kprobe_args7(args), (unsigned long long)PT_REGS_PARM8(ctx)

				#define ___bpf_kprobe_args(args...)     ___bpf_apply(___bpf_kprobe_args, ___bpf_narg(args))(args)

				/*

				 * BPF_KPROBE serves the same purpose for kprobes as BPF_PROG for

				 * tp_btf/fentry/fexit BPF programs. It hides the underlying platform-specific

				 * low-level way of getting kprobe input arguments from struct pt_regs, and

				 * provides a familiar typed and named function arguments syntax and

				 * semantics of accessing kprobe input parameters.

				 *

				 * Original struct pt_regs* context is preserved as 'ctx' argument. This might

				 * be necessary when using BPF helpers like bpf_perf_event_output().

				 */

				#define BPF_KPROBE(name, args...)					    \

				name(struct pt_regs *ctx);						    \

				static __always_inline typeof(name(0))					    \

				____##name(struct pt_regs *ctx, ##args);				    \

				typeof(name(0)) name(struct pt_regs *ctx)				    \

				{									    \

					_Pragma("GCC diagnostic push")					    \

					_Pragma("GCC diagnostic ignored \"-Wint-conversion\"")		    \

					return ____##name(___bpf_kprobe_args(args));			    \

					_Pragma("GCC diagnostic pop")					    \

				}									    \

				static __always_inline typeof(name(0))					    \

				____##name(struct pt_regs *ctx, ##args)

				#define ___bpf_kretprobe_args0()       ctx

				#define ___bpf_kretprobe_args1(x)      ___bpf_kretprobe_args0(), (unsigned long long)PT_REGS_RC(ctx)

				#define ___bpf_kretprobe_args(args...) ___bpf_apply(___bpf_kretprobe_args, ___bpf_narg(args))(args)

				/*

				 * BPF_KRETPROBE is similar to BPF_KPROBE, except, it only provides optional

				 * return value (in addition to `struct pt_regs *ctx`), but no input

				 * arguments, because they will be clobbered by the time probed function

				 * returns.

				 */

				#define BPF_KRETPROBE(name, args...)					    \

				name(struct pt_regs *ctx);						    \

				static __always_inline typeof(name(0))					    \

				____##name(struct pt_regs *ctx, ##args);				    \

				typeof(name(0)) name(struct pt_regs *ctx)				    \

				{									    \

					_Pragma("GCC diagnostic push")					    \

					_Pragma("GCC diagnostic ignored \"-Wint-conversion\"")		    \

					return ____##name(___bpf_kretprobe_args(args));			    \

					_Pragma("GCC diagnostic pop")					    \

				}									    \

				static __always_inline typeof(name(0)) ____##name(struct pt_regs *ctx, ##args)

				/* If kernel has CONFIG_ARCH_HAS_SYSCALL_WRAPPER, read pt_regs directly */

				#define ___bpf_syscall_args0()           ctx

				#define ___bpf_syscall_args1(x)          ___bpf_syscall_args0(), (unsigned long long)PT_REGS_PARM1_SYSCALL(regs)

				#define ___bpf_syscall_args2(x, args...) ___bpf_syscall_args1(args), (unsigned long long)PT_REGS_PARM2_SYSCALL(regs)

				#define ___bpf_syscall_args3(x, args...) ___bpf_syscall_args2(args), (unsigned long long)PT_REGS_PARM3_SYSCALL(regs)

				#define ___bpf_syscall_args4(x, args...) ___bpf_syscall_args3(args), (unsigned long long)PT_REGS_PARM4_SYSCALL(regs)

				#define ___bpf_syscall_args5(x, args...) ___bpf_syscall_args4(args), (unsigned long long)PT_REGS_PARM5_SYSCALL(regs)

				#define ___bpf_syscall_args6(x, args...) ___bpf_syscall_args5(args), (unsigned long long)PT_REGS_PARM6_SYSCALL(regs)

				#define ___bpf_syscall_args7(x, args...) ___bpf_syscall_args6(args), (unsigned long long)PT_REGS_PARM7_SYSCALL(regs)

				#define ___bpf_syscall_args(args...)     ___bpf_apply(___bpf_syscall_args, ___bpf_narg(args))(args)

				/* If kernel doesn't have CONFIG_ARCH_HAS_SYSCALL_WRAPPER, we have to BPF_CORE_READ from pt_regs */

				#define ___bpf_syswrap_args0()           ctx

				#define ___bpf_syswrap_args1(x)          ___bpf_syswrap_args0(), (unsigned long long)PT_REGS_PARM1_CORE_SYSCALL(regs)

				#define ___bpf_syswrap_args2(x, args...) ___bpf_syswrap_args1(args), (unsigned long long)PT_REGS_PARM2_CORE_SYSCALL(regs)

				#define ___bpf_syswrap_args3(x, args...) ___bpf_syswrap_args2(args), (unsigned long long)PT_REGS_PARM3_CORE_SYSCALL(regs)

				#define ___bpf_syswrap_args4(x, args...) ___bpf_syswrap_args3(args), (unsigned long long)PT_REGS_PARM4_CORE_SYSCALL(regs)

				#define ___bpf_syswrap_args5(x, args...) ___bpf_syswrap_args4(args), (unsigned long long)PT_REGS_PARM5_CORE_SYSCALL(regs)

				#define ___bpf_syswrap_args6(x, args...) ___bpf_syswrap_args5(args), (unsigned long long)PT_REGS_PARM6_CORE_SYSCALL(regs)

				#define ___bpf_syswrap_args7(x, args...) ___bpf_syswrap_args6(args), (unsigned long long)PT_REGS_PARM7_CORE_SYSCALL(regs)

				#define ___bpf_syswrap_args(args...)     ___bpf_apply(___bpf_syswrap_args, ___bpf_narg(args))(args)

				/*

				 * BPF_KSYSCALL is a variant of BPF_KPROBE, which is intended for

				 * tracing syscall functions, like __x64_sys_close. It hides the underlying

				 * platform-specific low-level way of getting syscall input arguments from

				 * struct pt_regs, and provides a familiar typed and named function arguments

				 * syntax and semantics of accessing syscall input parameters.

				 *

				 * Original struct pt_regs * context is preserved as 'ctx' argument. This might

				 * be necessary when using BPF helpers like bpf_perf_event_output().

				 *

				 * At the moment BPF_KSYSCALL does not transparently handle all the calling

				 * convention quirks for the following syscalls:

				 *

				 * - mmap(): __ARCH_WANT_SYS_OLD_MMAP.

				 * - clone(): CONFIG_CLONE_BACKWARDS, CONFIG_CLONE_BACKWARDS2 and

				 *            CONFIG_CLONE_BACKWARDS3.

				 * - socket-related syscalls: __ARCH_WANT_SYS_SOCKETCALL.

				 * - compat syscalls.

				 *

				 * This may or may not change in the future. User needs to take extra measures

				 * to handle such quirks explicitly, if necessary.

				 *

				 * This macro relies on BPF CO-RE support and virtual __kconfig externs.

				 */

				#define BPF_KSYSCALL(name, args...)					    \

				name(struct pt_regs *ctx);						    \

				extern _Bool LINUX_HAS_SYSCALL_WRAPPER __kconfig;			    \

				static __always_inline typeof(name(0))					    \

				____##name(struct pt_regs *ctx, ##args);				    \

				typeof(name(0)) name(struct pt_regs *ctx)				    \

				{									    \

					struct pt_regs *regs = LINUX_HAS_SYSCALL_WRAPPER		    \

							       ? (struct pt_regs *)PT_REGS_PARM1(ctx)	    \

							       : ctx;					    \

					_Pragma("GCC diagnostic push")					    \

					_Pragma("GCC diagnostic ignored \"-Wint-conversion\"")		    \

					if (LINUX_HAS_SYSCALL_WRAPPER)					    \

						return ____##name(___bpf_syswrap_args(args));		    \

					else								    \

						return ____##name(___bpf_syscall_args(args));		    \

					_Pragma("GCC diagnostic pop")					    \

				}									    \

				static __always_inline typeof(name(0))					    \

				____##name(struct pt_regs *ctx, ##args)

				#define BPF_KPROBE_SYSCALL BPF_KSYSCALL

				/* BPF_UPROBE and BPF_URETPROBE are identical to BPF_KPROBE and BPF_KRETPROBE,

				 * but are named way less confusingly for SEC("uprobe") and SEC("uretprobe")

				 * use cases.

				 */

				#define BPF_UPROBE(name, args...)  BPF_KPROBE(name, ##args)

				#define BPF_URETPROBE(name, args...)  BPF_KRETPROBE(name, ##args)

				#endif

5045

src/btf.c

View File

File diff suppressed because it is too large Load Diff

									
										630

src/btf.h
									
												View File
												
				@@ -1,22 +1,24 @@

				/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */

				/* Copyright (c) 2018 Facebook */

				/*! \file */

				#ifndef __LIBBPF_BTF_H

				#define __LIBBPF_BTF_H

				#include <stdarg.h>

				#include <stdbool.h>

				#include <linux/btf.h>

				#include <linux/types.h>

				#include "libbpf_common.h"

				#ifdef __cplusplus

				extern "C" {

				#endif

				#ifndef LIBBPF_API

				#define LIBBPF_API __attribute__((visibility("default")))

				#endif

				#define BTF_ELF_SEC ".BTF"

				#define BTF_EXT_ELF_SEC ".BTF.ext"

				#define BTF_BASE_ELF_SEC ".BTF.base"

				#define MAPS_ELF_SEC ".maps"

				struct btf;

				@@ -25,101 +27,589 @@ struct btf_type;

				struct bpf_object;

				/*

				 * The .BTF.ext ELF section layout defined as

				 *   struct btf_ext_header

				 *   func_info subsection

				 *

				 * The func_info subsection layout:

				 *   record size for struct bpf_func_info in the func_info subsection

				 *   struct btf_sec_func_info for section #1

				 *   a list of bpf_func_info records for section #1

				 *     where struct bpf_func_info mimics one in include/uapi/linux/bpf.h

				 *     but may not be identical

				 *   struct btf_sec_func_info for section #2

				 *   a list of bpf_func_info records for section #2

				 *   ......

				 *

				 * Note that the bpf_func_info record size in .BTF.ext may not

				 * be the same as the one defined in include/uapi/linux/bpf.h.

				 * The loader should ensure that record_size meets minimum

				 * requirement and pass the record as is to the kernel. The

				 * kernel will handle the func_info properly based on its contents.

				 */

				struct btf_ext_header {

					__u16	magic;

					__u8	version;

					__u8	flags;

					__u32	hdr_len;

					/* All offsets are in bytes relative to the end of this header */

					__u32	func_info_off;

					__u32	func_info_len;

					__u32	line_info_off;

					__u32	line_info_len;

				enum btf_endianness {

					BTF_LITTLE_ENDIAN = 0,

					BTF_BIG_ENDIAN = 1,

				};

				/**

				 * @brief **btf__free()** frees all data of a BTF object

				 * @param btf BTF object to free

				 */

				LIBBPF_API void btf__free(struct btf *btf);

				LIBBPF_API struct btf *btf__new(__u8 *data, __u32 size);

				LIBBPF_API struct btf *btf__parse_elf(const char *path,

								      struct btf_ext **btf_ext);

				LIBBPF_API int btf__finalize_data(struct bpf_object *obj, struct btf *btf);

				LIBBPF_API int btf__load(struct btf *btf);

				/**

				 * @brief **btf__new()** creates a new instance of a BTF object from the raw

				 * bytes of an ELF's BTF section

				 * @param data raw bytes

				 * @param size number of bytes passed in `data`

				 * @return new BTF object instance which has to be eventually freed with

				 * **btf__free()**

				 *

				 * On error, error-code-encoded-as-pointer is returned, not a NULL. To extract

				 * error code from such a pointer `libbpf_get_error()` should be used. If

				 * `libbpf_set_strict_mode(LIBBPF_STRICT_CLEAN_PTRS)` is enabled, NULL is

				 * returned on error instead. In both cases thread-local `errno` variable is

				 * always set to error code as well.

				 */

				LIBBPF_API struct btf *btf__new(const void *data, __u32 size);

				/**

				 * @brief **btf__new_split()** create a new instance of a BTF object from the

				 * provided raw data bytes. It takes another BTF instance, **base_btf**, which

				 * serves as a base BTF, which is extended by types in a newly created BTF

				 * instance

				 * @param data raw bytes

				 * @param size length of raw bytes

				 * @param base_btf the base BTF object

				 * @return new BTF object instance which has to be eventually freed with

				 * **btf__free()**

				 *

				 * If *base_btf* is NULL, `btf__new_split()` is equivalent to `btf__new()` and

				 * creates non-split BTF.

				 *

				 * On error, error-code-encoded-as-pointer is returned, not a NULL. To extract

				 * error code from such a pointer `libbpf_get_error()` should be used. If

				 * `libbpf_set_strict_mode(LIBBPF_STRICT_CLEAN_PTRS)` is enabled, NULL is

				 * returned on error instead. In both cases thread-local `errno` variable is

				 * always set to error code as well.

				 */

				LIBBPF_API struct btf *btf__new_split(const void *data, __u32 size, struct btf *base_btf);

				/**

				 * @brief **btf__new_empty()** creates an empty BTF object.  Use

				 * `btf__add_*()` to populate such BTF object.

				 * @return new BTF object instance which has to be eventually freed with

				 * **btf__free()**

				 *

				 * On error, error-code-encoded-as-pointer is returned, not a NULL. To extract

				 * error code from such a pointer `libbpf_get_error()` should be used. If

				 * `libbpf_set_strict_mode(LIBBPF_STRICT_CLEAN_PTRS)` is enabled, NULL is

				 * returned on error instead. In both cases thread-local `errno` variable is

				 * always set to error code as well.

				 */

				LIBBPF_API struct btf *btf__new_empty(void);

				/**

				 * @brief **btf__new_empty_split()** creates an unpopulated BTF object from an

				 * ELF BTF section except with a base BTF on top of which split BTF should be

				 * based

				 * @return new BTF object instance which has to be eventually freed with

				 * **btf__free()**

				 *

				 * If *base_btf* is NULL, `btf__new_empty_split()` is equivalent to

				 * `btf__new_empty()` and creates non-split BTF.

				 *

				 * On error, error-code-encoded-as-pointer is returned, not a NULL. To extract

				 * error code from such a pointer `libbpf_get_error()` should be used. If

				 * `libbpf_set_strict_mode(LIBBPF_STRICT_CLEAN_PTRS)` is enabled, NULL is

				 * returned on error instead. In both cases thread-local `errno` variable is

				 * always set to error code as well.

				 */

				LIBBPF_API struct btf *btf__new_empty_split(struct btf *base_btf);

				/**

				 * @brief **btf__distill_base()** creates new versions of the split BTF

				 * *src_btf* and its base BTF. The new base BTF will only contain the types

				 * needed to improve robustness of the split BTF to small changes in base BTF.

				 * When that split BTF is loaded against a (possibly changed) base, this

				 * distilled base BTF will help update references to that (possibly changed)

				 * base BTF.

				 *

				 * Both the new split and its associated new base BTF must be freed by

				 * the caller.

				 *

				 * If successful, 0 is returned and **new_base_btf** and **new_split_btf**

				 * will point at new base/split BTF. Both the new split and its associated

				 * new base BTF must be freed by the caller.

				 *

				 * A negative value is returned on error and the thread-local `errno` variable

				 * is set to the error code as well.

				 */

				LIBBPF_API int btf__distill_base(const struct btf *src_btf, struct btf **new_base_btf,

								 struct btf **new_split_btf);

				LIBBPF_API struct btf *btf__parse(const char *path, struct btf_ext **btf_ext);

				LIBBPF_API struct btf *btf__parse_split(const char *path, struct btf *base_btf);

				LIBBPF_API struct btf *btf__parse_elf(const char *path, struct btf_ext **btf_ext);

				LIBBPF_API struct btf *btf__parse_elf_split(const char *path, struct btf *base_btf);

				LIBBPF_API struct btf *btf__parse_raw(const char *path);

				LIBBPF_API struct btf *btf__parse_raw_split(const char *path, struct btf *base_btf);

				LIBBPF_API struct btf *btf__load_vmlinux_btf(void);

				LIBBPF_API struct btf *btf__load_module_btf(const char *module_name, struct btf *vmlinux_btf);

				LIBBPF_API struct btf *btf__load_from_kernel_by_id(__u32 id);

				LIBBPF_API struct btf *btf__load_from_kernel_by_id_split(__u32 id, struct btf *base_btf);

				LIBBPF_API int btf__load_into_kernel(struct btf *btf);

				LIBBPF_API __s32 btf__find_by_name(const struct btf *btf,

								   const char *type_name);

				LIBBPF_API __u32 btf__get_nr_types(const struct btf *btf);

				LIBBPF_API __s32 btf__find_by_name_kind(const struct btf *btf,

									const char *type_name, __u32 kind);

				LIBBPF_API __u32 btf__type_cnt(const struct btf *btf);

				LIBBPF_API const struct btf *btf__base_btf(const struct btf *btf);

				LIBBPF_API const struct btf_type *btf__type_by_id(const struct btf *btf,

										  __u32 id);

				LIBBPF_API size_t btf__pointer_size(const struct btf *btf);

				LIBBPF_API int btf__set_pointer_size(struct btf *btf, size_t ptr_sz);

				LIBBPF_API enum btf_endianness btf__endianness(const struct btf *btf);

				LIBBPF_API int btf__set_endianness(struct btf *btf, enum btf_endianness endian);

				LIBBPF_API __s64 btf__resolve_size(const struct btf *btf, __u32 type_id);

				LIBBPF_API int btf__resolve_type(const struct btf *btf, __u32 type_id);

				LIBBPF_API int btf__align_of(const struct btf *btf, __u32 id);

				LIBBPF_API int btf__fd(const struct btf *btf);

				LIBBPF_API const void *btf__get_raw_data(const struct btf *btf, __u32 *size);

				LIBBPF_API void btf__set_fd(struct btf *btf, int fd);

				LIBBPF_API const void *btf__raw_data(const struct btf *btf, __u32 *size);

				LIBBPF_API const char *btf__name_by_offset(const struct btf *btf, __u32 offset);

				LIBBPF_API int btf__get_from_id(__u32 id, struct btf **btf);

				LIBBPF_API int btf__get_map_kv_tids(const struct btf *btf, const char *map_name,

								    __u32 expected_key_size,

								    __u32 expected_value_size,

								    __u32 *key_type_id, __u32 *value_type_id);

				LIBBPF_API const char *btf__str_by_offset(const struct btf *btf, __u32 offset);

				LIBBPF_API struct btf_ext *btf_ext__new(__u8 *data, __u32 size);

				LIBBPF_API struct btf_ext *btf_ext__new(const __u8 *data, __u32 size);

				LIBBPF_API void btf_ext__free(struct btf_ext *btf_ext);

				LIBBPF_API const void *btf_ext__get_raw_data(const struct btf_ext *btf_ext,

									     __u32 *size);

				LIBBPF_API int btf_ext__reloc_func_info(const struct btf *btf,

									const struct btf_ext *btf_ext,

									const char *sec_name, __u32 insns_cnt,

									void **func_info, __u32 *cnt);

				LIBBPF_API int btf_ext__reloc_line_info(const struct btf *btf,

									const struct btf_ext *btf_ext,

									const char *sec_name, __u32 insns_cnt,

									void **line_info, __u32 *cnt);

				LIBBPF_API __u32 btf_ext__func_info_rec_size(const struct btf_ext *btf_ext);

				LIBBPF_API __u32 btf_ext__line_info_rec_size(const struct btf_ext *btf_ext);

				LIBBPF_API const void *btf_ext__raw_data(const struct btf_ext *btf_ext, __u32 *size);

				LIBBPF_API enum btf_endianness btf_ext__endianness(const struct btf_ext *btf_ext);

				LIBBPF_API int btf_ext__set_endianness(struct btf_ext *btf_ext,

								       enum btf_endianness endian);

				struct btf_dedup_opts {

					unsigned int dedup_table_size;

					bool dont_resolve_fwds;

				LIBBPF_API int btf__find_str(struct btf *btf, const char *s);

				LIBBPF_API int btf__add_str(struct btf *btf, const char *s);

				LIBBPF_API int btf__add_type(struct btf *btf, const struct btf *src_btf,

							     const struct btf_type *src_type);

				/**

				 * @brief **btf__add_btf()** appends all the BTF types from *src_btf* into *btf*

				 * @param btf BTF object which all the BTF types and strings are added to

				 * @param src_btf BTF object which all BTF types and referenced strings are copied from

				 * @return BTF type ID of the first appended BTF type, or negative error code

				 *

				 * **btf__add_btf()** can be used to simply and efficiently append the entire

				 * contents of one BTF object to another one. All the BTF type data is copied

				 * over, all referenced type IDs are adjusted by adding a necessary ID offset.

				 * Only strings referenced from BTF types are copied over and deduplicated, so

				 * if there were some unused strings in *src_btf*, those won't be copied over,

				 * which is consistent with the general string deduplication semantics of BTF

				 * writing APIs.

				 *

				 * If any error is encountered during this process, the contents of *btf* is

				 * left intact, which means that **btf__add_btf()** follows the transactional

				 * semantics and the operation as a whole is all-or-nothing.

				 *

				 * *src_btf* has to be non-split BTF, as of now copying types from split BTF

				 * is not supported and will result in -ENOTSUP error code returned.

				 */

				LIBBPF_API int btf__add_btf(struct btf *btf, const struct btf *src_btf);

				LIBBPF_API int btf__add_int(struct btf *btf, const char *name, size_t byte_sz, int encoding);

				LIBBPF_API int btf__add_float(struct btf *btf, const char *name, size_t byte_sz);

				LIBBPF_API int btf__add_ptr(struct btf *btf, int ref_type_id);

				LIBBPF_API int btf__add_array(struct btf *btf,

							      int index_type_id, int elem_type_id, __u32 nr_elems);

				/* struct/union construction APIs */

				LIBBPF_API int btf__add_struct(struct btf *btf, const char *name, __u32 sz);

				LIBBPF_API int btf__add_union(struct btf *btf, const char *name, __u32 sz);

				LIBBPF_API int btf__add_field(struct btf *btf, const char *name, int field_type_id,

							      __u32 bit_offset, __u32 bit_size);

				/* enum construction APIs */

				LIBBPF_API int btf__add_enum(struct btf *btf, const char *name, __u32 bytes_sz);

				LIBBPF_API int btf__add_enum_value(struct btf *btf, const char *name, __s64 value);

				LIBBPF_API int btf__add_enum64(struct btf *btf, const char *name, __u32 bytes_sz, bool is_signed);

				LIBBPF_API int btf__add_enum64_value(struct btf *btf, const char *name, __u64 value);

				enum btf_fwd_kind {

					BTF_FWD_STRUCT = 0,

					BTF_FWD_UNION = 1,

					BTF_FWD_ENUM = 2,

				};

				LIBBPF_API int btf__dedup(struct btf *btf, struct btf_ext *btf_ext,

							  const struct btf_dedup_opts *opts);

				LIBBPF_API int btf__add_fwd(struct btf *btf, const char *name, enum btf_fwd_kind fwd_kind);

				LIBBPF_API int btf__add_typedef(struct btf *btf, const char *name, int ref_type_id);

				LIBBPF_API int btf__add_volatile(struct btf *btf, int ref_type_id);

				LIBBPF_API int btf__add_const(struct btf *btf, int ref_type_id);

				LIBBPF_API int btf__add_restrict(struct btf *btf, int ref_type_id);

				LIBBPF_API int btf__add_type_tag(struct btf *btf, const char *value, int ref_type_id);

				LIBBPF_API int btf__add_type_attr(struct btf *btf, const char *value, int ref_type_id);

				/* func and func_proto construction APIs */

				LIBBPF_API int btf__add_func(struct btf *btf, const char *name,

							     enum btf_func_linkage linkage, int proto_type_id);

				LIBBPF_API int btf__add_func_proto(struct btf *btf, int ret_type_id);

				LIBBPF_API int btf__add_func_param(struct btf *btf, const char *name, int type_id);

				/* var & datasec construction APIs */

				LIBBPF_API int btf__add_var(struct btf *btf, const char *name, int linkage, int type_id);

				LIBBPF_API int btf__add_datasec(struct btf *btf, const char *name, __u32 byte_sz);

				LIBBPF_API int btf__add_datasec_var_info(struct btf *btf, int var_type_id,

									 __u32 offset, __u32 byte_sz);

				/* tag construction API */

				LIBBPF_API int btf__add_decl_tag(struct btf *btf, const char *value, int ref_type_id,

							    int component_idx);

				LIBBPF_API int btf__add_decl_attr(struct btf *btf, const char *value, int ref_type_id,

								  int component_idx);

				struct btf_dedup_opts {

					size_t sz;

					/* optional .BTF.ext info to dedup along the main BTF info */

					struct btf_ext *btf_ext;

					/* force hash collisions (used for testing) */

					bool force_collisions;

					size_t :0;

				};

				#define btf_dedup_opts__last_field force_collisions

				LIBBPF_API int btf__dedup(struct btf *btf, const struct btf_dedup_opts *opts);

				/**

				 * @brief **btf__relocate()** will check the split BTF *btf* for references

				 * to base BTF kinds, and verify those references are compatible with

				 * *base_btf*; if they are, *btf* is adjusted such that is re-parented to

				 * *base_btf* and type ids and strings are adjusted to accommodate this.

				 *

				 * If successful, 0 is returned and **btf** now has **base_btf** as its

				 * base.

				 *

				 * A negative value is returned on error and the thread-local `errno` variable

				 * is set to the error code as well.

				 */

				LIBBPF_API int btf__relocate(struct btf *btf, const struct btf *base_btf);

				struct btf_dump;

				struct btf_dump_opts {

					void *ctx;

					size_t sz;

				};

				#define btf_dump_opts__last_field sz

				typedef void (*btf_dump_printf_fn_t)(void *ctx, const char *fmt, va_list args);

				LIBBPF_API struct btf_dump *btf_dump__new(const struct btf *btf,

									  const struct btf_ext *btf_ext,

									  const struct btf_dump_opts *opts,

									  btf_dump_printf_fn_t printf_fn);

									  btf_dump_printf_fn_t printf_fn,

									  void *ctx,

									  const struct btf_dump_opts *opts);

				LIBBPF_API void btf_dump__free(struct btf_dump *d);

				LIBBPF_API int btf_dump__dump_type(struct btf_dump *d, __u32 id);

				struct btf_dump_emit_type_decl_opts {

					/* size of this struct, for forward/backward compatibility */

					size_t sz;

					/* optional field name for type declaration, e.g.:

					 * - struct my_struct <FNAME>

					 * - void (*<FNAME>)(int)

					 * - char (*<FNAME>)[123]

					 */

					const char *field_name;

					/* extra indentation level (in number of tabs) to emit for multi-line

					 * type declarations (e.g., anonymous struct); applies for lines

					 * starting from the second one (first line is assumed to have

					 * necessary indentation already

					 */

					int indent_level;

					/* strip all the const/volatile/restrict mods */

					bool strip_mods;

					size_t :0;

				};

				#define btf_dump_emit_type_decl_opts__last_field strip_mods

				LIBBPF_API int

				btf_dump__emit_type_decl(struct btf_dump *d, __u32 id,

							 const struct btf_dump_emit_type_decl_opts *opts);

				struct btf_dump_type_data_opts {

					/* size of this struct, for forward/backward compatibility */

					size_t sz;

					const char *indent_str;

					int indent_level;

					/* below match "show" flags for bpf_show_snprintf() */

					bool compact;		/* no newlines/indentation */

					bool skip_names;	/* skip member/type names */

					bool emit_zeroes;	/* show 0-valued fields */

					size_t :0;

				};

				#define btf_dump_type_data_opts__last_field emit_zeroes

				LIBBPF_API int

				btf_dump__dump_type_data(struct btf_dump *d, __u32 id,

							 const void *data, size_t data_sz,

							 const struct btf_dump_type_data_opts *opts);

				/*

				 * A set of helpers for easier BTF types handling.

				 *

				 * The inline functions below rely on constants from the kernel headers which

				 * may not be available for applications including this header file. To avoid

				 * compilation errors, we define all the constants here that were added after

				 * the initial introduction of the BTF_KIND* constants.

				 */

				#ifndef BTF_KIND_FUNC

				#define BTF_KIND_FUNC		12	/* Function	*/

				#define BTF_KIND_FUNC_PROTO	13	/* Function Proto	*/

				#endif

				#ifndef BTF_KIND_VAR

				#define BTF_KIND_VAR		14	/* Variable	*/

				#define BTF_KIND_DATASEC	15	/* Section	*/

				#endif

				#ifndef BTF_KIND_FLOAT

				#define BTF_KIND_FLOAT		16	/* Floating point	*/

				#endif

				/* The kernel header switched to enums, so the following were never #defined */

				#define BTF_KIND_DECL_TAG	17	/* Decl Tag */

				#define BTF_KIND_TYPE_TAG	18	/* Type Tag */

				#define BTF_KIND_ENUM64		19	/* Enum for up-to 64bit values */

				static inline __u16 btf_kind(const struct btf_type *t)

				{

					return BTF_INFO_KIND(t->info);

				}

				static inline __u16 btf_vlen(const struct btf_type *t)

				{

					return BTF_INFO_VLEN(t->info);

				}

				static inline bool btf_kflag(const struct btf_type *t)

				{

					return BTF_INFO_KFLAG(t->info);

				}

				static inline bool btf_is_void(const struct btf_type *t)

				{

					return btf_kind(t) == BTF_KIND_UNKN;

				}

				static inline bool btf_is_int(const struct btf_type *t)

				{

					return btf_kind(t) == BTF_KIND_INT;

				}

				static inline bool btf_is_ptr(const struct btf_type *t)

				{

					return btf_kind(t) == BTF_KIND_PTR;

				}

				static inline bool btf_is_array(const struct btf_type *t)

				{

					return btf_kind(t) == BTF_KIND_ARRAY;

				}

				static inline bool btf_is_struct(const struct btf_type *t)

				{

					return btf_kind(t) == BTF_KIND_STRUCT;

				}

				static inline bool btf_is_union(const struct btf_type *t)

				{

					return btf_kind(t) == BTF_KIND_UNION;

				}

				static inline bool btf_is_composite(const struct btf_type *t)

				{

					__u16 kind = btf_kind(t);

					return kind == BTF_KIND_STRUCT || kind == BTF_KIND_UNION;

				}

				static inline bool btf_is_enum(const struct btf_type *t)

				{

					return btf_kind(t) == BTF_KIND_ENUM;

				}

				static inline bool btf_is_enum64(const struct btf_type *t)

				{

					return btf_kind(t) == BTF_KIND_ENUM64;

				}

				static inline bool btf_is_fwd(const struct btf_type *t)

				{

					return btf_kind(t) == BTF_KIND_FWD;

				}

				static inline bool btf_is_typedef(const struct btf_type *t)

				{

					return btf_kind(t) == BTF_KIND_TYPEDEF;

				}

				static inline bool btf_is_volatile(const struct btf_type *t)

				{

					return btf_kind(t) == BTF_KIND_VOLATILE;

				}

				static inline bool btf_is_const(const struct btf_type *t)

				{

					return btf_kind(t) == BTF_KIND_CONST;

				}

				static inline bool btf_is_restrict(const struct btf_type *t)

				{

					return btf_kind(t) == BTF_KIND_RESTRICT;

				}

				static inline bool btf_is_mod(const struct btf_type *t)

				{

					__u16 kind = btf_kind(t);

					return kind == BTF_KIND_VOLATILE ||

					       kind == BTF_KIND_CONST ||

					       kind == BTF_KIND_RESTRICT ||

					       kind == BTF_KIND_TYPE_TAG;

				}

				static inline bool btf_is_func(const struct btf_type *t)

				{

					return btf_kind(t) == BTF_KIND_FUNC;

				}

				static inline bool btf_is_func_proto(const struct btf_type *t)

				{

					return btf_kind(t) == BTF_KIND_FUNC_PROTO;

				}

				static inline bool btf_is_var(const struct btf_type *t)

				{

					return btf_kind(t) == BTF_KIND_VAR;

				}

				static inline bool btf_is_datasec(const struct btf_type *t)

				{

					return btf_kind(t) == BTF_KIND_DATASEC;

				}

				static inline bool btf_is_float(const struct btf_type *t)

				{

					return btf_kind(t) == BTF_KIND_FLOAT;

				}

				static inline bool btf_is_decl_tag(const struct btf_type *t)

				{

					return btf_kind(t) == BTF_KIND_DECL_TAG;

				}

				static inline bool btf_is_type_tag(const struct btf_type *t)

				{

					return btf_kind(t) == BTF_KIND_TYPE_TAG;

				}

				static inline bool btf_is_any_enum(const struct btf_type *t)

				{

					return btf_is_enum(t) || btf_is_enum64(t);

				}

				static inline bool btf_kind_core_compat(const struct btf_type *t1,

									const struct btf_type *t2)

				{

					return btf_kind(t1) == btf_kind(t2) ||

					       (btf_is_any_enum(t1) && btf_is_any_enum(t2));

				}

				static inline __u8 btf_int_encoding(const struct btf_type *t)

				{

					return BTF_INT_ENCODING(*(__u32 *)(t + 1));

				}

				static inline __u8 btf_int_offset(const struct btf_type *t)

				{

					return BTF_INT_OFFSET(*(__u32 *)(t + 1));

				}

				static inline __u8 btf_int_bits(const struct btf_type *t)

				{

					return BTF_INT_BITS(*(__u32 *)(t + 1));

				}

				static inline struct btf_array *btf_array(const struct btf_type *t)

				{

					return (struct btf_array *)(t + 1);

				}

				static inline struct btf_enum *btf_enum(const struct btf_type *t)

				{

					return (struct btf_enum *)(t + 1);

				}

				struct btf_enum64;

				static inline struct btf_enum64 *btf_enum64(const struct btf_type *t)

				{

					return (struct btf_enum64 *)(t + 1);

				}

				static inline __u64 btf_enum64_value(const struct btf_enum64 *e)

				{

					/* struct btf_enum64 is introduced in Linux 6.0, which is very

					 * bleeding-edge. Here we are avoiding relying on struct btf_enum64

					 * definition coming from kernel UAPI headers to support wider range

					 * of system-wide kernel headers.

					 *

					 * Given this header can be also included from C++ applications, that

					 * further restricts C tricks we can use (like using compatible

					 * anonymous struct). So just treat struct btf_enum64 as

					 * a three-element array of u32 and access second (lo32) and third

					 * (hi32) elements directly.

					 *

					 * For reference, here is a struct btf_enum64 definition:

					 *

					 * const struct btf_enum64 {

					 *	__u32	name_off;

					 *	__u32	val_lo32;

					 *	__u32	val_hi32;

					 * };

					 */

					const __u32 *e64 = (const __u32 *)e;

					return ((__u64)e64[2] << 32) | e64[1];

				}

				static inline struct btf_member *btf_members(const struct btf_type *t)

				{

					return (struct btf_member *)(t + 1);

				}

				/* Get bit offset of a member with specified index. */

				static inline __u32 btf_member_bit_offset(const struct btf_type *t,

									  __u32 member_idx)

				{

					const struct btf_member *m = btf_members(t) + member_idx;

					bool kflag = btf_kflag(t);

					return kflag ? BTF_MEMBER_BIT_OFFSET(m->offset) : m->offset;

				}

				/*

				 * Get bitfield size of a member, assuming t is BTF_KIND_STRUCT or

				 * BTF_KIND_UNION. If member is not a bitfield, zero is returned.

				 */

				static inline __u32 btf_member_bitfield_size(const struct btf_type *t,

									     __u32 member_idx)

				{

					const struct btf_member *m = btf_members(t) + member_idx;

					bool kflag = btf_kflag(t);

					return kflag ? BTF_MEMBER_BITFIELD_SIZE(m->offset) : 0;

				}

				static inline struct btf_param *btf_params(const struct btf_type *t)

				{

					return (struct btf_param *)(t + 1);

				}

				static inline struct btf_var *btf_var(const struct btf_type *t)

				{

					return (struct btf_var *)(t + 1);

				}

				static inline struct btf_var_secinfo *

				btf_var_secinfos(const struct btf_type *t)

				{

					return (struct btf_var_secinfo *)(t + 1);

				}

				struct btf_decl_tag;

				static inline struct btf_decl_tag *btf_decl_tag(const struct btf_type *t)

				{

					return (struct btf_decl_tag *)(t + 1);

				}

				#ifdef __cplusplus

				} /* extern "C" */

				#endif

1752

src/btf_dump.c

View File

File diff suppressed because it is too large Load Diff

									
										177

src/btf_iter.c
									
										Normal file
									
												View File
												
				@@ -0,0 +1,177 @@

				// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)

				/* Copyright (c) 2021 Facebook */

				/* Copyright (c) 2024, Oracle and/or its affiliates. */

				#ifdef __KERNEL__

				#include <linux/bpf.h>

				#include <linux/btf.h>

				#define btf_var_secinfos(t)	(struct btf_var_secinfo *)btf_type_var_secinfo(t)

				#else

				#include "btf.h"

				#include "libbpf_internal.h"

				#endif

				int btf_field_iter_init(struct btf_field_iter *it, struct btf_type *t,

							enum btf_field_iter_kind iter_kind)

				{

					it->p = NULL;

					it->m_idx = -1;

					it->off_idx = 0;

					it->vlen = 0;

					switch (iter_kind) {

					case BTF_FIELD_ITER_IDS:

						switch (btf_kind(t)) {

						case BTF_KIND_UNKN:

						case BTF_KIND_INT:

						case BTF_KIND_FLOAT:

						case BTF_KIND_ENUM:

						case BTF_KIND_ENUM64:

							it->desc = (struct btf_field_desc) {};

							break;

						case BTF_KIND_FWD:

						case BTF_KIND_CONST:

						case BTF_KIND_VOLATILE:

						case BTF_KIND_RESTRICT:

						case BTF_KIND_PTR:

						case BTF_KIND_TYPEDEF:

						case BTF_KIND_FUNC:

						case BTF_KIND_VAR:

						case BTF_KIND_DECL_TAG:

						case BTF_KIND_TYPE_TAG:

							it->desc = (struct btf_field_desc) { 1, {offsetof(struct btf_type, type)} };

							break;

						case BTF_KIND_ARRAY:

							it->desc = (struct btf_field_desc) {

								2, {sizeof(struct btf_type) + offsetof(struct btf_array, type),

								sizeof(struct btf_type) + offsetof(struct btf_array, index_type)}

							};

							break;

						case BTF_KIND_STRUCT:

						case BTF_KIND_UNION:

							it->desc = (struct btf_field_desc) {

								0, {},

								sizeof(struct btf_member),

								1, {offsetof(struct btf_member, type)}

							};

							break;

						case BTF_KIND_FUNC_PROTO:

							it->desc = (struct btf_field_desc) {

								1, {offsetof(struct btf_type, type)},

								sizeof(struct btf_param),

								1, {offsetof(struct btf_param, type)}

							};

							break;

						case BTF_KIND_DATASEC:

							it->desc = (struct btf_field_desc) {

								0, {},

								sizeof(struct btf_var_secinfo),

								1, {offsetof(struct btf_var_secinfo, type)}

							};

							break;

						default:

							return -EINVAL;

						}

						break;

					case BTF_FIELD_ITER_STRS:

						switch (btf_kind(t)) {

						case BTF_KIND_UNKN:

							it->desc = (struct btf_field_desc) {};

							break;

						case BTF_KIND_INT:

						case BTF_KIND_FLOAT:

						case BTF_KIND_FWD:

						case BTF_KIND_ARRAY:

						case BTF_KIND_CONST:

						case BTF_KIND_VOLATILE:

						case BTF_KIND_RESTRICT:

						case BTF_KIND_PTR:

						case BTF_KIND_TYPEDEF:

						case BTF_KIND_FUNC:

						case BTF_KIND_VAR:

						case BTF_KIND_DECL_TAG:

						case BTF_KIND_TYPE_TAG:

						case BTF_KIND_DATASEC:

							it->desc = (struct btf_field_desc) {

								1, {offsetof(struct btf_type, name_off)}

							};

							break;

						case BTF_KIND_ENUM:

							it->desc = (struct btf_field_desc) {

								1, {offsetof(struct btf_type, name_off)},

								sizeof(struct btf_enum),

								1, {offsetof(struct btf_enum, name_off)}

							};

							break;

						case BTF_KIND_ENUM64:

							it->desc = (struct btf_field_desc) {

								1, {offsetof(struct btf_type, name_off)},

								sizeof(struct btf_enum64),

								1, {offsetof(struct btf_enum64, name_off)}

							};

							break;

						case BTF_KIND_STRUCT:

						case BTF_KIND_UNION:

							it->desc = (struct btf_field_desc) {

								1, {offsetof(struct btf_type, name_off)},

								sizeof(struct btf_member),

								1, {offsetof(struct btf_member, name_off)}

							};

							break;

						case BTF_KIND_FUNC_PROTO:

							it->desc = (struct btf_field_desc) {

								1, {offsetof(struct btf_type, name_off)},

								sizeof(struct btf_param),

								1, {offsetof(struct btf_param, name_off)}

							};

							break;

						default:

							return -EINVAL;

						}

						break;

					default:

						return -EINVAL;

					}

					if (it->desc.m_sz)

						it->vlen = btf_vlen(t);

					it->p = t;

					return 0;

				}

				__u32 *btf_field_iter_next(struct btf_field_iter *it)

				{

					if (!it->p)

						return NULL;

					if (it->m_idx < 0) {

						if (it->off_idx < it->desc.t_off_cnt)

							return it->p + it->desc.t_offs[it->off_idx++];

						/* move to per-member iteration */

						it->m_idx = 0;

						it->p += sizeof(struct btf_type);

						it->off_idx = 0;

					}

					/* if type doesn't have members, stop */

					if (it->desc.m_sz == 0) {

						it->p = NULL;

						return NULL;

					}

					if (it->off_idx >= it->desc.m_off_cnt) {

						/* exhausted this member's fields, go to the next member */

						it->m_idx++;

						it->p += it->desc.m_sz;

						it->off_idx = 0;

					}

					if (it->m_idx < it->vlen)

						return it->p + it->desc.m_offs[it->off_idx++];

					it->p = NULL;

					return NULL;

				}

									
										519

src/btf_relocate.c
									
										Normal file
									
												View File
												
				@@ -0,0 +1,519 @@

				// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)

				/* Copyright (c) 2024, Oracle and/or its affiliates. */

				#ifndef _GNU_SOURCE

				#define _GNU_SOURCE

				#endif

				#ifdef __KERNEL__

				#include <linux/bpf.h>

				#include <linux/bsearch.h>

				#include <linux/btf.h>

				#include <linux/sort.h>

				#include <linux/string.h>

				#include <linux/bpf_verifier.h>

				#define btf_type_by_id				(struct btf_type *)btf_type_by_id

				#define btf__type_cnt				btf_nr_types

				#define btf__base_btf				btf_base_btf

				#define btf__name_by_offset			btf_name_by_offset

				#define btf__str_by_offset			btf_str_by_offset

				#define btf_kflag				btf_type_kflag

				#define calloc(nmemb, sz)			kvcalloc(nmemb, sz, GFP_KERNEL | __GFP_NOWARN)

				#define free(ptr)				kvfree(ptr)

				#define qsort(base, num, sz, cmp)		sort(base, num, sz, cmp, NULL)

				#else

				#include "btf.h"

				#include "bpf.h"

				#include "libbpf.h"

				#include "libbpf_internal.h"

				#endif /* __KERNEL__ */

				struct btf;

				struct btf_relocate {

					struct btf *btf;

					const struct btf *base_btf;

					const struct btf *dist_base_btf;

					unsigned int nr_base_types;

					unsigned int nr_split_types;

					unsigned int nr_dist_base_types;

					int dist_str_len;

					int base_str_len;

					__u32 *id_map;

					__u32 *str_map;

				};

				/* Set temporarily in relocation id_map if distilled base struct/union is

				 * embedded in a split BTF struct/union; in such a case, size information must

				 * match between distilled base BTF and base BTF representation of type.

				 */

				#define BTF_IS_EMBEDDED ((__u32)-1)

				/* <name, size, id> triple used in sorting/searching distilled base BTF. */

				struct btf_name_info {

					const char *name;

					/* set when search requires a size match */

					bool needs_size: 1;

					unsigned int size: 31;

					__u32 id;

				};

				static int btf_relocate_rewrite_type_id(struct btf_relocate *r, __u32 i)

				{

					struct btf_type *t = btf_type_by_id(r->btf, i);

					struct btf_field_iter it;

					__u32 *id;

					int err;

					err = btf_field_iter_init(&it, t, BTF_FIELD_ITER_IDS);

					if (err)

						return err;

					while ((id = btf_field_iter_next(&it)))

						*id = r->id_map[*id];

					return 0;

				}

				/* Simple string comparison used for sorting within BTF, since all distilled

				 * types are named.  If strings match, and size is non-zero for both elements

				 * fall back to using size for ordering.

				 */

				static int cmp_btf_name_size(const void *n1, const void *n2)

				{

					const struct btf_name_info *ni1 = n1;

					const struct btf_name_info *ni2 = n2;

					int name_diff = strcmp(ni1->name, ni2->name);

					if (!name_diff && ni1->needs_size && ni2->needs_size)

						return ni2->size - ni1->size;

					return name_diff;

				}

				/* Binary search with a small twist; find leftmost element that matches

				 * so that we can then iterate through all exact matches.  So for example

				 * searching { "a", "bb", "bb", "c" }  we would always match on the

				 * leftmost "bb".

				 */

				static struct btf_name_info *search_btf_name_size(struct btf_name_info *key,

										  struct btf_name_info *vals,

										  int nelems)

				{

					struct btf_name_info *ret = NULL;

					int high = nelems - 1;

					int low = 0;

					while (low <= high) {

						int mid = (low + high)/2;

						struct btf_name_info *val = &vals[mid];

						int diff = cmp_btf_name_size(key, val);

						if (diff == 0)

							ret = val;

						/* even if found, keep searching for leftmost match */

						if (diff <= 0)

							high = mid - 1;

						else

							low = mid + 1;

					}

					return ret;

				}

				/* If a member of a split BTF struct/union refers to a base BTF

				 * struct/union, mark that struct/union id temporarily in the id_map

				 * with BTF_IS_EMBEDDED.  Members can be const/restrict/volatile/typedef

				 * reference types, but if a pointer is encountered, the type is no longer

				 * considered embedded.

				 */

				static int btf_mark_embedded_composite_type_ids(struct btf_relocate *r, __u32 i)

				{

					struct btf_type *t = btf_type_by_id(r->btf, i);

					struct btf_field_iter it;

					__u32 *id;

					int err;

					if (!btf_is_composite(t))

						return 0;

					err = btf_field_iter_init(&it, t, BTF_FIELD_ITER_IDS);

					if (err)

						return err;

					while ((id = btf_field_iter_next(&it))) {

						__u32 next_id = *id;

						while (next_id) {

							t = btf_type_by_id(r->btf, next_id);

							switch (btf_kind(t)) {

							case BTF_KIND_CONST:

							case BTF_KIND_RESTRICT:

							case BTF_KIND_VOLATILE:

							case BTF_KIND_TYPEDEF:

							case BTF_KIND_TYPE_TAG:

								next_id = t->type;

								break;

							case BTF_KIND_ARRAY: {

								struct btf_array *a = btf_array(t);

								next_id = a->type;

								break;

							}

							case BTF_KIND_STRUCT:

							case BTF_KIND_UNION:

								if (next_id < r->nr_dist_base_types)

									r->id_map[next_id] = BTF_IS_EMBEDDED;

								next_id = 0;

								break;

							default:

								next_id = 0;

								break;

							}

						}

					}

					return 0;

				}

				/* Build a map from distilled base BTF ids to base BTF ids. To do so, iterate

				 * through base BTF looking up distilled type (using binary search) equivalents.

				 */

				static int btf_relocate_map_distilled_base(struct btf_relocate *r)

				{

					struct btf_name_info *info, *info_end;

					struct btf_type *base_t, *dist_t;

					__u8 *base_name_cnt = NULL;

					int err = 0;

					__u32 id;

					/* generate a sort index array of name/type ids sorted by name for

					 * distilled base BTF to speed name-based lookups.

					 */

					info = calloc(r->nr_dist_base_types, sizeof(*info));

					if (!info) {

						err = -ENOMEM;

						goto done;

					}

					info_end = info + r->nr_dist_base_types;

					for (id = 0; id < r->nr_dist_base_types; id++) {

						dist_t = btf_type_by_id(r->dist_base_btf, id);

						info[id].name = btf__name_by_offset(r->dist_base_btf, dist_t->name_off);

						info[id].id = id;

						info[id].size = dist_t->size;

						info[id].needs_size = true;

					}

					qsort(info, r->nr_dist_base_types, sizeof(*info), cmp_btf_name_size);

					/* Mark distilled base struct/union members of split BTF structs/unions

					 * in id_map with BTF_IS_EMBEDDED; this signals that these types

					 * need to match both name and size, otherwise embedding the base

					 * struct/union in the split type is invalid.

					 */

					for (id = r->nr_dist_base_types; id < r->nr_dist_base_types + r->nr_split_types; id++) {

						err = btf_mark_embedded_composite_type_ids(r, id);

						if (err)

							goto done;

					}

					/* Collect name counts for composite types in base BTF.  If multiple

					 * instances of a struct/union of the same name exist, we need to use

					 * size to determine which to map to since name alone is ambiguous.

					 */

					base_name_cnt = calloc(r->base_str_len, sizeof(*base_name_cnt));

					if (!base_name_cnt) {

						err = -ENOMEM;

						goto done;

					}

					for (id = 1; id < r->nr_base_types; id++) {

						base_t = btf_type_by_id(r->base_btf, id);

						if (!btf_is_composite(base_t) || !base_t->name_off)

							continue;

						if (base_name_cnt[base_t->name_off] < 255)

							base_name_cnt[base_t->name_off]++;

					}

					/* Now search base BTF for matching distilled base BTF types. */

					for (id = 1; id < r->nr_base_types; id++) {

						struct btf_name_info *dist_info, base_info = {};

						int dist_kind, base_kind;

						base_t = btf_type_by_id(r->base_btf, id);

						/* distilled base consists of named types only. */

						if (!base_t->name_off)

							continue;

						base_kind = btf_kind(base_t);

						base_info.id = id;

						base_info.name = btf__name_by_offset(r->base_btf, base_t->name_off);

						switch (base_kind) {

						case BTF_KIND_INT:

						case BTF_KIND_FLOAT:

						case BTF_KIND_ENUM:

						case BTF_KIND_ENUM64:

							/* These types should match both name and size */

							base_info.needs_size = true;

							base_info.size = base_t->size;

							break;

						case BTF_KIND_FWD:

							/* No size considerations for fwds. */

							break;

						case BTF_KIND_STRUCT:

						case BTF_KIND_UNION:

							/* Size only needs to be used for struct/union if there

							 * are multiple types in base BTF with the same name.

							 * If there are multiple _distilled_ types with the same

							 * name (a very unlikely scenario), that doesn't matter

							 * unless corresponding _base_ types to match them are

							 * missing.

							 */

							base_info.needs_size = base_name_cnt[base_t->name_off] > 1;

							base_info.size = base_t->size;

							break;

						default:

							continue;

						}

						/* iterate over all matching distilled base types */

						for (dist_info = search_btf_name_size(&base_info, info, r->nr_dist_base_types);

						     dist_info != NULL && dist_info < info_end &&

						     cmp_btf_name_size(&base_info, dist_info) == 0;

						     dist_info++) {

							if (!dist_info->id || dist_info->id >= r->nr_dist_base_types) {

								pr_warn("base BTF id [%d] maps to invalid distilled base BTF id [%d]\n",

									id, dist_info->id);

								err = -EINVAL;

								goto done;

							}

							dist_t = btf_type_by_id(r->dist_base_btf, dist_info->id);

							dist_kind = btf_kind(dist_t);

							/* Validate that the found distilled type is compatible.

							 * Do not error out on mismatch as another match may

							 * occur for an identically-named type.

							 */

							switch (dist_kind) {

							case BTF_KIND_FWD:

								switch (base_kind) {

								case BTF_KIND_FWD:

									if (btf_kflag(dist_t) != btf_kflag(base_t))

										continue;

									break;

								case BTF_KIND_STRUCT:

									if (btf_kflag(base_t))

										continue;

									break;

								case BTF_KIND_UNION:

									if (!btf_kflag(base_t))

										continue;

									break;

								default:

									continue;

								}

								break;

							case BTF_KIND_INT:

								if (dist_kind != base_kind ||

								    btf_int_encoding(base_t) != btf_int_encoding(dist_t))

									continue;

								break;

							case BTF_KIND_FLOAT:

								if (dist_kind != base_kind)

									continue;

								break;

							case BTF_KIND_ENUM:

								/* ENUM and ENUM64 are encoded as sized ENUM in

								 * distilled base BTF.

								 */

								if (base_kind != dist_kind && base_kind != BTF_KIND_ENUM64)

									continue;

								break;

							case BTF_KIND_STRUCT:

							case BTF_KIND_UNION:

								/* size verification is required for embedded

								 * struct/unions.

								 */

								if (r->id_map[dist_info->id] == BTF_IS_EMBEDDED &&

								    base_t->size != dist_t->size)

									continue;

								break;

							default:

								continue;

							}

							if (r->id_map[dist_info->id] &&

							    r->id_map[dist_info->id] != BTF_IS_EMBEDDED) {

								/* we already have a match; this tells us that

								 * multiple base types of the same name

								 * have the same size, since for cases where

								 * multiple types have the same name we match

								 * on name and size.  In this case, we have

								 * no way of determining which to relocate

								 * to in base BTF, so error out.

								 */

								pr_warn("distilled base BTF type '%s' [%u], size %u has multiple candidates of the same size (ids [%u, %u]) in base BTF\n",

									base_info.name, dist_info->id,

									base_t->size, id, r->id_map[dist_info->id]);

								err = -EINVAL;

								goto done;

							}

							/* map id and name */

							r->id_map[dist_info->id] = id;

							r->str_map[dist_t->name_off] = base_t->name_off;

						}

					}

					/* ensure all distilled BTF ids now have a mapping... */

					for (id = 1; id < r->nr_dist_base_types; id++) {

						const char *name;

						if (r->id_map[id] && r->id_map[id] != BTF_IS_EMBEDDED)

							continue;

						dist_t = btf_type_by_id(r->dist_base_btf, id);

						name = btf__name_by_offset(r->dist_base_btf, dist_t->name_off);

						pr_warn("distilled base BTF type '%s' [%d] is not mapped to base BTF id\n",

							name, id);

						err = -EINVAL;

						break;

					}

				done:

					free(base_name_cnt);

					free(info);

					return err;

				}

				/* distilled base should only have named int/float/enum/fwd/struct/union types. */

				static int btf_relocate_validate_distilled_base(struct btf_relocate *r)

				{

					unsigned int i;

					for (i = 1; i < r->nr_dist_base_types; i++) {

						struct btf_type *t = btf_type_by_id(r->dist_base_btf, i);

						int kind = btf_kind(t);

						switch (kind) {

						case BTF_KIND_INT:

						case BTF_KIND_FLOAT:

						case BTF_KIND_ENUM:

						case BTF_KIND_STRUCT:

						case BTF_KIND_UNION:

						case BTF_KIND_FWD:

							if (t->name_off)

								break;

							pr_warn("type [%d], kind [%d] is invalid for distilled base BTF; it is anonymous\n",

								i, kind);

							return -EINVAL;

						default:

							pr_warn("type [%d] in distilled based BTF has unexpected kind [%d]\n",

								i, kind);

							return -EINVAL;

						}

					}

					return 0;

				}

				static int btf_relocate_rewrite_strs(struct btf_relocate *r, __u32 i)

				{

					struct btf_type *t = btf_type_by_id(r->btf, i);

					struct btf_field_iter it;

					__u32 *str_off;

					int off, err;

					err = btf_field_iter_init(&it, t, BTF_FIELD_ITER_STRS);

					if (err)

						return err;

					while ((str_off = btf_field_iter_next(&it))) {

						if (!*str_off)

							continue;

						if (*str_off >= r->dist_str_len) {

							*str_off += r->base_str_len - r->dist_str_len;

						} else {

							off = r->str_map[*str_off];

							if (!off) {

								pr_warn("string '%s' [offset %u] is not mapped to base BTF\n",

									btf__str_by_offset(r->btf, off), *str_off);

								return -ENOENT;

							}

							*str_off = off;

						}

					}

					return 0;

				}

				/* If successful, output of relocation is updated BTF with base BTF pointing

				 * at base_btf, and type ids, strings adjusted accordingly.

				 */

				int btf_relocate(struct btf *btf, const struct btf *base_btf, __u32 **id_map)

				{

					unsigned int nr_types = btf__type_cnt(btf);

					const struct btf_header *dist_base_hdr;

					const struct btf_header *base_hdr;

					struct btf_relocate r = {};

					int err = 0;

					__u32 id, i;

					r.dist_base_btf = btf__base_btf(btf);

					if (!base_btf || r.dist_base_btf == base_btf)

						return -EINVAL;

					r.nr_dist_base_types = btf__type_cnt(r.dist_base_btf);

					r.nr_base_types = btf__type_cnt(base_btf);

					r.nr_split_types = nr_types - r.nr_dist_base_types;

					r.btf = btf;

					r.base_btf = base_btf;

					r.id_map = calloc(nr_types, sizeof(*r.id_map));

					r.str_map = calloc(btf_header(r.dist_base_btf)->str_len, sizeof(*r.str_map));

					dist_base_hdr = btf_header(r.dist_base_btf);

					base_hdr = btf_header(r.base_btf);

					r.dist_str_len = dist_base_hdr->str_len;

					r.base_str_len = base_hdr->str_len;

					if (!r.id_map || !r.str_map) {

						err = -ENOMEM;

						goto err_out;

					}

					err = btf_relocate_validate_distilled_base(&r);

					if (err)

						goto err_out;

					/* Split BTF ids need to be adjusted as base and distilled base

					 * have different numbers of types, changing the start id of split

					 * BTF.

					 */

					for (id = r.nr_dist_base_types; id < nr_types; id++)

						r.id_map[id] = id + r.nr_base_types - r.nr_dist_base_types;

					/* Build a map from distilled base ids to actual base BTF ids; it is used

					 * to update split BTF id references.  Also build a str_map mapping from

					 * distilled base BTF names to base BTF names.

					 */

					err = btf_relocate_map_distilled_base(&r);

					if (err)

						goto err_out;

					/* Next, rewrite type ids in split BTF, replacing split ids with updated

					 * ids based on number of types in base BTF, and base ids with

					 * relocated ids from base_btf.

					 */

					for (i = 0, id = r.nr_dist_base_types; i < r.nr_split_types; i++, id++) {

						err = btf_relocate_rewrite_type_id(&r, id);

						if (err)

							goto err_out;

					}

					/* String offsets now need to be updated using the str_map. */

					for (i = 0; i < r.nr_split_types; i++) {

						err = btf_relocate_rewrite_strs(&r, i + r.nr_dist_base_types);

						if (err)

							goto err_out;

					}

					/* Finally reset base BTF to be base_btf */

					btf_set_base_btf(btf, base_btf);

					if (id_map) {

						*id_map = r.id_map;

						r.id_map = NULL;

					}

				err_out:

					free(r.id_map);

					free(r.str_map);

					return err;

				}

									
										559

src/elf.c
									
										Normal file
									
												View File
												
				@@ -0,0 +1,559 @@

				// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)

				#ifndef _GNU_SOURCE

				#define _GNU_SOURCE

				#endif

				#include <libelf.h>

				#include <gelf.h>

				#include <fcntl.h>

				#include <linux/kernel.h>

				#include "libbpf_internal.h"

				#include "str_error.h"

				/* A SHT_GNU_versym section holds 16-bit words. This bit is set if

				 * the symbol is hidden and can only be seen when referenced using an

				 * explicit version number. This is a GNU extension.

				 */

				#define VERSYM_HIDDEN	0x8000

				/* This is the mask for the rest of the data in a word read from a

				 * SHT_GNU_versym section.

				 */

				#define VERSYM_VERSION	0x7fff

				int elf_open(const char *binary_path, struct elf_fd *elf_fd)

				{

					int fd, ret;

					Elf *elf;

					elf_fd->elf = NULL;

					elf_fd->fd = -1;

					if (elf_version(EV_CURRENT) == EV_NONE) {

						pr_warn("elf: failed to init libelf for %s\n", binary_path);

						return -LIBBPF_ERRNO__LIBELF;

					}

					fd = open(binary_path, O_RDONLY | O_CLOEXEC);

					if (fd < 0) {

						ret = -errno;

						pr_warn("elf: failed to open %s: %s\n", binary_path, errstr(ret));

						return ret;

					}

					elf = elf_begin(fd, ELF_C_READ_MMAP, NULL);

					if (!elf) {

						pr_warn("elf: could not read elf from %s: %s\n", binary_path, elf_errmsg(-1));

						close(fd);

						return -LIBBPF_ERRNO__FORMAT;

					}

					elf_fd->fd = fd;

					elf_fd->elf = elf;

					return 0;

				}

				void elf_close(struct elf_fd *elf_fd)

				{

					if (!elf_fd)

						return;

					elf_end(elf_fd->elf);

					close(elf_fd->fd);

				}

				/* Return next ELF section of sh_type after scn, or first of that type if scn is NULL. */

				static Elf_Scn *elf_find_next_scn_by_type(Elf *elf, int sh_type, Elf_Scn *scn)

				{

					while ((scn = elf_nextscn(elf, scn)) != NULL) {

						GElf_Shdr sh;

						if (!gelf_getshdr(scn, &sh))

							continue;

						if (sh.sh_type == sh_type)

							return scn;

					}

					return NULL;

				}

				struct elf_sym {

					const char *name;

					GElf_Sym sym;

					GElf_Shdr sh;

					int ver;

					bool hidden;

				};

				struct elf_sym_iter {

					Elf *elf;

					Elf_Data *syms;

					Elf_Data *versyms;

					Elf_Data *verdefs;

					size_t nr_syms;

					size_t strtabidx;

					size_t verdef_strtabidx;

					size_t next_sym_idx;

					struct elf_sym sym;

					int st_type;

				};

				static int elf_sym_iter_new(struct elf_sym_iter *iter,

							    Elf *elf, const char *binary_path,

							    int sh_type, int st_type)

				{

					Elf_Scn *scn = NULL;

					GElf_Ehdr ehdr;

					GElf_Shdr sh;

					memset(iter, 0, sizeof(*iter));

					if (!gelf_getehdr(elf, &ehdr)) {

						pr_warn("elf: failed to get ehdr from %s: %s\n", binary_path, elf_errmsg(-1));

						return -EINVAL;

					}

					scn = elf_find_next_scn_by_type(elf, sh_type, NULL);

					if (!scn) {

						pr_debug("elf: failed to find symbol table ELF sections in '%s'\n",

							 binary_path);

						return -ENOENT;

					}

					if (!gelf_getshdr(scn, &sh))

						return -EINVAL;

					iter->strtabidx = sh.sh_link;

					iter->syms = elf_getdata(scn, 0);

					if (!iter->syms) {

						pr_warn("elf: failed to get symbols for symtab section in '%s': %s\n",

							binary_path, elf_errmsg(-1));

						return -EINVAL;

					}

					iter->nr_syms = iter->syms->d_size / sh.sh_entsize;

					iter->elf = elf;

					iter->st_type = st_type;

					/* Version symbol table is meaningful to dynsym only */

					if (sh_type != SHT_DYNSYM)

						return 0;

					scn = elf_find_next_scn_by_type(elf, SHT_GNU_versym, NULL);

					if (!scn)

						return 0;

					iter->versyms = elf_getdata(scn, 0);

					scn = elf_find_next_scn_by_type(elf, SHT_GNU_verdef, NULL);

					if (!scn)

						return 0;

					iter->verdefs = elf_getdata(scn, 0);

					if (!iter->verdefs || !gelf_getshdr(scn, &sh)) {

						pr_warn("elf: failed to get verdef ELF section in '%s'\n", binary_path);

						return -EINVAL;

					}

					iter->verdef_strtabidx = sh.sh_link;

					return 0;

				}

				static struct elf_sym *elf_sym_iter_next(struct elf_sym_iter *iter)

				{

					struct elf_sym *ret = &iter->sym;

					GElf_Sym *sym = &ret->sym;

					const char *name = NULL;

					GElf_Versym versym;

					Elf_Scn *sym_scn;

					size_t idx;

					for (idx = iter->next_sym_idx; idx < iter->nr_syms; idx++) {

						if (!gelf_getsym(iter->syms, idx, sym))

							continue;

						if (GELF_ST_TYPE(sym->st_info) != iter->st_type)

							continue;

						name = elf_strptr(iter->elf, iter->strtabidx, sym->st_name);

						if (!name)

							continue;

						sym_scn = elf_getscn(iter->elf, sym->st_shndx);

						if (!sym_scn)

							continue;

						if (!gelf_getshdr(sym_scn, &ret->sh))

							continue;

						iter->next_sym_idx = idx + 1;

						ret->name = name;

						ret->ver = 0;

						ret->hidden = false;

						if (iter->versyms) {

							if (!gelf_getversym(iter->versyms, idx, &versym))

								continue;

							ret->ver = versym & VERSYM_VERSION;

							ret->hidden = versym & VERSYM_HIDDEN;

						}

						return ret;

					}

					return NULL;

				}

				static const char *elf_get_vername(struct elf_sym_iter *iter, int ver)

				{

					GElf_Verdaux verdaux;

					GElf_Verdef verdef;

					int offset;

					if (!iter->verdefs)

						return NULL;

					offset = 0;

					while (gelf_getverdef(iter->verdefs, offset, &verdef)) {

						if (verdef.vd_ndx != ver) {

							if (!verdef.vd_next)

								break;

							offset += verdef.vd_next;

							continue;

						}

						if (!gelf_getverdaux(iter->verdefs, offset + verdef.vd_aux, &verdaux))

							break;

						return elf_strptr(iter->elf, iter->verdef_strtabidx, verdaux.vda_name);

					}

					return NULL;

				}

				static bool symbol_match(struct elf_sym_iter *iter, int sh_type, struct elf_sym *sym,

							 const char *name, size_t name_len, const char *lib_ver)

				{

					const char *ver_name;

					/* Symbols are in forms of func, func@LIB_VER or func@@LIB_VER

					 * make sure the func part matches the user specified name

					 */

					if (strncmp(sym->name, name, name_len) != 0)

						return false;

					/* ...but we don't want a search for "foo" to match 'foo2" also, so any

					 * additional characters in sname should be of the form "@@LIB".

					 */

					if (sym->name[name_len] != '\0' && sym->name[name_len] != '@')

						return false;

					/* If user does not specify symbol version, then we got a match */

					if (!lib_ver)

						return true;

					/* If user specifies symbol version, for dynamic symbols,

					 * get version name from ELF verdef section for comparison.

					 */

					if (sh_type == SHT_DYNSYM) {

						ver_name = elf_get_vername(iter, sym->ver);

						if (!ver_name)

							return false;

						return strcmp(ver_name, lib_ver) == 0;

					}

					/* For normal symbols, it is already in form of func@LIB_VER */

					return strcmp(sym->name, name) == 0;

				}

				/* Transform symbol's virtual address (absolute for binaries and relative

				 * for shared libs) into file offset, which is what kernel is expecting

				 * for uprobe/uretprobe attachment.

				 * See Documentation/trace/uprobetracer.rst for more details. This is done

				 * by looking up symbol's containing section's header and using iter's virtual

				 * address (sh_addr) and corresponding file offset (sh_offset) to transform

				 * sym.st_value (virtual address) into desired final file offset.

				 */

				static unsigned long elf_sym_offset(struct elf_sym *sym)

				{

					return sym->sym.st_value - sym->sh.sh_addr + sym->sh.sh_offset;

				}

				/* Find offset of function name in the provided ELF object. "binary_path" is

				 * the path to the ELF binary represented by "elf", and only used for error

				 * reporting matters. "name" matches symbol name or name@@LIB for library

				 * functions.

				 */

				long elf_find_func_offset(Elf *elf, const char *binary_path, const char *name)

				{

					int i, sh_types[2] = { SHT_DYNSYM, SHT_SYMTAB };

					const char *at_symbol, *lib_ver;

					bool is_shared_lib;

					long ret = -ENOENT;

					size_t name_len;

					GElf_Ehdr ehdr;

					if (!gelf_getehdr(elf, &ehdr)) {

						pr_warn("elf: failed to get ehdr from %s: %s\n", binary_path, elf_errmsg(-1));

						ret = -LIBBPF_ERRNO__FORMAT;

						goto out;

					}

					/* for shared lib case, we do not need to calculate relative offset */

					is_shared_lib = ehdr.e_type == ET_DYN;

					/* Does name specify "@@LIB_VER" or "@LIB_VER" ? */

					at_symbol = strchr(name, '@');

					if (at_symbol) {

						name_len = at_symbol - name;

						/* skip second @ if it's @@LIB_VER case */

						if (at_symbol[1] == '@')

							at_symbol++;

						lib_ver = at_symbol + 1;

					} else {

						name_len = strlen(name);

						lib_ver = NULL;

					}

					/* Search SHT_DYNSYM, SHT_SYMTAB for symbol. This search order is used because if

					 * a binary is stripped, it may only have SHT_DYNSYM, and a fully-statically

					 * linked binary may not have SHT_DYMSYM, so absence of a section should not be

					 * reported as a warning/error.

					 */

					for (i = 0; i < ARRAY_SIZE(sh_types); i++) {

						struct elf_sym_iter iter;

						struct elf_sym *sym;

						int last_bind = -1;

						int cur_bind;

						ret = elf_sym_iter_new(&iter, elf, binary_path, sh_types[i], STT_FUNC);

						if (ret == -ENOENT)

							continue;

						if (ret)

							goto out;

						while ((sym = elf_sym_iter_next(&iter))) {

							if (!symbol_match(&iter, sh_types[i], sym, name, name_len, lib_ver))

								continue;

							cur_bind = GELF_ST_BIND(sym->sym.st_info);

							if (ret > 0) {

								/* handle multiple matches */

								if (elf_sym_offset(sym) == ret) {

									/* same offset, no problem */

									continue;

								} else if (last_bind != STB_WEAK && cur_bind != STB_WEAK) {

									/* Only accept one non-weak bind. */

									pr_warn("elf: ambiguous match for '%s', '%s' in '%s'\n",

										sym->name, name, binary_path);

									ret = -LIBBPF_ERRNO__FORMAT;

									goto out;

								} else if (cur_bind == STB_WEAK) {

									/* already have a non-weak bind, and

									 * this is a weak bind, so ignore.

									 */

									continue;

								}

							}

							ret = elf_sym_offset(sym);

							last_bind = cur_bind;

						}

						if (ret > 0)

							break;

					}

					if (ret > 0) {

						pr_debug("elf: symbol address match for '%s' in '%s': 0x%lx\n", name, binary_path,

							 ret);

					} else {

						if (ret == 0) {

							pr_warn("elf: '%s' is 0 in symtab for '%s': %s\n", name, binary_path,

								is_shared_lib ? "should not be 0 in a shared library" :

										"try using shared library path instead");

							ret = -ENOENT;

						} else {

							pr_warn("elf: failed to find symbol '%s' in '%s'\n", name, binary_path);

						}

					}

				out:

					return ret;

				}

				/* Find offset of function name in ELF object specified by path. "name" matches

				 * symbol name or name@@LIB for library functions.

				 */

				long elf_find_func_offset_from_file(const char *binary_path, const char *name)

				{

					struct elf_fd elf_fd;

					long ret = -ENOENT;

					ret = elf_open(binary_path, &elf_fd);

					if (ret)

						return ret;

					ret = elf_find_func_offset(elf_fd.elf, binary_path, name);

					elf_close(&elf_fd);

					return ret;

				}

				struct symbol {

					const char *name;

					int bind;

					int idx;

				};

				static int symbol_cmp(const void *a, const void *b)

				{

					const struct symbol *sym_a = a;

					const struct symbol *sym_b = b;

					return strcmp(sym_a->name, sym_b->name);

				}

				/*

				 * Return offsets in @poffsets for symbols specified in @syms array argument.

				 * On success returns 0 and offsets are returned in allocated array with @cnt

				 * size, that needs to be released by the caller.

				 */

				int elf_resolve_syms_offsets(const char *binary_path, int cnt,

							     const char **syms, unsigned long **poffsets,

							     int st_type)

				{

					int sh_types[2] = { SHT_DYNSYM, SHT_SYMTAB };

					int err = 0, i, cnt_done = 0;

					unsigned long *offsets;

					struct symbol *symbols;

					struct elf_fd elf_fd;

					err = elf_open(binary_path, &elf_fd);

					if (err)

						return err;

					offsets = calloc(cnt, sizeof(*offsets));

					symbols = calloc(cnt, sizeof(*symbols));

					if (!offsets || !symbols) {

						err = -ENOMEM;

						goto out;

					}

					for (i = 0; i < cnt; i++) {

						symbols[i].name = syms[i];

						symbols[i].idx = i;

					}

					qsort(symbols, cnt, sizeof(*symbols), symbol_cmp);

					for (i = 0; i < ARRAY_SIZE(sh_types); i++) {

						struct elf_sym_iter iter;

						struct elf_sym *sym;

						err = elf_sym_iter_new(&iter, elf_fd.elf, binary_path, sh_types[i], st_type);

						if (err == -ENOENT)

							continue;

						if (err)

							goto out;

						while ((sym = elf_sym_iter_next(&iter))) {

							unsigned long sym_offset = elf_sym_offset(sym);

							int bind = GELF_ST_BIND(sym->sym.st_info);

							struct symbol *found, tmp = {

								.name = sym->name,

							};

							unsigned long *offset;

							found = bsearch(&tmp, symbols, cnt, sizeof(*symbols), symbol_cmp);

							if (!found)

								continue;

							offset = &offsets[found->idx];

							if (*offset > 0) {

								/* same offset, no problem */

								if (*offset == sym_offset)

									continue;

								/* handle multiple matches */

								if (found->bind != STB_WEAK && bind != STB_WEAK) {

									/* Only accept one non-weak bind. */

									pr_warn("elf: ambiguous match found '%s@%lu' in '%s' previous offset %lu\n",

										sym->name, sym_offset, binary_path, *offset);

									err = -ESRCH;

									goto out;

								} else if (bind == STB_WEAK) {

									/* already have a non-weak bind, and

									 * this is a weak bind, so ignore.

									 */

									continue;

								}

							} else {

								cnt_done++;

							}

							*offset = sym_offset;

							found->bind = bind;

						}

					}

					if (cnt != cnt_done) {

						err = -ENOENT;

						goto out;

					}

					*poffsets = offsets;

				out:

					free(symbols);

					if (err)

						free(offsets);

					elf_close(&elf_fd);

					return err;

				}

				/*

				 * Return offsets in @poffsets for symbols specified by @pattern argument.

				 * On success returns 0 and offsets are returned in allocated @poffsets

				 * array with the @pctn size, that needs to be released by the caller.

				 */

				int elf_resolve_pattern_offsets(const char *binary_path, const char *pattern,

								unsigned long **poffsets, size_t *pcnt)

				{

					int sh_types[2] = { SHT_SYMTAB, SHT_DYNSYM };

					unsigned long *offsets = NULL;

					size_t cap = 0, cnt = 0;

					struct elf_fd elf_fd;

					int err = 0, i;

					err = elf_open(binary_path, &elf_fd);

					if (err)

						return err;

					for (i = 0; i < ARRAY_SIZE(sh_types); i++) {

						struct elf_sym_iter iter;

						struct elf_sym *sym;

						err = elf_sym_iter_new(&iter, elf_fd.elf, binary_path, sh_types[i], STT_FUNC);

						if (err == -ENOENT)

							continue;

						if (err)

							goto out;

						while ((sym = elf_sym_iter_next(&iter))) {

							if (!glob_match(sym->name, pattern))

								continue;

							err = libbpf_ensure_mem((void **) &offsets, &cap, sizeof(*offsets),

										cnt + 1);

							if (err)

								goto out;

							offsets[cnt++] = elf_sym_offset(sym);

						}

						/* If we found anything in the first symbol section,

						 * do not search others to avoid duplicates.

						 */

						if (cnt)

							break;

					}

					if (cnt) {

						*poffsets = offsets;

						*pcnt = cnt;

					} else {

						err = -ENOENT;

					}

				out:

					if (err)

						free(offsets);

					elf_close(&elf_fd);

					return err;

				}

									
										610

src/features.c
									
										Normal file
									
												View File
												
				@@ -0,0 +1,610 @@

				// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)

				/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */

				#include <linux/kernel.h>

				#include <linux/filter.h>

				#include "bpf.h"

				#include "libbpf.h"

				#include "libbpf_common.h"

				#include "libbpf_internal.h"

				#include "str_error.h"

				static inline __u64 ptr_to_u64(const void *ptr)

				{

					return (__u64)(unsigned long)ptr;

				}

				int probe_fd(int fd)

				{

					if (fd >= 0)

						close(fd);

					return fd >= 0;

				}

				static int probe_kern_prog_name(int token_fd)

				{

					const size_t attr_sz = offsetofend(union bpf_attr, prog_token_fd);

					struct bpf_insn insns[] = {

						BPF_MOV64_IMM(BPF_REG_0, 0),

						BPF_EXIT_INSN(),

					};

					union bpf_attr attr;

					int ret;

					memset(&attr, 0, attr_sz);

					attr.prog_type = BPF_PROG_TYPE_SOCKET_FILTER;

					attr.license = ptr_to_u64("GPL");

					attr.insns = ptr_to_u64(insns);

					attr.insn_cnt = (__u32)ARRAY_SIZE(insns);

					attr.prog_token_fd = token_fd;

					if (token_fd)

						attr.prog_flags |= BPF_F_TOKEN_FD;

					libbpf_strlcpy(attr.prog_name, "libbpf_nametest", sizeof(attr.prog_name));

					/* make sure loading with name works */

					ret = sys_bpf_prog_load(&attr, attr_sz, PROG_LOAD_ATTEMPTS);

					return probe_fd(ret);

				}

				static int probe_kern_global_data(int token_fd)

				{

					struct bpf_insn insns[] = {

						BPF_LD_MAP_VALUE(BPF_REG_1, 0, 16),

						BPF_ST_MEM(BPF_DW, BPF_REG_1, 0, 42),

						BPF_MOV64_IMM(BPF_REG_0, 0),

						BPF_EXIT_INSN(),

					};

					LIBBPF_OPTS(bpf_map_create_opts, map_opts,

						.token_fd = token_fd,

						.map_flags = token_fd ? BPF_F_TOKEN_FD : 0,

					);

					LIBBPF_OPTS(bpf_prog_load_opts, prog_opts,

						.token_fd = token_fd,

						.prog_flags = token_fd ? BPF_F_TOKEN_FD : 0,

					);

					int ret, map, insn_cnt = ARRAY_SIZE(insns);

					map = bpf_map_create(BPF_MAP_TYPE_ARRAY, "libbpf_global", sizeof(int), 32, 1, &map_opts);

					if (map < 0) {

						ret = -errno;

						pr_warn("Error in %s(): %s. Couldn't create simple array map.\n",

							__func__, errstr(ret));

						return ret;

					}

					insns[0].imm = map;

					ret = bpf_prog_load(BPF_PROG_TYPE_SOCKET_FILTER, NULL, "GPL", insns, insn_cnt, &prog_opts);

					close(map);

					return probe_fd(ret);

				}

				static int probe_kern_btf(int token_fd)

				{

					static const char strs[] = "\0int";

					__u32 types[] = {

						/* int */

						BTF_TYPE_INT_ENC(1, BTF_INT_SIGNED, 0, 32, 4),

					};

					return probe_fd(libbpf__load_raw_btf((char *)types, sizeof(types),

									     strs, sizeof(strs), token_fd));

				}

				static int probe_kern_btf_func(int token_fd)

				{

					static const char strs[] = "\0int\0x\0a";

					/* void x(int a) {} */

					__u32 types[] = {

						/* int */

						BTF_TYPE_INT_ENC(1, BTF_INT_SIGNED, 0, 32, 4),  /* [1] */

						/* FUNC_PROTO */                                /* [2] */

						BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_FUNC_PROTO, 0, 1), 0),

						BTF_PARAM_ENC(7, 1),

						/* FUNC x */                                    /* [3] */

						BTF_TYPE_ENC(5, BTF_INFO_ENC(BTF_KIND_FUNC, 0, 0), 2),

					};

					return probe_fd(libbpf__load_raw_btf((char *)types, sizeof(types),

									     strs, sizeof(strs), token_fd));

				}

				static int probe_kern_btf_func_global(int token_fd)

				{

					static const char strs[] = "\0int\0x\0a";

					/* static void x(int a) {} */

					__u32 types[] = {

						/* int */

						BTF_TYPE_INT_ENC(1, BTF_INT_SIGNED, 0, 32, 4),  /* [1] */

						/* FUNC_PROTO */                                /* [2] */

						BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_FUNC_PROTO, 0, 1), 0),

						BTF_PARAM_ENC(7, 1),

						/* FUNC x BTF_FUNC_GLOBAL */                    /* [3] */

						BTF_TYPE_ENC(5, BTF_INFO_ENC(BTF_KIND_FUNC, 0, BTF_FUNC_GLOBAL), 2),

					};

					return probe_fd(libbpf__load_raw_btf((char *)types, sizeof(types),

									     strs, sizeof(strs), token_fd));

				}

				static int probe_kern_btf_datasec(int token_fd)

				{

					static const char strs[] = "\0x\0.data";

					/* static int a; */

					__u32 types[] = {

						/* int */

						BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4),  /* [1] */

						/* VAR x */                                     /* [2] */

						BTF_TYPE_ENC(1, BTF_INFO_ENC(BTF_KIND_VAR, 0, 0), 1),

						BTF_VAR_STATIC,

						/* DATASEC val */                               /* [3] */

						BTF_TYPE_ENC(3, BTF_INFO_ENC(BTF_KIND_DATASEC, 0, 1), 4),

						BTF_VAR_SECINFO_ENC(2, 0, 4),

					};

					return probe_fd(libbpf__load_raw_btf((char *)types, sizeof(types),

									     strs, sizeof(strs), token_fd));

				}

				static int probe_kern_btf_qmark_datasec(int token_fd)

				{

					static const char strs[] = "\0x\0?.data";

					/* static int a; */

					__u32 types[] = {

						/* int */

						BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4),  /* [1] */

						/* VAR x */                                     /* [2] */

						BTF_TYPE_ENC(1, BTF_INFO_ENC(BTF_KIND_VAR, 0, 0), 1),

						BTF_VAR_STATIC,

						/* DATASEC ?.data */                            /* [3] */

						BTF_TYPE_ENC(3, BTF_INFO_ENC(BTF_KIND_DATASEC, 0, 1), 4),

						BTF_VAR_SECINFO_ENC(2, 0, 4),

					};

					return probe_fd(libbpf__load_raw_btf((char *)types, sizeof(types),

									     strs, sizeof(strs), token_fd));

				}

				static int probe_kern_btf_float(int token_fd)

				{

					static const char strs[] = "\0float";

					__u32 types[] = {

						/* float */

						BTF_TYPE_FLOAT_ENC(1, 4),

					};

					return probe_fd(libbpf__load_raw_btf((char *)types, sizeof(types),

									     strs, sizeof(strs), token_fd));

				}

				static int probe_kern_btf_decl_tag(int token_fd)

				{

					static const char strs[] = "\0tag";

					__u32 types[] = {

						/* int */

						BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4),  /* [1] */

						/* VAR x */                                     /* [2] */

						BTF_TYPE_ENC(1, BTF_INFO_ENC(BTF_KIND_VAR, 0, 0), 1),

						BTF_VAR_STATIC,

						/* attr */

						BTF_TYPE_DECL_TAG_ENC(1, 2, -1),

					};

					return probe_fd(libbpf__load_raw_btf((char *)types, sizeof(types),

									     strs, sizeof(strs), token_fd));

				}

				static int probe_kern_btf_type_tag(int token_fd)

				{

					static const char strs[] = "\0tag";

					__u32 types[] = {

						/* int */

						BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4),		/* [1] */

						/* attr */

						BTF_TYPE_TYPE_TAG_ENC(1, 1),				/* [2] */

						/* ptr */

						BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_PTR, 0, 0), 2),	/* [3] */

					};

					return probe_fd(libbpf__load_raw_btf((char *)types, sizeof(types),

									     strs, sizeof(strs), token_fd));

				}

				static int probe_kern_array_mmap(int token_fd)

				{

					LIBBPF_OPTS(bpf_map_create_opts, opts,

						.map_flags = BPF_F_MMAPABLE | (token_fd ? BPF_F_TOKEN_FD : 0),

						.token_fd = token_fd,

					);

					int fd;

					fd = bpf_map_create(BPF_MAP_TYPE_ARRAY, "libbpf_mmap", sizeof(int), sizeof(int), 1, &opts);

					return probe_fd(fd);

				}

				static int probe_kern_exp_attach_type(int token_fd)

				{

					LIBBPF_OPTS(bpf_prog_load_opts, opts,

						.expected_attach_type = BPF_CGROUP_INET_SOCK_CREATE,

						.token_fd = token_fd,

						.prog_flags = token_fd ? BPF_F_TOKEN_FD : 0,

					);

					struct bpf_insn insns[] = {

						BPF_MOV64_IMM(BPF_REG_0, 0),

						BPF_EXIT_INSN(),

					};

					int fd, insn_cnt = ARRAY_SIZE(insns);

					/* use any valid combination of program type and (optional)

					 * non-zero expected attach type (i.e., not a BPF_CGROUP_INET_INGRESS)

					 * to see if kernel supports expected_attach_type field for

					 * BPF_PROG_LOAD command

					 */

					fd = bpf_prog_load(BPF_PROG_TYPE_CGROUP_SOCK, NULL, "GPL", insns, insn_cnt, &opts);

					return probe_fd(fd);

				}

				static int probe_kern_probe_read_kernel(int token_fd)

				{

					LIBBPF_OPTS(bpf_prog_load_opts, opts,

						.token_fd = token_fd,

						.prog_flags = token_fd ? BPF_F_TOKEN_FD : 0,

					);

					struct bpf_insn insns[] = {

						BPF_MOV64_REG(BPF_REG_1, BPF_REG_10),	/* r1 = r10 (fp) */

						BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -8),	/* r1 += -8 */

						BPF_MOV64_IMM(BPF_REG_2, 8),		/* r2 = 8 */

						BPF_MOV64_IMM(BPF_REG_3, 0),		/* r3 = 0 */

						BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_probe_read_kernel),

						BPF_EXIT_INSN(),

					};

					int fd, insn_cnt = ARRAY_SIZE(insns);

					fd = bpf_prog_load(BPF_PROG_TYPE_TRACEPOINT, NULL, "GPL", insns, insn_cnt, &opts);

					return probe_fd(fd);

				}

				static int probe_prog_bind_map(int token_fd)

				{

					struct bpf_insn insns[] = {

						BPF_MOV64_IMM(BPF_REG_0, 0),

						BPF_EXIT_INSN(),

					};

					LIBBPF_OPTS(bpf_map_create_opts, map_opts,

						.token_fd = token_fd,

						.map_flags = token_fd ? BPF_F_TOKEN_FD : 0,

					);

					LIBBPF_OPTS(bpf_prog_load_opts, prog_opts,

						.token_fd = token_fd,

						.prog_flags = token_fd ? BPF_F_TOKEN_FD : 0,

					);

					int ret, map, prog, insn_cnt = ARRAY_SIZE(insns);

					map = bpf_map_create(BPF_MAP_TYPE_ARRAY, "libbpf_det_bind", sizeof(int), 32, 1, &map_opts);

					if (map < 0) {

						ret = -errno;

						pr_warn("Error in %s(): %s. Couldn't create simple array map.\n",

							__func__, errstr(ret));

						return ret;

					}

					prog = bpf_prog_load(BPF_PROG_TYPE_SOCKET_FILTER, NULL, "GPL", insns, insn_cnt, &prog_opts);

					if (prog < 0) {

						close(map);

						return 0;

					}

					ret = bpf_prog_bind_map(prog, map, NULL);

					close(map);

					close(prog);

					return ret >= 0;

				}

				static int probe_module_btf(int token_fd)

				{

					static const char strs[] = "\0int";

					__u32 types[] = {

						/* int */

						BTF_TYPE_INT_ENC(1, BTF_INT_SIGNED, 0, 32, 4),

					};

					struct bpf_btf_info info;

					__u32 len = sizeof(info);

					char name[16];

					int fd, err;

					fd = libbpf__load_raw_btf((char *)types, sizeof(types), strs, sizeof(strs), token_fd);

					if (fd < 0)

						return 0; /* BTF not supported at all */

					memset(&info, 0, sizeof(info));

					info.name = ptr_to_u64(name);

					info.name_len = sizeof(name);

					/* check that BPF_OBJ_GET_INFO_BY_FD supports specifying name pointer;

					 * kernel's module BTF support coincides with support for

					 * name/name_len fields in struct bpf_btf_info.

					 */

					err = bpf_btf_get_info_by_fd(fd, &info, &len);

					close(fd);

					return !err;

				}

				static int probe_perf_link(int token_fd)

				{

					struct bpf_insn insns[] = {

						BPF_MOV64_IMM(BPF_REG_0, 0),

						BPF_EXIT_INSN(),

					};

					LIBBPF_OPTS(bpf_prog_load_opts, opts,

						.token_fd = token_fd,

						.prog_flags = token_fd ? BPF_F_TOKEN_FD : 0,

					);

					int prog_fd, link_fd, err;

					prog_fd = bpf_prog_load(BPF_PROG_TYPE_TRACEPOINT, NULL, "GPL",

								insns, ARRAY_SIZE(insns), &opts);

					if (prog_fd < 0)

						return -errno;

					/* use invalid perf_event FD to get EBADF, if link is supported;

					 * otherwise EINVAL should be returned

					 */

					link_fd = bpf_link_create(prog_fd, -1, BPF_PERF_EVENT, NULL);

					err = -errno; /* close() can clobber errno */

					if (link_fd >= 0)

						close(link_fd);

					close(prog_fd);

					return link_fd < 0 && err == -EBADF;

				}

				static int probe_uprobe_multi_link(int token_fd)

				{

					LIBBPF_OPTS(bpf_prog_load_opts, load_opts,

						.expected_attach_type = BPF_TRACE_UPROBE_MULTI,

						.token_fd = token_fd,

						.prog_flags = token_fd ? BPF_F_TOKEN_FD : 0,

					);

					LIBBPF_OPTS(bpf_link_create_opts, link_opts);

					struct bpf_insn insns[] = {

						BPF_MOV64_IMM(BPF_REG_0, 0),

						BPF_EXIT_INSN(),

					};

					int prog_fd, link_fd, err;

					unsigned long offset = 0;

					prog_fd = bpf_prog_load(BPF_PROG_TYPE_KPROBE, NULL, "GPL",

								insns, ARRAY_SIZE(insns), &load_opts);

					if (prog_fd < 0)

						return -errno;

					/* Creating uprobe in '/' binary should fail with -EBADF. */

					link_opts.uprobe_multi.path = "/";

					link_opts.uprobe_multi.offsets = &offset;

					link_opts.uprobe_multi.cnt = 1;

					link_fd = bpf_link_create(prog_fd, -1, BPF_TRACE_UPROBE_MULTI, &link_opts);

					err = -errno; /* close() can clobber errno */

					if (link_fd >= 0 || err != -EBADF) {

						if (link_fd >= 0)

							close(link_fd);

						close(prog_fd);

						return 0;

					}

					/* Initial multi-uprobe support in kernel didn't handle PID filtering

					 * correctly (it was doing thread filtering, not process filtering).

					 * So now we'll detect if PID filtering logic was fixed, and, if not,

					 * we'll pretend multi-uprobes are not supported, if not.

					 * Multi-uprobes are used in USDT attachment logic, and we need to be

					 * conservative here, because multi-uprobe selection happens early at

					 * load time, while the use of PID filtering is known late at

					 * attachment time, at which point it's too late to undo multi-uprobe

					 * selection.

					 *

					 * Creating uprobe with pid == -1 for (invalid) '/' binary will fail

					 * early with -EINVAL on kernels with fixed PID filtering logic;

					 * otherwise -ESRCH would be returned if passed correct binary path

					 * (but we'll just get -BADF, of course).

					 */

					link_opts.uprobe_multi.pid = -1; /* invalid PID */

					link_opts.uprobe_multi.path = "/"; /* invalid path */

					link_opts.uprobe_multi.offsets = &offset;

					link_opts.uprobe_multi.cnt = 1;

					link_fd = bpf_link_create(prog_fd, -1, BPF_TRACE_UPROBE_MULTI, &link_opts);

					err = -errno; /* close() can clobber errno */

					if (link_fd >= 0)

						close(link_fd);

					close(prog_fd);

					return link_fd < 0 && err == -EINVAL;

				}

				static int probe_kern_bpf_cookie(int token_fd)

				{

					struct bpf_insn insns[] = {

						BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_get_attach_cookie),

						BPF_EXIT_INSN(),

					};

					LIBBPF_OPTS(bpf_prog_load_opts, opts,

						.token_fd = token_fd,

						.prog_flags = token_fd ? BPF_F_TOKEN_FD : 0,

					);

					int ret, insn_cnt = ARRAY_SIZE(insns);

					ret = bpf_prog_load(BPF_PROG_TYPE_TRACEPOINT, NULL, "GPL", insns, insn_cnt, &opts);

					return probe_fd(ret);

				}

				static int probe_kern_btf_enum64(int token_fd)

				{

					static const char strs[] = "\0enum64";

					__u32 types[] = {

						BTF_TYPE_ENC(1, BTF_INFO_ENC(BTF_KIND_ENUM64, 0, 0), 8),

					};

					return probe_fd(libbpf__load_raw_btf((char *)types, sizeof(types),

									     strs, sizeof(strs), token_fd));

				}

				static int probe_kern_arg_ctx_tag(int token_fd)

				{

					static const char strs[] = "\0a\0b\0arg:ctx\0";

					const __u32 types[] = {

						/* [1] INT */

						BTF_TYPE_INT_ENC(1 /* "a" */, BTF_INT_SIGNED, 0, 32, 4),

						/* [2] PTR -> VOID */

						BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_PTR, 0, 0), 0),

						/* [3] FUNC_PROTO `int(void *a)` */

						BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_FUNC_PROTO, 0, 1), 1),

						BTF_PARAM_ENC(1 /* "a" */, 2),

						/* [4] FUNC 'a' -> FUNC_PROTO (main prog) */

						BTF_TYPE_ENC(1 /* "a" */, BTF_INFO_ENC(BTF_KIND_FUNC, 0, BTF_FUNC_GLOBAL), 3),

						/* [5] FUNC_PROTO `int(void *b __arg_ctx)` */

						BTF_TYPE_ENC(0, BTF_INFO_ENC(BTF_KIND_FUNC_PROTO, 0, 1), 1),

						BTF_PARAM_ENC(3 /* "b" */, 2),

						/* [6] FUNC 'b' -> FUNC_PROTO (subprog) */

						BTF_TYPE_ENC(3 /* "b" */, BTF_INFO_ENC(BTF_KIND_FUNC, 0, BTF_FUNC_GLOBAL), 5),

						/* [7] DECL_TAG 'arg:ctx' -> func 'b' arg 'b' */

						BTF_TYPE_DECL_TAG_ENC(5 /* "arg:ctx" */, 6, 0),

					};

					const struct bpf_insn insns[] = {

						/* main prog */

						BPF_CALL_REL(+1),

						BPF_EXIT_INSN(),

						/* global subprog */

						BPF_EMIT_CALL(BPF_FUNC_get_func_ip), /* needs PTR_TO_CTX */

						BPF_EXIT_INSN(),

					};

					const struct bpf_func_info_min func_infos[] = {

						{ 0, 4 }, /* main prog -> FUNC 'a' */

						{ 2, 6 }, /* subprog -> FUNC 'b' */

					};

					LIBBPF_OPTS(bpf_prog_load_opts, opts,

						.token_fd = token_fd,

						.prog_flags = token_fd ? BPF_F_TOKEN_FD : 0,

					);

					int prog_fd, btf_fd, insn_cnt = ARRAY_SIZE(insns);

					btf_fd = libbpf__load_raw_btf((char *)types, sizeof(types), strs, sizeof(strs), token_fd);

					if (btf_fd < 0)

						return 0;

					opts.prog_btf_fd = btf_fd;

					opts.func_info = &func_infos;

					opts.func_info_cnt = ARRAY_SIZE(func_infos);

					opts.func_info_rec_size = sizeof(func_infos[0]);

					prog_fd = bpf_prog_load(BPF_PROG_TYPE_KPROBE, "det_arg_ctx",

								"GPL", insns, insn_cnt, &opts);

					close(btf_fd);

					return probe_fd(prog_fd);

				}

				typedef int (*feature_probe_fn)(int /* token_fd */);

				static struct kern_feature_cache feature_cache;

				static struct kern_feature_desc {

					const char *desc;

					feature_probe_fn probe;

				} feature_probes[__FEAT_CNT] = {

					[FEAT_PROG_NAME] = {

						"BPF program name", probe_kern_prog_name,

					},

					[FEAT_GLOBAL_DATA] = {

						"global variables", probe_kern_global_data,

					},

					[FEAT_BTF] = {

						"minimal BTF", probe_kern_btf,

					},

					[FEAT_BTF_FUNC] = {

						"BTF functions", probe_kern_btf_func,

					},

					[FEAT_BTF_GLOBAL_FUNC] = {

						"BTF global function", probe_kern_btf_func_global,

					},

					[FEAT_BTF_DATASEC] = {

						"BTF data section and variable", probe_kern_btf_datasec,

					},

					[FEAT_ARRAY_MMAP] = {

						"ARRAY map mmap()", probe_kern_array_mmap,

					},

					[FEAT_EXP_ATTACH_TYPE] = {

						"BPF_PROG_LOAD expected_attach_type attribute",

						probe_kern_exp_attach_type,

					},

					[FEAT_PROBE_READ_KERN] = {

						"bpf_probe_read_kernel() helper", probe_kern_probe_read_kernel,

					},

					[FEAT_PROG_BIND_MAP] = {

						"BPF_PROG_BIND_MAP support", probe_prog_bind_map,

					},

					[FEAT_MODULE_BTF] = {

						"module BTF support", probe_module_btf,

					},

					[FEAT_BTF_FLOAT] = {

						"BTF_KIND_FLOAT support", probe_kern_btf_float,

					},

					[FEAT_PERF_LINK] = {

						"BPF perf link support", probe_perf_link,

					},

					[FEAT_BTF_DECL_TAG] = {

						"BTF_KIND_DECL_TAG support", probe_kern_btf_decl_tag,

					},

					[FEAT_BTF_TYPE_TAG] = {

						"BTF_KIND_TYPE_TAG support", probe_kern_btf_type_tag,

					},

					[FEAT_MEMCG_ACCOUNT] = {

						"memcg-based memory accounting", probe_memcg_account,

					},

					[FEAT_BPF_COOKIE] = {

						"BPF cookie support", probe_kern_bpf_cookie,

					},

					[FEAT_BTF_ENUM64] = {

						"BTF_KIND_ENUM64 support", probe_kern_btf_enum64,

					},

					[FEAT_SYSCALL_WRAPPER] = {

						"Kernel using syscall wrapper", probe_kern_syscall_wrapper,

					},

					[FEAT_UPROBE_MULTI_LINK] = {

						"BPF multi-uprobe link support", probe_uprobe_multi_link,

					},

					[FEAT_ARG_CTX_TAG] = {

						"kernel-side __arg_ctx tag", probe_kern_arg_ctx_tag,

					},

					[FEAT_BTF_QMARK_DATASEC] = {

						"BTF DATASEC names starting from '?'", probe_kern_btf_qmark_datasec,

					},

				};

				bool feat_supported(struct kern_feature_cache *cache, enum kern_feature_id feat_id)

				{

					struct kern_feature_desc *feat = &feature_probes[feat_id];

					int ret;

					/* assume global feature cache, unless custom one is provided */

					if (!cache)

						cache = &feature_cache;

					if (READ_ONCE(cache->res[feat_id]) == FEAT_UNKNOWN) {

						ret = feat->probe(cache->token_fd);

						if (ret > 0) {

							WRITE_ONCE(cache->res[feat_id], FEAT_SUPPORTED);

						} else if (ret == 0) {

							WRITE_ONCE(cache->res[feat_id], FEAT_MISSING);

						} else {

							pr_warn("Detection of kernel %s support failed: %s\n",

								feat->desc, errstr(ret));

							WRITE_ONCE(cache->res[feat_id], FEAT_MISSING);

						}

					}

					return READ_ONCE(cache->res[feat_id]) == FEAT_SUPPORTED;

				}

1207

src/gen_loader.c Normal file

View File

File diff suppressed because it is too large Load Diff

									
										37

src/hashmap.c
									
												View File
												
				@@ -12,6 +12,12 @@

				#include <linux/err.h>

				#include "hashmap.h"

				/* make sure libbpf doesn't use kernel-only integer typedefs */

				#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64

				/* prevent accidental re-addition of reallocarray() */

				#pragma GCC poison reallocarray

				/* start with 4 buckets */

				#define HASHMAP_MIN_CAP_BITS 2

				@@ -56,13 +62,20 @@ struct hashmap *hashmap__new(hashmap_hash_fn hash_fn,

				void hashmap__clear(struct hashmap *map)

				{

					struct hashmap_entry *cur, *tmp;

					size_t bkt;

					hashmap__for_each_entry_safe(map, cur, tmp, bkt) {

						free(cur);

					}

					free(map->buckets);

					map->buckets = NULL;

					map->cap = map->cap_bits = map->sz = 0;

				}

				void hashmap__free(struct hashmap *map)

				{

					if (!map)

					if (IS_ERR_OR_NULL(map))

						return;

					hashmap__clear(map);

				@@ -90,8 +103,7 @@ static int hashmap_grow(struct hashmap *map)

					struct hashmap_entry **new_buckets;

					struct hashmap_entry *cur, *tmp;

					size_t new_cap_bits, new_cap;

					size_t h;

					int bkt;

					size_t h, bkt;

					new_cap_bits = map->cap_bits + 1;

					if (new_cap_bits < HASHMAP_MIN_CAP_BITS)

				@@ -116,7 +128,7 @@ static int hashmap_grow(struct hashmap *map)

				}

				static bool hashmap_find_entry(const struct hashmap *map,

							       const void *key, size_t hash,

							       const long key, size_t hash,

							       struct hashmap_entry ***pprev,

							       struct hashmap_entry **entry)

				{

				@@ -139,18 +151,18 @@ static bool hashmap_find_entry(const struct hashmap *map,

					return false;

				}

				int hashmap__insert(struct hashmap *map, const void *key, void *value,

						    enum hashmap_insert_strategy strategy,

						    const void **old_key, void **old_value)

				int hashmap_insert(struct hashmap *map, long key, long value,

						   enum hashmap_insert_strategy strategy,

						   long *old_key, long *old_value)

				{

					struct hashmap_entry *entry;

					size_t h;

					int err;

					if (old_key)

						*old_key = NULL;

						*old_key = 0;

					if (old_value)

						*old_value = NULL;

						*old_value = 0;

					h = hash_bits(map->hash_fn(key, map->ctx), map->cap_bits);

					if (strategy != HASHMAP_APPEND &&

				@@ -191,7 +203,7 @@ int hashmap__insert(struct hashmap *map, const void *key, void *value,

					return 0;

				}

				bool hashmap__find(const struct hashmap *map, const void *key, void **value)

				bool hashmap_find(const struct hashmap *map, long key, long *value)

				{

					struct hashmap_entry *entry;

					size_t h;

				@@ -205,8 +217,8 @@ bool hashmap__find(const struct hashmap *map, const void *key, void **value)

					return true;

				}

				bool hashmap__delete(struct hashmap *map, const void *key,

						     const void **old_key, void **old_value)

				bool hashmap_delete(struct hashmap *map, long key,

						    long *old_key, long *old_value)

				{

					struct hashmap_entry **pprev, *entry;

					size_t h;

				@@ -226,4 +238,3 @@ bool hashmap__delete(struct hashmap *map, const void *key,

					return true;

				}

									
										151

src/hashmap.h
									
												View File
												
				@@ -10,20 +10,62 @@

				#include <stdbool.h>

				#include <stddef.h>

				#include "libbpf_internal.h"

				#include <limits.h>

				static inline size_t hash_bits(size_t h, int bits)

				{

					/* shuffle bits and return requested number of upper bits */

					return (h * 11400714819323198485llu) >> (__WORDSIZE - bits);

					if (bits == 0)

						return 0;

				#if (__SIZEOF_SIZE_T__ == __SIZEOF_LONG_LONG__)

					/* LP64 case */

					return (h * 11400714819323198485llu) >> (__SIZEOF_LONG_LONG__ * 8 - bits);

				#elif (__SIZEOF_SIZE_T__ <= __SIZEOF_LONG__)

					return (h * 2654435769lu) >> (__SIZEOF_LONG__ * 8 - bits);

				#else

				#	error "Unsupported size_t size"

				#endif

				}

				typedef size_t (*hashmap_hash_fn)(const void *key, void *ctx);

				typedef bool (*hashmap_equal_fn)(const void *key1, const void *key2, void *ctx);

				/* generic C-string hashing function */

				static inline size_t str_hash(const char *s)

				{

					size_t h = 0;

					while (*s) {

						h = h * 31 + *s;

						s++;

					}

					return h;

				}

				typedef size_t (*hashmap_hash_fn)(long key, void *ctx);

				typedef bool (*hashmap_equal_fn)(long key1, long key2, void *ctx);

				/*

				 * Hashmap interface is polymorphic, keys and values could be either

				 * long-sized integers or pointers, this is achieved as follows:

				 * - interface functions that operate on keys and values are hidden

				 *   behind auxiliary macros, e.g. hashmap_insert <-> hashmap__insert;

				 * - these auxiliary macros cast the key and value parameters as

				 *   long or long *, so the user does not have to specify the casts explicitly;

				 * - for pointer parameters (e.g. old_key) the size of the pointed

				 *   type is verified by hashmap_cast_ptr using _Static_assert;

				 * - when iterating using hashmap__for_each_* forms

				 *   hasmap_entry->key should be used for integer keys and

				 *   hasmap_entry->pkey should be used for pointer keys,

				 *   same goes for values.

				 */

				struct hashmap_entry {

					const void *key;

					void *value;

					union {

						long key;

						const void *pkey;

					};

					union {

						long value;

						void *pvalue;

					};

					struct hashmap_entry *next;

				};

				@@ -38,16 +80,6 @@ struct hashmap {

					size_t sz;

				};

				#define HASHMAP_INIT(hash_fn, equal_fn, ctx) {	\

					.hash_fn = (hash_fn),			\

					.equal_fn = (equal_fn),			\

					.ctx = (ctx),				\

					.buckets = NULL,			\

					.cap = 0,				\

					.cap_bits = 0,				\

					.sz = 0,				\

				}

				void hashmap__init(struct hashmap *map, hashmap_hash_fn hash_fn,

						   hashmap_equal_fn equal_fn, void *ctx);

				struct hashmap *hashmap__new(hashmap_hash_fn hash_fn,

				@@ -80,6 +112,13 @@ enum hashmap_insert_strategy {

					HASHMAP_APPEND,

				};

				#define hashmap_cast_ptr(p) ({								\

					_Static_assert((__builtin_constant_p((p)) ? (p) == NULL : 0) ||			\

								sizeof(*(p)) == sizeof(long),				\

						       #p " pointee should be a long-sized integer or a pointer");	\

					(long *)(p);									\

				})

				/*

				 * hashmap__insert() adds key/value entry w/ various semantics, depending on

				 * provided strategy value. If a given key/value pair replaced already

				@@ -87,42 +126,38 @@ enum hashmap_insert_strategy {

				 * through old_key and old_value to allow calling code do proper memory

				 * management.

				 */

				int hashmap__insert(struct hashmap *map, const void *key, void *value,

						    enum hashmap_insert_strategy strategy,

						    const void **old_key, void **old_value);

				int hashmap_insert(struct hashmap *map, long key, long value,

						   enum hashmap_insert_strategy strategy,

						   long *old_key, long *old_value);

				static inline int hashmap__add(struct hashmap *map,

							       const void *key, void *value)

				{

					return hashmap__insert(map, key, value, HASHMAP_ADD, NULL, NULL);

				}

				#define hashmap__insert(map, key, value, strategy, old_key, old_value) \

					hashmap_insert((map), (long)(key), (long)(value), (strategy),  \

						       hashmap_cast_ptr(old_key),		       \

						       hashmap_cast_ptr(old_value))

				static inline int hashmap__set(struct hashmap *map,

							       const void *key, void *value,

							       const void **old_key, void **old_value)

				{

					return hashmap__insert(map, key, value, HASHMAP_SET,

							       old_key, old_value);

				}

				#define hashmap__add(map, key, value) \

					hashmap__insert((map), (key), (value), HASHMAP_ADD, NULL, NULL)

				static inline int hashmap__update(struct hashmap *map,

								  const void *key, void *value,

								  const void **old_key, void **old_value)

				{

					return hashmap__insert(map, key, value, HASHMAP_UPDATE,

							       old_key, old_value);

				}

				#define hashmap__set(map, key, value, old_key, old_value) \

					hashmap__insert((map), (key), (value), HASHMAP_SET, (old_key), (old_value))

				static inline int hashmap__append(struct hashmap *map,

								  const void *key, void *value)

				{

					return hashmap__insert(map, key, value, HASHMAP_APPEND, NULL, NULL);

				}

				#define hashmap__update(map, key, value, old_key, old_value) \

					hashmap__insert((map), (key), (value), HASHMAP_UPDATE, (old_key), (old_value))

				bool hashmap__delete(struct hashmap *map, const void *key,

						     const void **old_key, void **old_value);

				#define hashmap__append(map, key, value) \

					hashmap__insert((map), (key), (value), HASHMAP_APPEND, NULL, NULL)

				bool hashmap__find(const struct hashmap *map, const void *key, void **value);

				bool hashmap_delete(struct hashmap *map, long key, long *old_key, long *old_value);

				#define hashmap__delete(map, key, old_key, old_value)		       \

					hashmap_delete((map), (long)(key),			       \

						       hashmap_cast_ptr(old_key),		       \

						       hashmap_cast_ptr(old_value))

				bool hashmap_find(const struct hashmap *map, long key, long *value);

				#define hashmap__find(map, key, value) \

					hashmap_find((map), (long)(key), hashmap_cast_ptr(value))

				/*

				 * hashmap__for_each_entry - iterate over all entries in hashmap

				@@ -131,8 +166,8 @@ bool hashmap__find(const struct hashmap *map, const void *key, void **value);

				 * @bkt: integer used as a bucket loop cursor

				 */

				#define hashmap__for_each_entry(map, cur, bkt)				    \

					for (bkt = 0; bkt < map->cap; bkt++)				    \

						for (cur = map->buckets[bkt]; cur; cur = cur->next)

					for (bkt = 0; bkt < (map)->cap; bkt++)				    \

						for (cur = (map)->buckets[bkt]; cur; cur = cur->next)

				/*

				 * hashmap__for_each_entry_safe - iterate over all entries in hashmap, safe

				@@ -143,8 +178,8 @@ bool hashmap__find(const struct hashmap *map, const void *key, void **value);

				 * @bkt: integer used as a bucket loop cursor

				 */

				#define hashmap__for_each_entry_safe(map, cur, tmp, bkt)		    \

					for (bkt = 0; bkt < map->cap; bkt++)				    \

						for (cur = map->buckets[bkt];				    \

					for (bkt = 0; bkt < (map)->cap; bkt++)				    \

						for (cur = (map)->buckets[bkt];				    \

						     cur && ({tmp = cur->next; true; });		    \

						     cur = tmp)

				@@ -155,19 +190,19 @@ bool hashmap__find(const struct hashmap *map, const void *key, void **value);

				 * @key: key to iterate entries for

				 */

				#define hashmap__for_each_key_entry(map, cur, _key)			    \

					for (cur = ({ size_t bkt = hash_bits(map->hash_fn((_key), map->ctx),\

									     map->cap_bits);		    \

						     map->buckets ? map->buckets[bkt] : NULL; });	    \

					for (cur = (map)->buckets					    \

						     ? (map)->buckets[hash_bits((map)->hash_fn((_key), (map)->ctx), (map)->cap_bits)] \

						     : NULL;						    \

					     cur;							    \

					     cur = cur->next)						    \

						if (map->equal_fn(cur->key, (_key), map->ctx))

						if ((map)->equal_fn(cur->key, (_key), (map)->ctx))

				#define hashmap__for_each_key_entry_safe(map, cur, tmp, _key)		    \

					for (cur = ({ size_t bkt = hash_bits(map->hash_fn((_key), map->ctx),\

									     map->cap_bits);		    \

						     cur = map->buckets ? map->buckets[bkt] : NULL; });	    \

					for (cur = (map)->buckets					    \

						     ? (map)->buckets[hash_bits((map)->hash_fn((_key), (map)->ctx), (map)->cap_bits)] \

						     : NULL;						    \

					     cur && ({ tmp = cur->next; true; });			    \

					     cur = tmp)							    \

						if (map->equal_fn(cur->key, (_key), map->ctx))

						if ((map)->equal_fn(cur->key, (_key), (map)->ctx))

				#endif /* __LIBBPF_HASHMAP_H */

15125

src/libbpf.c

View File

File diff suppressed because it is too large Load Diff

1999

src/libbpf.h

View File

File diff suppressed because it is too large Load Diff

Compare commits

2435 Commits v0.0.4 ... netdata_pa

1 .gitattributes vendored Normal file Unescape Escape View File

108617 .github/actions/build-selftests/vmlinux.h vendored Normal file View File

16 .github/actions/debian/action.yml vendored Normal file Unescape Escape View File

23 .github/actions/setup/action.yml vendored Normal file Unescape Escape View File

92 .github/workflows/build.yml vendored Normal file Unescape Escape View File

40 .github/workflows/cifuzz.yml vendored Normal file Unescape Escape View File

52 .github/workflows/codeql.yml vendored Normal file Unescape Escape View File

32 .github/workflows/coverity.yml vendored Normal file Unescape Escape View File

19 .github/workflows/lint.yml vendored Normal file Unescape Escape View File

31 .github/workflows/ondemand.yml vendored Normal file Unescape Escape View File

36 .github/workflows/test.yml vendored Normal file Unescape Escape View File

117 .github/workflows/vmtest.yml vendored Normal file Unescape Escape View File

22 .mailmap Normal file Unescape Escape View File

26 .readthedocs.yaml Normal file Unescape Escape View File

122 .travis.yml Unescape Escape View File

1 BPF-CHECKPOINT-COMMIT Normal file Unescape Escape View File

2 CHECKPOINT-COMMIT Unescape Escape View File

1 LICENSE Normal file Unescape Escape View File

32 LICENSE.BSD-2-Clause Normal file Unescape Escape View File

503 LICENSE.LGPL-2.1 Normal file Unescape Escape View File

189 README.md Unescape Escape View File

281 SYNC.md Normal file Unescape Escape View File

BIN assets/libbpf-logo-compact-darkbg.png Normal file View File

BIN assets/libbpf-logo-compact-mono.png Normal file View File

BIN assets/libbpf-logo-compact.png Normal file View File

BIN assets/libbpf-logo-sideways-darkbg.png Normal file View File

BIN assets/libbpf-logo-sideways-mono.png Normal file View File

BIN assets/libbpf-logo-sideways.png Normal file View File

BIN assets/libbpf-logo-sparse-darkbg.png Normal file View File

BIN assets/libbpf-logo-sparse-mono.png Normal file View File

BIN assets/libbpf-logo-sparse.png Normal file View File

14 ci/build-in-docker.sh Executable file Unescape Escape View File

0 ci/diffs/.keep Normal file Unescape Escape View File

85 ci/diffs/0001-selftests-bpf-set-test-path-for-token-obj_priv_impli.patch Normal file Unescape Escape View File

69 ci/diffs/4000-selftests-bpf-Fix-tests-after-fields-reorder-in-stru.patch Normal file Unescape Escape View File

71 ci/diffs/4001-selftests-bpf-Fix-verifier_bpf_fastcall-test.patch Normal file Unescape Escape View File

71 ci/diffs/4002-selftests-bpf-Fix-verifier_private_stack-test-failur.patch Normal file Unescape Escape View File

95 ci/managers/debian.sh Executable file Unescape Escape View File

15 ci/managers/test_compile.sh Executable file Unescape Escape View File

0 travis-ci/managers/travis_wait.bash → ci/managers/travis_wait.bash Unescape Escape View File

24 ci/managers/ubuntu.sh Executable file Unescape Escape View File

15 ci/vmtest/configs/DENYLIST Normal file Unescape Escape View File

13 ci/vmtest/configs/DENYLIST-latest Normal file Unescape Escape View File

17 ci/vmtest/configs/DENYLIST-latest.s390x Normal file Unescape Escape View File

37 ci/vmtest/configs/run-vmtest.env Normal file Unescape Escape View File

2 docs/.gitignore vendored Normal file Unescape Escape View File

93 docs/api.rst Normal file Unescape Escape View File

41 docs/conf.py Normal file Unescape Escape View File

33 docs/index.rst Normal file Unescape Escape View File

37 docs/libbpf_build.rst Normal file Unescape Escape View File

87 src/README.rst → docs/libbpf_naming_convention.rst Unescape Escape View File

236 docs/libbpf_overview.rst Normal file Unescape Escape View File

235 docs/program_types.rst Normal file Unescape Escape View File

9 docs/sphinx/Makefile Normal file Unescape Escape View File

277 docs/sphinx/doxygen/Doxyfile Normal file Unescape Escape View File

2 docs/sphinx/requirements.txt Normal file Unescape Escape View File

23 fuzz/bpf-object-fuzzer.c Normal file Unescape Escape View File

BIN fuzz/bpf-object-fuzzer_seed_corpus.zip Normal file View File

33 include/linux/filter.h Unescape Escape View File

2 include/linux/kernel.h Unescape Escape View File

9 include/linux/list.h Unescape Escape View File

2 include/linux/types.h Unescape Escape View File

20 include/tools/libc_compat.h Unescape Escape View File

4833 include/uapi/linux/bpf.h View File

90 include/uapi/linux/btf.h Unescape Escape View File

1022 include/uapi/linux/if_link.h View File

99 include/uapi/linux/if_xdp.h Unescape Escape View File

230 include/uapi/linux/netdev.h Normal file Unescape Escape View File

1475 include/uapi/linux/perf_event.h Normal file View File

565 include/uapi/linux/pkt_cls.h Normal file Unescape Escape View File

1055 include/uapi/linux/pkt_sched.h Normal file View File

95 meson.build Unescape Escape View File

82 scripts/build-fuzzers.sh Executable file Unescape Escape View File

18 scripts/check-reallocarray.sh Unescape Escape View File

105 scripts/coverity.sh Executable file Unescape Escape View File

37 scripts/mailmap-update.sh Executable file Unescape Escape View File

417 scripts/sync-kernel.sh Unescape Escape View File

2 src/.gitignore vendored Unescape Escape View File

2435 Commits

v0.0.4 ... netdata_pa

1

.gitattributes vendored Normal file

View File

108617

.github/actions/build-selftests/vmlinux.h vendored Normal file

View File

16

.github/actions/debian/action.yml vendored Normal file

View File

23

.github/actions/setup/action.yml vendored Normal file

View File

92

.github/workflows/build.yml vendored Normal file

View File

40

.github/workflows/cifuzz.yml vendored Normal file

View File

52

.github/workflows/codeql.yml vendored Normal file

View File

32

.github/workflows/coverity.yml vendored Normal file

View File

19

.github/workflows/lint.yml vendored Normal file

View File

31

.github/workflows/ondemand.yml vendored Normal file

View File

36

.github/workflows/test.yml vendored Normal file

View File

117

.github/workflows/vmtest.yml vendored Normal file

View File

22

.mailmap Normal file

View File

26

.readthedocs.yaml Normal file

View File

122

.travis.yml

View File

1

BPF-CHECKPOINT-COMMIT Normal file

View File

2

CHECKPOINT-COMMIT

View File

1

LICENSE Normal file

View File

32

LICENSE.BSD-2-Clause Normal file

View File

503

LICENSE.LGPL-2.1 Normal file

View File

189

README.md

View File

281

SYNC.md Normal file

View File

BIN
assets/libbpf-logo-compact-darkbg.png Normal file

View File

BIN
assets/libbpf-logo-compact-mono.png Normal file

View File

BIN
assets/libbpf-logo-compact.png Normal file

View File

BIN
assets/libbpf-logo-sideways-darkbg.png Normal file

View File

BIN
assets/libbpf-logo-sideways-mono.png Normal file

View File

BIN
assets/libbpf-logo-sideways.png Normal file

View File

BIN
assets/libbpf-logo-sparse-darkbg.png Normal file

View File

BIN
assets/libbpf-logo-sparse-mono.png Normal file

View File

BIN
assets/libbpf-logo-sparse.png Normal file

View File

14

ci/build-in-docker.sh Executable file

View File

0

ci/diffs/.keep Normal file

View File

85

ci/diffs/0001-selftests-bpf-set-test-path-for-token-obj_priv_impli.patch Normal file

View File

69

ci/diffs/4000-selftests-bpf-Fix-tests-after-fields-reorder-in-stru.patch Normal file

View File

71

ci/diffs/4001-selftests-bpf-Fix-verifier_bpf_fastcall-test.patch Normal file

View File

71

ci/diffs/4002-selftests-bpf-Fix-verifier_private_stack-test-failur.patch Normal file

View File

95

ci/managers/debian.sh Executable file

View File

15

ci/managers/test_compile.sh Executable file

View File

0

travis-ci/managers/travis_wait.bash → ci/managers/travis_wait.bash

View File

24

ci/managers/ubuntu.sh Executable file

View File

15

ci/vmtest/configs/DENYLIST Normal file

View File

13

ci/vmtest/configs/DENYLIST-latest Normal file

View File

17

ci/vmtest/configs/DENYLIST-latest.s390x Normal file

View File

37

ci/vmtest/configs/run-vmtest.env Normal file

View File

2

docs/.gitignore vendored Normal file

View File

93

docs/api.rst Normal file

View File

41

docs/conf.py Normal file

View File

33

docs/index.rst Normal file

View File

37

docs/libbpf_build.rst Normal file

View File

87

src/README.rst → docs/libbpf_naming_convention.rst

View File

236

docs/libbpf_overview.rst Normal file

View File

235

docs/program_types.rst Normal file

View File

9

docs/sphinx/Makefile Normal file

View File

277

docs/sphinx/doxygen/Doxyfile Normal file

View File

2

docs/sphinx/requirements.txt Normal file

View File

23

fuzz/bpf-object-fuzzer.c Normal file

View File

BIN
fuzz/bpf-object-fuzzer_seed_corpus.zip Normal file

View File

33

include/linux/filter.h

View File

2

include/linux/kernel.h

View File

9

include/linux/list.h

View File

2

include/linux/types.h

View File

20

include/tools/libc_compat.h

View File

4833

include/uapi/linux/bpf.h

View File

90

include/uapi/linux/btf.h

View File

1022

include/uapi/linux/if_link.h

View File

99

include/uapi/linux/if_xdp.h

View File

230

include/uapi/linux/netdev.h Normal file

View File

1475

include/uapi/linux/perf_event.h Normal file

View File

565

include/uapi/linux/pkt_cls.h Normal file

View File

1055

include/uapi/linux/pkt_sched.h Normal file

View File

95

meson.build

View File

82

scripts/build-fuzzers.sh Executable file

View File

18

scripts/check-reallocarray.sh

View File

105

scripts/coverity.sh Executable file

View File

37

scripts/mailmap-update.sh Executable file

View File

417

scripts/sync-kernel.sh

View File

2

src/.gitignore vendored

View File

173

src/Makefile

View File