libbpf

mirror of https://github.com/netdata/libbpf.git synced 2026-06-14 18:49:08 +08:00

Author	SHA1	Message	Date
Stanislav Fomichev	95134be22e	xsk: Add TX timestamp and TX checksum offload support This change actually defines the (initial) metadata layout that should be used by AF_XDP userspace (xsk_tx_metadata). The first field is flags which requests appropriate offloads, followed by the offload-specific fields. The supported per-device offloads are exported via netlink (new xsk-flags). The offloads themselves are still implemented in a bit of a framework-y fashion that's left from my initial kfunc attempt. I'm introducing new xsk_tx_metadata_ops which drivers are supposed to implement. The drivers are also supposed to call xsk_tx_metadata_request/xsk_tx_metadata_complete in the right places. Since xsk_tx_metadata_{request,_complete} are static inline, we don't incur any extra overhead doing indirect calls. The benefit of this scheme is as follows: - keeps all metadata layout parsing away from driver code - makes it easy to grep and see which drivers implement what - don't need any extra flags to maintain to keep track of what offloads are implemented; if the callback is implemented - the offload is supported (used by netlink reporting code) Two offloads are defined right now: 1. XDP_TXMD_FLAGS_CHECKSUM: skb-style csum_start+csum_offset 2. XDP_TXMD_FLAGS_TIMESTAMP: writes TX timestamp back into metadata area upon completion (tx_timestamp field) XDP_TXMD_FLAGS_TIMESTAMP is also implemented for XDP_COPY mode: it writes SW timestamp from the skb destructor (note I'm reusing hwtstamps to pass metadata pointer). The struct is forward-compatible and can be extended in the future by appending more fields. Reviewed-by: Song Yoong Siang <yoong.siang.song@intel.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20231127190319.1190813-3-sdf@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-04 19:15:17 -05:00
Stanislav Fomichev	2f95d28664	xsk: Support tx_metadata_len For zerocopy mode, tx_desc->addr can point to an arbitrary offset and carry some TX metadata in the headroom. For copy mode, there is no way currently to populate skb metadata. Introduce new tx_metadata_len umem config option that indicates how many bytes to treat as metadata. Metadata bytes come prior to tx_desc address (same as in RX case). The size of the metadata has mostly the same constraints as XDP: - less than 256 bytes - 8-byte aligned (compared to 4-byte alignment on xdp, due to 8-byte timestamp in the completion) - non-zero This data is not interpreted in any way right now. Reviewed-by: Song Yoong Siang <yoong.siang.song@intel.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20231127190319.1190813-2-sdf@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2024-01-04 19:15:17 -05:00
Jiri Olsa	afb384f685	bpf: Add link_info support for uprobe multi link Adding support to get uprobe_link details through bpf_link_info interface. Adding new struct uprobe_multi to struct bpf_link_info to carry the uprobe_multi link details. The uprobe_multi.count is passed from user space to denote size of array fields (offsets/ref_ctr_offsets/cookies). The actual array size is stored back to uprobe_multi.count (allowing user to find out the actual array size) and array fields are populated up to the user passed size. All the non-array fields (path/count/flags/pid) are always set. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/20231125193130.834322-4-jolsa@kernel.org	2024-01-04 19:15:17 -05:00
Jiri Olsa	467dd7bda5	libbpf: Add st_type argument to elf_resolve_syms_offsets function We need to get offsets for static variables in following changes, so making elf_resolve_syms_offsets to take st_type value as argument and passing it to elf_sym_iter_new. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/bpf/20231125193130.834322-2-jolsa@kernel.org	2024-01-04 19:15:17 -05:00
Eduard Zingerman	9c794e5ab4	libbpf: Start v1.4 development cycle Bump libbpf.map to v1.4.0 to start a new libbpf version cycle. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20231123000439.12025-1-eddyz87@gmail.com	2024-01-04 19:15:17 -05:00
Jakub Kicinski	eb40a93a10	tools: ynl: add sample for getting page-pool information Regenerate the tools/ code after netdev spec changes. Add sample to query page-pool info in a concise fashion: $ ./page-pool eth0[2] page pools: 10 (zombies: 0) refs: 41984 bytes: 171966464 (refs: 0 bytes: 0) recycling: 90.3% (alloc: 656:397681 recycle: 89652:270201) Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-01-04 19:15:17 -05:00
Eduard Zingerman	1baa3e2355	ci: move /dev/kvm permissions setup from to actions/vmtest.yml The vmtest action is used by several workflows: test, pahole, ondemand. At the same time, vmtest action requires valid access rights to /dev/kvm and is the only action that uses it. This commit moves /dev/kvm permissions setup from test workflow to vmtest action, in order to make sure that setup logic is shared by all workflows that run vmtest. Should fix CI failures like [1]. [1] https://github.com/libbpf/libbpf/actions/runs/7104762048/job/19340484589 Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>	2023-12-13 15:50:08 -05:00
Andrii Nakryiko	1b2ae67c1d	ci: custom patch to patch out BPF_F_TEST_REG_INVARIANTS flag Without needing to modify tons of BPF selftests file, make sure we don't pass BPF_F_TEST_REG_INVARIANTS to kernel, to make BPF selftests work on 4.9 and 5.5 kernels. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-12-05 12:51:08 -05:00
Andrii Nakryiko	20c0a9e3d7	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 155addf0814a92d08fce26a11b27e3315cdba977 Checkpoint bpf-next commit: 750011e239a50873251c16207b0fe78eabf8577e Baseline bpf commit: 83b9dda8afa4e968d9cce253f390b01c0612a2a5 Checkpoint bpf commit: bc4fbf022c68967cb49b2b820b465cf90de974b8 Andrii Nakryiko (2): bpf: add register bounds sanity checks and sanitization bpf: rename BPF_F_TEST_SANITY_STRICT to BPF_F_TEST_REG_INVARIANTS Jordan Rome (1): bpf: Add crosstask check to __bpf_get_stack include/uapi/linux/bpf.h \| 6 ++++++ 1 file changed, 6 insertions(+) Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-11-22 16:20:56 -05:00
Andrii Nakryiko	b88b3ac09d	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-11-22 16:20:56 -05:00
Andrii Nakryiko	96ed1c508f	bpf: rename BPF_F_TEST_SANITY_STRICT to BPF_F_TEST_REG_INVARIANTS Rename verifier internal flag BPF_F_TEST_SANITY_STRICT to more neutral BPF_F_TEST_REG_INVARIANTS. This is a follow up to [0]. A few selftests and veristat need to be adjusted in the same patch as well. [0] https://patchwork.kernel.org/project/netdevbpf/patch/20231112010609.848406-5-andrii@kernel.org/ Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231117171404.225508-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-22 16:20:56 -05:00
Andrii Nakryiko	7ccc41c138	bpf: add register bounds sanity checks and sanitization Add simple sanity checks that validate well-formed ranges (min <= max) across u64, s64, u32, and s32 ranges. Also for cases when the value is constant (either 64-bit or 32-bit), we validate that ranges and tnums are in agreement. These bounds checks are performed at the end of BPF_ALU/BPF_ALU64 operations, on conditional jumps, and for LDX instructions (where subreg zero/sign extension is probably the most important to check). This covers most of the interesting cases. Also, we validate the sanity of the return register when manually adjusting it for some special helpers. By default, sanity violation will trigger a warning in verifier log and resetting register bounds to "unbounded" ones. But to aid development and debugging, BPF_F_TEST_SANITY_STRICT flag is added, which will trigger hard failure of verification with -EFAULT on register bounds violations. This allows selftests to catch such issues. veristat will also gain a CLI option to enable this behavior. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Link: https://lore.kernel.org/r/20231112010609.848406-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-22 16:20:56 -05:00
Jordan Rome	785a079966	bpf: Add crosstask check to __bpf_get_stack Currently get_perf_callchain only supports user stack walking for the current task. Passing the correct crosstask param will return 0 frames if the task passed to __bpf_get_stack isn't the current one instead of a single incorrect frame/address. This change passes the correct crosstask param but also does a preemptive check in __bpf_get_stack if the task is current and returns -EOPNOTSUPP if it is not. This issue was found using bpf_get_task_stack inside a BPF iterator ("iter/task"), which iterates over all tasks. bpf_get_task_stack works fine for fetching kernel stacks but because get_perf_callchain relies on the caller to know if the requested task is the current one (via crosstask) it was failing in a confusing way. It might be possible to get user stacks for all tasks utilizing something like access_process_vm but that requires the bpf program calling bpf_get_task_stack to be sleepable and would therefore be a breaking change. Fixes: fa28dcb82a38 ("bpf: Introduce helper bpf_get_task_stack()") Signed-off-by: Jordan Rome <jordalgo@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20231108112334.3433136-1-jordalgo@meta.com	2023-11-22 16:20:56 -05:00
Eduard Zingerman	a6b990991c	ci: disable sockopt selftest for 5.5 kernel The following 'sockopt' selftests fail on libbpf CI for kernel 5.5: - sockopt/getsockopt: read ctx->optlen:FAIL - sockopt/getsockopt: support smaller ctx->optlen:FAIL - sockopt/setsockopt: read ctx->level:FAIL - sockopt/setsockopt: read ctx->optname:FAIL - sockopt/setsockopt: read ctx->optlen:FAIL - sockopt/setsockopt: ctx->optlen == -1 is ok:FAIL Examples of failing CI runs: - https://github.com/libbpf/libbpf/actions/runs/6961182067 - https://github.com/libbpf/libbpf/actions/runs/6961088131 The failures are strange as all tests were added quite a while ago (Jun 27 2019) by commit: 9ec8a4c9489d ("selftests/bpf: add sockopt test") But seem to be unrelated to libbpf. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>	2023-11-22 16:20:43 -05:00
Eduard Zingerman	4161e1f41d	ci: disable a number of selftest causing CI for LATEST kernel All tests disabled in this commit pass on main kernel CI and fail or flip/flop on libbpf CI. Failures do not seem to be related to libbpf. It appears that common theme for all failing tests is that hardware perf events are not delivered as expected on github CI worker machines. Examples of failed CI runs: - https://github.com/libbpf/libbpf/actions/runs/6961182067 - https://github.com/libbpf/libbpf/actions/runs/6961088131 Fails with the following log: test_send_signal_common:FAIL:incorrect result \ unexpected incorrect result: actual 48 != expected 50 Test mode of operation: - fork' - child: - install handler for SIGUSR1; - send ready message to parent; - wait for SIGUSR1 in busy loop; - send message '2' (50) to parent if SIGUSR1 occured; - send message '0' (48) to parent if no SIGUSR1 occured. - parent: - wait for ready message from child; - install perf_event or tracepoint bpf program that uses bpf_send_signal() to send SIGUSR1; - wait for message '0' or '2' from child, '2' is expected for test success. It appears that perf event that should be triggered by parent never happens, thus message 48 is received by parent and test fails. Fails with the following log: test_and_reset_skel:FAIL:found_vm_exec \ unexpected found_vm_exec: actual 0 != expected 1 Such log is printed if variables set from BPF program are not set after some timeout. The program that should set the variable is SEC("perf_event") int handle_pe(void), it appears that it is never run. Fails with the following log: pe_subtest:FAIL:pe_res1 unexpected pe_res1: actual 0 != expected 1048576 Variable pe_res1 should be triggered by program SEC("perf_event") int handle_pe(struct pt_regs *ctx), it appears that it is never run. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>	2023-11-22 16:20:43 -05:00
Eduard Zingerman	93f360cf4b	ci: don't set /dev/kvm permissions when CI user is root s390 tests are executed on selfhosted runner using root user, avoid setting /dev/kvm permissions in such case. This should fix CI failures like [0]. (Still necessary for x86 tests executed on standard github runners). [0] https://github.com/libbpf/libbpf/actions/runs/6898545987/job/18768732980?pr=752 Fixes: `168630f852` ("ci: give /dev/kvm 0666 permissions inside CI runner") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>	2023-11-17 15:36:52 -05:00
Eduard Zingerman	5ff0102329	ci: use config.vm for kernel config when present Recent kernel commit [0] changed selftests config snippets structure by extracting VM specific options to the file 'config.vm'. This file has to be used in .github/actions/vmtest/action.yml at step 'Prepare to build BPF selftests', otherwise drivers necessary for e.g. root file system access are not compiled into the kernel, leading to CI failures like [1]. [0] b0cf0dcde8ca ("selftests/bpf: Consolidate VIRTIO/9P configs in config.vm file") [1] https://github.com/libbpf/libbpf/actions/runs/6830439839/job/18578379328?pr=747 Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>	2023-11-16 20:25:07 -05:00
Andrii Nakryiko	0c54691bae	ci: apply temporary patch to make bpf-next build Apply fe69a1b1b6ed ("selftests: bpf: xskxceiver: ksft_print_msg: fix format type error") to make bpf-next build. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-11-13 21:51:02 -05:00
Eduard Zingerman	168630f852	ci: give /dev/kvm 0666 permissions inside CI runner Starting recently libbpf CI runs started failing with the following error: ##[group]vm_init - Starting virtual machine... Starting VM with 4 CPUs... INFO: /dev/kvm exists KVM acceleration can be used Could not access KVM kernel module: Permission denied qemu-system-x86_64: failed to initialize KVM: Permission denied ##[error]Process completed with exit code 2. E.g. see here [0]. The error happens because CI user has not enough rights to access /dev/kvm. On a regular machine the solution would be to add user to group 'kvm', however that would require a re-login, which is cumbersome to achieve in CI setting. Instead, use a recipe described in [1] to make udev set 0666 access permissions for /dev/kvm. [0] https://github.com/libbpf/libbpf/actions/runs/6819530119/job/18547589967?pr=746 [1] https://stackoverflow.com/questions/37300811/android-studio-dev-kvm-device-permission-denied/61984745#61984745 Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>	2023-11-13 18:21:02 -08:00
Eduard Zingerman	5d4237d52d	ci: regenerate vmlinux.h Regenerate latest vmlinux.h for old kernel CI tests. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>	2023-11-13 18:21:02 -08:00
Eduard Zingerman	fa0e866373	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 0e133a13370389d3894891eafe54fec2c44ad735 Checkpoint bpf-next commit: e80742d917492f10926b46b0caca050c6c9231d6 Baseline bpf commit: 8f8abb863fa5a4cc18955c6a0e17af0ded3e4a76 Checkpoint bpf commit: 83b9dda8afa4e968d9cce253f390b01c0612a2a5 Daniel Borkmann (3): netkit, bpf: Add bpf programmable net device tools: Sync if_link uapi header libbpf: Add link-based API for netkit Yonghong Song (2): libbpf: Fix potential uninitialized tail padding with LIBBPF_OPTS_RESET bpf: Use named fields for certain bpf uapi structs include/uapi/linux/bpf.h \| 37 +++++---- include/uapi/linux/if_link.h \| 141 +++++++++++++++++++++++++++++++++++ src/bpf.c \| 16 ++++ src/bpf.h \| 5 ++ src/libbpf.c \| 39 ++++++++++ src/libbpf.h \| 15 ++++ src/libbpf.map \| 1 + src/libbpf_common.h \| 13 ++-- 8 files changed, 246 insertions(+), 21 deletions(-) Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>	2023-11-13 18:21:02 -08:00
Yonghong Song	0fa5ff4f54	bpf: Use named fields for certain bpf uapi structs Martin and Vadim reported a verifier failure with bpf_dynptr usage. The issue is mentioned but Vadim workarounded the issue with source change ([1]). The below describes what is the issue and why there is a verification failure. int BPF_PROG(skb_crypto_setup) { struct bpf_dynptr algo, key; ... bpf_dynptr_from_mem(..., ..., 0, &algo); ... } The bpf program is using vmlinux.h, so we have the following definition in vmlinux.h: struct bpf_dynptr { long: 64; long: 64; }; Note that in uapi header bpf.h, we have struct bpf_dynptr { long: 64; long: 64; } __attribute__((aligned(8))); So we lost alignment information for struct bpf_dynptr by using vmlinux.h. Let us take a look at a simple program below: $ cat align.c typedef unsigned long long __u64; struct bpf_dynptr_no_align { __u64 :64; __u64 :64; }; struct bpf_dynptr_yes_align { __u64 :64; __u64 :64; } __attribute__((aligned(8))); void bar(void , void ); int foo() { struct bpf_dynptr_no_align a; struct bpf_dynptr_yes_align b; bar(&a, &b); return 0; } $ clang --target=bpf -O2 -S -emit-llvm align.c Look at the generated IR file align.ll: ... %a = alloca %struct.bpf_dynptr_no_align, align 1 %b = alloca %struct.bpf_dynptr_yes_align, align 8 ... The compiler dictates the alignment for struct bpf_dynptr_no_align is 1 and the alignment for struct bpf_dynptr_yes_align is 8. So theoretically compiler could allocate variable %a with alignment 1 although in reallity the compiler may choose a different alignment by considering other local variables. In [1], the verification failure happens because variable 'algo' is allocated on the stack with alignment 4 (fp-28). But the verifer wants its alignment to be 8. To fix the issue, the RFC patch ([1]) tried to add '__attribute__((aligned(8)))' to struct bpf_dynptr plus other similar structs. Andrii suggested that we could directly modify uapi struct with named fields like struct 'bpf_iter_num': struct bpf_iter_num { /* opaque iterator state; having __u64 here allows to preserve correct * alignment requirements in vmlinux.h, generated from BTF */ __u64 __opaque[1]; } __attribute__((aligned(8))); Indeed, adding named fields for those affected structs in this patch can preserve alignment when bpf program references them in vmlinux.h. With this patch, the verification failure in [1] can also be resolved. [1] https://lore.kernel.org/bpf/1b100f73-7625-4c1f-3ae5-50ecf84d3ff0@linux.dev/ [2] https://lore.kernel.org/bpf/20231103055218.2395034-1-yonghong.song@linux.dev/ Cc: Vadim Fedorenko <vadfed@meta.com> Cc: Martin KaFai Lau <martin.lau@linux.dev> Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231104024900.1539182-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-11-10 13:27:01 -08:00
Yonghong Song	2d5df9f626	libbpf: Fix potential uninitialized tail padding with LIBBPF_OPTS_RESET Martin reported that there is a libbpf complaining of non-zero-value tail padding with LIBBPF_OPTS_RESET macro if struct bpf_netkit_opts is modified to have a 4-byte tail padding. This only happens to clang compiler. The commend line is: ./test_progs -t tc_netkit_multi_links Martin and I did some investigation and found this indeed the case and the following are the investigation details. Clang: clang version 18.0.0 <I tried clang15/16/17 and they all have similar results> tools/lib/bpf/libbpf_common.h: #define LIBBPF_OPTS_RESET(NAME, ...) \ do { \ memset(&NAME, 0, sizeof(NAME)); \ NAME = (typeof(NAME)) { \ .sz = sizeof(NAME), \ __VA_ARGS__ \ }; \ } while (0) #endif tools/lib/bpf/libbpf.h: struct bpf_netkit_opts { /* size of this struct, for forward/backward compatibility / size_t sz; __u32 flags; __u32 relative_fd; __u32 relative_id; __u64 expected_revision; size_t :0; }; #define bpf_netkit_opts__last_field expected_revision In the above struct bpf_netkit_opts, there is no tail padding. prog_tests/tc_netkit.c: static void serial_test_tc_netkit_multi_links_target(int mode, int target) { ... LIBBPF_OPTS(bpf_netkit_opts, optl); ... LIBBPF_OPTS_RESET(optl, .flags = BPF_F_BEFORE, .relative_fd = bpf_program__fd(skel->progs.tc1), ); ... } Let us make the following source change, note that we have a 4-byte tailing padding now. diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index 6cd9c501624f..0dd83910ae9a 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -803,13 +803,13 @@ bpf_program__attach_tcx(const struct bpf_program prog, int ifindex, struct bpf_netkit_opts { /* size of this struct, for forward/backward compatibility */ size_t sz; - __u32 flags; __u32 relative_fd; __u32 relative_id; __u64 expected_revision; + __u32 flags; size_t :0; }; -#define bpf_netkit_opts__last_field expected_revision +#define bpf_netkit_opts__last_field flags The clang 18 generated asm code looks like below: ; LIBBPF_OPTS_RESET(optl, 55e3: 48 8d 7d 98 leaq -0x68(%rbp), %rdi 55e7: 31 f6 xorl %esi, %esi 55e9: ba 20 00 00 00 movl $0x20, %edx 55ee: e8 00 00 00 00 callq 0x55f3 <serial_test_tc_netkit_multi_links_target+0x18d3> 55f3: 48 c7 85 10 fd ff ff 20 00 00 00 movq $0x20, -0x2f0(%rbp) 55fe: 48 8b 85 68 ff ff ff movq -0x98(%rbp), %rax 5605: 48 8b 78 18 movq 0x18(%rax), %rdi 5609: e8 00 00 00 00 callq 0x560e <serial_test_tc_netkit_multi_links_target+0x18ee> 560e: 89 85 18 fd ff ff movl %eax, -0x2e8(%rbp) 5614: c7 85 1c fd ff ff 00 00 00 00 movl $0x0, -0x2e4(%rbp) 561e: 48 c7 85 20 fd ff ff 00 00 00 00 movq $0x0, -0x2e0(%rbp) 5629: c7 85 28 fd ff ff 08 00 00 00 movl $0x8, -0x2d8(%rbp) 5633: 48 8b 85 10 fd ff ff movq -0x2f0(%rbp), %rax 563a: 48 89 45 98 movq %rax, -0x68(%rbp) 563e: 48 8b 85 18 fd ff ff movq -0x2e8(%rbp), %rax 5645: 48 89 45 a0 movq %rax, -0x60(%rbp) 5649: 48 8b 85 20 fd ff ff movq -0x2e0(%rbp), %rax 5650: 48 89 45 a8 movq %rax, -0x58(%rbp) 5654: 48 8b 85 28 fd ff ff movq -0x2d8(%rbp), %rax 565b: 48 89 45 b0 movq %rax, -0x50(%rbp) ; link = bpf_program__attach_netkit(skel->progs.tc2, ifindex, &optl); At -O0 level, the clang compiler creates an intermediate copy. We have below to store 'flags' with 4-byte store and leave another 4 byte in the same 8-byte-aligned storage undefined, 5629: c7 85 28 fd ff ff 08 00 00 00 movl $0x8, -0x2d8(%rbp) and later we store 8-byte to the original zero'ed buffer 5654: 48 8b 85 28 fd ff ff movq -0x2d8(%rbp), %rax 565b: 48 89 45 b0 movq %rax, -0x50(%rbp) This caused a problem as the 4-byte value at [%rbp-0x2dc, %rbp-0x2e0) may be garbage. gcc (gcc 11.4) does not have this issue as it does zeroing struct first before doing assignments: ; LIBBPF_OPTS_RESET(optl, 50fd: 48 8d 85 40 fc ff ff leaq -0x3c0(%rbp), %rax 5104: ba 20 00 00 00 movl $0x20, %edx 5109: be 00 00 00 00 movl $0x0, %esi 510e: 48 89 c7 movq %rax, %rdi 5111: e8 00 00 00 00 callq 0x5116 <serial_test_tc_netkit_multi_links_target+0x1522> 5116: 48 8b 45 f0 movq -0x10(%rbp), %rax 511a: 48 8b 40 18 movq 0x18(%rax), %rax 511e: 48 89 c7 movq %rax, %rdi 5121: e8 00 00 00 00 callq 0x5126 <serial_test_tc_netkit_multi_links_target+0x1532> 5126: 48 c7 85 40 fc ff ff 00 00 00 00 movq $0x0, -0x3c0(%rbp) 5131: 48 c7 85 48 fc ff ff 00 00 00 00 movq $0x0, -0x3b8(%rbp) 513c: 48 c7 85 50 fc ff ff 00 00 00 00 movq $0x0, -0x3b0(%rbp) 5147: 48 c7 85 58 fc ff ff 00 00 00 00 movq $0x0, -0x3a8(%rbp) 5152: 48 c7 85 40 fc ff ff 20 00 00 00 movq $0x20, -0x3c0(%rbp) 515d: 89 85 48 fc ff ff movl %eax, -0x3b8(%rbp) 5163: c7 85 58 fc ff ff 08 00 00 00 movl $0x8, -0x3a8(%rbp) ; link = bpf_program__attach_netkit(skel->progs.tc2, ifindex, &optl); It is not clear how to resolve the compiler code generation as the compiler generates correct code w.r.t. how to handle unnamed padding in C standard. So this patch changed LIBBPF_OPTS_RESET macro to avoid uninitialized tail padding. We already knows LIBBPF_OPTS macro works on both gcc and clang, even with tail padding. So LIBBPF_OPTS_RESET is changed to be a LIBBPF_OPTS followed by a memcpy(), thus avoiding uninitialized tail padding. The below is asm code generated with this patch and with clang compiler: ; LIBBPF_OPTS_RESET(optl, 55e3: 48 8d bd 10 fd ff ff leaq -0x2f0(%rbp), %rdi 55ea: 31 f6 xorl %esi, %esi 55ec: ba 20 00 00 00 movl $0x20, %edx 55f1: e8 00 00 00 00 callq 0x55f6 <serial_test_tc_netkit_multi_links_target+0x18d6> 55f6: 48 c7 85 10 fd ff ff 20 00 00 00 movq $0x20, -0x2f0(%rbp) 5601: 48 8b 85 68 ff ff ff movq -0x98(%rbp), %rax 5608: 48 8b 78 18 movq 0x18(%rax), %rdi 560c: e8 00 00 00 00 callq 0x5611 <serial_test_tc_netkit_multi_links_target+0x18f1> 5611: 89 85 18 fd ff ff movl %eax, -0x2e8(%rbp) 5617: c7 85 1c fd ff ff 00 00 00 00 movl $0x0, -0x2e4(%rbp) 5621: 48 c7 85 20 fd ff ff 00 00 00 00 movq $0x0, -0x2e0(%rbp) 562c: c7 85 28 fd ff ff 08 00 00 00 movl $0x8, -0x2d8(%rbp) 5636: 48 8b 85 10 fd ff ff movq -0x2f0(%rbp), %rax 563d: 48 89 45 98 movq %rax, -0x68(%rbp) 5641: 48 8b 85 18 fd ff ff movq -0x2e8(%rbp), %rax 5648: 48 89 45 a0 movq %rax, -0x60(%rbp) 564c: 48 8b 85 20 fd ff ff movq -0x2e0(%rbp), %rax 5653: 48 89 45 a8 movq %rax, -0x58(%rbp) 5657: 48 8b 85 28 fd ff ff movq -0x2d8(%rbp), %rax 565e: 48 89 45 b0 movq %rax, -0x50(%rbp) ; link = bpf_program__attach_netkit(skel->progs.tc2, ifindex, &optl); In the above code, a temporary buffer is zeroed and then has proper value assigned. Finally, values in temporary buffer are copied to the original variable buffer, hence tail padding is guaranteed to be 0. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/bpf/20231107201511.2548645-1-yonghong.song@linux.dev	2023-11-10 13:27:01 -08:00
Daniel Borkmann	2cb0236318	libbpf: Add link-based API for netkit This adds bpf_program__attach_netkit() API to libbpf. Overall it is very similar to tcx. The API looks as following: LIBBPF_API struct bpf_link * bpf_program__attach_netkit(const struct bpf_program prog, int ifindex, const struct bpf_netkit_opts opts); The struct bpf_netkit_opts is done in similar way as struct bpf_tcx_opts for supporting bpf_mprog control parameters. The attach location for the primary and peer device is derived from the program section "netkit/primary" and "netkit/peer", respectively. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20231024214904.29825-4-daniel@iogearbox.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-11-10 13:27:01 -08:00
Daniel Borkmann	cc7f085286	tools: Sync if_link uapi header Sync if_link uapi header to the latest version as we need the refresher in tooling for netkit device. Given it's been a while since the last sync and the diff is fairly big, it has been done as its own commit. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20231024214904.29825-3-daniel@iogearbox.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-11-10 13:27:01 -08:00
Daniel Borkmann	62b1e4905b	netkit, bpf: Add bpf programmable net device This work adds a new, minimal BPF-programmable device called "netkit" (former PoC code-name "meta") we recently presented at LSF/MM/BPF. The core idea is that BPF programs are executed within the drivers xmit routine and therefore e.g. in case of containers/Pods moving BPF processing closer to the source. One of the goals was that in case of Pod egress traffic, this allows to move BPF programs from hostns tcx ingress into the device itself, providing earlier drop or forward mechanisms, for example, if the BPF program determines that the skb must be sent out of the node, then a redirect to the physical device can take place directly without going through per-CPU backlog queue. This helps to shift processing for such traffic from softirq to process context, leading to better scheduling decisions/performance (see measurements in the slides). In this initial version, the netkit device ships as a pair, but we plan to extend this further so it can also operate in single device mode. The pair comes with a primary and a peer device. Only the primary device, typically residing in hostns, can manage BPF programs for itself and its peer. The peer device is designated for containers/Pods and cannot attach/detach BPF programs. Upon the device creation, the user can set the default policy to 'pass' or 'drop' for the case when no BPF program is attached. Additionally, the device can be operated in L3 (default) or L2 mode. The management of BPF programs is done via bpf_mprog, so that multi-attach is supported right from the beginning with similar API and dependency controls as tcx. For details on the latter see commit 053c8e1f235d ("bpf: Add generic attach/detach/query API for multi-progs"). tc BPF compatibility is provided, so that existing programs can be easily migrated. Going forward, we plan to use netkit devices in Cilium as the main device type for connecting Pods. They will be operated in L3 mode in order to simplify a Pod's neighbor management and the peer will operate in default drop mode, so that no traffic is leaving between the time when a Pod is brought up by the CNI plugin and programs attached by the agent. Additionally, the programs we attach via tcx on the physical devices are using bpf_redirect_peer() for inbound traffic into netkit device, hence the latter is also supporting the ndo_get_peer_dev callback. Similarly, we use bpf_redirect_neigh() for the way out, pushing from netkit peer to phys device directly. Also, BIG TCP is supported on netkit device. For the follow-up work in single device mode, we plan to convert Cilium's cilium_host/_net devices into a single one. An extensive test suite for checking device operations and the BPF program and link management API comes as BPF selftests in this series. Co-developed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Stanislav Fomichev <sdf@google.com> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://github.com/borkmann/iproute2/tree/pr/netkit Link: http://vger.kernel.org/bpfconf2023_material/tcx_meta_netdev_borkmann.pdf (24ff.) Link: https://lore.kernel.org/r/20231024214904.29825-2-daniel@iogearbox.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-11-10 13:27:01 -08:00
Andrii Nakryiko	3189f70538	docs: attempt to fix .readthedocs.yaml Seems like we need to update the config ([0],[1]). [0] https://blog.readthedocs.com/migrate-configuration-v2/ [1] https://blog.readthedocs.com/use-build-os-config/ Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2023-10-27 14:07:51 -07:00
Yonghong Song	6a5776066c	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 2147c8d07e1abc8dfc3433ca18eed5295e230ede Checkpoint bpf-next commit: 0e133a13370389d3894891eafe54fec2c44ad735 Baseline bpf commit: 9ff8d2717fc8f63e5cb226ddbda20649eefa2728 Checkpoint bpf commit: 9ff8d2717fc8f63e5cb226ddbda20649eefa2728 Alexandre Ghiti (1): libbpf: Fix syscall access arguments on riscv Andrii Nakryiko (1): libbpf: Don't assume SHT_GNU_verdef presence for SHT_GNU_versym section Daan De Meyer (3): bpf: Implement cgroup sockaddr hooks for unix sockets libbpf: Add support for cgroup unix socket address hooks documentation/bpf: Document cgroup unix socket address hooks David Vernet (1): bpf: Add ability to pin bpf timer to calling CPU Martynas Pumputis (1): bpf: Derive source IP addr via bpf_*_fib_lookup() docs/program_types.rst \| 10 ++++++++++ include/uapi/linux/bpf.h \| 27 +++++++++++++++++++++++---- src/bpf_tracing.h \| 2 -- src/elf.c \| 16 ++++++++++------ src/libbpf.c \| 10 ++++++++++ 5 files changed, 53 insertions(+), 12 deletions(-) Signed-off-by: Yonghong Song <yonghong.song@linux.dev>	2023-10-26 09:00:01 -07:00
Yonghong Song	acecaf855d	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions. Signed-off-by: Yonghong Song <yonghong.song@linux.dev>	2023-10-19 11:36:22 -07:00
Andrii Nakryiko	365cefa149	libbpf: Don't assume SHT_GNU_verdef presence for SHT_GNU_versym section Fix too eager assumption that SHT_GNU_verdef ELF section is going to be present whenever binary has SHT_GNU_versym section. It seems like either SHT_GNU_verdef or SHT_GNU_verneed can be used, so failing on missing SHT_GNU_verdef actually breaks use cases in production. One specific reported issue, which was used to manually test this fix, was trying to attach to `readline` function in BASH binary. Fixes: bb7fa09399b9 ("libbpf: Support symbol versioning for uprobe") Reported-by: Liam Wisehart <liamwisehart@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Manu Bretelle <chantr4@gmail.com> Reviewed-by: Fangrui Song <maskray@google.com> Acked-by: Hengqi Chen <hengqi.chen@gmail.com> Link: https://lore.kernel.org/bpf/20231016182840.4033346-1-andrii@kernel.org	2023-10-19 11:36:22 -07:00
Daan De Meyer	f4b6dcfca1	documentation/bpf: Document cgroup unix socket address hooks Update the documentation to mention the new cgroup unix sockaddr hooks. Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com> Link: https://lore.kernel.org/r/20231011185113.140426-8-daan.j.demeyer@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-10-19 11:36:22 -07:00
Daan De Meyer	748787456b	libbpf: Add support for cgroup unix socket address hooks Add the necessary plumbing to hook up the new cgroup unix sockaddr hooks into libbpf. Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com> Link: https://lore.kernel.org/r/20231011185113.140426-6-daan.j.demeyer@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-10-19 11:36:22 -07:00
Daan De Meyer	8a08d63f29	bpf: Implement cgroup sockaddr hooks for unix sockets These hooks allows intercepting connect(), getsockname(), getpeername(), sendmsg() and recvmsg() for unix sockets. The unix socket hooks get write access to the address length because the address length is not fixed when dealing with unix sockets and needs to be modified when a unix socket address is modified by the hook. Because abstract socket unix addresses start with a NUL byte, we cannot recalculate the socket address in kernelspace after running the hook by calculating the length of the unix socket path using strlen(). These hooks can be used when users want to multiplex syscall to a single unix socket to multiple different processes behind the scenes by redirecting the connect() and other syscalls to process specific sockets. We do not implement support for intercepting bind() because when using bind() with unix sockets with a pathname address, this creates an inode in the filesystem which must be cleaned up. If we rewrite the address, the user might try to clean up the wrong file, leaking the socket in the filesystem where it is never cleaned up. Until we figure out a solution for this (and a use case for intercepting bind()), we opt to not allow rewriting the sockaddr in bind() calls. We also implement recvmsg() support for connected streams so that after a connect() that is modified by a sockaddr hook, any corresponding recmvsg() on the connected socket can also be modified to make the connected program think it is connected to the "intended" remote. Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com> Link: https://lore.kernel.org/r/20231011185113.140426-5-daan.j.demeyer@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-10-19 11:36:22 -07:00
Martynas Pumputis	c9f8eb5310	bpf: Derive source IP addr via bpf__fib_lookup() Extend the bpf_fib_lookup() helper by making it to return the source IPv4/IPv6 address if the BPF_FIB_LOOKUP_SRC flag is set. For example, the following snippet can be used to derive the desired source IP address: struct bpf_fib_lookup p = { .ipv4_dst = ip4->daddr }; ret = bpf_skb_fib_lookup(skb, p, sizeof(p), BPF_FIB_LOOKUP_SRC \| BPF_FIB_LOOKUP_SKIP_NEIGH); if (ret != BPF_FIB_LKUP_RET_SUCCESS) return TC_ACT_SHOT; / the p.ipv4_src now contains the source address */ The inability to derive the proper source address may cause malfunctions in BPF-based dataplanes for hosts containing netdevs with more than one routable IP address or for multi-homed hosts. For example, Cilium implements packet masquerading in BPF. If an egressing netdev to which the Cilium's BPF prog is attached has multiple IP addresses, then only one [hardcoded] IP address can be used for masquerading. This breaks connectivity if any other IP address should have been selected instead, for example, when a public and private addresses are attached to the same egress interface. The change was tested with Cilium [1]. Nikolay Aleksandrov helped to figure out the IPv6 addr selection. [1]: https://github.com/cilium/cilium/pull/28283 Signed-off-by: Martynas Pumputis <m@lambda.lt> Link: https://lore.kernel.org/r/20231007081415.33502-2-m@lambda.lt Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2023-10-19 11:36:22 -07:00
David Vernet	1c0358823c	bpf: Add ability to pin bpf timer to calling CPU BPF supports creating high resolution timers using bpf_timer_* helper functions. Currently, only the BPF_F_TIMER_ABS flag is supported, which specifies that the timeout should be interpreted as absolute time. It would also be useful to be able to pin that timer to a core. For example, if you wanted to make a subset of cores run without timer interrupts, and only have the timer be invoked on a single core. This patch adds support for this with a new BPF_F_TIMER_CPU_PIN flag. When specified, the HRTIMER_MODE_PINNED flag is passed to hrtimer_start(). A subsequent patch will update selftests to validate. Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <song@kernel.org> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/bpf/20231004162339.200702-2-void@manifault.com	2023-10-19 11:36:22 -07:00
Alexandre Ghiti	20c1170ea4	libbpf: Fix syscall access arguments on riscv Since commit 08d0ce30e0e4 ("riscv: Implement syscall wrappers"), riscv selects ARCH_HAS_SYSCALL_WRAPPER so let's use the generic implementation of PT_REGS_SYSCALL_REGS(). Fixes: 08d0ce30e0e4 ("riscv: Implement syscall wrappers") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/bpf/20231004110905.49024-2-bjorn@kernel.org	2023-10-19 11:36:22 -07:00
Yonghong Song	b44eb3a8fa	libbpf: fix bpf-checkpoint-commit The previous sync bpf-checkpoint-commit becomes invalid due to upstream bpf tree force-push. This patch picked a new valid commit as the bpf-checkpoint-commit so the sync script can work with newer changes. Signed-off-by: Yonghong Song <yonghong.song@linux.dev>	2023-10-19 11:36:22 -07:00
Yonghong Song	14648264b1	ci: Regenerate latest vmlinux.h for old kernel CI testts Without the change, we will have failures like below: Warning: Kernel ABI header at 'tools/include/uapi/linux/if_xdp.h' differs from latest version at 'include/uapi/linux/if_xdp.h' progs/getsockname_unix_prog.c:27:15: error: no member named 'uaddrlen' in 'struct bpf_sock_addr_kern' if (sa_kern->uaddrlen != unaddrlen) ~~~~~~~ ^ 1 error generated. make: * [Makefile:605: /home/runner/work/libbpf/libbpf/.kernel/tools/testing/selftests/bpf/getsockname_unix_prog.bpf.o] Error 1 make: * Waiting for unfinished jobs.... Error: Process completed with exit code 2. in Kernel 5.5.0 on ubuntu-20.04 + selftests Manu Bretelle kindly helped regenerate the vmlinux.h from latest bpf-next kernel for me. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>	2023-10-19 11:36:22 -07:00
Song Liu	e26b84dc33	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 45ee73a0722b9e1d0b7a524d06756291b13b5912 Checkpoint bpf-next commit: 2147c8d07e1abc8dfc3433ca18eed5295e230ede Baseline bpf commit: 57eb5e1c5c57972c95e8efab6bc81b87161b0b07 Checkpoint bpf commit: 4cb893e89221be9c791e43cab6a8e937cd57e17f Hengqi Chen (3): libbpf: Resolve symbol conflicts at the same offset for uprobe libbpf: Support symbol versioning for uprobe libbpf: Allow Golang symbols in uprobe secdef Jiri Olsa (2): bpf: Add missed value to kprobe_multi link info bpf: Add missed value to kprobe perf link info Kumar Kartikeya Dwivedi (2): libbpf: Refactor bpf_object__reloc_code libbpf: Add support for custom exception callbacks Martin Kelly (8): libbpf: Refactor cleanup in ring_buffer__add libbpf: Switch rings to array of pointers libbpf: Add ring_buffer__ring libbpf: Add ring__producer_pos, ring__consumer_pos libbpf: Add ring__avail_data_size libbpf: Add ring__size libbpf: Add ring__map_fd libbpf: Add ring__consume include/uapi/linux/bpf.h \| 2 + src/elf.c \| 139 ++++++++++++++++++++++++++--- src/libbpf.c \| 188 ++++++++++++++++++++++++++++++++------- src/libbpf.h \| 73 +++++++++++++++ src/libbpf.map \| 7 ++ src/ringbuf.c \| 85 +++++++++++++++--- 6 files changed, 439 insertions(+), 55 deletions(-) Signed-off-by: Song Liu <song@kernel.org>	2023-10-02 11:17:48 -07:00
Hengqi Chen	9a3a2e9303	libbpf: Allow Golang symbols in uprobe secdef Golang symbols in ELF files are different from C/C++ which contains special characters like '', '(' and ')'. With generics, things get more complicated, there are symbols like: github.com/cilium/ebpf/internal.(Deque[go.shape.interface { Format(fmt.State, int32); TypeName() string;github.com/cilium/ebpf/btf.copy() github.com/cilium/ebpf/btf.Type}]).Grow Matching such symbols using `%m[^\n]` in sscanf, this excludes newline which typically does not appear in ELF symbols. This should work in most use-cases and also work for unicode letters in identifiers. If newline do show up in ELF symbols, users can still attach to such symbol by specifying bpf_uprobe_opts::func_name. A working example can be found at this repo ([0]). [0]: https://github.com/chenhengqi/libbpf-go-symbols Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230929155954.92448-1-hengqi.chen@gmail.com	2023-10-02 11:17:48 -07:00
Jiri Olsa	96d70a52ad	bpf: Add missed value to kprobe perf link info Add missed value to kprobe attached through perf link info to hold the stats of missed kprobe handler execution. The kprobe's missed counter gets incremented when kprobe handler is not executed due to another kprobe running on the same cpu. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230920213145.1941596-4-jolsa@kernel.org	2023-10-02 11:17:48 -07:00
Jiri Olsa	de02cb1697	bpf: Add missed value to kprobe_multi link info Add missed value to kprobe_multi link info to hold the stats of missed kprobe_multi probe. The missed counter gets incremented when fprobe fails the recursion check or there's no rethook available for return probe. In either case the attached bpf program is not executed. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Song Liu <song@kernel.org> Reviewed-by: Song Liu <song@kernel.org> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/bpf/20230920213145.1941596-3-jolsa@kernel.org	2023-10-02 11:17:48 -07:00
Martin Kelly	b520bcd7d8	libbpf: Add ring__consume Add ring__consume to consume a single ringbuffer, analogous to ring_buffer__consume. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-14-martin.kelly@crowdstrike.com	2023-10-02 11:17:48 -07:00
Martin Kelly	6413c2d063	libbpf: Add ring__map_fd Add ring__map_fd to get the file descriptor underlying a given ringbuffer. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-12-martin.kelly@crowdstrike.com	2023-10-02 11:17:48 -07:00
Martin Kelly	cd3fe56c75	libbpf: Add ring__size Add ring__size to get the total size of a given ringbuffer. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-10-martin.kelly@crowdstrike.com	2023-10-02 11:17:48 -07:00
Martin Kelly	3e675ed6ab	libbpf: Add ring__avail_data_size Add ring__avail_data_size for querying the currently available data in the ringbuffer, similar to the BPF_RB_AVAIL_DATA flag in bpf_ringbuf_query. This is racy during ongoing operations but is still useful for overall information on how a ringbuffer is behaving. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-8-martin.kelly@crowdstrike.com	2023-10-02 11:17:48 -07:00
Martin Kelly	2ad16b970a	libbpf: Add ring__producer_pos, ring__consumer_pos Add APIs to get the producer and consumer position for a given ringbuffer. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-6-martin.kelly@crowdstrike.com	2023-10-02 11:17:48 -07:00
Martin Kelly	a20576f5f2	libbpf: Add ring_buffer__ring Add a new function ring_buffer__ring, which exposes struct ring * to the user, representing a single ringbuffer. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-4-martin.kelly@crowdstrike.com	2023-10-02 11:17:48 -07:00
Martin Kelly	bfa471bc85	libbpf: Switch rings to array of pointers Switch rb->rings to be an array of pointers instead of a contiguous block. This allows for each ring pointer to be stable after ring_buffer__add is called, which allows us to expose struct ring * to the user without gotchas. Without this change, the realloc in ring_buffer__add could invalidate a struct ring *, making it unsafe to give to the user. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-3-martin.kelly@crowdstrike.com	2023-10-02 11:17:48 -07:00
Martin Kelly	64f2b4ab49	libbpf: Refactor cleanup in ring_buffer__add Refactor the cleanup code in ring_buffer__add to use a unified err_out label. This reduces code duplication, as well as plugging a potential leak if mmap_sz != (__u64)(size_t)mmap_sz (currently this would miss unmapping tmp because ringbuf_unmap_ring isn't called). Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230925215045.2375758-2-martin.kelly@crowdstrike.com	2023-10-02 11:17:48 -07:00

... 2 3 4 5 6 ...

2391 Commits