libbpf

mirror of https://github.com/netdata/libbpf.git synced 2026-06-22 14:39:08 +08:00

Author	SHA1	Message	Date
Alan Maguire	ebcae72279	libbpf: Propagate errors when retrieving enum value for typed data display When retrieving the enum value associated with typed data during "is data zero?" checking in btf_dump_type_data_check_zero(), the return value of btf_dump_get_enum_value() is not passed to the caller if the function returns a non-zero (error) value. Currently, 0 is returned if the function returns an error. We should instead propagate the error to the caller. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1626770993-11073-4-git-send-email-alan.maguire@oracle.com	2021-07-21 11:16:01 -07:00
Alan Maguire	64362b8896	libbpf: Avoid use of __int128 in typed dump display __int128 is not supported for some 32-bit platforms (arm and i386). __int128 was used in carrying out computations on bitfields which aid display, but the same calculations could be done with __u64 with the small effect of not supporting 128-bit bitfields. With these changes, a big-endian issue with casting 128-bit integers to 64-bit for enum bitfields is solved also, as we now use 64-bit integers for bitfield calculations. Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1626770993-11073-2-git-send-email-alan.maguire@oracle.com	2021-07-21 11:16:01 -07:00
Michal Suchanek	df01b246df	README: State the source origin more prominently. Signed-off-by: Michal Suchanek <msuchanek@suse.de>	2021-07-20 14:45:49 -07:00
Michal Suchanek	6eb5e25905	Makefile: Default LIBSUBDIR to lib64 on 64bit architectures. commit `a82a66e` ("Extend build and add install rules to Makefile") adds special handling for LIBSUBDIR on x86_64. Expand this to all architectures with 64 in name which suggests a 32bit variant exists, and s390x which is 64bit extension of s390. Fixes: #337 Fixes: `a82a66e` ("Extend build and add install rules to Makefile") Signed-off-by: Michal Suchanek <msuchanek@suse.de>	2021-07-20 14:45:49 -07:00
Andrii Nakryiko	a603965dad	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: 068dfc655b666b54e08fc3d7108b309d7f906d34 Checkpoint bpf-next commit: 08f71a1e39a1f07a464ac782d9b612d6a74c7015 Baseline bpf commit: a6c39de76d709f30982d4b80a9b9537e1d388858 Checkpoint bpf commit: d6371c76e20d7d3f61b05fd67b596af4d14a8886 Alan Maguire (3): libbpf: Clarify/fix unaligned data issues for btf typed dump libbpf: Fix compilation errors on ppc64le for btf dump typed data libbpf: Btf typed dump does not need to allocate dump data Martynas Pumputis (1): libbpf: Fix removal of inner map in bpf_object__create_map src/btf_dump.c \| 41 ++++++++++++++++++++++++++++++----------- src/libbpf.c \| 10 ++++------ 2 files changed, 34 insertions(+), 17 deletions(-) -- 2.30.2	2021-07-19 17:45:10 -07:00
Martynas Pumputis	f61c3b318b	libbpf: Fix removal of inner map in bpf_object__create_map If creating an outer map of a BTF-defined map-in-map fails (via bpf_object__create_map()), then the previously created its inner map won't be destroyed. Fix this by ensuring that the destroy routines are not bypassed in the case of a failure. Fixes: 646f02ffdd49c ("libbpf: Add BTF-defined map-in-map support") Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210719173838.423148-2-m@lambda.lt	2021-07-19 17:45:10 -07:00
Alan Maguire	8235032464	libbpf: Btf typed dump does not need to allocate dump data By using the stack for this small structure, we avoid the need for freeing memory in error paths. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1626475617-25984-4-git-send-email-alan.maguire@oracle.com	2021-07-19 17:45:10 -07:00
Alan Maguire	dc2c53b7f6	libbpf: Fix compilation errors on ppc64le for btf dump typed data __s64 can be defined as either long or long long, depending on the architecture. On ppc64le it's defined as long, giving this error: In file included from btf_dump.c:22: btf_dump.c: In function 'btf_dump_type_data_check_overflow': libbpf_internal.h:111:22: error: format '%lld' expects argument of type 'long long int', but argument 3 has type '__s64' {aka 'long int'} [-Werror=format=] 111 \| libbpf_print(level, "libbpf: " fmt, ##__VA_ARGS__); \ \| ^~~~~~~~~~ libbpf_internal.h:114:27: note: in expansion of macro '__pr' 114 \| #define pr_warn(fmt, ...) __pr(LIBBPF_WARN, fmt, ##__VA_ARGS__) \| ^~~~ btf_dump.c:1992:3: note: in expansion of macro 'pr_warn' 1992 \| pr_warn("unexpected size [%lld] for id [%u]\n", \| ^~~~~~~ btf_dump.c:1992:32: note: format string is defined here 1992 \| pr_warn("unexpected size [%lld] for id [%u]\n", \| ~~~^ \| \| \| long long int \| %ld Cast to size_t and use %zu instead. Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1626475617-25984-3-git-send-email-alan.maguire@oracle.com	2021-07-19 17:45:10 -07:00
Alan Maguire	fb3809e940	libbpf: Clarify/fix unaligned data issues for btf typed dump If data is packed, data structures can store it outside of usual boundaries. For example a 4-byte int can be stored on a unaligned boundary in a case like this: struct s { char f1; int f2; } __attribute((packed)); ...the int is stored at an offset of one byte. Some platforms have problems dereferencing data that is not aligned with its size, and code exists to handle most cases of this for BTF typed data display. However pointer display was missed, and a simple function to test if "ptr_is_aligned(data, data_sz)" would help clarify this code. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1626475617-25984-2-git-send-email-alan.maguire@oracle.com	2021-07-19 17:45:10 -07:00
Fejes Ferenc	74d3571880	Update README.md	2021-07-19 16:43:05 -07:00
Fejes Ferenc	be570b29c1	Update README.md Manjaro is a popular and friendly Arch based distro. Recently they also enabled the BTF support: https://forum.manjaro.org/t/co-re-support-in-kernel/46134/19 I can confirm that: [user@pc ~]$ uname -a Linux pc 5.12.16-1-MANJARO #1 SMP PREEMPT Sun Jul 11 13:23:34 UTC 2021 x86_64 GNU/Linux [user@pc ~]$ ls -la /sys/kernel/btf/vmlinux -r--r--r-- 1 root root 4226769 jul 17 15.27 /sys/kernel/btf/vmlinux	2021-07-19 16:43:05 -07:00
Sergei Iudin	9aa71e1040	Run apt-get update as a first step for GH actions otherwise container may contain stall repo metadata cached	2021-07-19 14:57:35 -07:00
Andrii Nakryiko	b3ffd258fc	vmtest: blacklist 5.5 selftests Add few new selftests to blacklist. They can't succeed on 5.5. Also temporarily remove btf_dump for 4.9 due to newly added data dumping subtests. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-07-19 11:36:37 -07:00
Andrii Nakryiko	4447ac82d4	ci: temporary work-around to get green CI builds back Temporary disable tc_bpf tests that seem to have regressed. Temporary and artificially bump pahole version from 1.21 to 1.22 to get per-CPU BTF data built. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-07-19 11:36:37 -07:00
Andrii Nakryiko	8fa229c455	ci: disable -Wstringop-truncation for GCC10 configurations as well We used to have it disabled for GCC8, but now GCC10 is false-report same warnings, so disable stringop-truncation warnigs for GCC10 as well. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-07-19 11:36:37 -07:00
Andrii Nakryiko	8a670b7422	vmtest: regenerate latest vmlinux.h This is necessary to make runqslower compile with task->__state field on old kernels, for which we don't have an actual vmlinux.h. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-07-19 11:36:37 -07:00
Andrii Nakryiko	21f90f61b0	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: f42cfb469f9b4a1c002a03cce3d9329376800a6f Checkpoint bpf-next commit: 068dfc655b666b54e08fc3d7108b309d7f906d34 Baseline bpf commit: 61e8aeda9398925f8c6fc290585bdd9727d154c4 Checkpoint bpf commit: a6c39de76d709f30982d4b80a9b9537e1d388858 Alan Maguire (2): libbpf: Allow specification of "kprobe/function+offset" libbpf: BTF dumper support for typed data Alexei Starovoitov (2): bpf: Sync tools/include/uapi/linux/bpf.h bpf: Introduce bpf timers. Jiri Olsa (3): bpf: Add bpf_get_func_ip helper for tracing programs bpf: Add bpf_get_func_ip helper for kprobe programs libbpf: Add bpf_program__attach_kprobe_opts function Jonathan Edwards (1): libbpf: Add extra BPF_PROG_TYPE check to bpf_object__probe_loading Kumar Kartikeya Dwivedi (2): libbpf: Add request buffer type for netlink messages libbpf: Switch to void * casting in netlink helpers Kuniyuki Iwashima (1): bpf: Fix a typo of reuseport map in bpf.h. Martynas Pumputis (1): libbpf: Fix reuse of pinned map on older kernel Shuyi Cheng (2): libbpf: Introduce 'btf_custom_path' to 'bpf_obj_open_opts' libbpf: Fix the possible memory leak on error Toke Høiland-Jørgensen (1): libbpf: Restore errno return for functions that were already returning it include/uapi/linux/bpf.h \| 85 +++- src/btf.h \| 19 + src/btf_dump.c \| 819 ++++++++++++++++++++++++++++++++++++++- src/libbpf.c \| 146 ++++++- src/libbpf.h \| 9 +- src/libbpf.map \| 1 + src/netlink.c \| 115 +++--- src/nlattr.c \| 2 +- src/nlattr.h \| 38 +- 9 files changed, 1117 insertions(+), 117 deletions(-) -- 2.30.2	2021-07-16 17:05:44 -07:00
Andrii Nakryiko	c8b1d14b03	sync: auto-generate latest BPF helpers Latest changes to BPF helper definitions.	2021-07-16 17:05:44 -07:00
Alan Maguire	c0b2ceba1d	libbpf: BTF dumper support for typed data Add a BTF dumper for typed data, so that the user can dump a typed version of the data provided. The API is int btf_dump__dump_type_data(struct btf_dump d, __u32 id, void data, size_t data_sz, const struct btf_dump_type_data_opts opts); ...where the id is the BTF id of the data pointed to by the "void " argument; for example the BTF id of "struct sk_buff" for a "struct skb " data pointer. Options supported are - a starting indent level (indent_lvl) - a user-specified indent string which will be printed once per indent level; if NULL, tab is chosen but any string <= 32 chars can be provided. - a set of boolean options to control dump display, similar to those used for BPF helper bpf_snprintf_btf(). Options are - compact : omit newlines and other indentation - skip_names: omit member names - emit_zeroes: show zero-value members Default output format is identical to that dumped by bpf_snprintf_btf(), for example a "struct sk_buff" representation would look like this: struct sk_buff){ (union){ (struct){ .next = (struct sk_buff )0xffffffffffffffff, .prev = (struct sk_buff )0xffffffffffffffff, (union){ .dev = (struct net_device )0xffffffffffffffff, .dev_scratch = (long unsigned int)18446744073709551615, }, }, ... If the data structure is larger than the data_sz number of bytes that are available in data, as much of the data as possible will be dumped and -E2BIG will be returned. This is useful as tracers will sometimes not be able to capture all of the data associated with a type; for example a "struct task_struct" is ~16k. Being able to specify that only a subset is available is important for such cases. On success, the amount of data dumped is returned. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1626362126-27775-2-git-send-email-alan.maguire@oracle.com	2021-07-16 17:05:44 -07:00
Shuyi Cheng	bd25fc7df1	libbpf: Fix the possible memory leak on error If the strdup() fails then we need to call bpf_object__close(obj) to avoid a resource leak. Fixes: 166750bc1dd2 ("libbpf: Support libbpf-provided extern variables") Signed-off-by: Shuyi Cheng <chengshuyi@linux.alibaba.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1626180159-112996-3-git-send-email-chengshuyi@linux.alibaba.com	2021-07-16 17:05:44 -07:00
Shuyi Cheng	4920031c88	libbpf: Introduce 'btf_custom_path' to 'bpf_obj_open_opts' btf_custom_path allows developers to load custom BTF which libbpf will subsequently use for CO-RE relocation instead of vmlinux BTF. Having btf_custom_path in bpf_object_open_opts one can directly use the skeleton's <objname>_bpf__open_opts() API to pass in the btf_custom_path parameter, as opposed to using bpf_object__load_xattr() which is slated to be deprecated ([0]). This work continues previous work started by another developer ([1]). [0] https://lore.kernel.org/bpf/CAEf4BzbJZLjNoiK8_VfeVg_Vrg=9iYFv+po-38SMe=UzwDKJ=Q@mail.gmail.com/#t [1] https://yhbt.net/lore/all/CAEf4Bzbgw49w2PtowsrzKQNcxD4fZRE6AKByX-5-dMo-+oWHHA@mail.gmail.com/ Signed-off-by: Shuyi Cheng <chengshuyi@linux.alibaba.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1626180159-112996-2-git-send-email-chengshuyi@linux.alibaba.com	2021-07-16 17:05:44 -07:00
Alan Maguire	8fa50e86c1	libbpf: Allow specification of "kprobe/function+offset" kprobes can be placed on most instructions in a function, not just entry, and ftrace and bpftrace support the function+offset notification for probe placement. Adding parsing of func_name into func+offset to bpf_program__attach_kprobe() allows the user to specify SEC("kprobe/bpf_fentry_test5+0x6") ...for example, and the offset can be passed to perf_event_open_probe() to support kprobe attachment. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210714094400.396467-8-jolsa@kernel.org	2021-07-16 17:05:44 -07:00
Jiri Olsa	330a158982	libbpf: Add bpf_program__attach_kprobe_opts function Adding bpf_program__attach_kprobe_opts that does the same as bpf_program__attach_kprobe, but takes opts argument. Currently opts struct holds just retprobe bool, but we will add new field in following patch. The function is not exported, so there's no need to add size to the struct bpf_program_attach_kprobe_opts for now. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210714094400.396467-7-jolsa@kernel.org	2021-07-16 17:05:44 -07:00
Jiri Olsa	a524ae0bbf	bpf: Add bpf_get_func_ip helper for kprobe programs Adding bpf_get_func_ip helper for BPF_PROG_TYPE_KPROBE programs, so it's now possible to call bpf_get_func_ip from both kprobe and kretprobe programs. Taking the caller's address from 'struct kprobe::addr', which is defined for both kprobe and kretprobe. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lore.kernel.org/bpf/20210714094400.396467-5-jolsa@kernel.org	2021-07-16 17:05:44 -07:00
Jiri Olsa	97e2a9c9a1	bpf: Add bpf_get_func_ip helper for tracing programs Adding bpf_get_func_ip helper for BPF_PROG_TYPE_TRACING programs, specifically for all trampoline attach types. The trampoline's caller IP address is stored in (ctx - 8) address. so there's no reason to actually call the helper, but rather fixup the call instruction and return [ctx - 8] value directly. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210714094400.396467-4-jolsa@kernel.org	2021-07-16 17:05:44 -07:00
Alexei Starovoitov	bef77595ca	bpf: Introduce bpf timers. Introduce 'struct bpf_timer { __u64 :64; __u64 :64; };' that can be embedded in hash/array/lru maps as a regular field and helpers to operate on it: // Initialize the timer. // First 4 bits of 'flags' specify clockid. // Only CLOCK_MONOTONIC, CLOCK_REALTIME, CLOCK_BOOTTIME are allowed. long bpf_timer_init(struct bpf_timer timer, struct bpf_map map, int flags); // Configure the timer to call 'callback_fn' static function. long bpf_timer_set_callback(struct bpf_timer timer, void callback_fn); // Arm the timer to expire 'nsec' nanoseconds from the current time. long bpf_timer_start(struct bpf_timer timer, u64 nsec, u64 flags); // Cancel the timer and wait for callback_fn to finish if it was running. long bpf_timer_cancel(struct bpf_timer timer); Here is how BPF program might look like: struct map_elem { int counter; struct bpf_timer timer; }; struct { __uint(type, BPF_MAP_TYPE_HASH); __uint(max_entries, 1000); __type(key, int); __type(value, struct map_elem); } hmap SEC(".maps"); static int timer_cb(void map, int key, struct map_elem val); / val points to particular map element that contains bpf_timer. / SEC("fentry/bpf_fentry_test1") int BPF_PROG(test1, int a) { struct map_elem val; int key = 0; val = bpf_map_lookup_elem(&hmap, &key); if (val) { bpf_timer_init(&val->timer, &hmap, CLOCK_REALTIME); bpf_timer_set_callback(&val->timer, timer_cb); bpf_timer_start(&val->timer, 1000 /* call timer_cb2 in 1 usec */, 0); } } This patch adds helper implementations that rely on hrtimers to call bpf functions as timers expire. The following patches add necessary safety checks. Only programs with CAP_BPF are allowed to use bpf_timer. The amount of timers used by the program is constrained by the memcg recorded at map creation time. The bpf_timer_init() helper needs explicit 'map' argument because inner maps are dynamic and not known at load time. While the bpf_timer_set_callback() is receiving hidden 'aux->prog' argument supplied by the verifier. The prog pointer is needed to do refcnting of bpf program to make sure that program doesn't get freed while the timer is armed. This approach relies on "user refcnt" scheme used in prog_array that stores bpf programs for bpf_tail_call. The bpf_timer_set_callback() will increment the prog refcnt which is paired with bpf_timer_cancel() that will drop the prog refcnt. The ops->map_release_uref is responsible for cancelling the timers and dropping prog refcnt when user space reference to a map reaches zero. This uref approach is done to make sure that Ctrl-C of user space process will not leave timers running forever unless the user space explicitly pinned a map that contained timers in bpffs. bpf_timer_init() and bpf_timer_set_callback() will return -EPERM if map doesn't have user references (is not held by open file descriptor from user space and not pinned in bpffs). The bpf_map_delete_elem() and bpf_map_update_elem() operations cancel and free the timer if given map element had it allocated. "bpftool map update" command can be used to cancel timers. The 'struct bpf_timer' is explicitly __attribute__((aligned(8))) because '__u64 :64' has 1 byte alignment of 8 byte padding. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20210715005417.78572-4-alexei.starovoitov@gmail.com	2021-07-16 17:05:44 -07:00
Kuniyuki Iwashima	6f7839f477	bpf: Fix a typo of reuseport map in bpf.h. Fix s/BPF_MAP_TYPE_REUSEPORT_ARRAY/BPF_MAP_TYPE_REUSEPORT_SOCKARRAY/ typo in bpf.h. Fixes: 2dbb9b9e6df6 ("bpf: Introduce BPF_PROG_TYPE_SK_REUSEPORT") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210714124317.67526-1-kuniyu@amazon.co.jp	2021-07-16 17:05:44 -07:00
Alexei Starovoitov	90aba5e582	bpf: Sync tools/include/uapi/linux/bpf.h Commit 47316f4a3053 missed updating tools/.../bpf.h. Sync it. Fixes: 47316f4a3053 ("bpf: Support input xdp_md context in BPF_PROG_TEST_RUN") Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2021-07-16 17:05:44 -07:00
Martynas Pumputis	4dc3aeb072	libbpf: Fix reuse of pinned map on older kernel When loading a BPF program with a pinned map, the loader checks whether the pinned map can be reused, i.e. their properties match. To derive such of the pinned map, the loader invokes BPF_OBJ_GET_INFO_BY_FD and then does the comparison. Unfortunately, on < 4.12 kernels the BPF_OBJ_GET_INFO_BY_FD is not available, so loading the program fails with the following error: libbpf: failed to get map info for map FD 5: Invalid argument libbpf: couldn't reuse pinned map at '/sys/fs/bpf/tc/globals/cilium_call_policy': parameter mismatch" libbpf: map 'cilium_call_policy': error reusing pinned map libbpf: map 'cilium_call_policy': failed to create: Invalid argument(-22) libbpf: failed to load object 'bpf_overlay.o' To fix this, fallback to derivation of the map properties via /proc/$PID/fdinfo/$MAP_FD if BPF_OBJ_GET_INFO_BY_FD fails with EINVAL, which can be used as an indicator that the kernel doesn't support the latter. Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210712125552.58705-1-m@lambda.lt	2021-07-16 17:05:44 -07:00
Toke Høiland-Jørgensen	4ce0551ee5	libbpf: Restore errno return for functions that were already returning it The update to streamline libbpf error reporting intended to change all functions to return the errno as a negative return value if LIBBPF_STRICT_DIRECT_ERRS is set. However, if the flag is not set, the return value changes for the two functions that were already returning a negative errno unconditionally: bpf_link__unpin() and perf_buffer__poll(). This is a user-visible API change that breaks applications; so let's revert these two functions back to unconditionally returning a negative errno value. Fixes: e9fc3ce99b34 ("libbpf: Streamline error reporting for high-level APIs") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210706122355.236082-1-toke@redhat.com	2021-07-16 17:05:44 -07:00
Kumar Kartikeya Dwivedi	f8411901c4	libbpf: Switch to void * casting in netlink helpers Netlink helpers I added in 8bbb77b7c7a2 ("libbpf: Add various netlink helpers") used char * casts everywhere, and there were a few more that existed from before. Convert all of them to void * cast, as it is treated equivalently by clang/gcc for the purposes of pointer arithmetic and to follow the convention elsewhere in the kernel/libbpf. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210619041454.417577-2-memxor@gmail.com	2021-07-16 17:05:44 -07:00
Kumar Kartikeya Dwivedi	9ff2b76693	libbpf: Add request buffer type for netlink messages Coverity complains about OOB writes to nlmsghdr. There is no OOB as we write to the trailing buffer, but static analyzers and compilers may rightfully be confused as the nlmsghdr pointer has subobject provenance (and hence subobject bounds). Fix this by using an explicit request structure containing the nlmsghdr, struct tcmsg/ifinfomsg, and attribute buffer. Also switch nh_tail (renamed to req_tail) to cast req * to char * so that it can be understood as arithmetic on pointer to the representation array (hence having same bound as request structure), which should further appease analyzers. As a bonus, callers don't have to pass sizeof(req) all the time now, as size is implicitly obtained using the pointer. While at it, also reduce the size of attribute buffer to 128 bytes (132 for ifinfomsg using functions due to the padding). Summary of problem: Even though C standard allows interconvertibility of pointer to first member and pointer to struct, for the purposes of alias analysis it would still consider the first as having pointer value "pointer to T" where T is type of first member hence having subobject bounds, allowing analyzers within reason to complain when object is accessed beyond the size of pointed to object. The only exception to this rule may be when a char * is formed to a member subobject. It is not possible for the compiler to be able to tell the intent of the programmer that it is a pointer to member object or the underlying representation array of the containing object, so such diagnosis is suppressed. Fixes: 715c5ce454a6 ("libbpf: Add low level TC-BPF management API") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210619041454.417577-1-memxor@gmail.com	2021-07-16 17:05:44 -07:00
Jonathan Edwards	df023f5cfc	libbpf: Add extra BPF_PROG_TYPE check to bpf_object__probe_loading eBPF has been backported for RHEL 7 w/ kernel 3.10-940+ [0]. However only the following program types are supported [1]: BPF_PROG_TYPE_KPROBE BPF_PROG_TYPE_TRACEPOINT BPF_PROG_TYPE_PERF_EVENT For libbpf this causes an EINVAL return during the bpf_object__probe_loading call which only checks to see if programs of type BPF_PROG_TYPE_SOCKET_FILTER can load. The following will try BPF_PROG_TYPE_TRACEPOINT as a fallback attempt before erroring out. BPF_PROG_TYPE_KPROBE was not a good candidate because on some kernels it requires knowledge of the LINUX_VERSION_CODE. [0] https://www.redhat.com/en/blog/introduction-ebpf-red-hat-enterprise-linux-7 [1] https://access.redhat.com/articles/3550581 Signed-off-by: Jonathan Edwards <jonathan.edwards@165gc.onmicrosoft.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210619151007.GA6963@165gc.onmicrosoft.com	2021-07-16 17:05:44 -07:00
Andrii Nakryiko	ae62c159ec	include: initial sync of pkt_cls.h and pkt_sched.h Add pkt_cls.h and pkt_sched.h to include/uapi/linux. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-07-16 14:22:07 -07:00
Yonghong Song	8bf016110e	sync uapi headers linux/pkt_cls.h and linux/pkt_sched.h Let us sync linux/{pkt_cls.h,pkt_sched.h} to libbpf repo. Otherwise, on ubuntu 16.04, system headers will be picked up and this will result in compilation error like: .../netlink.c:416:23: error: ‘TC_H_CLSACT’ undeclared (first use in this function) *parent = TC_H_MAKE(TC_H_CLSACT, ^ .../netlink.c:418:9: error: ‘TC_H_MIN_INGRESS’ undeclared (first use in this function) TC_H_MIN_INGRESS : TC_H_MIN_EGRESS); ^ .../netlink.c:418:28: error: ‘TC_H_MIN_EGRESS’ undeclared (first use in this function) TC_H_MIN_INGRESS : TC_H_MIN_EGRESS); ^ .../netlink.c: In function ‘__get_tc_info’: .../netlink.c:522:11: error: ‘TCA_BPF_ID’ undeclared (first use in this function) if (!tbb[TCA_BPF_ID]) ^ Signed-off-by: Yonghong Song <yhs@fb.com>	2021-07-12 14:01:21 -07:00
Yucong Sun	d3e4039a0a	create ondemand vmtest workflow	2021-07-09 14:09:51 -07:00
Jussi Mäki	dd34504b43	vmtest: Set CONFIG_BONDING=y in latest.config This is preparation for the XDP bonding patch set [1] to avoid having to mangle the kernel configuration from vmtest.sh. [1]: https://lore.kernel.org/bpf/202106221509.kwNvAAZg-lkp@intel.com/T/#m4635dc0003944f38a54059b11147ab46abeffa13 Signed-off-by: Jussi Maki <joamaki@gmail.com>	2021-07-08 15:18:32 -07:00
Andrii Nakryiko	bec2ae0c6e	sync: update rewritten bpf-next SHA Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-07-06 13:31:12 -07:00
Andrii Nakryiko	1d6106cf45	ci: blacklist few new tests on 5.5 tc_redirect and migrate_reuseport use new functionality not present on 5.5 Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-06-18 13:05:10 -07:00
Andrii Nakryiko	95e51c1dbe	ci: disable fail-fast for Github Actions tests Make sure we run all of the tests even if some of them fail. This allows to test all of them independently, especially kernel LATEST slow test. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>	2021-06-18 13:05:10 -07:00
Andrii Nakryiko	db132757c9	sync: latest libbpf changes from kernel Syncing latest libbpf commits from kernel repository. Baseline bpf-next commit: cf68fa431d5da7ef0b5ea142dd603611696cbd44 Checkpoint bpf-next commit: f540a7d2c37f9ae0867de0a14bf06cf50b63d65e Baseline bpf commit: 11fc79fc9f2e395aa39fa5baccae62767c5d8280 Checkpoint bpf commit: 61e8aeda9398925f8c6fc290585bdd9727d154c4 Kumar Kartikeya Dwivedi (2): libbpf: Remove unneeded check for flags during tc detach libbpf: Set NLM_F_EXCL when creating qdisc Kuniyuki Iwashima (3): bpf: Support BPF_FUNC_get_socket_cookie() for BPF_PROG_TYPE_SK_REUSEPORT. bpf: Support socket migration by eBPF. libbpf: Set expected_attach_type for BPF_PROG_TYPE_SK_REUSEPORT. Lorenz Bauer (1): libbpf: Fail compilation if target arch is missing Wang Hai (1): libbpf: Simplify the return expression of bpf_object__init_maps function grantseltzer (1): Add documentation for libbpf including API autogen include/uapi/linux/bpf.h \| 16 ++++ src/README.rst \| 168 --------------------------------------- src/bpf_tracing.h \| 46 ++++++++++- src/libbpf.c \| 9 ++- src/netlink.c \| 4 +- 5 files changed, 64 insertions(+), 179 deletions(-) delete mode 100644 src/README.rst -- 2.30.2	2021-06-18 13:05:10 -07:00
grantseltzer	41cddf18f4	Add documentation for libbpf including API autogen This patch is meant to start the initiative to document libbpf. It includes .rst files which are text documentation describing building, API naming convention, as well as an index to generated API documentation. In this approach the generated API documentation is enabled by the kernels existing kernel documentation system which uses sphinx. The resulting docs would then be synced to kernel.org/doc You can test this by running `make htmldocs` and serving the html in Documentation/output. Since libbpf does not yet have comments in kernel doc format, see kernel.org/doc/html/latest/doc-guide/kernel-doc.html for an example so you can test this. The advantage of this approach is to use the existing sphinx infrastructure that the kernel has, and have libbpf docs in the same place as everything else. The current plan is to have the libbpf mirror sync the generated docs and version them based on the libbpf releases which are cut on github. This patch includes the addition of libbpf_api.rst which pulls comment documentation from header files in libbpf under tools/lib/bpf/. The comment docs would be of the standard kernel doc format. Signed-off-by: grantseltzer <grantseltzer@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210618140459.9887-2-grantseltzer@gmail.com	2021-06-18 13:05:10 -07:00
Lorenz Bauer	f883bbf3f4	libbpf: Fail compilation if target arch is missing bpf2go is the Go equivalent of libbpf skeleton. The convention is that the compiled BPF is checked into the repository to facilitate distributing BPF as part of Go packages. To make this portable, bpf2go by default generates both bpfel and bpfeb variants of the C. Using bpf_tracing.h is inherently non-portable since the fields of struct pt_regs differ between platforms, so CO-RE can't help us here. The only way of working around this is to compile for each target platform independently. bpf2go can't do this by default since there are too many platforms. Define the various PT_... macros when no target can be determined and turn them into compilation failures. This works because bpf2go always compiles for bpf targets, so the compiler fallback doesn't kick in. Conditionally define __BPF_MISSING_TARGET so that we can inject a more appropriate error message at build time. The user can then choose which platform to target explicitly. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210616083635.11434-1-lmb@cloudflare.com	2021-06-18 13:05:10 -07:00
Kuniyuki Iwashima	db8982bcaa	libbpf: Set expected_attach_type for BPF_PROG_TYPE_SK_REUSEPORT. This commit introduces a new section (sk_reuseport/migrate) and sets expected_attach_type to two each section in BPF_PROG_TYPE_SK_REUSEPORT program. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210612123224.12525-11-kuniyu@amazon.co.jp	2021-06-18 13:05:10 -07:00
Kuniyuki Iwashima	d1571ab5ce	bpf: Support socket migration by eBPF. This patch introduces a new bpf_attach_type for BPF_PROG_TYPE_SK_REUSEPORT to check if the attached eBPF program is capable of migrating sockets. When the eBPF program is attached, we run it for socket migration if the expected_attach_type is BPF_SK_REUSEPORT_SELECT_OR_MIGRATE or net.ipv4.tcp_migrate_req is enabled. Currently, the expected_attach_type is not enforced for the BPF_PROG_TYPE_SK_REUSEPORT type of program. Thus, this commit follows the earlier idea in the commit aac3fc320d94 ("bpf: Post-hooks for sys_bind") to fix up the zero expected_attach_type in bpf_prog_load_fixup_attach_type(). Moreover, this patch adds a new field (migrating_sk) to sk_reuseport_md to select a new listener based on the child socket. migrating_sk varies depending on if it is migrating a request in the accept queue or during 3WHS. - accept_queue : sock (ESTABLISHED/SYN_RECV) - 3WHS : request_sock (NEW_SYN_RECV) In the eBPF program, we can select a new listener by BPF_FUNC_sk_select_reuseport(). Also, we can cancel migration by returning SK_DROP. This feature is useful when listeners have different settings at the socket API level or when we want to free resources as soon as possible. - SK_PASS with selected_sk, select it as a new listener - SK_PASS with selected_sk NULL, fallbacks to the random selection - SK_DROP, cancel the migration. There is a noteworthy point. We select a listening socket in three places, but we do not have struct skb at closing a listener or retransmitting a SYN+ACK. On the other hand, some helper functions do not expect skb is NULL (e.g. skb_header_pointer() in BPF_FUNC_skb_load_bytes(), skb_tail_pointer() in BPF_FUNC_skb_load_bytes_relative()). So we allocate an empty skb temporarily before running the eBPF program. Suggested-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/netdev/20201123003828.xjpjdtk4ygl6tg6h@kafai-mbp.dhcp.thefacebook.com/ Link: https://lore.kernel.org/netdev/20201203042402.6cskdlit5f3mw4ru@kafai-mbp.dhcp.thefacebook.com/ Link: https://lore.kernel.org/netdev/20201209030903.hhow5r53l6fmozjn@kafai-mbp.dhcp.thefacebook.com/ Link: https://lore.kernel.org/bpf/20210612123224.12525-10-kuniyu@amazon.co.jp	2021-06-18 13:05:10 -07:00
Kuniyuki Iwashima	03b0787342	bpf: Support BPF_FUNC_get_socket_cookie() for BPF_PROG_TYPE_SK_REUSEPORT. We will call sock_reuseport.prog for socket migration in the next commit, so the eBPF program has to know which listener is closing to select a new listener. We can currently get a unique ID of each listener in the userspace by calling bpf_map_lookup_elem() for BPF_MAP_TYPE_REUSEPORT_SOCKARRAY map. This patch makes the pointer of sk available in sk_reuseport_md so that we can get the ID by BPF_FUNC_get_socket_cookie() in the eBPF program. Suggested-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/netdev/20201119001154.kapwihc2plp4f7zc@kafai-mbp.dhcp.thefacebook.com/ Link: https://lore.kernel.org/bpf/20210612123224.12525-9-kuniyu@amazon.co.jp	2021-06-18 13:05:10 -07:00
Kumar Kartikeya Dwivedi	a1bd8104a9	libbpf: Set NLM_F_EXCL when creating qdisc This got lost during the refactoring across versions. We always use NLM_F_EXCL when creating some TC object, so reflect what the function says and set the flag. Fixes: 715c5ce454a6 ("libbpf: Add low level TC-BPF management API") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210612023502.1283837-3-memxor@gmail.com	2021-06-18 13:05:10 -07:00
Kumar Kartikeya Dwivedi	ccead28901	libbpf: Remove unneeded check for flags during tc detach Coverity complained about this being unreachable code. It is right because we already enforce flags to be unset, so a check validating the flag value is redundant. Fixes: 715c5ce454a6 ("libbpf: Add low level TC-BPF management API") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210612023502.1283837-2-memxor@gmail.com	2021-06-18 13:05:10 -07:00
Wang Hai	0b59d75ecd	libbpf: Simplify the return expression of bpf_object__init_maps function There is no need for special treatment of the 'ret == 0' case. This patch simplifies the return expression. Signed-off-by: Wang Hai <wanghai38@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210609115651.3392580-1-wanghai38@huawei.com	2021-06-18 13:05:10 -07:00
Sergei Iudin	a5ee05d505	Run pahole staging once a day	2021-06-17 17:49:56 -07:00

1 2 3 4 5 ...

1109 Commits