Compare commits

...

56 Commits
v0.2 ... v0.3

Author SHA1 Message Date
Luca Boccassi
051a4009f9 pkgconfig: use literal ${prefix} to allow override
Various workflows (--define-prefix, --define-variable=prefix) require variables in
the pc file to use a literal  so that it is overridden. Change the Makefile
so that, by default and unless  is specified, it is set as expected.

Signed-off-by: Luca Boccassi <bluca@debian.org>
2021-01-03 10:41:31 -08:00
Luca Boccassi
a3a5e9688a README: point to Debian source package rather than binary
For consistency with other links

Signed-off-by: Luca Boccassi <bluca@debian.org>
2021-01-03 10:41:31 -08:00
Luca Boccassi
5569404346 README: note that Debian 11 (will) ship LLVM 11
Signed-off-by: Luca Boccassi <bluca@debian.org>
2021-01-03 10:41:31 -08:00
Andrii Nakryiko
e05f9be4f4 vmtests: temporarily disable test_maps
Disable test_maps test until it's debugged why they started failing.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2020-12-20 17:00:58 -08:00
Andrii Nakryiko
4d3535ff7b vmtests: test_maps needs more memory, so bump to 4G
Memory is cheap.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2020-12-20 17:00:58 -08:00
Andrii Nakryiko
c66a9770e3 vmtests: fix up bpf_testmod.ko generation for 5.5 and 4.9
Selftests makefile deletes local bpf_testmod.ko, so that invalidates current
approach of faking bpf_testmod.ko "generation". Instead, generate a fake
Makefile that will create an empty bpf_testmod/bpf_testmod.ko.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2020-12-20 17:00:58 -08:00
Andrii Nakryiko
8262be6034 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   5c667dca71095abec90420eb09503f35f66c9585
Checkpoint bpf-next commit: 3db1a3fa98808aa90f95ec3e0fa2fc7abf28f5c9
Baseline bpf commit:        12c8a8ca117f3d734babc3fba131fdaa329d2163
Checkpoint bpf commit:      1a3449c19407a28f7019a887cdf0d6ba2444751a

Andrii Nakryiko (2):
  bpf: Fix enum names for bpf_this_cpu_ptr() and bpf_per_cpu_ptr()
    helpers
  libbpf: Support modules in bpf_program__set_attach_target() API

Brendan Jackman (1):
  libbpf: Expose libbpf ring_buffer epoll_fd

Florent Revest (1):
  bpf: Add a bpf_sock_from_file helper

 include/uapi/linux/bpf.h | 13 ++++++--
 src/libbpf.c             | 64 +++++++++++++++++++++++++---------------
 src/libbpf.h             |  1 +
 src/libbpf.map           |  1 +
 src/ringbuf.c            |  6 ++++
 5 files changed, 59 insertions(+), 26 deletions(-)

--
2.24.1
2020-12-20 17:00:58 -08:00
Andrii Nakryiko
182e9dde0d sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-12-20 17:00:58 -08:00
Brendan Jackman
30e2c16571 libbpf: Expose libbpf ring_buffer epoll_fd
This provides a convenient perf ringbuf -> libbpf ringbuf migration
path for users of external polling systems. It is analogous to
perf_buffer__epoll_fd.

Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201214113812.305274-1-jackmanb@google.com
2020-12-20 17:00:58 -08:00
Andrii Nakryiko
ebcae62e7e libbpf: Support modules in bpf_program__set_attach_target() API
Support finding kernel targets in kernel modules when using
bpf_program__set_attach_target() API. This brings it up to par with what
libbpf supports when doing declarative SEC()-based target determination.

Some minor internal refactoring was needed to make sure vmlinux BTF can be
loaded before bpf_object's load phase.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20201211215825.3646154-2-andrii@kernel.org
2020-12-20 17:00:58 -08:00
Florent Revest
252ad1f3eb bpf: Add a bpf_sock_from_file helper
While eBPF programs can check whether a file is a socket by file->f_op
== &socket_file_ops, they cannot convert the void private_data pointer
to a struct socket BTF pointer. In order to do this a new helper
wrapping sock_from_file is added.

This is useful to tracing programs but also other program types
inheriting this set of helpers such as iterators or LSM programs.

Signed-off-by: Florent Revest <revest@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: KP Singh <kpsingh@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20201204113609.1850150-2-revest@google.com
2020-12-20 17:00:58 -08:00
Andrii Nakryiko
3e68c60659 bpf: Fix enum names for bpf_this_cpu_ptr() and bpf_per_cpu_ptr() helpers
Remove bpf_ prefix, which causes these helpers to be reported in verifier
dump as bpf_bpf_this_cpu_ptr() and bpf_bpf_per_cpu_ptr(), respectively. Lets
fix it as long as it is still possible before UAPI freezes on these helpers.

Fixes: eaa6bcb71ef6 ("bpf: Introduce bpf_per_cpu_ptr()")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-20 17:00:58 -08:00
Andrii Nakryiko
42baefba71 vmtests: update blacklist for 5.5
Blacklist selftests relying on newer kernel's features.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
46ecf7aef3 vmtest: omit building bpf_testmod.ko on non-latest kernels
Non-latest kernel versions don't build kernel from sources, so module buliding
fails, despite using `make prepare`. For now, just make sure no module is
built by overwriting bpf_testmod/Makefile.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
2981bb8d26 vmtests: update vmlinux.h to latest version
Update vmlinux.h to get some of BPF UAPI constants needed for the compilation
of new selftests.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
2042df2fed sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   c6bde958a62b8ca5ee8d2c1fe429aec4ad54efad
Checkpoint bpf-next commit: 5c667dca71095abec90420eb09503f35f66c9585
Baseline bpf commit:        d3bec0138bfbe58606fc1d6f57a4cdc1a20218db
Checkpoint bpf commit:      12c8a8ca117f3d734babc3fba131fdaa329d2163

Alan Maguire (1):
  libbpf: bpf__find_by_name[_kind] should use btf__get_nr_types()

Andrei Matei (1):
  libbpf: Fail early when loading programs with unspecified type

Andrii Nakryiko (11):
  bpf: Assign ID to vmlinux BTF and return extra info for BTF in
    GET_OBJ_INFO
  libbpf: Don't attempt to load unused subprog as an entry-point BPF
    program
  libbpf: Add base BTF accessor
  libbpf: Add internal helper to load BTF data by FD
  libbpf: Refactor CO-RE relocs to not assume a single BTF object
  libbpf: Add kernel module BTF support for CO-RE relocations
  bpf: Allow to specify kernel module BTFs when attaching BPF programs
  libbpf: Factor out low-level BPF program loading helper
  libbpf: Support attachment of BPF tracing programs to kernel modules
  libbpf: Use memcpy instead of strncpy to please GCC
  libbpf: Fix ring_buffer__poll() to return number of consumed samples

Dmitrii Banshchikov (1):
  bpf: Add bpf_ktime_get_coarse_ns helper

KP Singh (5):
  bpf: Implement task local storage
  libbpf: Add support for task local storage
  bpf: Implement get_current_task_btf and RET_PTR_TO_BTF_ID
  bpf: Add bpf_bprm_opts_set helper
  bpf: Add a BPF helper for getting the IMA hash of an inode

Li RongQing (1):
  libbpf: Add support for canceling cached_cons advance

Magnus Karlsson (1):
  libbpf: Replace size_t with __u32 in xsk interfaces

Mariusz Dudek (1):
  libbpf: Separate XDP program load with xsk socket creation

Stanislav Fomichev (1):
  libbpf: Cap retries in sys_bpf_prog_load

Thomas Karlsson (1):
  macvlan: Support for high multicast packet rate

Toke Høiland-Jørgensen (1):
  libbpf: Sanitise map names before pinning

 include/uapi/linux/bpf.h     |  96 +++++-
 include/uapi/linux/if_link.h |   2 +
 src/bpf.c                    | 104 +++++--
 src/btf.c                    |  74 +++--
 src/btf.h                    |   1 +
 src/libbpf.c                 | 550 +++++++++++++++++++++++++++--------
 src/libbpf.map               |   3 +
 src/libbpf_internal.h        |  31 ++
 src/libbpf_probes.c          |   1 +
 src/ringbuf.c                |   2 +-
 src/xsk.c                    |  92 +++++-
 src/xsk.h                    |  22 +-
 12 files changed, 771 insertions(+), 207 deletions(-)

--
2.24.1
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
8c2c4c3451 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
21ae7bb113 libbpf: Fix ring_buffer__poll() to return number of consumed samples
Fix ring_buffer__poll() to return the number of non-discarded records
consumed, just like its documentation states. It's also consistent with
ring_buffer__consume() return. Fix up selftests with wrong expected results.

Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support")
Fixes: cb1c9ddd5525 ("selftests/bpf: Add BPF ringbuf selftests")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201130223336.904192-1-andrii@kernel.org
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
b2a34784b2 libbpf: Use memcpy instead of strncpy to please GCC
Some versions of GCC are really nit-picky about strncpy() use. Use memcpy(),
as they are pretty much equivalent for the case of fixed length strings.

Fixes: e459f49b4394 ("libbpf: Separate XDP program load with xsk socket creation")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201203235440.2302137-1-andrii@kernel.org
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
d95b12da56 libbpf: Support attachment of BPF tracing programs to kernel modules
Teach libbpf to search for BTF types in kernel modules for tracing BPF
programs. This allows attachment of raw_tp/fentry/fexit/fmod_ret/etc BPF
program types to tracepoints and functions in kernel modules.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201203204634.1325171-13-andrii@kernel.org
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
a1fd6dab54 libbpf: Factor out low-level BPF program loading helper
Refactor low-level API for BPF program loading to not rely on public API
types. This allows painless extension without constant efforts to cleverly not
break backwards compatibility.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201203204634.1325171-12-andrii@kernel.org
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
fde1be5a9c bpf: Allow to specify kernel module BTFs when attaching BPF programs
Add ability for user-space programs to specify non-vmlinux BTF when attaching
BTF-powered BPF programs: raw_tp, fentry/fexit/fmod_ret, LSM, etc. For this,
attach_prog_fd (now with the alias name attach_btf_obj_fd) should specify FD
of a module or vmlinux BTF object. For backwards compatibility reasons,
0 denotes vmlinux BTF. Only kernel BTF (vmlinux or module) can be specified.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201203204634.1325171-11-andrii@kernel.org
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
6b08519a69 libbpf: Add kernel module BTF support for CO-RE relocations
Teach libbpf to search for candidate types for CO-RE relocations across kernel
modules BTFs, in addition to vmlinux BTF. If at least one candidate type is
found in vmlinux BTF, kernel module BTFs are not iterated. If vmlinux BTF has
no matching candidates, then find all kernel module BTFs and search for all
matching candidates across all of them.

Kernel's support for module BTFs are inferred from the support for BTF name
pointer in BPF UAPI.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201203204634.1325171-6-andrii@kernel.org
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
aff8028b6e libbpf: Refactor CO-RE relocs to not assume a single BTF object
Refactor CO-RE relocation candidate search to not expect a single BTF, rather
return all candidate types with their corresponding BTF objects. This will
allow to extend CO-RE relocations to accommodate kernel module BTFs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20201203204634.1325171-5-andrii@kernel.org
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
10e321f100 libbpf: Add internal helper to load BTF data by FD
Add a btf_get_from_fd() helper, which constructs struct btf from in-kernel BTF
data by FD. This is used for loading module BTFs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201203204634.1325171-4-andrii@kernel.org
2020-12-04 20:04:52 -08:00
Stanislav Fomichev
8051a539d8 libbpf: Cap retries in sys_bpf_prog_load
I've seen a situation, where a process that's under pprof constantly
generates SIGPROF which prevents program loading indefinitely.
The right thing to do probably is to disable signals in the upper
layers while loading, but it still would be nice to get some error from
libbpf instead of an endless loop.

Let's add some small retry limit to the program loading:
try loading the program 5 (arbitrary) times and give up.

v2:
* 10 -> 5 retires (Andrii Nakryiko)

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201202231332.3923644-1-sdf@google.com
2020-12-04 20:04:52 -08:00
Toke Høiland-Jørgensen
691c22dc0c libbpf: Sanitise map names before pinning
When we added sanitising of map names before loading programs to libbpf, we
still allowed periods in the name. While the kernel will accept these for
the map names themselves, they are not allowed in file names when pinning
maps. This means that bpf_object__pin_maps() will fail if called on an
object that contains internal maps (such as sections .rodata).

Fix this by replacing periods with underscores when constructing map pin
paths. This only affects the paths generated by libbpf when
bpf_object__pin_maps() is called with a path argument. Any pin paths set
by bpf_map__set_pin_path() are unaffected, and it will still be up to the
caller to avoid invalid characters in those.

Fixes: 113e6b7e15e2 ("libbpf: Sanitise internal map names so they are not rejected by the kernel")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201203093306.107676-1-toke@redhat.com
2020-12-04 20:04:52 -08:00
Andrei Matei
5fe9c1217a libbpf: Fail early when loading programs with unspecified type
Before this patch, a program with unspecified type
(BPF_PROG_TYPE_UNSPEC) would be passed to the BPF syscall, only to have
the kernel reject it with an opaque invalid argument error. This patch
makes libbpf reject such programs with a nicer error message - in
particular libbpf now tries to diagnose bad ELF section names at both
open time and load time.

Signed-off-by: Andrei Matei <andreimatei1@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20201203043410.59699-1-andreimatei1@gmail.com
2020-12-04 20:04:52 -08:00
Mariusz Dudek
78c76a1015 libbpf: Separate XDP program load with xsk socket creation
Add support for separation of eBPF program load and xsk socket
creation.

This is needed for use-case when you want to privide as little
privileges as possible to the data plane application that will
handle xsk socket creation and incoming traffic.

With this patch the data entity container can be run with only
CAP_NET_RAW capability to fulfill its purpose of creating xsk
socket and handling packages. In case your umem is larger or
equal process limit for MEMLOCK you need either increase the
limit or CAP_IPC_LOCK capability.

To resolve privileges issue two APIs are introduced:

- xsk_setup_xdp_prog - loads the built in XDP program. It can
also return xsks_map_fd which is needed by unprivileged process
to update xsks_map with AF_XDP socket "fd"

- xsk_socket__update_xskmap - inserts an AF_XDP socket into an xskmap
for a particular xsk_socket

Signed-off-by: Mariusz Dudek <mariuszx.dudek@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20201203090546.11976-2-mariuszx.dudek@intel.com
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
a741bc6479 libbpf: Add base BTF accessor
Add ability to get base BTF. It can be also used to check if BTF is split BTF.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201202065244.530571-3-andrii@kernel.org
2020-12-04 20:04:52 -08:00
Magnus Karlsson
65e4be6f5d libbpf: Replace size_t with __u32 in xsk interfaces
Replace size_t with __u32 in the xsk interfaces that contain this.
There is no reason to have size_t since the internal variable that
is manipulated is a __u32. The following APIs are affected:

__u32 xsk_ring_prod__reserve(struct xsk_ring_prod *prod, __u32 nb, __u32 *idx)
void xsk_ring_prod__submit(struct xsk_ring_prod *prod, __u32 nb)
__u32 xsk_ring_cons__peek(struct xsk_ring_cons *cons, __u32 nb, __u32 *idx)
void xsk_ring_cons__cancel(struct xsk_ring_cons *cons, __u32 nb)
void xsk_ring_cons__release(struct xsk_ring_cons *cons, __u32 nb)

The "nb" variable and the return values have been changed from size_t
to __u32.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/1606383455-8243-1-git-send-email-magnus.karlsson@gmail.com
2020-12-04 20:04:52 -08:00
KP Singh
3a2739aa8a bpf: Add a BPF helper for getting the IMA hash of an inode
Provide a wrapper function to get the IMA hash of an inode. This helper
is useful in fingerprinting files (e.g executables on execution) and
using these fingerprints in detections like an executable unlinking
itself.

Since the ima_inode_hash can sleep, it's only allowed for sleepable
LSM hooks.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20201124151210.1081188-3-kpsingh@chromium.org
2020-12-04 20:04:52 -08:00
Li RongQing
dd2369d2a8 libbpf: Add support for canceling cached_cons advance
Add a new function for returning descriptors the user received
after an xsk_ring_cons__peek call. After the application has
gotten a number of descriptors from a ring, it might not be able
to or want to process them all for various reasons. Therefore,
it would be useful to have an interface for returning or
cancelling a number of them so that they are returned to the ring.

This patch adds a new function called xsk_ring_cons__cancel that
performs this operation on nb descriptors counted from the end of
the batch of descriptors that was received through the peek call.

Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
[ Magnus Karlsson: rewrote changelog ]
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/1606202474-8119-1-git-send-email-lirongqing@baidu.com
2020-12-04 20:04:52 -08:00
Dmitrii Banshchikov
39f5b2e75e bpf: Add bpf_ktime_get_coarse_ns helper
The helper uses CLOCK_MONOTONIC_COARSE source of time that is less
accurate but more performant.

We have a BPF CGROUP_SKB firewall that supports event logging through
bpf_perf_event_output(). Each event has a timestamp and currently we use
bpf_ktime_get_ns() for it. Use of bpf_ktime_get_coarse_ns() saves ~15-20
ns in time required for event logging.

bpf_ktime_get_ns():
EgressLogByRemoteEndpoint                              113.82ns    8.79M

bpf_ktime_get_coarse_ns():
EgressLogByRemoteEndpoint                               95.40ns   10.48M

Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20201117184549.257280-1-me@ubique.spb.ru
2020-12-04 20:04:52 -08:00
KP Singh
6969a44914 bpf: Add bpf_bprm_opts_set helper
The helper allows modification of certain bits on the linux_binprm
struct starting with the secureexec bit which can be updated using the
BPF_F_BPRM_SECUREEXEC flag.

secureexec can be set by the LSM for privilege gaining executions to set
the AT_SECURE auxv for glibc.  When set, the dynamic linker disables the
use of certain environment variables (like LD_PRELOAD).

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20201117232929.2156341-1-kpsingh@chromium.org
2020-12-04 20:04:52 -08:00
Alan Maguire
de2edae80d libbpf: bpf__find_by_name[_kind] should use btf__get_nr_types()
When operating on split BTF, btf__find_by_name[_kind] will not
iterate over all types since they use btf->nr_types to show
the number of types to iterate over. For split BTF this is
the number of types _on top of base BTF_, so it will
underestimate the number of types to iterate over, especially
for vmlinux + module BTF, where the latter is much smaller.

Use btf__get_nr_types() instead.

Fixes: ba451366bf44 ("libbpf: Implement basic split BTF support")
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1605437195-2175-1-git-send-email-alan.maguire@oracle.com
2020-12-04 20:04:52 -08:00
Thomas Karlsson
2ea4ba9c96 macvlan: Support for high multicast packet rate
Background:
Broadcast and multicast packages are enqueued for later processing.
This queue was previously hardcoded to 1000.

This proved insufficient for handling very high packet rates.
This resulted in packet drops for multicast.
While at the same time unicast worked fine.

The change:
This patch make the queue length adjustable to accommodate
for environments with very high multicast packet rate.
But still keeps the default value of 1000 unless specified.

The queue length is specified as a request per macvlan
using the IFLA_MACVLAN_BC_QUEUE_LEN parameter.

The actual used queue length will then be the maximum of
any macvlan connected to the same port. The actual used
queue length for the port can be retrieved (read only)
by the IFLA_MACVLAN_BC_QUEUE_LEN_USED parameter for verification.

This will be followed up by a patch to iproute2
in order to adjust the parameter from userspace.

Signed-off-by: Thomas Karlsson <thomas.karlsson@paneda.se>
Link: https://lore.kernel.org/r/dd4673b2-7eab-edda-6815-85c67ce87f63@paneda.se
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
2dd5965052 libbpf: Don't attempt to load unused subprog as an entry-point BPF program
If BPF code contains unused BPF subprogram and there are no other subprogram
calls (which can realistically happen in real-world applications given
sufficiently smart Clang code optimizations), libbpf will erroneously assume
that subprograms are entry-point programs and will attempt to load them with
UNSPEC program type.

Fix by not relying on subcall instructions and rather detect it based on the
structure of BPF object's sections.

Fixes: 9a94f277c4fb ("tools: libbpf: restore the ability to load programs from .text section")
Reported-by: Dmitrii Banshchikov <dbanschikov@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20201107000251.256821-1-andrii@kernel.org
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
ef8820fea8 bpf: Assign ID to vmlinux BTF and return extra info for BTF in GET_OBJ_INFO
Allocate ID for vmlinux BTF. This makes it visible when iterating over all BTF
objects in the system. To allow distinguishing vmlinux BTF (and later kernel
module BTF) from user-provided BTFs, expose extra kernel_btf flag, as well as
BTF name ("vmlinux" for vmlinux BTF, will equal to module's name for module
BTF).  We might want to later allow specifying BTF name for user-provided BTFs
as well, if that makes sense. But currently this is reserved only for
in-kernel BTFs.

Having in-kernel BTFs exposed IDs will allow to extend BPF APIs that require
in-kernel BTF type with ability to specify BTF types from kernel modules, not
just vmlinux BTF. This will be implemented in a follow up patch set for
fentry/fexit/fmod_ret/lsm/etc.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20201110011932.3201430-3-andrii@kernel.org
2020-12-04 20:04:52 -08:00
KP Singh
eae38a781c bpf: Implement get_current_task_btf and RET_PTR_TO_BTF_ID
The currently available bpf_get_current_task returns an unsigned integer
which can be used along with BPF_CORE_READ to read data from
the task_struct but still cannot be used as an input argument to a
helper that accepts an ARG_PTR_TO_BTF_ID of type task_struct.

In order to implement this helper a new return type, RET_PTR_TO_BTF_ID,
is added. This is similar to RET_PTR_TO_BTF_ID_OR_NULL but does not
require checking the nullness of returned pointer.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20201106103747.2780972-6-kpsingh@chromium.org
2020-12-04 20:04:52 -08:00
KP Singh
83c2c20acb libbpf: Add support for task local storage
Updates the bpf_probe_map_type API to also support
BPF_MAP_TYPE_TASK_STORAGE similar to other local storage maps.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20201106103747.2780972-4-kpsingh@chromium.org
2020-12-04 20:04:52 -08:00
KP Singh
00ae5bac8f bpf: Implement task local storage
Similar to bpf_local_storage for sockets and inodes add local storage
for task_struct.

The life-cycle of storage is managed with the life-cycle of the
task_struct.  i.e. the storage is destroyed along with the owning task
with a callback to the bpf_task_storage_free from the task_free LSM
hook.

The BPF LSM allocates an __rcu pointer to the bpf_local_storage in
the security blob which are now stackable and can co-exist with other
LSMs.

The userspace map operations can be done by using a pid fd as a key
passed to the lookup, update and delete operations.

Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20201106103747.2780972-3-kpsingh@chromium.org
2020-12-04 20:04:52 -08:00
Andrii Nakryiko
f99c252cbc vmtest: update Kconfig to accommodate IMA test config
test_progs's IMA selftests requires extra Kconfig values, so update
latest.config to accommodate those.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2020-12-03 12:27:49 -08:00
Andrii Nakryiko
5ae2a2621c readme: move gory sync details down and add libbpf-bootstrap references
Move gory details about libbpf mirror and sync into a
separate section at the bottom of README.

Also add references to libbpf-bootstrap and blog about it,
as well as libbpf-tools reference.
2020-11-29 13:34:03 -08:00
Andrii Nakryiko
5af3d86b5a vmtests: blacklist two more tests on 5.5
tcpbpf_user uses cgroup bpf_link, not available in 5.5. hash_large_key is
testing a more permissive verifier check, implemented in 5.11. So blacklist
both.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2020-11-05 21:20:45 -08:00
Andrii Nakryiko
c55abf0752 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   3cb12d27ff655e57e8efe3486dca2a22f4e30578
Checkpoint bpf-next commit: c6bde958a62b8ca5ee8d2c1fe429aec4ad54efad
Baseline bpf commit:        c66dca98a24cb5f3493dd08d40bcfa94a220fa92
Checkpoint bpf commit:      d3bec0138bfbe58606fc1d6f57a4cdc1a20218db

Andrii Nakryiko (6):
  libbpf: Factor out common operations in BTF writing APIs
  libbpf: Unify and speed up BTF string deduplication
  libbpf: Implement basic split BTF support
  libbpf: Fix BTF data layout checks and allow empty BTF
  libbpf: Support BTF dedup of split BTFs
  libbpf: Accomodate DWARF/compiler bug with duplicated identical arrays

Ian Rogers (1):
  libbpf, hashmap: Fix undefined behavior in hash_bits

Magnus Karlsson (2):
  libbpf: Fix null dereference in xsk_socket__delete
  libbpf: Fix possible use after free in xsk_socket__delete

 src/btf.c      | 807 +++++++++++++++++++++++++++++--------------------
 src/btf.h      |   8 +
 src/hashmap.h  |  15 +-
 src/libbpf.map |   9 +
 src/xsk.c      |   9 +-
 5 files changed, 504 insertions(+), 344 deletions(-)

--
2.24.1
2020-11-05 21:20:45 -08:00
Andrii Nakryiko
e30f758aab sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2020-11-05 21:20:45 -08:00
Magnus Karlsson
8caff995c7 libbpf: Fix possible use after free in xsk_socket__delete
Fix a possible use after free in xsk_socket__delete that will happen
if xsk_put_ctx() frees the ctx. To fix, save the umem reference taken
from the context and just use that instead.

Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1604396490-12129-3-git-send-email-magnus.karlsson@gmail.com
2020-11-05 21:20:45 -08:00
Magnus Karlsson
539aa6bea5 libbpf: Fix null dereference in xsk_socket__delete
Fix a possible null pointer dereference in xsk_socket__delete that
will occur if a null pointer is fed into the function.

Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices")
Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1604396490-12129-2-git-send-email-magnus.karlsson@gmail.com
2020-11-05 21:20:45 -08:00
Ian Rogers
224db2db07 libbpf, hashmap: Fix undefined behavior in hash_bits
If bits is 0, the case when the map is empty, then the >> is the size of
the register which is undefined behavior - on x86 it is the same as a
shift by 0.

Fix by handling the 0 case explicitly and guarding calls to hash_bits for
empty maps in hashmap__for_each_key_entry and hashmap__for_each_entry_safe.

Fixes: e3b924224028 ("libbpf: add resizable non-thread safe internal hashmap")
Suggested-by: Andrii Nakryiko <andriin@fb.com>,
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20201029223707.494059-1-irogers@google.com
2020-11-05 21:20:45 -08:00
Andrii Nakryiko
e6725d2467 libbpf: Accomodate DWARF/compiler bug with duplicated identical arrays
In some cases compiler seems to generate distinct DWARF types for identical
arrays within the same CU. That seems like a bug, but it's already out there
and breaks type graph equivalence checks, so accommodate it anyway by checking
for identical arrays, regardless of their type ID.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20201105043402.2530976-10-andrii@kernel.org
2020-11-05 21:20:45 -08:00
Andrii Nakryiko
658ac1ec19 libbpf: Support BTF dedup of split BTFs
Add support for deduplication split BTFs. When deduplicating split BTF, base
BTF is considered to be immutable and can't be modified or adjusted. 99% of
BTF deduplication logic is left intact (module some type numbering adjustments).
There are only two differences.

First, each type in base BTF gets hashed (expect VAR and DATASEC, of course,
those are always considered to be self-canonical instances) and added into
a table of canonical table candidates. Hashing is a shallow, fast operation,
so mostly eliminates the overhead of having entire base BTF to be a part of
BTF dedup.

Second difference is very critical and subtle. While deduplicating split BTF
types, it is possible to discover that one of immutable base BTF BTF_KIND_FWD
types can and should be resolved to a full STRUCT/UNION type from the split
BTF part.  This is, obviously, can't happen because we can't modify the base
BTF types anymore. So because of that, any type in split BTF that directly or
indirectly references that newly-to-be-resolved FWD type can't be considered
to be equivalent to the corresponding canonical types in base BTF, because
that would result in a loss of type resolution information. So in such case,
split BTF types will be deduplicated separately and will cause some
duplication of type information, which is unavoidable.

With those two changes, the rest of the algorithm manages to deduplicate split
BTF correctly, pointing all the duplicates to their canonical counter-parts in
base BTF, but also is deduplicating whatever unique types are present in split
BTF on their own.

Also, theoretically, split BTF after deduplication could end up with either
empty type section or empty string section. This is handled by libbpf
correctly in one of previous patches in the series.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20201105043402.2530976-9-andrii@kernel.org
2020-11-05 21:20:45 -08:00
Andrii Nakryiko
dd36215834 libbpf: Fix BTF data layout checks and allow empty BTF
Make data section layout checks stricter, disallowing overlap of types and
strings data.

Additionally, allow BTFs with no type data. There is nothing inherently wrong
with having BTF with no types (put potentially with some strings). This could
be a situation with kernel module BTFs, if module doesn't introduce any new
type information.

Also fix invalid offset alignment check for btf->hdr->type_off.

Fixes: 8a138aed4a80 ("bpf: btf: Add BTF support to libbpf")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201105043402.2530976-8-andrii@kernel.org
2020-11-05 21:20:45 -08:00
Andrii Nakryiko
2811d54f8b libbpf: Implement basic split BTF support
Support split BTF operation, in which one BTF (base BTF) provides basic set of
types and strings, while another one (split BTF) builds on top of base's types
and strings and adds its own new types and strings. From API standpoint, the
fact that the split BTF is built on top of the base BTF is transparent.

Type numeration is transparent. If the base BTF had last type ID #N, then all
types in the split BTF start at type ID N+1. Any type in split BTF can
reference base BTF types, but not vice versa. Programmatically construction of
a split BTF on top of a base BTF is supported: one can create an empty split
BTF with btf__new_empty_split() and pass base BTF as an input, or pass raw
binary data to btf__new_split(), or use btf__parse_xxx_split() variants to get
initial set of split types/strings from the ELF file with .BTF section.

String offsets are similarly transparent and are a logical continuation of
base BTF's strings. When building BTF programmatically and adding a new string
(explicitly with btf__add_str() or implicitly through appending new
types/members), string-to-be-added would first be looked up from the base
BTF's string section and re-used if it's there. If not, it will be looked up
and/or added to the split BTF string section. Similarly to type IDs, types in
split BTF can refer to strings from base BTF absolutely transparently (but not
vice versa, of course, because base BTF doesn't "know" about existence of
split BTF).

Internal type index is slightly adjusted to be zero-indexed, ignoring a fake
[0] VOID type. This allows to handle split/base BTF type lookups transparently
by using btf->start_id type ID offset, which is always 1 for base/non-split
BTF and equals btf__get_nr_types(base_btf) + 1 for the split BTF.

BTF deduplication is not yet supported for split BTF and support for it will
be added in separate patch.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20201105043402.2530976-5-andrii@kernel.org
2020-11-05 21:20:45 -08:00
Andrii Nakryiko
be2dc73ee2 libbpf: Unify and speed up BTF string deduplication
Revamp BTF dedup's string deduplication to match the approach of writable BTF
string management. This allows to transfer deduplicated strings index back to
BTF object after deduplication without expensive extra memory copying and hash
map re-construction. It also simplifies the code and speeds it up, because
hashmap-based string deduplication is faster than sort + unique approach.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20201105043402.2530976-4-andrii@kernel.org
2020-11-05 21:20:45 -08:00
Andrii Nakryiko
4953827790 libbpf: Factor out common operations in BTF writing APIs
Factor out commiting of appended type data. Also extract fetching the very
last type in the BTF (to append members to). These two operations are common
across many APIs and will be easier to refactor with split BTF, if they are
extracted into a single place.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20201105043402.2530976-2-andrii@kernel.org
2020-11-05 21:20:45 -08:00
26 changed files with 80555 additions and 23676 deletions

View File

@@ -1 +1 @@
c66dca98a24cb5f3493dd08d40bcfa94a220fa92
1a3449c19407a28f7019a887cdf0d6ba2444751a

View File

@@ -1 +1 @@
3cb12d27ff655e57e8efe3486dca2a22f4e30578
3db1a3fa98808aa90f95ec3e0fa2fc7abf28f5c9

View File

@@ -1,17 +1,11 @@
This is a mirror of [bpf-next Linux source
tree](https://kernel.googlesource.com/pub/scm/linux/kernel/git/bpf/bpf-next)'s
`tools/lib/bpf` directory plus its supporting header files.
BPF/libbpf usage and questions
==============================
All the gory details of syncing can be found in `scripts/sync-kernel.sh`
script.
Some header files in this repo (`include/linux/*.h`) are reduced versions of
their counterpart files at
[bpf-next](https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/)'s
`tools/include/linux/*.h` to make compilation successful.
BPF questions
=============
Please check out [libbpf-bootstrap](https://github.com/libbpf/libbpf-bootstrap)
and [the companion blog post](https://nakryiko.com/posts/libbpf-bootstrap/) for
the examples of building BPF applications with libbpf.
[libbpf-tools](https://github.com/iovisor/bcc/tree/master/libbpf-tools) are also
a good source of the real-world libbpf-based tracing tools.
All general BPF questions, including kernel functionality, libbpf APIs and
their application, should be sent to bpf@vger.kernel.org mailing list. You can
@@ -67,7 +61,7 @@ Distributions
Distributions packaging libbpf from this mirror:
- [Fedora](https://src.fedoraproject.org/rpms/libbpf)
- [Gentoo](https://packages.gentoo.org/packages/dev-libs/libbpf)
- [Debian](https://packages.debian.org/sid/libbpf-dev)
- [Debian](https://packages.debian.org/source/sid/libbpf)
- [Arch](https://www.archlinux.org/packages/extra/x86_64/libbpf/)
- [Ubuntu](https://packages.ubuntu.com/source/groovy/libbpf)
@@ -124,6 +118,7 @@ distributions have Clang/LLVM 10+ packaged by default:
- Ubuntu 20.04+
- Arch Linux
- Ubuntu 20.10 (LLVM 11)
- Debian 11 (LLVM 11)
Otherwise, please make sure to update it on your system.
@@ -135,6 +130,20 @@ use it:
contain lots of real-world tools converted from BCC to BPF CO-RE. Consider
converting some more to both contribute to the BPF community and gain some
more experience with it.
Details
=======
This is a mirror of [bpf-next Linux source
tree](https://kernel.googlesource.com/pub/scm/linux/kernel/git/bpf/bpf-next)'s
`tools/lib/bpf` directory plus its supporting header files.
All the gory details of syncing can be found in `scripts/sync-kernel.sh`
script.
Some header files in this repo (`include/linux/*.h`) are reduced versions of
their counterpart files at
[bpf-next](https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/)'s
`tools/include/linux/*.h` to make compilation successful.
License
=======

View File

@@ -157,6 +157,7 @@ enum bpf_map_type {
BPF_MAP_TYPE_STRUCT_OPS,
BPF_MAP_TYPE_RINGBUF,
BPF_MAP_TYPE_INODE_STORAGE,
BPF_MAP_TYPE_TASK_STORAGE,
};
/* Note that tracing related programs such as
@@ -556,7 +557,12 @@ union bpf_attr {
__aligned_u64 line_info; /* line info */
__u32 line_info_cnt; /* number of bpf_line_info records */
__u32 attach_btf_id; /* in-kernel BTF type id to attach to */
__u32 attach_prog_fd; /* 0 to attach to vmlinux */
union {
/* valid prog_fd to attach to bpf prog */
__u32 attach_prog_fd;
/* or valid module BTF object fd or 0 to attach to vmlinux */
__u32 attach_btf_obj_fd;
};
};
struct { /* anonymous struct used by BPF_OBJ_* commands */
@@ -3742,6 +3748,88 @@ union bpf_attr {
* Return
* The helper returns **TC_ACT_REDIRECT** on success or
* **TC_ACT_SHOT** on error.
*
* void *bpf_task_storage_get(struct bpf_map *map, struct task_struct *task, void *value, u64 flags)
* Description
* Get a bpf_local_storage from the *task*.
*
* Logically, it could be thought of as getting the value from
* a *map* with *task* as the **key**. From this
* perspective, the usage is not much different from
* **bpf_map_lookup_elem**\ (*map*, **&**\ *task*) except this
* helper enforces the key must be an task_struct and the map must also
* be a **BPF_MAP_TYPE_TASK_STORAGE**.
*
* Underneath, the value is stored locally at *task* instead of
* the *map*. The *map* is used as the bpf-local-storage
* "type". The bpf-local-storage "type" (i.e. the *map*) is
* searched against all bpf_local_storage residing at *task*.
*
* An optional *flags* (**BPF_LOCAL_STORAGE_GET_F_CREATE**) can be
* used such that a new bpf_local_storage will be
* created if one does not exist. *value* can be used
* together with **BPF_LOCAL_STORAGE_GET_F_CREATE** to specify
* the initial value of a bpf_local_storage. If *value* is
* **NULL**, the new bpf_local_storage will be zero initialized.
* Return
* A bpf_local_storage pointer is returned on success.
*
* **NULL** if not found or there was an error in adding
* a new bpf_local_storage.
*
* long bpf_task_storage_delete(struct bpf_map *map, struct task_struct *task)
* Description
* Delete a bpf_local_storage from a *task*.
* Return
* 0 on success.
*
* **-ENOENT** if the bpf_local_storage cannot be found.
*
* struct task_struct *bpf_get_current_task_btf(void)
* Description
* Return a BTF pointer to the "current" task.
* This pointer can also be used in helpers that accept an
* *ARG_PTR_TO_BTF_ID* of type *task_struct*.
* Return
* Pointer to the current task.
*
* long bpf_bprm_opts_set(struct linux_binprm *bprm, u64 flags)
* Description
* Set or clear certain options on *bprm*:
*
* **BPF_F_BPRM_SECUREEXEC** Set the secureexec bit
* which sets the **AT_SECURE** auxv for glibc. The bit
* is cleared if the flag is not specified.
* Return
* **-EINVAL** if invalid *flags* are passed, zero otherwise.
*
* u64 bpf_ktime_get_coarse_ns(void)
* Description
* Return a coarse-grained version of the time elapsed since
* system boot, in nanoseconds. Does not include time the system
* was suspended.
*
* See: **clock_gettime**\ (**CLOCK_MONOTONIC_COARSE**)
* Return
* Current *ktime*.
*
* long bpf_ima_inode_hash(struct inode *inode, void *dst, u32 size)
* Description
* Returns the stored IMA hash of the *inode* (if it's avaialable).
* If the hash is larger than *size*, then only *size*
* bytes will be copied to *dst*
* Return
* The **hash_algo** is returned on success,
* **-EOPNOTSUP** if IMA is disabled or **-EINVAL** if
* invalid arguments are passed.
*
* struct socket *bpf_sock_from_file(struct file *file)
* Description
* If the given file represents a socket, returns the associated
* socket.
* Return
* A pointer to a struct socket on success or NULL if the file is
* not a socket.
*/
#define __BPF_FUNC_MAPPER(FN) \
FN(unspec), \
@@ -3897,9 +3985,16 @@ union bpf_attr {
FN(seq_printf_btf), \
FN(skb_cgroup_classid), \
FN(redirect_neigh), \
FN(bpf_per_cpu_ptr), \
FN(bpf_this_cpu_ptr), \
FN(per_cpu_ptr), \
FN(this_cpu_ptr), \
FN(redirect_peer), \
FN(task_storage_get), \
FN(task_storage_delete), \
FN(get_current_task_btf), \
FN(bprm_opts_set), \
FN(ktime_get_coarse_ns), \
FN(ima_inode_hash), \
FN(sock_from_file), \
/* */
/* integer value in 'imm' field of BPF_CALL instruction selects which helper
@@ -4071,6 +4166,11 @@ enum bpf_lwt_encap_mode {
BPF_LWT_ENCAP_IP,
};
/* Flags for bpf_bprm_opts_set helper */
enum {
BPF_F_BPRM_SECUREEXEC = (1ULL << 0),
};
#define __bpf_md_ptr(type, name) \
union { \
type name; \
@@ -4418,6 +4518,9 @@ struct bpf_btf_info {
__aligned_u64 btf;
__u32 btf_size;
__u32 id;
__aligned_u64 name;
__u32 name_len;
__u32 kernel_btf;
} __attribute__((aligned(8)));
struct bpf_link_info {

View File

@@ -409,6 +409,8 @@ enum {
IFLA_MACVLAN_MACADDR,
IFLA_MACVLAN_MACADDR_DATA,
IFLA_MACVLAN_MACADDR_COUNT,
IFLA_MACVLAN_BC_QUEUE_LEN,
IFLA_MACVLAN_BC_QUEUE_LEN_USED,
__IFLA_MACVLAN_MAX,
};

View File

@@ -66,6 +66,13 @@ else
LIBSUBDIR := lib
endif
# By default let the pc file itself use ${prefix} in includedir/libdir so that
# the prefix can be overridden at runtime (eg: --define-prefix)
ifndef LIBDIR
LIBDIR_PC := $$\{prefix\}/$(LIBSUBDIR)
else
LIBDIR_PC := $(LIBDIR)
endif
PREFIX ?= /usr
LIBDIR ?= $(PREFIX)/$(LIBSUBDIR)
INCLUDEDIR ?= $(PREFIX)/include
@@ -93,7 +100,7 @@ $(OBJDIR)/libbpf.so.$(LIBBPF_VERSION): $(SHARED_OBJS)
$(OBJDIR)/libbpf.pc:
$(Q)sed -e "s|@PREFIX@|$(PREFIX)|" \
-e "s|@LIBDIR@|$(LIBDIR)|" \
-e "s|@LIBDIR@|$(LIBDIR_PC)|" \
-e "s|@VERSION@|$(LIBBPF_VERSION)|" \
< libbpf.pc.template > $@

104
src/bpf.c
View File

@@ -67,11 +67,12 @@ static inline int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr,
static inline int sys_bpf_prog_load(union bpf_attr *attr, unsigned int size)
{
int retries = 5;
int fd;
do {
fd = sys_bpf(BPF_PROG_LOAD, attr, size);
} while (fd < 0 && errno == EAGAIN);
} while (fd < 0 && errno == EAGAIN && retries-- > 0);
return fd;
}
@@ -214,59 +215,55 @@ alloc_zero_tailing_info(const void *orecord, __u32 cnt,
return info;
}
int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
char *log_buf, size_t log_buf_sz)
int libbpf__bpf_prog_load(const struct bpf_prog_load_params *load_attr)
{
void *finfo = NULL, *linfo = NULL;
union bpf_attr attr;
__u32 log_level;
int fd;
if (!load_attr || !log_buf != !log_buf_sz)
if (!load_attr->log_buf != !load_attr->log_buf_sz)
return -EINVAL;
log_level = load_attr->log_level;
if (log_level > (4 | 2 | 1) || (log_level && !log_buf))
if (load_attr->log_level > (4 | 2 | 1) || (load_attr->log_level && !load_attr->log_buf))
return -EINVAL;
memset(&attr, 0, sizeof(attr));
attr.prog_type = load_attr->prog_type;
attr.expected_attach_type = load_attr->expected_attach_type;
if (attr.prog_type == BPF_PROG_TYPE_STRUCT_OPS ||
attr.prog_type == BPF_PROG_TYPE_LSM) {
attr.attach_btf_id = load_attr->attach_btf_id;
} else if (attr.prog_type == BPF_PROG_TYPE_TRACING ||
attr.prog_type == BPF_PROG_TYPE_EXT) {
attr.attach_btf_id = load_attr->attach_btf_id;
if (load_attr->attach_prog_fd)
attr.attach_prog_fd = load_attr->attach_prog_fd;
} else {
attr.prog_ifindex = load_attr->prog_ifindex;
attr.kern_version = load_attr->kern_version;
}
attr.insn_cnt = (__u32)load_attr->insns_cnt;
else
attr.attach_btf_obj_fd = load_attr->attach_btf_obj_fd;
attr.attach_btf_id = load_attr->attach_btf_id;
attr.prog_ifindex = load_attr->prog_ifindex;
attr.kern_version = load_attr->kern_version;
attr.insn_cnt = (__u32)load_attr->insn_cnt;
attr.insns = ptr_to_u64(load_attr->insns);
attr.license = ptr_to_u64(load_attr->license);
attr.log_level = log_level;
if (log_level) {
attr.log_buf = ptr_to_u64(log_buf);
attr.log_size = log_buf_sz;
} else {
attr.log_buf = ptr_to_u64(NULL);
attr.log_size = 0;
attr.log_level = load_attr->log_level;
if (attr.log_level) {
attr.log_buf = ptr_to_u64(load_attr->log_buf);
attr.log_size = load_attr->log_buf_sz;
}
attr.prog_btf_fd = load_attr->prog_btf_fd;
attr.prog_flags = load_attr->prog_flags;
attr.func_info_rec_size = load_attr->func_info_rec_size;
attr.func_info_cnt = load_attr->func_info_cnt;
attr.func_info = ptr_to_u64(load_attr->func_info);
attr.line_info_rec_size = load_attr->line_info_rec_size;
attr.line_info_cnt = load_attr->line_info_cnt;
attr.line_info = ptr_to_u64(load_attr->line_info);
if (load_attr->name)
memcpy(attr.prog_name, load_attr->name,
min(strlen(load_attr->name), BPF_OBJ_NAME_LEN - 1));
attr.prog_flags = load_attr->prog_flags;
min(strlen(load_attr->name), (size_t)BPF_OBJ_NAME_LEN - 1));
fd = sys_bpf_prog_load(&attr, sizeof(attr));
if (fd >= 0)
@@ -306,19 +303,19 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
}
fd = sys_bpf_prog_load(&attr, sizeof(attr));
if (fd >= 0)
goto done;
}
if (log_level || !log_buf)
if (load_attr->log_level || !load_attr->log_buf)
goto done;
/* Try again with log */
attr.log_buf = ptr_to_u64(log_buf);
attr.log_size = log_buf_sz;
attr.log_buf = ptr_to_u64(load_attr->log_buf);
attr.log_size = load_attr->log_buf_sz;
attr.log_level = 1;
log_buf[0] = 0;
load_attr->log_buf[0] = 0;
fd = sys_bpf_prog_load(&attr, sizeof(attr));
done:
free(finfo);
@@ -326,6 +323,49 @@ done:
return fd;
}
int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
char *log_buf, size_t log_buf_sz)
{
struct bpf_prog_load_params p = {};
if (!load_attr || !log_buf != !log_buf_sz)
return -EINVAL;
p.prog_type = load_attr->prog_type;
p.expected_attach_type = load_attr->expected_attach_type;
switch (p.prog_type) {
case BPF_PROG_TYPE_STRUCT_OPS:
case BPF_PROG_TYPE_LSM:
p.attach_btf_id = load_attr->attach_btf_id;
break;
case BPF_PROG_TYPE_TRACING:
case BPF_PROG_TYPE_EXT:
p.attach_btf_id = load_attr->attach_btf_id;
p.attach_prog_fd = load_attr->attach_prog_fd;
break;
default:
p.prog_ifindex = load_attr->prog_ifindex;
p.kern_version = load_attr->kern_version;
}
p.insn_cnt = load_attr->insns_cnt;
p.insns = load_attr->insns;
p.license = load_attr->license;
p.log_level = load_attr->log_level;
p.log_buf = log_buf;
p.log_buf_sz = log_buf_sz;
p.prog_btf_fd = load_attr->prog_btf_fd;
p.func_info_rec_size = load_attr->func_info_rec_size;
p.func_info_cnt = load_attr->func_info_cnt;
p.func_info = load_attr->func_info;
p.line_info_rec_size = load_attr->line_info_rec_size;
p.line_info_cnt = load_attr->line_info_cnt;
p.line_info = load_attr->line_info;
p.name = load_attr->name;
p.prog_flags = load_attr->prog_flags;
return libbpf__bpf_prog_load(&p);
}
int bpf_load_program(enum bpf_prog_type type, const struct bpf_insn *insns,
size_t insns_cnt, const char *license,
__u32 kern_version, char *log_buf,

View File

@@ -16,6 +16,7 @@ struct bpf_sysctl;
struct bpf_tcp_sock;
struct bpf_tunnel_key;
struct bpf_xfrm_state;
struct linux_binprm;
struct pt_regs;
struct sk_reuseport_md;
struct sockaddr;
@@ -32,6 +33,9 @@ struct sk_msg_md;
struct xdp_md;
struct path;
struct btf_ptr;
struct inode;
struct socket;
struct file;
/*
* bpf_map_lookup_elem
@@ -3617,4 +3621,114 @@ static void *(*bpf_this_cpu_ptr)(const void *percpu_ptr) = (void *) 154;
*/
static long (*bpf_redirect_peer)(__u32 ifindex, __u64 flags) = (void *) 155;
/*
* bpf_task_storage_get
*
* Get a bpf_local_storage from the *task*.
*
* Logically, it could be thought of as getting the value from
* a *map* with *task* as the **key**. From this
* perspective, the usage is not much different from
* **bpf_map_lookup_elem**\ (*map*, **&**\ *task*) except this
* helper enforces the key must be an task_struct and the map must also
* be a **BPF_MAP_TYPE_TASK_STORAGE**.
*
* Underneath, the value is stored locally at *task* instead of
* the *map*. The *map* is used as the bpf-local-storage
* "type". The bpf-local-storage "type" (i.e. the *map*) is
* searched against all bpf_local_storage residing at *task*.
*
* An optional *flags* (**BPF_LOCAL_STORAGE_GET_F_CREATE**) can be
* used such that a new bpf_local_storage will be
* created if one does not exist. *value* can be used
* together with **BPF_LOCAL_STORAGE_GET_F_CREATE** to specify
* the initial value of a bpf_local_storage. If *value* is
* **NULL**, the new bpf_local_storage will be zero initialized.
*
* Returns
* A bpf_local_storage pointer is returned on success.
*
* **NULL** if not found or there was an error in adding
* a new bpf_local_storage.
*/
static void *(*bpf_task_storage_get)(void *map, struct task_struct *task, void *value, __u64 flags) = (void *) 156;
/*
* bpf_task_storage_delete
*
* Delete a bpf_local_storage from a *task*.
*
* Returns
* 0 on success.
*
* **-ENOENT** if the bpf_local_storage cannot be found.
*/
static long (*bpf_task_storage_delete)(void *map, struct task_struct *task) = (void *) 157;
/*
* bpf_get_current_task_btf
*
* Return a BTF pointer to the "current" task.
* This pointer can also be used in helpers that accept an
* *ARG_PTR_TO_BTF_ID* of type *task_struct*.
*
* Returns
* Pointer to the current task.
*/
static struct task_struct *(*bpf_get_current_task_btf)(void) = (void *) 158;
/*
* bpf_bprm_opts_set
*
* Set or clear certain options on *bprm*:
*
* **BPF_F_BPRM_SECUREEXEC** Set the secureexec bit
* which sets the **AT_SECURE** auxv for glibc. The bit
* is cleared if the flag is not specified.
*
* Returns
* **-EINVAL** if invalid *flags* are passed, zero otherwise.
*/
static long (*bpf_bprm_opts_set)(struct linux_binprm *bprm, __u64 flags) = (void *) 159;
/*
* bpf_ktime_get_coarse_ns
*
* Return a coarse-grained version of the time elapsed since
* system boot, in nanoseconds. Does not include time the system
* was suspended.
*
* See: **clock_gettime**\ (**CLOCK_MONOTONIC_COARSE**)
*
* Returns
* Current *ktime*.
*/
static __u64 (*bpf_ktime_get_coarse_ns)(void) = (void *) 160;
/*
* bpf_ima_inode_hash
*
* Returns the stored IMA hash of the *inode* (if it's avaialable).
* If the hash is larger than *size*, then only *size*
* bytes will be copied to *dst*
*
* Returns
* The **hash_algo** is returned on success,
* **-EOPNOTSUP** if IMA is disabled or **-EINVAL** if
* invalid arguments are passed.
*/
static long (*bpf_ima_inode_hash)(struct inode *inode, void *dst, __u32 size) = (void *) 161;
/*
* bpf_sock_from_file
*
* If the given file represents a socket, returns the associated
* socket.
*
* Returns
* A pointer to a struct socket on success or NULL if the file is
* not a socket.
*/
static struct socket *(*bpf_sock_from_file)(struct file *file) = (void *) 162;

883
src/btf.c

File diff suppressed because it is too large Load Diff

View File

@@ -31,11 +31,19 @@ enum btf_endianness {
};
LIBBPF_API void btf__free(struct btf *btf);
LIBBPF_API struct btf *btf__new(const void *data, __u32 size);
LIBBPF_API struct btf *btf__new_split(const void *data, __u32 size, struct btf *base_btf);
LIBBPF_API struct btf *btf__new_empty(void);
LIBBPF_API struct btf *btf__new_empty_split(struct btf *base_btf);
LIBBPF_API struct btf *btf__parse(const char *path, struct btf_ext **btf_ext);
LIBBPF_API struct btf *btf__parse_split(const char *path, struct btf *base_btf);
LIBBPF_API struct btf *btf__parse_elf(const char *path, struct btf_ext **btf_ext);
LIBBPF_API struct btf *btf__parse_elf_split(const char *path, struct btf *base_btf);
LIBBPF_API struct btf *btf__parse_raw(const char *path);
LIBBPF_API struct btf *btf__parse_raw_split(const char *path, struct btf *base_btf);
LIBBPF_API int btf__finalize_data(struct bpf_object *obj, struct btf *btf);
LIBBPF_API int btf__load(struct btf *btf);
LIBBPF_API __s32 btf__find_by_name(const struct btf *btf,
@@ -43,6 +51,7 @@ LIBBPF_API __s32 btf__find_by_name(const struct btf *btf,
LIBBPF_API __s32 btf__find_by_name_kind(const struct btf *btf,
const char *type_name, __u32 kind);
LIBBPF_API __u32 btf__get_nr_types(const struct btf *btf);
LIBBPF_API const struct btf *btf__base_btf(const struct btf *btf);
LIBBPF_API const struct btf_type *btf__type_by_id(const struct btf *btf,
__u32 id);
LIBBPF_API size_t btf__pointer_size(const struct btf *btf);

View File

@@ -15,6 +15,9 @@
static inline size_t hash_bits(size_t h, int bits)
{
/* shuffle bits and return requested number of upper bits */
if (bits == 0)
return 0;
#if (__SIZEOF_SIZE_T__ == __SIZEOF_LONG_LONG__)
/* LP64 case */
return (h * 11400714819323198485llu) >> (__SIZEOF_LONG_LONG__ * 8 - bits);
@@ -174,17 +177,17 @@ bool hashmap__find(const struct hashmap *map, const void *key, void **value);
* @key: key to iterate entries for
*/
#define hashmap__for_each_key_entry(map, cur, _key) \
for (cur = ({ size_t bkt = hash_bits(map->hash_fn((_key), map->ctx),\
map->cap_bits); \
map->buckets ? map->buckets[bkt] : NULL; }); \
for (cur = map->buckets \
? map->buckets[hash_bits(map->hash_fn((_key), map->ctx), map->cap_bits)] \
: NULL; \
cur; \
cur = cur->next) \
if (map->equal_fn(cur->key, (_key), map->ctx))
#define hashmap__for_each_key_entry_safe(map, cur, tmp, _key) \
for (cur = ({ size_t bkt = hash_bits(map->hash_fn((_key), map->ctx),\
map->cap_bits); \
cur = map->buckets ? map->buckets[bkt] : NULL; }); \
for (cur = map->buckets \
? map->buckets[hash_bits(map->hash_fn((_key), map->ctx), map->cap_bits)] \
: NULL; \
cur && ({ tmp = cur->next; true; }); \
cur = tmp) \
if (map->equal_fn(cur->key, (_key), map->ctx))

View File

@@ -176,6 +176,8 @@ enum kern_feature_id {
FEAT_PROBE_READ_KERN,
/* BPF_PROG_BIND_MAP is supported */
FEAT_PROG_BIND_MAP,
/* Kernel support for module BTFs */
FEAT_MODULE_BTF,
__FEAT_CNT,
};
@@ -276,6 +278,7 @@ struct bpf_program {
enum bpf_prog_type type;
enum bpf_attach_type expected_attach_type;
int prog_ifindex;
__u32 attach_btf_obj_fd;
__u32 attach_btf_id;
__u32 attach_prog_fd;
void *func_info;
@@ -402,6 +405,13 @@ struct extern_desc {
static LIST_HEAD(bpf_objects_list);
struct module_btf {
struct btf *btf;
char *name;
__u32 id;
int fd;
};
struct bpf_object {
char name[BPF_OBJ_NAME_LEN];
char license[64];
@@ -462,11 +472,19 @@ struct bpf_object {
struct list_head list;
struct btf *btf;
struct btf_ext *btf_ext;
/* Parse and load BTF vmlinux if any of the programs in the object need
* it at load time.
*/
struct btf *btf_vmlinux;
struct btf_ext *btf_ext;
/* vmlinux BTF override for CO-RE relocations */
struct btf *btf_vmlinux_override;
/* Lazily initialized kernel module BTFs */
struct module_btf *btf_modules;
bool btf_modules_loaded;
size_t btf_module_cnt;
size_t btf_module_cap;
void *priv;
bpf_object_clear_priv_t clear_priv;
@@ -560,8 +578,6 @@ bpf_object__init_prog(struct bpf_object *obj, struct bpf_program *prog,
const char *name, size_t sec_idx, const char *sec_name,
size_t sec_off, void *insn_data, size_t insn_data_sz)
{
int i;
if (insn_data_sz == 0 || insn_data_sz % BPF_INSN_SZ || sec_off % BPF_INSN_SZ) {
pr_warn("sec '%s': corrupted program '%s', offset %zu, size %zu\n",
sec_name, name, sec_off, insn_data_sz);
@@ -600,13 +616,6 @@ bpf_object__init_prog(struct bpf_object *obj, struct bpf_program *prog,
goto errout;
memcpy(prog->insns, insn_data, insn_data_sz);
for (i = 0; i < prog->insns_cnt; i++) {
if (insn_is_subprog_call(&prog->insns[i])) {
obj->has_subcalls = true;
break;
}
}
return 0;
errout:
pr_warn("sec '%s': failed to allocate memory for prog '%s'\n", sec_name, name);
@@ -2509,7 +2518,7 @@ static int bpf_object__finalize_btf(struct bpf_object *obj)
return 0;
}
static inline bool libbpf_prog_needs_vmlinux_btf(struct bpf_program *prog)
static bool prog_needs_vmlinux_btf(struct bpf_program *prog)
{
if (prog->type == BPF_PROG_TYPE_STRUCT_OPS ||
prog->type == BPF_PROG_TYPE_LSM)
@@ -2524,37 +2533,43 @@ static inline bool libbpf_prog_needs_vmlinux_btf(struct bpf_program *prog)
return false;
}
static int bpf_object__load_vmlinux_btf(struct bpf_object *obj)
static bool obj_needs_vmlinux_btf(const struct bpf_object *obj)
{
bool need_vmlinux_btf = false;
struct bpf_program *prog;
int i, err;
int i;
/* CO-RE relocations need kernel BTF */
if (obj->btf_ext && obj->btf_ext->core_relo_info.len)
need_vmlinux_btf = true;
return true;
/* Support for typed ksyms needs kernel BTF */
for (i = 0; i < obj->nr_extern; i++) {
const struct extern_desc *ext;
ext = &obj->externs[i];
if (ext->type == EXT_KSYM && ext->ksym.type_id) {
need_vmlinux_btf = true;
break;
}
if (ext->type == EXT_KSYM && ext->ksym.type_id)
return true;
}
bpf_object__for_each_program(prog, obj) {
if (!prog->load)
continue;
if (libbpf_prog_needs_vmlinux_btf(prog)) {
need_vmlinux_btf = true;
break;
}
if (prog_needs_vmlinux_btf(prog))
return true;
}
if (!need_vmlinux_btf)
return false;
}
static int bpf_object__load_vmlinux_btf(struct bpf_object *obj, bool force)
{
int err;
/* btf_vmlinux could be loaded earlier */
if (obj->btf_vmlinux)
return 0;
if (!force && !obj_needs_vmlinux_btf(obj))
return 0;
obj->btf_vmlinux = libbpf_find_kernel_btf();
@@ -3280,7 +3295,19 @@ bpf_object__find_program_by_title(const struct bpf_object *obj,
static bool prog_is_subprog(const struct bpf_object *obj,
const struct bpf_program *prog)
{
return prog->sec_idx == obj->efile.text_shndx && obj->has_subcalls;
/* For legacy reasons, libbpf supports an entry-point BPF programs
* without SEC() attribute, i.e., those in the .text section. But if
* there are 2 or more such programs in the .text section, they all
* must be subprograms called from entry-point BPF programs in
* designated SEC()'tions, otherwise there is no way to distinguish
* which of those programs should be loaded vs which are a subprogram.
* Similarly, if there is a function/program in .text and at least one
* other BPF program with custom SEC() attribute, then we just assume
* .text programs are subprograms (even if they are not called from
* other programs), because libbpf never explicitly supported mixing
* SEC()-designated BPF programs and .text entry-point BPF programs.
*/
return prog->sec_idx == obj->efile.text_shndx && obj->nr_programs > 1;
}
struct bpf_program *
@@ -3957,6 +3984,35 @@ static int probe_prog_bind_map(void)
return ret >= 0;
}
static int probe_module_btf(void)
{
static const char strs[] = "\0int";
__u32 types[] = {
/* int */
BTF_TYPE_INT_ENC(1, BTF_INT_SIGNED, 0, 32, 4),
};
struct bpf_btf_info info;
__u32 len = sizeof(info);
char name[16];
int fd, err;
fd = libbpf__load_raw_btf((char *)types, sizeof(types), strs, sizeof(strs));
if (fd < 0)
return 0; /* BTF not supported at all */
memset(&info, 0, sizeof(info));
info.name = ptr_to_u64(name);
info.name_len = sizeof(name);
/* check that BPF_OBJ_GET_INFO_BY_FD supports specifying name pointer;
* kernel's module BTF support coincides with support for
* name/name_len fields in struct bpf_btf_info.
*/
err = bpf_obj_get_info_by_fd(fd, &info, &len);
close(fd);
return !err;
}
enum kern_feature_result {
FEAT_UNKNOWN = 0,
FEAT_SUPPORTED = 1,
@@ -4000,7 +4056,10 @@ static struct kern_feature_desc {
},
[FEAT_PROG_BIND_MAP] = {
"BPF_PROG_BIND_MAP support", probe_prog_bind_map,
}
},
[FEAT_MODULE_BTF] = {
"module BTF support", probe_module_btf,
},
};
static bool kernel_supports(enum kern_feature_id feat_id)
@@ -4600,46 +4659,43 @@ static size_t bpf_core_essential_name_len(const char *name)
return n;
}
/* dynamically sized list of type IDs */
struct ids_vec {
__u32 *data;
struct core_cand
{
const struct btf *btf;
const struct btf_type *t;
const char *name;
__u32 id;
};
/* dynamically sized list of type IDs and its associated struct btf */
struct core_cand_list {
struct core_cand *cands;
int len;
};
static void bpf_core_free_cands(struct ids_vec *cand_ids)
static void bpf_core_free_cands(struct core_cand_list *cands)
{
free(cand_ids->data);
free(cand_ids);
free(cands->cands);
free(cands);
}
static struct ids_vec *bpf_core_find_cands(const struct btf *local_btf,
__u32 local_type_id,
const struct btf *targ_btf)
static int bpf_core_add_cands(struct core_cand *local_cand,
size_t local_essent_len,
const struct btf *targ_btf,
const char *targ_btf_name,
int targ_start_id,
struct core_cand_list *cands)
{
size_t local_essent_len, targ_essent_len;
const char *local_name, *targ_name;
const struct btf_type *t, *local_t;
struct ids_vec *cand_ids;
__u32 *new_ids;
int i, err, n;
local_t = btf__type_by_id(local_btf, local_type_id);
if (!local_t)
return ERR_PTR(-EINVAL);
local_name = btf__name_by_offset(local_btf, local_t->name_off);
if (str_is_empty(local_name))
return ERR_PTR(-EINVAL);
local_essent_len = bpf_core_essential_name_len(local_name);
cand_ids = calloc(1, sizeof(*cand_ids));
if (!cand_ids)
return ERR_PTR(-ENOMEM);
struct core_cand *new_cands, *cand;
const struct btf_type *t;
const char *targ_name;
size_t targ_essent_len;
int n, i;
n = btf__get_nr_types(targ_btf);
for (i = 1; i <= n; i++) {
for (i = targ_start_id; i <= n; i++) {
t = btf__type_by_id(targ_btf, i);
if (btf_kind(t) != btf_kind(local_t))
if (btf_kind(t) != btf_kind(local_cand->t))
continue;
targ_name = btf__name_by_offset(targ_btf, t->name_off);
@@ -4650,24 +4706,174 @@ static struct ids_vec *bpf_core_find_cands(const struct btf *local_btf,
if (targ_essent_len != local_essent_len)
continue;
if (strncmp(local_name, targ_name, local_essent_len) == 0) {
pr_debug("CO-RE relocating [%d] %s %s: found target candidate [%d] %s %s\n",
local_type_id, btf_kind_str(local_t),
local_name, i, btf_kind_str(t), targ_name);
new_ids = libbpf_reallocarray(cand_ids->data,
cand_ids->len + 1,
sizeof(*cand_ids->data));
if (!new_ids) {
err = -ENOMEM;
goto err_out;
}
cand_ids->data = new_ids;
cand_ids->data[cand_ids->len++] = i;
}
if (strncmp(local_cand->name, targ_name, local_essent_len) != 0)
continue;
pr_debug("CO-RE relocating [%d] %s %s: found target candidate [%d] %s %s in [%s]\n",
local_cand->id, btf_kind_str(local_cand->t),
local_cand->name, i, btf_kind_str(t), targ_name,
targ_btf_name);
new_cands = libbpf_reallocarray(cands->cands, cands->len + 1,
sizeof(*cands->cands));
if (!new_cands)
return -ENOMEM;
cand = &new_cands[cands->len];
cand->btf = targ_btf;
cand->t = t;
cand->name = targ_name;
cand->id = i;
cands->cands = new_cands;
cands->len++;
}
return cand_ids;
return 0;
}
static int load_module_btfs(struct bpf_object *obj)
{
struct bpf_btf_info info;
struct module_btf *mod_btf;
struct btf *btf;
char name[64];
__u32 id = 0, len;
int err, fd;
if (obj->btf_modules_loaded)
return 0;
/* don't do this again, even if we find no module BTFs */
obj->btf_modules_loaded = true;
/* kernel too old to support module BTFs */
if (!kernel_supports(FEAT_MODULE_BTF))
return 0;
while (true) {
err = bpf_btf_get_next_id(id, &id);
if (err && errno == ENOENT)
return 0;
if (err) {
err = -errno;
pr_warn("failed to iterate BTF objects: %d\n", err);
return err;
}
fd = bpf_btf_get_fd_by_id(id);
if (fd < 0) {
if (errno == ENOENT)
continue; /* expected race: BTF was unloaded */
err = -errno;
pr_warn("failed to get BTF object #%d FD: %d\n", id, err);
return err;
}
len = sizeof(info);
memset(&info, 0, sizeof(info));
info.name = ptr_to_u64(name);
info.name_len = sizeof(name);
err = bpf_obj_get_info_by_fd(fd, &info, &len);
if (err) {
err = -errno;
pr_warn("failed to get BTF object #%d info: %d\n", id, err);
goto err_out;
}
/* ignore non-module BTFs */
if (!info.kernel_btf || strcmp(name, "vmlinux") == 0) {
close(fd);
continue;
}
btf = btf_get_from_fd(fd, obj->btf_vmlinux);
if (IS_ERR(btf)) {
pr_warn("failed to load module [%s]'s BTF object #%d: %ld\n",
name, id, PTR_ERR(btf));
err = PTR_ERR(btf);
goto err_out;
}
err = btf_ensure_mem((void **)&obj->btf_modules, &obj->btf_module_cap,
sizeof(*obj->btf_modules), obj->btf_module_cnt + 1);
if (err)
goto err_out;
mod_btf = &obj->btf_modules[obj->btf_module_cnt++];
mod_btf->btf = btf;
mod_btf->id = id;
mod_btf->fd = fd;
mod_btf->name = strdup(name);
if (!mod_btf->name) {
err = -ENOMEM;
goto err_out;
}
continue;
err_out:
bpf_core_free_cands(cand_ids);
close(fd);
return err;
}
return 0;
}
static struct core_cand_list *
bpf_core_find_cands(struct bpf_object *obj, const struct btf *local_btf, __u32 local_type_id)
{
struct core_cand local_cand = {};
struct core_cand_list *cands;
const struct btf *main_btf;
size_t local_essent_len;
int err, i;
local_cand.btf = local_btf;
local_cand.t = btf__type_by_id(local_btf, local_type_id);
if (!local_cand.t)
return ERR_PTR(-EINVAL);
local_cand.name = btf__name_by_offset(local_btf, local_cand.t->name_off);
if (str_is_empty(local_cand.name))
return ERR_PTR(-EINVAL);
local_essent_len = bpf_core_essential_name_len(local_cand.name);
cands = calloc(1, sizeof(*cands));
if (!cands)
return ERR_PTR(-ENOMEM);
/* Attempt to find target candidates in vmlinux BTF first */
main_btf = obj->btf_vmlinux_override ?: obj->btf_vmlinux;
err = bpf_core_add_cands(&local_cand, local_essent_len, main_btf, "vmlinux", 1, cands);
if (err)
goto err_out;
/* if vmlinux BTF has any candidate, don't got for module BTFs */
if (cands->len)
return cands;
/* if vmlinux BTF was overridden, don't attempt to load module BTFs */
if (obj->btf_vmlinux_override)
return cands;
/* now look through module BTFs, trying to still find candidates */
err = load_module_btfs(obj);
if (err)
goto err_out;
for (i = 0; i < obj->btf_module_cnt; i++) {
err = bpf_core_add_cands(&local_cand, local_essent_len,
obj->btf_modules[i].btf,
obj->btf_modules[i].name,
btf__get_nr_types(obj->btf_vmlinux) + 1,
cands);
if (err)
goto err_out;
}
return cands;
err_out:
bpf_core_free_cands(cands);
return ERR_PTR(err);
}
@@ -5661,7 +5867,6 @@ static int bpf_core_apply_relo(struct bpf_program *prog,
const struct bpf_core_relo *relo,
int relo_idx,
const struct btf *local_btf,
const struct btf *targ_btf,
struct hashmap *cand_cache)
{
struct bpf_core_spec local_spec, cand_spec, targ_spec = {};
@@ -5669,8 +5874,8 @@ static int bpf_core_apply_relo(struct bpf_program *prog,
struct bpf_core_relo_res cand_res, targ_res;
const struct btf_type *local_type;
const char *local_name;
struct ids_vec *cand_ids;
__u32 local_id, cand_id;
struct core_cand_list *cands = NULL;
__u32 local_id;
const char *spec_str;
int i, j, err;
@@ -5717,24 +5922,24 @@ static int bpf_core_apply_relo(struct bpf_program *prog,
return -EOPNOTSUPP;
}
if (!hashmap__find(cand_cache, type_key, (void **)&cand_ids)) {
cand_ids = bpf_core_find_cands(local_btf, local_id, targ_btf);
if (IS_ERR(cand_ids)) {
pr_warn("prog '%s': relo #%d: target candidate search failed for [%d] %s %s: %ld",
if (!hashmap__find(cand_cache, type_key, (void **)&cands)) {
cands = bpf_core_find_cands(prog->obj, local_btf, local_id);
if (IS_ERR(cands)) {
pr_warn("prog '%s': relo #%d: target candidate search failed for [%d] %s %s: %ld\n",
prog->name, relo_idx, local_id, btf_kind_str(local_type),
local_name, PTR_ERR(cand_ids));
return PTR_ERR(cand_ids);
local_name, PTR_ERR(cands));
return PTR_ERR(cands);
}
err = hashmap__set(cand_cache, type_key, cand_ids, NULL, NULL);
err = hashmap__set(cand_cache, type_key, cands, NULL, NULL);
if (err) {
bpf_core_free_cands(cand_ids);
bpf_core_free_cands(cands);
return err;
}
}
for (i = 0, j = 0; i < cand_ids->len; i++) {
cand_id = cand_ids->data[i];
err = bpf_core_spec_match(&local_spec, targ_btf, cand_id, &cand_spec);
for (i = 0, j = 0; i < cands->len; i++) {
err = bpf_core_spec_match(&local_spec, cands->cands[i].btf,
cands->cands[i].id, &cand_spec);
if (err < 0) {
pr_warn("prog '%s': relo #%d: error matching candidate #%d ",
prog->name, relo_idx, i);
@@ -5778,7 +5983,7 @@ static int bpf_core_apply_relo(struct bpf_program *prog,
return -EINVAL;
}
cand_ids->data[j++] = cand_spec.root_type_id;
cands->cands[j++] = cands->cands[i];
}
/*
@@ -5790,7 +5995,7 @@ static int bpf_core_apply_relo(struct bpf_program *prog,
* depending on relo's kind.
*/
if (j > 0)
cand_ids->len = j;
cands->len = j;
/*
* If no candidates were found, it might be both a programmer error,
@@ -5834,20 +6039,19 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path)
struct hashmap_entry *entry;
struct hashmap *cand_cache = NULL;
struct bpf_program *prog;
struct btf *targ_btf;
const char *sec_name;
int i, err = 0, insn_idx, sec_idx;
if (obj->btf_ext->core_relo_info.len == 0)
return 0;
if (targ_btf_path)
targ_btf = btf__parse(targ_btf_path, NULL);
else
targ_btf = obj->btf_vmlinux;
if (IS_ERR_OR_NULL(targ_btf)) {
pr_warn("failed to get target BTF: %ld\n", PTR_ERR(targ_btf));
return PTR_ERR(targ_btf);
if (targ_btf_path) {
obj->btf_vmlinux_override = btf__parse(targ_btf_path, NULL);
if (IS_ERR_OR_NULL(obj->btf_vmlinux_override)) {
err = PTR_ERR(obj->btf_vmlinux_override);
pr_warn("failed to parse target BTF: %d\n", err);
return err;
}
}
cand_cache = hashmap__new(bpf_core_hash_fn, bpf_core_equal_fn, NULL);
@@ -5899,8 +6103,7 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path)
if (!prog->load)
continue;
err = bpf_core_apply_relo(prog, rec, i, obj->btf,
targ_btf, cand_cache);
err = bpf_core_apply_relo(prog, rec, i, obj->btf, cand_cache);
if (err) {
pr_warn("prog '%s': relo #%d: failed to relocate: %d\n",
prog->name, i, err);
@@ -5910,9 +6113,10 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path)
}
out:
/* obj->btf_vmlinux is freed at the end of object load phase */
if (targ_btf != obj->btf_vmlinux)
btf__free(targ_btf);
/* obj->btf_vmlinux and module BTFs are freed after object load */
btf__free(obj->btf_vmlinux_override);
obj->btf_vmlinux_override = NULL;
if (!IS_ERR_OR_NULL(cand_cache)) {
hashmap__for_each_entry(cand_cache, entry, i) {
bpf_core_free_cands(entry->value);
@@ -6623,16 +6827,25 @@ static int
load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt,
char *license, __u32 kern_version, int *pfd)
{
struct bpf_load_program_attr load_attr;
struct bpf_prog_load_params load_attr = {};
char *cp, errmsg[STRERR_BUFSIZE];
size_t log_buf_size = 0;
char *log_buf = NULL;
int btf_fd, ret;
if (prog->type == BPF_PROG_TYPE_UNSPEC) {
/*
* The program type must be set. Most likely we couldn't find a proper
* section definition at load time, and thus we didn't infer the type.
*/
pr_warn("prog '%s': missing BPF prog type, check ELF section name '%s'\n",
prog->name, prog->sec_name);
return -EINVAL;
}
if (!insns || !insns_cnt)
return -EINVAL;
memset(&load_attr, 0, sizeof(struct bpf_load_program_attr));
load_attr.prog_type = prog->type;
/* old kernels might not support specifying expected_attach_type */
if (!kernel_supports(FEAT_EXP_ATTACH_TYPE) && prog->sec_def &&
@@ -6643,19 +6856,17 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt,
if (kernel_supports(FEAT_PROG_NAME))
load_attr.name = prog->name;
load_attr.insns = insns;
load_attr.insns_cnt = insns_cnt;
load_attr.insn_cnt = insns_cnt;
load_attr.license = license;
if (prog->type == BPF_PROG_TYPE_STRUCT_OPS ||
prog->type == BPF_PROG_TYPE_LSM) {
load_attr.attach_btf_id = prog->attach_btf_id;
} else if (prog->type == BPF_PROG_TYPE_TRACING ||
prog->type == BPF_PROG_TYPE_EXT) {
load_attr.attach_btf_id = prog->attach_btf_id;
if (prog->attach_prog_fd)
load_attr.attach_prog_fd = prog->attach_prog_fd;
load_attr.attach_btf_id = prog->attach_btf_id;
} else {
load_attr.kern_version = kern_version;
load_attr.prog_ifindex = prog->prog_ifindex;
}
else
load_attr.attach_btf_obj_fd = prog->attach_btf_obj_fd;
load_attr.attach_btf_id = prog->attach_btf_id;
load_attr.kern_version = kern_version;
load_attr.prog_ifindex = prog->prog_ifindex;
/* specify func_info/line_info only if kernel supports them */
btf_fd = bpf_object__btf_fd(prog->obj);
if (btf_fd >= 0 && kernel_supports(FEAT_BTF_FUNC)) {
@@ -6679,7 +6890,9 @@ retry_load:
*log_buf = 0;
}
ret = bpf_load_program_xattr(&load_attr, log_buf, log_buf_size);
load_attr.log_buf = log_buf;
load_attr.log_buf_sz = log_buf_size;
ret = libbpf__bpf_prog_load(&load_attr);
if (ret >= 0) {
if (log_buf && load_attr.log_level)
@@ -6720,9 +6933,9 @@ retry_load:
pr_warn("-- BEGIN DUMP LOG ---\n");
pr_warn("\n%s\n", log_buf);
pr_warn("-- END LOG --\n");
} else if (load_attr.insns_cnt >= BPF_MAXINSNS) {
} else if (load_attr.insn_cnt >= BPF_MAXINSNS) {
pr_warn("Program too large (%zu insns), at most %d insns\n",
load_attr.insns_cnt, BPF_MAXINSNS);
load_attr.insn_cnt, BPF_MAXINSNS);
ret = -LIBBPF_ERRNO__PROG2BIG;
} else if (load_attr.prog_type != BPF_PROG_TYPE_KPROBE) {
/* Wrong program type? */
@@ -6730,7 +6943,9 @@ retry_load:
load_attr.prog_type = BPF_PROG_TYPE_KPROBE;
load_attr.expected_attach_type = 0;
fd = bpf_load_program_xattr(&load_attr, NULL, 0);
load_attr.log_buf = NULL;
load_attr.log_buf_sz = 0;
fd = libbpf__bpf_prog_load(&load_attr);
if (fd >= 0) {
close(fd);
ret = -LIBBPF_ERRNO__PROGTYPE;
@@ -6743,11 +6958,11 @@ out:
return ret;
}
static int libbpf_find_attach_btf_id(struct bpf_program *prog);
static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd, int *btf_type_id);
int bpf_program__load(struct bpf_program *prog, char *license, __u32 kern_ver)
{
int err = 0, fd, i, btf_id;
int err = 0, fd, i;
if (prog->obj->loaded) {
pr_warn("prog '%s': can't load after object was loaded\n", prog->name);
@@ -6757,10 +6972,14 @@ int bpf_program__load(struct bpf_program *prog, char *license, __u32 kern_ver)
if ((prog->type == BPF_PROG_TYPE_TRACING ||
prog->type == BPF_PROG_TYPE_LSM ||
prog->type == BPF_PROG_TYPE_EXT) && !prog->attach_btf_id) {
btf_id = libbpf_find_attach_btf_id(prog);
if (btf_id <= 0)
return btf_id;
prog->attach_btf_id = btf_id;
int btf_obj_fd = 0, btf_type_id = 0;
err = libbpf_find_attach_btf_id(prog, &btf_obj_fd, &btf_type_id);
if (err)
return err;
prog->attach_btf_obj_fd = btf_obj_fd;
prog->attach_btf_id = btf_type_id;
}
if (prog->instances.nr < 0 || !prog->instances.fds) {
@@ -6920,9 +7139,12 @@ __bpf_object__open(const char *path, const void *obj_buf, size_t obj_buf_sz,
bpf_object__for_each_program(prog, obj) {
prog->sec_def = find_sec_def(prog->sec_name);
if (!prog->sec_def)
if (!prog->sec_def) {
/* couldn't guess, but user might manually specify */
pr_debug("prog '%s': unrecognized ELF section name '%s'\n",
prog->name, prog->sec_name);
continue;
}
if (prog->sec_def->is_sleepable)
prog->prog_flags |= BPF_F_SLEEPABLE;
@@ -7259,7 +7481,7 @@ int bpf_object__load_xattr(struct bpf_object_load_attr *attr)
}
err = bpf_object__probe_loading(obj);
err = err ? : bpf_object__load_vmlinux_btf(obj);
err = err ? : bpf_object__load_vmlinux_btf(obj, false);
err = err ? : bpf_object__resolve_externs(obj, obj->kconfig);
err = err ? : bpf_object__sanitize_and_load_btf(obj);
err = err ? : bpf_object__sanitize_maps(obj);
@@ -7268,6 +7490,15 @@ int bpf_object__load_xattr(struct bpf_object_load_attr *attr)
err = err ? : bpf_object__relocate(obj, attr->target_btf_path);
err = err ? : bpf_object__load_progs(obj, attr->log_level);
/* clean up module BTFs */
for (i = 0; i < obj->btf_module_cnt; i++) {
close(obj->btf_modules[i].fd);
btf__free(obj->btf_modules[i].btf);
free(obj->btf_modules[i].name);
}
free(obj->btf_modules);
/* clean up vmlinux BTF */
btf__free(obj->btf_vmlinux);
obj->btf_vmlinux = NULL;
@@ -7646,6 +7877,16 @@ bool bpf_map__is_pinned(const struct bpf_map *map)
return map->pinned;
}
static void sanitize_pin_path(char *s)
{
/* bpffs disallows periods in path names */
while (*s) {
if (*s == '.')
*s = '_';
s++;
}
}
int bpf_object__pin_maps(struct bpf_object *obj, const char *path)
{
struct bpf_map *map;
@@ -7675,6 +7916,7 @@ int bpf_object__pin_maps(struct bpf_object *obj, const char *path)
err = -ENAMETOOLONG;
goto err_unpin_maps;
}
sanitize_pin_path(buf);
pin_path = buf;
} else if (!map->pin_path) {
continue;
@@ -7719,6 +7961,7 @@ int bpf_object__unpin_maps(struct bpf_object *obj, const char *path)
return -EINVAL;
else if (len >= PATH_MAX)
return -ENAMETOOLONG;
sanitize_pin_path(buf);
pin_path = buf;
} else if (!map->pin_path) {
continue;
@@ -8604,8 +8847,8 @@ static int find_btf_by_prefix_kind(const struct btf *btf, const char *prefix,
return btf__find_by_name_kind(btf, btf_type_name, kind);
}
static inline int __find_vmlinux_btf_id(struct btf *btf, const char *name,
enum bpf_attach_type attach_type)
static inline int find_attach_btf_id(struct btf *btf, const char *name,
enum bpf_attach_type attach_type)
{
int err;
@@ -8621,9 +8864,6 @@ static inline int __find_vmlinux_btf_id(struct btf *btf, const char *name,
else
err = btf__find_by_name_kind(btf, name, BTF_KIND_FUNC);
if (err <= 0)
pr_warn("%s is not found in vmlinux BTF\n", name);
return err;
}
@@ -8639,7 +8879,10 @@ int libbpf_find_vmlinux_btf_id(const char *name,
return -EINVAL;
}
err = __find_vmlinux_btf_id(btf, name, attach_type);
err = find_attach_btf_id(btf, name, attach_type);
if (err <= 0)
pr_warn("%s is not found in vmlinux BTF\n", name);
btf__free(btf);
return err;
}
@@ -8677,11 +8920,49 @@ out:
return err;
}
static int libbpf_find_attach_btf_id(struct bpf_program *prog)
static int find_kernel_btf_id(struct bpf_object *obj, const char *attach_name,
enum bpf_attach_type attach_type,
int *btf_obj_fd, int *btf_type_id)
{
int ret, i;
ret = find_attach_btf_id(obj->btf_vmlinux, attach_name, attach_type);
if (ret > 0) {
*btf_obj_fd = 0; /* vmlinux BTF */
*btf_type_id = ret;
return 0;
}
if (ret != -ENOENT)
return ret;
ret = load_module_btfs(obj);
if (ret)
return ret;
for (i = 0; i < obj->btf_module_cnt; i++) {
const struct module_btf *mod = &obj->btf_modules[i];
ret = find_attach_btf_id(mod->btf, attach_name, attach_type);
if (ret > 0) {
*btf_obj_fd = mod->fd;
*btf_type_id = ret;
return 0;
}
if (ret == -ENOENT)
continue;
return ret;
}
return -ESRCH;
}
static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd, int *btf_type_id)
{
enum bpf_attach_type attach_type = prog->expected_attach_type;
__u32 attach_prog_fd = prog->attach_prog_fd;
const char *name = prog->sec_name;
const char *name = prog->sec_name, *attach_name;
const struct bpf_sec_def *sec = NULL;
int i, err;
if (!name)
@@ -8692,17 +8973,37 @@ static int libbpf_find_attach_btf_id(struct bpf_program *prog)
continue;
if (strncmp(name, section_defs[i].sec, section_defs[i].len))
continue;
if (attach_prog_fd)
err = libbpf_find_prog_btf_id(name + section_defs[i].len,
attach_prog_fd);
else
err = __find_vmlinux_btf_id(prog->obj->btf_vmlinux,
name + section_defs[i].len,
attach_type);
sec = &section_defs[i];
break;
}
if (!sec) {
pr_warn("failed to identify BTF ID based on ELF section name '%s'\n", name);
return -ESRCH;
}
attach_name = name + sec->len;
/* BPF program's BTF ID */
if (attach_prog_fd) {
err = libbpf_find_prog_btf_id(attach_name, attach_prog_fd);
if (err < 0) {
pr_warn("failed to find BPF program (FD %d) BTF ID for '%s': %d\n",
attach_prog_fd, attach_name, err);
return err;
}
*btf_obj_fd = 0;
*btf_type_id = err;
return 0;
}
/* kernel/module BTF ID */
err = find_kernel_btf_id(prog->obj, attach_name, attach_type, btf_obj_fd, btf_type_id);
if (err) {
pr_warn("failed to find kernel BTF type ID of '%s': %d\n", attach_name, err);
return err;
}
pr_warn("failed to identify btf_id based on ELF section name '%s'\n", name);
return -ESRCH;
return 0;
}
int libbpf_attach_type_by_name(const char *name,
@@ -10575,22 +10876,33 @@ int bpf_program__set_attach_target(struct bpf_program *prog,
int attach_prog_fd,
const char *attach_func_name)
{
int btf_id;
int btf_obj_fd = 0, btf_id = 0, err;
if (!prog || attach_prog_fd < 0 || !attach_func_name)
return -EINVAL;
if (attach_prog_fd)
if (prog->obj->loaded)
return -EINVAL;
if (attach_prog_fd) {
btf_id = libbpf_find_prog_btf_id(attach_func_name,
attach_prog_fd);
else
btf_id = libbpf_find_vmlinux_btf_id(attach_func_name,
prog->expected_attach_type);
if (btf_id < 0)
return btf_id;
if (btf_id < 0)
return btf_id;
} else {
/* load btf_vmlinux, if not yet */
err = bpf_object__load_vmlinux_btf(prog->obj, true);
if (err)
return err;
err = find_kernel_btf_id(prog->obj, attach_func_name,
prog->expected_attach_type,
&btf_obj_fd, &btf_id);
if (err)
return err;
}
prog->attach_btf_id = btf_id;
prog->attach_btf_obj_fd = btf_obj_fd;
prog->attach_prog_fd = attach_prog_fd;
return 0;
}

View File

@@ -536,6 +536,7 @@ LIBBPF_API int ring_buffer__add(struct ring_buffer *rb, int map_fd,
ring_buffer_sample_fn sample_cb, void *ctx);
LIBBPF_API int ring_buffer__poll(struct ring_buffer *rb, int timeout_ms);
LIBBPF_API int ring_buffer__consume(struct ring_buffer *rb);
LIBBPF_API int ring_buffer__epoll_fd(const struct ring_buffer *rb);
/* Perf buffer APIs */
struct perf_buffer;

View File

@@ -337,3 +337,16 @@ LIBBPF_0.2.0 {
perf_buffer__consume_buffer;
xsk_socket__create_shared;
} LIBBPF_0.1.0;
LIBBPF_0.3.0 {
global:
btf__base_btf;
btf__parse_elf_split;
btf__parse_raw_split;
btf__parse_split;
btf__new_empty_split;
btf__new_split;
ring_buffer__epoll_fd;
xsk_setup_xdp_prog;
xsk_socket__update_xskmap;
} LIBBPF_0.2.0;

View File

@@ -151,10 +151,41 @@ int parse_cpu_mask_file(const char *fcpu, bool **mask, int *mask_sz);
int libbpf__load_raw_btf(const char *raw_types, size_t types_len,
const char *str_sec, size_t str_len);
struct bpf_prog_load_params {
enum bpf_prog_type prog_type;
enum bpf_attach_type expected_attach_type;
const char *name;
const struct bpf_insn *insns;
size_t insn_cnt;
const char *license;
__u32 kern_version;
__u32 attach_prog_fd;
__u32 attach_btf_obj_fd;
__u32 attach_btf_id;
__u32 prog_ifindex;
__u32 prog_btf_fd;
__u32 prog_flags;
__u32 func_info_rec_size;
const void *func_info;
__u32 func_info_cnt;
__u32 line_info_rec_size;
const void *line_info;
__u32 line_info_cnt;
__u32 log_level;
char *log_buf;
size_t log_buf_sz;
};
int libbpf__bpf_prog_load(const struct bpf_prog_load_params *load_attr);
int bpf_object__section_size(const struct bpf_object *obj, const char *name,
__u32 *size);
int bpf_object__variable_offset(const struct bpf_object *obj, const char *name,
__u32 *off);
struct btf *btf_get_from_fd(int btf_fd, struct btf *base_btf);
struct btf_ext_info {
/*

View File

@@ -230,6 +230,7 @@ bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex)
break;
case BPF_MAP_TYPE_SK_STORAGE:
case BPF_MAP_TYPE_INODE_STORAGE:
case BPF_MAP_TYPE_TASK_STORAGE:
btf_key_type_id = 1;
btf_value_type_id = 3;
value_size = 8;

View File

@@ -278,7 +278,13 @@ int ring_buffer__poll(struct ring_buffer *rb, int timeout_ms)
err = ringbuf_process_ring(ring);
if (err < 0)
return err;
res += cnt;
res += err;
}
return cnt < 0 ? -errno : res;
}
/* Get an fd that can be used to sleep until data is available in the ring(s) */
int ring_buffer__epoll_fd(const struct ring_buffer *rb)
{
return rb->epoll_fd;
}

101
src/xsk.c
View File

@@ -566,8 +566,35 @@ static int xsk_set_bpf_maps(struct xsk_socket *xsk)
&xsk->fd, 0);
}
static int xsk_setup_xdp_prog(struct xsk_socket *xsk)
static int xsk_create_xsk_struct(int ifindex, struct xsk_socket *xsk)
{
char ifname[IFNAMSIZ];
struct xsk_ctx *ctx;
char *interface;
ctx = calloc(1, sizeof(*ctx));
if (!ctx)
return -ENOMEM;
interface = if_indextoname(ifindex, &ifname[0]);
if (!interface) {
free(ctx);
return -errno;
}
ctx->ifindex = ifindex;
memcpy(ctx->ifname, ifname, IFNAMSIZ -1);
ctx->ifname[IFNAMSIZ - 1] = 0;
xsk->ctx = ctx;
return 0;
}
static int __xsk_setup_xdp_prog(struct xsk_socket *_xdp,
int *xsks_map_fd)
{
struct xsk_socket *xsk = _xdp;
struct xsk_ctx *ctx = xsk->ctx;
__u32 prog_id = 0;
int err;
@@ -584,8 +611,7 @@ static int xsk_setup_xdp_prog(struct xsk_socket *xsk)
err = xsk_load_xdp_prog(xsk);
if (err) {
xsk_delete_bpf_maps(xsk);
return err;
goto err_load_xdp_prog;
}
} else {
ctx->prog_fd = bpf_prog_get_fd_by_id(prog_id);
@@ -598,15 +624,29 @@ static int xsk_setup_xdp_prog(struct xsk_socket *xsk)
}
}
if (xsk->rx)
if (xsk->rx) {
err = xsk_set_bpf_maps(xsk);
if (err) {
xsk_delete_bpf_maps(xsk);
close(ctx->prog_fd);
return err;
if (err) {
if (!prog_id) {
goto err_set_bpf_maps;
} else {
close(ctx->prog_fd);
return err;
}
}
}
if (xsks_map_fd)
*xsks_map_fd = ctx->xsks_map_fd;
return 0;
err_set_bpf_maps:
close(ctx->prog_fd);
bpf_set_link_xdp_fd(ctx->ifindex, -1, 0);
err_load_xdp_prog:
xsk_delete_bpf_maps(xsk);
return err;
}
static struct xsk_ctx *xsk_get_ctx(struct xsk_umem *umem, int ifindex,
@@ -689,6 +729,40 @@ static struct xsk_ctx *xsk_create_ctx(struct xsk_socket *xsk,
return ctx;
}
static void xsk_destroy_xsk_struct(struct xsk_socket *xsk)
{
free(xsk->ctx);
free(xsk);
}
int xsk_socket__update_xskmap(struct xsk_socket *xsk, int fd)
{
xsk->ctx->xsks_map_fd = fd;
return xsk_set_bpf_maps(xsk);
}
int xsk_setup_xdp_prog(int ifindex, int *xsks_map_fd)
{
struct xsk_socket *xsk;
int res;
xsk = calloc(1, sizeof(*xsk));
if (!xsk)
return -ENOMEM;
res = xsk_create_xsk_struct(ifindex, xsk);
if (res) {
free(xsk);
return -EINVAL;
}
res = __xsk_setup_xdp_prog(xsk, xsks_map_fd);
xsk_destroy_xsk_struct(xsk);
return res;
}
int xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
const char *ifname,
__u32 queue_id, struct xsk_umem *umem,
@@ -838,7 +912,7 @@ int xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
ctx->prog_fd = -1;
if (!(xsk->config.libbpf_flags & XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD)) {
err = xsk_setup_xdp_prog(xsk);
err = __xsk_setup_xdp_prog(xsk, NULL);
if (err)
goto out_mmap_tx;
}
@@ -891,13 +965,16 @@ int xsk_umem__delete(struct xsk_umem *umem)
void xsk_socket__delete(struct xsk_socket *xsk)
{
size_t desc_sz = sizeof(struct xdp_desc);
struct xsk_ctx *ctx = xsk->ctx;
struct xdp_mmap_offsets off;
struct xsk_umem *umem;
struct xsk_ctx *ctx;
int err;
if (!xsk)
return;
ctx = xsk->ctx;
umem = ctx->umem;
if (ctx->prog_fd != -1) {
xsk_delete_bpf_maps(xsk);
close(ctx->prog_fd);
@@ -917,11 +994,11 @@ void xsk_socket__delete(struct xsk_socket *xsk)
xsk_put_ctx(ctx);
ctx->umem->refcount--;
umem->refcount--;
/* Do not close an fd that also has an associated umem connected
* to it.
*/
if (xsk->fd != ctx->umem->fd)
if (xsk->fd != umem->fd)
close(xsk->fd);
free(xsk);
}

View File

@@ -113,8 +113,7 @@ static inline __u32 xsk_cons_nb_avail(struct xsk_ring_cons *r, __u32 nb)
return (entries > nb) ? nb : entries;
}
static inline size_t xsk_ring_prod__reserve(struct xsk_ring_prod *prod,
size_t nb, __u32 *idx)
static inline __u32 xsk_ring_prod__reserve(struct xsk_ring_prod *prod, __u32 nb, __u32 *idx)
{
if (xsk_prod_nb_free(prod, nb) < nb)
return 0;
@@ -125,7 +124,7 @@ static inline size_t xsk_ring_prod__reserve(struct xsk_ring_prod *prod,
return nb;
}
static inline void xsk_ring_prod__submit(struct xsk_ring_prod *prod, size_t nb)
static inline void xsk_ring_prod__submit(struct xsk_ring_prod *prod, __u32 nb)
{
/* Make sure everything has been written to the ring before indicating
* this to the kernel by writing the producer pointer.
@@ -135,10 +134,9 @@ static inline void xsk_ring_prod__submit(struct xsk_ring_prod *prod, size_t nb)
*prod->producer += nb;
}
static inline size_t xsk_ring_cons__peek(struct xsk_ring_cons *cons,
size_t nb, __u32 *idx)
static inline __u32 xsk_ring_cons__peek(struct xsk_ring_cons *cons, __u32 nb, __u32 *idx)
{
size_t entries = xsk_cons_nb_avail(cons, nb);
__u32 entries = xsk_cons_nb_avail(cons, nb);
if (entries > 0) {
/* Make sure we do not speculatively read the data before
@@ -153,7 +151,12 @@ static inline size_t xsk_ring_cons__peek(struct xsk_ring_cons *cons,
return entries;
}
static inline void xsk_ring_cons__release(struct xsk_ring_cons *cons, size_t nb)
static inline void xsk_ring_cons__cancel(struct xsk_ring_cons *cons, __u32 nb)
{
cons->cached_cons -= nb;
}
static inline void xsk_ring_cons__release(struct xsk_ring_cons *cons, __u32 nb)
{
/* Make sure data has been read before indicating we are done
* with the entries by updating the consumer pointer.
@@ -201,6 +204,11 @@ struct xsk_umem_config {
__u32 flags;
};
LIBBPF_API int xsk_setup_xdp_prog(int ifindex,
int *xsks_map_fd);
LIBBPF_API int xsk_socket__update_xskmap(struct xsk_socket *xsk,
int xsks_map_fd);
/* Flags for the libbpf_flags field. */
#define XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD (1 << 0)

View File

@@ -20,18 +20,23 @@ fexit_test # bpf_prog_test_tracing missing
flow_dissector # bpf_link-based flow dissector is in 5.8+
flow_dissector_reattach
get_stack_raw_tp # exercising BPF verifier bug causing infinite loop
hash_large_key # v5.11+
ima # v5.11+
kfree_skb # 32-bit pointer arith in test_pkt_access
ksyms # __start_BTF has different name
link_pinning # bpf_link is missing
load_bytes_relative # new functionality in 5.8
map_init # per-CPU LRU missing
map_ptr # test uses BPF_MAP_TYPE_RINGBUF, added in 5.8
metadata # v5.10+
mmap # 5.5 kernel is too permissive with re-mmaping
modify_return # fmod_ret support is missing
module_attach # module BTF support missing (v5.11+)
ns_current_pid_tgid # bpf_get_ns_current_pid_tgid() helper is missing
pe_preserve_elems # v5.10+
perf_branches # bpf_read_branch_records() helper is missing
pkt_access # 32-bit pointer arith in test_pkt_access
probe_read_user_str # kernel bug with garbage bytes at the end
prog_run_xattr # 32-bit pointer arith in test_pkt_access
raw_tp_test_run # v5.10+
ringbuf # BPF_MAP_TYPE_RINGBUF is supported in 5.8+
@@ -44,6 +49,7 @@ select_reuseport # UDP support is missing
send_signal # bpf_send_signal_thread() helper is missing
sk_assign # bpf_sk_assign helper missing
skb_helpers # helpers added in 5.8+
sk_storage_tracing # missing bpf_sk_storage_get() helper
snprintf_btf # v5.10+
sock_fields # v5.10+
sockmap_basic # uses new socket fields, 5.8+
@@ -52,12 +58,15 @@ sockopt_sk
sk_lookup # v5.9+
skb_ctx # ctx_{size, }_{in, out} in BPF_PROG_TEST_RUN is missing
tcp_hdr_options # v5.10+, new TCP header options feature in BPF
tcpbpf_user # LINK_CREATE is missing
test_bpffs # v5.10+, new CONFIG_BPF_PRELOAD=y and CONFIG_BPF_PRELOAD_UMG=y|m
test_bprm_opts # v5.11+
test_global_funcs # kernel doesn't support BTF linkage=global on FUNCs
test_local_storage # v5.10+ feature
test_lsm # no BPF_LSM support
test_overhead # no fmod_ret support
test_profiler # needs verifier logic improvements from v5.10+
test_skb_pkt_end # v5.11+
trace_ext # v5.10+
udp_limit # no cgroup/sock_release BPF program type (5.9+)
varlen # verifier bug fixed in later kernels

View File

@@ -1388,7 +1388,7 @@ CONFIG_BLK_DEV=y
# CONFIG_BLK_DEV_FD is not set
# CONFIG_BLK_DEV_PCIESSD_MTIP32XX is not set
# CONFIG_BLK_DEV_UMEM is not set
# CONFIG_BLK_DEV_LOOP is not set
CONFIG_BLK_DEV_LOOP=y
# CONFIG_BLK_DEV_DRBD is not set
# CONFIG_BLK_DEV_NBD is not set
# CONFIG_BLK_DEV_SKD is not set
@@ -2394,8 +2394,8 @@ CONFIG_IMA_DEFAULT_TEMPLATE="ima-ng"
CONFIG_IMA_DEFAULT_HASH_SHA1=y
# CONFIG_IMA_DEFAULT_HASH_SHA256 is not set
CONFIG_IMA_DEFAULT_HASH="sha1"
# CONFIG_IMA_WRITE_POLICY is not set
# CONFIG_IMA_READ_POLICY is not set
CONFIG_IMA_WRITE_POLICY=y
CONFIG_IMA_READ_POLICY=y
# CONFIG_IMA_APPRAISE is not set
CONFIG_IMA_MEASURE_ASYMMETRIC_KEYS=y
CONFIG_IMA_QUEUE_EARLY_BOOT_KEYS=y
@@ -2403,7 +2403,7 @@ CONFIG_IMA_QUEUE_EARLY_BOOT_KEYS=y
# CONFIG_EVM is not set
# CONFIG_DEFAULT_SECURITY_SELINUX is not set
CONFIG_DEFAULT_SECURITY_DAC=y
CONFIG_LSM="selinux,bpf"
CONFIG_LSM="selinux,bpf,integrity"
#
# Kernel hardening options

View File

@@ -0,0 +1,3 @@
#!/bin/bash
printf "all:\n\ttouch bpf_testmod.ko\n\nclean:\n" > bpf_testmod/Makefile

View File

@@ -0,0 +1,3 @@
#!/bin/bash
printf "all:\n\ttouch bpf_testmod.ko\n\nclean:\n" > bpf_testmod/Makefile

View File

@@ -434,8 +434,10 @@ sudo chmod 755 "$mnt/etc/rcS.d/S99-poweroff"
sudo umount "$mnt"
echo "Starting VM with $(nproc) CPUs..."
qemu-system-x86_64 -nodefaults -display none -serial mon:stdio \
-cpu kvm64 -enable-kvm -smp "$(nproc)" -m 2G \
-cpu kvm64 -enable-kvm -smp "$(nproc)" -m 4G \
-drive file="$IMG",format=raw,index=1,media=disk,if=virtio,cache=none \
-kernel "$vmlinuz" -append "root=/dev/vda rw console=ttyS0,115200$APPEND"

View File

@@ -46,6 +46,6 @@ cd libbpf/selftests/bpf
test_progs
if [[ "${KERNEL}" == 'latest' ]]; then
test_maps
#test_maps
test_verifier
fi

File diff suppressed because it is too large Load Diff