Compare commits

..

219 Commits

Author SHA1 Message Date
thiagoftsm
70599f3a1e Merge branch 'libbpf:master' into master 2022-07-15 16:43:22 +00:00
Andrii Nakryiko
b78c75fcb3 Makefile: remove xsk.c and xsk.h
xsk.{c,h} are not part of libbpf anymore, remove them from Makefile.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-07-03 20:23:34 -07:00
Andrii Nakryiko
f42d136c1c sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   bb7a4257892717caf82fe6da45b259b35f73445c
Checkpoint bpf-next commit: b0d93b44641a83c28014ca38001e85bf6dc8501e
Baseline bpf commit:        a2b1a5d40bd12b44322c2ccd40bb0ec1699708b6
Checkpoint bpf commit:      d28b25a62a47a8c8aa19bd543863aab6717e68c9

Andrii Nakryiko (14):
  libbpf: move xsk.{c,h} into selftests/bpf
  libbpf: remove deprecated low-level APIs
  libbpf: remove deprecated XDP APIs
  libbpf: remove deprecated probing APIs
  libbpf: remove deprecated BTF APIs
  libbpf: clean up perfbuf APIs
  libbpf: remove prog_info_linear APIs
  libbpf: remove most other deprecated high-level APIs
  libbpf: remove multi-instance and custom private data APIs
  libbpf: cleanup LIBBPF_DEPRECATED_SINCE supporting macros for v0.x
  libbpf: remove internal multi-instance prog support
  libbpf: clean up SEC() handling
  libbpf: enforce strict libbpf 1.0 behaviors
  libbpf: fix up few libbpf.map problems

Daniel Müller (1):
  bpf: Merge "types_are_compat" logic into relo_core.c

Stanislav Fomichev (4):
  bpf: per-cgroup lsm flavor
  tools/bpf: Sync btf_ids.h to tools
  libbpf: add lsm_cgoup_sock type
  libbpf: implement bpf_prog_query_opts

 include/uapi/linux/bpf.h |    4 +
 src/bpf.c                |  200 +----
 src/bpf.h                |   98 +--
 src/btf.c                |  183 +----
 src/btf.h                |   86 +--
 src/btf_dump.c           |   23 +-
 src/libbpf.c             | 1500 ++++----------------------------------
 src/libbpf.h             |  469 +-----------
 src/libbpf.map           |  114 +--
 src/libbpf_common.h      |   16 +-
 src/libbpf_internal.h    |   24 +-
 src/libbpf_legacy.h      |   28 +-
 src/libbpf_probes.c      |  125 +---
 src/netlink.c            |   62 +-
 src/relo_core.c          |   80 ++
 src/relo_core.h          |    2 +
 src/xsk.c                | 1260 --------------------------------
 src/xsk.h                |  336 ---------
 18 files changed, 339 insertions(+), 4271 deletions(-)
 delete mode 100644 src/xsk.c
 delete mode 100644 src/xsk.h

--
2.30.2
2022-07-03 20:23:34 -07:00
Stanislav Fomichev
812a95fdf7 libbpf: implement bpf_prog_query_opts
Implement bpf_prog_query_opts as a more expendable version of
bpf_prog_query. Expose new prog_attach_flags and attach_btf_func_id as
well:

* prog_attach_flags is a per-program attach_type; relevant only for
  lsm cgroup program which might have different attach_flags
  per attach_btf_id
* attach_btf_func_id is a new field expose for prog_query which
  specifies real btf function id for lsm cgroup attachments

Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20220628174314.1216643-10-sdf@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-03 20:23:34 -07:00
Stanislav Fomichev
f9f7f2d30a libbpf: add lsm_cgoup_sock type
lsm_cgroup/ is the prefix for BPF_LSM_CGROUP.

Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20220628174314.1216643-9-sdf@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-03 20:23:34 -07:00
Stanislav Fomichev
25ba007681 tools/bpf: Sync btf_ids.h to tools
Has been slowly getting out of sync, let's update it.

resolve_btfids usage has been updated to match the header changes.

Also bring new parts of tools/include/uapi/linux/bpf.h.

Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20220628174314.1216643-8-sdf@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-03 20:23:34 -07:00
Stanislav Fomichev
9bdb296ec6 bpf: per-cgroup lsm flavor
Allow attaching to lsm hooks in the cgroup context.

Attaching to per-cgroup LSM works exactly like attaching
to other per-cgroup hooks. New BPF_LSM_CGROUP is added
to trigger new mode; the actual lsm hook we attach to is
signaled via existing attach_btf_id.

For the hooks that have 'struct socket' or 'struct sock' as its first
argument, we use the cgroup associated with that socket. For the rest,
we use 'current' cgroup (this is all on default hierarchy == v2 only).
Note that for some hooks that work on 'struct sock' we still
take the cgroup from 'current' because some of them work on the socket
that hasn't been properly initialized yet.

Behind the scenes, we allocate a shim program that is attached
to the trampoline and runs cgroup effective BPF programs array.
This shim has some rudimentary ref counting and can be shared
between several programs attaching to the same lsm hook from
different cgroups.

Note that this patch bloats cgroup size because we add 211
cgroup_bpf_attach_type(s) for simplicity sake. This will be
addressed in the subsequent patch.

Also note that we only add non-sleepable flavor for now. To enable
sleepable use-cases, bpf_prog_run_array_cg has to grab trace rcu,
shim programs have to be freed via trace rcu, cgroup_bpf.effective
should be also trace-rcu-managed + maybe some other changes that
I'm not aware of.

Reviewed-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20220628174314.1216643-4-sdf@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-03 20:23:34 -07:00
Andrii Nakryiko
f009af7889 libbpf: fix up few libbpf.map problems
Seems like we missed to add 2 APIs to libbpf.map and another API was
misspelled. Fix it in libbpf.map.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-16-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-03 20:23:34 -07:00
Andrii Nakryiko
62e8af46d2 libbpf: enforce strict libbpf 1.0 behaviors
Remove support for legacy features and behaviors that previously had to
be disabled by calling libbpf_set_strict_mode():
  - legacy BPF map definitions are not supported now;
  - RLIMIT_MEMLOCK auto-setting, if necessary, is always on (but see
    libbpf_set_memlock_rlim());
  - program name is used for program pinning (instead of section name);
  - cleaned up error returning logic;
  - entry BPF programs should have SEC() always.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-15-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-03 20:23:34 -07:00
Andrii Nakryiko
fcd1b668c6 libbpf: clean up SEC() handling
Get rid of sloppy prefix logic and remove deprecated xdp_{devmap,cpumap}
sections.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-13-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-03 20:23:34 -07:00
Andrii Nakryiko
0eb12dca7e libbpf: remove internal multi-instance prog support
Clean up internals that had to deal with the possibility of
multi-instance bpf_programs. Libbpf 1.0 doesn't support this, so all
this is not necessary now and can be simplified.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-12-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-03 20:23:34 -07:00
Andrii Nakryiko
fedeba74b7 libbpf: cleanup LIBBPF_DEPRECATED_SINCE supporting macros for v0.x
Keep the LIBBPF_DEPRECATED_SINCE macro "framework" for future
deprecations, but clean up 0.x related helper macros.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-11-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-03 20:23:34 -07:00
Andrii Nakryiko
bf51e3c336 libbpf: remove multi-instance and custom private data APIs
Remove all the public APIs that are related to creating multi-instance
bpf_programs through custom preprocessing callback and generally working
with them.

Also remove all the bpf_{object,map,program}__[set_]priv() APIs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-10-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-03 20:23:34 -07:00
Andrii Nakryiko
d8454ba8ad libbpf: remove most other deprecated high-level APIs
Remove a bunch of high-level bpf_object/bpf_map/bpf_program related
APIs. All the APIs related to private per-object/map/prog state,
program preprocessing callback, and generally everything multi-instance
related is removed in a separate patch.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-9-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-03 20:23:34 -07:00
Andrii Nakryiko
ec3bbc05c0 libbpf: remove prog_info_linear APIs
Remove prog_info_linear-related APIs previously used by perf.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-8-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-03 20:23:34 -07:00
Andrii Nakryiko
d32e7ea952 libbpf: clean up perfbuf APIs
Remove deprecated perfbuf APIs and clean up opts structs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-7-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-03 20:23:34 -07:00
Andrii Nakryiko
6abeb4203d libbpf: remove deprecated BTF APIs
Get rid of deprecated BTF-related APIs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-6-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-03 20:23:34 -07:00
Andrii Nakryiko
e28a540c59 libbpf: remove deprecated probing APIs
Get rid of deprecated feature-probing APIs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-5-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-03 20:23:34 -07:00
Andrii Nakryiko
e8802d6319 libbpf: remove deprecated XDP APIs
Get rid of deprecated bpf_set_link*() and bpf_get_link*() APIs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-4-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-03 20:23:34 -07:00
Andrii Nakryiko
9476dce6fe libbpf: remove deprecated low-level APIs
Drop low-level APIs as well as high-level (and very confusingly named)
BPF object loading bpf_prog_load_xattr() and bpf_prog_load_deprecated()
APIs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-03 20:23:34 -07:00
Andrii Nakryiko
8ee1202ff4 libbpf: move xsk.{c,h} into selftests/bpf
Remove deprecated xsk APIs from libbpf. But given we have selftests
relying on this, move those files (with minimal adjustments to make them
compilable) under selftests/bpf.

We also remove all the removed APIs from libbpf.map, while overall
keeping version inheritance chain, as most APIs are backwards
compatible so there is no need to reassign them as LIBBPF_1.0.0 versions.

Cc: Magnus Karlsson <magnus.karlsson@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220627211527.2245459-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-03 20:23:34 -07:00
Daniel Müller
7013b92fef bpf: Merge "types_are_compat" logic into relo_core.c
BPF type compatibility checks (bpf_core_types_are_compat()) are
currently duplicated between kernel and user space. That's a historical
artifact more than intentional doing and can lead to subtle bugs where
one implementation is adjusted but another is forgotten.

That happened with the enum64 work, for example, where the libbpf side
was changed (commit 23b2a3a8f63a ("libbpf: Add enum64 relocation
support")) to use the btf_kind_core_compat() helper function but the
kernel side was not (commit 6089fb325cf7 ("bpf: Add btf enum64
support")).

This patch addresses both the duplication issue, by merging both
implementations and moving them into relo_core.c, and fixes the alluded
to kind check (by giving preference to libbpf's already adjusted logic).

For discussion of the topic, please refer to:
https://lore.kernel.org/bpf/CAADnVQKbWR7oarBdewgOBZUPzryhRYvEbkhyPJQHHuxq=0K1gw@mail.gmail.com/T/#mcc99f4a33ad9a322afaf1b9276fb1f0b7add9665

Changelog:
v1 -> v2:
- limited libbpf recursion limit to 32
- changed name to __bpf_core_types_are_compat
- included warning previously present in libbpf version
- merged kernel and user space changes into a single patch

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220623182934.2582827-1-deso@posteo.net
2022-07-03 20:23:34 -07:00
Daniel Müller
20f0330235 Remove unused .travis.yml configuration
Checking earlier pull requests, to the best of my understanding nothing
is using Travis anymore -- all CI checks are GitHub Actions based.
Further checking the Travis repository [0] the last CI run there was 2
years ago.
Hence, let's remove stale configuration for Travis, as it's seemingly
only bitrotting and causing confusion.

[0]: https://travis-ci.org/github/libbpf/libbpf/builds

Signed-off-by: Daniel Müller <deso@posteo.net>
2022-06-28 18:26:00 -07:00
Andrii Nakryiko
29869d6ef0 ci: disable attach_probe test on 5.5
It's assuming kprobe w/ sleepable flag is loadable, which is failing on
5.5 kernel.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-06-24 13:32:31 -07:00
Andrii Nakryiko
72dbaf2ac3 ci: update vmlinux.h for 5.5 and 4.9 kernels
Update vmlinux.h to fix selftests build on 5.5 and 4.9 kernels.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-06-24 13:32:31 -07:00
Andrii Nakryiko
bc3673cdd5 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   3e6fe5ce4d4860c3a111c246fddc6f31492f4fb0
Checkpoint bpf-next commit: bb7a4257892717caf82fe6da45b259b35f73445c
Baseline bpf commit:        5e0b0a4c52d30bb09659446f40b77a692361600d
Checkpoint bpf commit:      a2b1a5d40bd12b44322c2ccd40bb0ec1699708b6

Delyan Kratunov (1):
  libbpf: add support for sleepable uprobe programs

Maxim Mikityanskiy (2):
  bpf: Fix documentation of th_len in bpf_tcp_{gen,check}_syncookie
  bpf: Add helpers to issue and check SYN cookies in XDP

 include/uapi/linux/bpf.h | 88 ++++++++++++++++++++++++++++++++++++++--
 src/libbpf.c             |  5 ++-
 2 files changed, 88 insertions(+), 5 deletions(-)

--
2.30.2
2022-06-24 13:32:31 -07:00
Andrii Nakryiko
78909b8caf sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2022-06-24 13:32:31 -07:00
Maxim Mikityanskiy
ec718073b0 bpf: Add helpers to issue and check SYN cookies in XDP
The new helpers bpf_tcp_raw_{gen,check}_syncookie_ipv{4,6} allow an XDP
program to generate SYN cookies in response to TCP SYN packets and to
check those cookies upon receiving the first ACK packet (the final
packet of the TCP handshake).

Unlike bpf_tcp_{gen,check}_syncookie these new helpers don't need a
listening socket on the local machine, which allows to use them together
with synproxy to accelerate SYN cookie generation.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20220615134847.3753567-4-maximmi@nvidia.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-24 13:32:31 -07:00
Maxim Mikityanskiy
9c73b6d422 bpf: Fix documentation of th_len in bpf_tcp_{gen,check}_syncookie
bpf_tcp_gen_syncookie expects the full length of the TCP header (with
all options), and bpf_tcp_check_syncookie accepts lengths bigger than
sizeof(struct tcphdr). Fix the documentation that says these lengths
should be exactly sizeof(struct tcphdr).

While at it, fix a typo in the name of struct ipv6hdr.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20220615134847.3753567-2-maximmi@nvidia.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-24 13:32:31 -07:00
Delyan Kratunov
0c84902331 libbpf: add support for sleepable uprobe programs
Add section mappings for u(ret)probe.s programs.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Delyan Kratunov <delyank@fb.com>
Link: https://lore.kernel.org/r/aedbc3b74f3523f00010a7b0df8f3388cca59f16.1655248076.git.delyank@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-24 13:32:31 -07:00
Roberto Sassu
4cb682229d configs: Enable CONFIG_MODULE_SIG
Enable CONFIG_MODULE_SIG to test the new helper
bpf_verify_pkcs7_signature().

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
2022-06-17 22:05:28 -07:00
Eyal Birger
0304a3c027 ci: enable vrf configs for x86_64
Set CONFIG_NET_L3_MASTER_DEV=y, CONFIG_NET_VRF=y for x86_64.

These options are needed for performing LWT BPF tests in test_progs.

Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
2022-06-17 09:58:20 -07:00
Mykola Lysenko
a459010926 ci: temporarily disable varlen test 2022-06-17 09:47:40 -07:00
Andrii Nakryiko
e5ff285a44 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   fe92833524e368e59bba9c57e00f7359f133667f
Checkpoint bpf-next commit: 3e6fe5ce4d4860c3a111c246fddc6f31492f4fb0
Baseline bpf commit:        825464e79db4aac936e0fdae62cdfb7546d0028f
Checkpoint bpf commit:      5e0b0a4c52d30bb09659446f40b77a692361600d

Andrii Nakryiko (1):
  libbpf: Fix internal USDT address translation logic for shared
    libraries

Yonghong Song (1):
  libbpf: Fix an unsigned < 0 bug

 src/libbpf.c |   2 +-
 src/usdt.c   | 123 ++++++++++++++++++++++++++-------------------------
 2 files changed, 64 insertions(+), 61 deletions(-)

--
2.30.2
2022-06-16 16:58:52 -07:00
Andrii Nakryiko
2d91c46d1a libbpf: Fix internal USDT address translation logic for shared libraries
Perform the same virtual address to file offset translation that libbpf
is doing for executable ELF binaries also for shared libraries.
Currently libbpf is making a simplifying and sometimes wrong assumption
that for shared libraries relative virtual addresses inside ELF are
always equal to file offsets.

Unfortunately, this is not always the case with LLVM's lld linker, which
now by default generates quite more complicated ELF segments layout.
E.g., for liburandom_read.so from selftests/bpf, here's an excerpt from
readelf output listing ELF segments (a.k.a. program headers):

  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  PHDR           0x000040 0x0000000000000040 0x0000000000000040 0x0001f8 0x0001f8 R   0x8
  LOAD           0x000000 0x0000000000000000 0x0000000000000000 0x0005e4 0x0005e4 R   0x1000
  LOAD           0x0005f0 0x00000000000015f0 0x00000000000015f0 0x000160 0x000160 R E 0x1000
  LOAD           0x000750 0x0000000000002750 0x0000000000002750 0x000210 0x000210 RW  0x1000
  LOAD           0x000960 0x0000000000003960 0x0000000000003960 0x000028 0x000029 RW  0x1000

Compare that to what is generated by GNU ld (or LLVM lld's with extra
-znoseparate-code argument which disables this cleverness in the name of
file size reduction):

  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  LOAD           0x000000 0x0000000000000000 0x0000000000000000 0x000550 0x000550 R   0x1000
  LOAD           0x001000 0x0000000000001000 0x0000000000001000 0x000131 0x000131 R E 0x1000
  LOAD           0x002000 0x0000000000002000 0x0000000000002000 0x0000ac 0x0000ac R   0x1000
  LOAD           0x002dc0 0x0000000000003dc0 0x0000000000003dc0 0x000262 0x000268 RW  0x1000

You can see from the first example above that for executable (Flg == "R E")
PT_LOAD segment (LOAD #2), Offset doesn't match VirtAddr columns.
And it does in the second case (GNU ld output).

This is important because all the addresses, including USDT specs,
operate in a virtual address space, while kernel is expecting file
offsets when performing uprobe attach. So such mismatches have to be
properly taken care of and compensated by libbpf, which is what this
patch is fixing.

Also patch clarifies few function and variable names, as well as updates
comments to reflect this important distinction (virtaddr vs file offset)
and to ephasize that shared libraries are not all that different from
executables in this regard.

This patch also changes selftests/bpf Makefile to force urand_read and
liburand_read.so to be built with Clang and LLVM's lld (and explicitly
request this ELF file size optimization through -znoseparate-code linker
parameter) to validate libbpf logic and ensure regressions don't happen
in the future. I've bundled these selftests changes together with libbpf
changes to keep the above description tied with both libbpf and
selftests changes.

Fixes: 74cc6311cec9 ("libbpf: Add USDT notes parsing and resolution logic")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220616055543.3285835-1-andrii@kernel.org
2022-06-16 16:58:52 -07:00
Yonghong Song
d3e41fc1aa libbpf: Fix an unsigned < 0 bug
Andrii reported a bug with the following information:

  2859 	if (enum64_placeholder_id == 0) {
  2860 		enum64_placeholder_id = btf__add_int(btf, "enum64_placeholder", 1, 0);
  >>>     CID 394804:  Control flow issues  (NO_EFFECT)
  >>>     This less-than-zero comparison of an unsigned value is never true. "enum64_placeholder_id < 0U".
  2861 		if (enum64_placeholder_id < 0)
  2862 			return enum64_placeholder_id;
  2863    	...

Here enum64_placeholder_id declared as '__u32' so enum64_placeholder_id < 0
is always false. Declare enum64_placeholder_id as 'int' in order to capture
the potential error properly.

Fixes: f2a625889bb8 ("libbpf: Add enum64 sanitization")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220613054314.1251905-1-yhs@fb.com
2022-06-16 16:58:52 -07:00
Andrii Nakryiko
645500dd7d ci: blacklist mptcp test on s390x
It is also blacklisted in kernel-patches CI.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-06-10 14:13:02 -07:00
Andrii Nakryiko
5497411f48 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   02f4afebf8a54ba16f99f4f6ca10df3efeac6229
Checkpoint bpf-next commit: fe92833524e368e59bba9c57e00f7359f133667f
Baseline bpf commit:        d08af2c46881b62f4efad8ebb7eae381fa1f1033
Checkpoint bpf commit:      825464e79db4aac936e0fdae62cdfb7546d0028f

Andrii Nakryiko (1):
  libbpf: Fix uprobe symbol file offset calculation logic

Yonghong Song (10):
  bpf: Add btf enum64 support
  libbpf: Permit 64bit relocation value
  libbpf: Fix an error in 64bit relocation value computation
  libbpf: Refactor btf__add_enum() for future code sharing
  libbpf: Add enum64 parsing and new enum64 public API
  libbpf: Add enum64 deduplication support
  libbpf: Add enum64 support for btf_dump
  libbpf: Add enum64 sanitization
  libbpf: Add enum64 support for bpf linking
  libbpf: Add enum64 relocation support

 include/uapi/linux/btf.h |  17 +++-
 src/btf.c                | 201 +++++++++++++++++++++++++++++++++++----
 src/btf.h                |  32 ++++++-
 src/btf_dump.c           | 137 +++++++++++++++++++-------
 src/libbpf.c             | 126 ++++++++++++++----------
 src/libbpf.map           |   2 +
 src/libbpf_internal.h    |   2 +
 src/linker.c             |   2 +
 src/relo_core.c          | 105 ++++++++++++--------
 src/relo_core.h          |   4 +-
 10 files changed, 483 insertions(+), 145 deletions(-)

--
2.30.2
2022-06-10 14:13:02 -07:00
Andrii Nakryiko
74b22b6c8a libbpf: Fix uprobe symbol file offset calculation logic
Fix libbpf's bpf_program__attach_uprobe() logic of determining
function's *file offset* (which is what kernel is actually expecting)
when attaching uprobe/uretprobe by function name. Previously calculation
was determining virtual address offset relative to base load address,
which (offset) is not always the same as file offset (though very
frequently it is which is why this went unnoticed for a while).

Fixes: 433966e3ae04 ("libbpf: Support function name-based attach uprobes")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Riham Selim <rihams@fb.com>
Cc: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20220606220143.3796908-1-andrii@kernel.org
2022-06-10 14:13:02 -07:00
Yonghong Song
416351822c libbpf: Add enum64 relocation support
The enum64 relocation support is added. The bpf local type
could be either enum or enum64 and the remote type could be
either enum or enum64 too. The all combinations of local enum/enum64
and remote enum/enum64 are supported.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062647.3721719-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-10 14:13:02 -07:00
Yonghong Song
3f9d041e19 libbpf: Add enum64 support for bpf linking
Add BTF_KIND_ENUM64 support for bpf linking, which is
very similar to BTF_KIND_ENUM.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062642.3721494-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-10 14:13:02 -07:00
Yonghong Song
a945df2439 libbpf: Add enum64 sanitization
When old kernel does not support enum64 but user space btf
contains non-zero enum kflag or enum64, libbpf needs to
do proper sanitization so modified btf can be accepted
by the kernel.

Sanitization for enum kflag can be achieved by clearing
the kflag bit. For enum64, the type is replaced with an
union of integer member types and the integer member size
must be smaller than enum64 size. If such an integer
type cannot be found, a new type is created and used
for union members.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062636.3721375-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-10 14:13:02 -07:00
Yonghong Song
f429a582bf libbpf: Add enum64 support for btf_dump
Add enum64 btf dumping support. For long long and unsigned long long
dump, suffixes 'LL' and 'ULL' are added to avoid compilation errors
in some cases.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062631.3720526-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-10 14:13:02 -07:00
Yonghong Song
25238de149 libbpf: Add enum64 deduplication support
Add enum64 deduplication support. BTF_KIND_ENUM64 handling
is very similar to BTF_KIND_ENUM.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062626.3720166-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-10 14:13:02 -07:00
Yonghong Song
c3f8eecb16 libbpf: Add enum64 parsing and new enum64 public API
Add enum64 parsing support and two new enum64 public APIs:
  btf__add_enum64
  btf__add_enum64_value

Also add support of signedness for BTF_KIND_ENUM. The
BTF_KIND_ENUM API signatures are not changed. The signedness
will be changed from unsigned to signed if btf__add_enum_value()
finds any negative values.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062621.3719391-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-10 14:13:02 -07:00
Yonghong Song
25fd7a1cf5 libbpf: Refactor btf__add_enum() for future code sharing
Refactor btf__add_enum() function to create a separate
function btf_add_enum_common() so later the common function
can be used to add enum64 btf type. There is no functionality
change for this patch.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062615.3718063-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-10 14:13:02 -07:00
Yonghong Song
0167a88355 libbpf: Fix an error in 64bit relocation value computation
Currently, the 64bit relocation value in the instruction
is computed as follows:
  __u64 imm = insn[0].imm + ((__u64)insn[1].imm << 32)

Suppose insn[0].imm = -1 (0xffffffff) and insn[1].imm = 1.
With the above computation, insn[0].imm will first sign-extend
to 64bit -1 (0xffffffffFFFFFFFF) and then add 0x1FFFFFFFF,
producing incorrect value 0xFFFFFFFF. The correct value
should be 0x1FFFFFFFF.

Changing insn[0].imm to __u32 first will prevent 64bit sign
extension and fix the issue. Merging high and low 32bit values
also changed from '+' to '|' to be consistent with other
similar occurences in kernel and libbpf.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062610.3717378-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-10 14:13:02 -07:00
Yonghong Song
23e3d8cf31 libbpf: Permit 64bit relocation value
Currently, the libbpf limits the relocation value to be 32bit
since all current relocations have such a limit. But with
BTF_KIND_ENUM64 support, the enum value could be 64bit.
So let us permit 64bit relocation value in libbpf.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062605.3716779-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-10 14:13:02 -07:00
Yonghong Song
9a976c6b98 bpf: Add btf enum64 support
Currently, BTF only supports upto 32bit enum value with BTF_KIND_ENUM.
But in kernel, some enum indeed has 64bit values, e.g.,
in uapi bpf.h, we have
  enum {
        BPF_F_INDEX_MASK                = 0xffffffffULL,
        BPF_F_CURRENT_CPU               = BPF_F_INDEX_MASK,
        BPF_F_CTXLEN_MASK               = (0xfffffULL << 32),
  };
In this case, BTF_KIND_ENUM will encode the value of BPF_F_CTXLEN_MASK
as 0, which certainly is incorrect.

This patch added a new btf kind, BTF_KIND_ENUM64, which permits
64bit value to cover the above use case. The BTF_KIND_ENUM64 has
the following three fields followed by the common type:
  struct bpf_enum64 {
    __u32 nume_off;
    __u32 val_lo32;
    __u32 val_hi32;
  };
Currently, btf type section has an alignment of 4 as all element types
are u32. Representing the value with __u64 will introduce a pad
for bpf_enum64 and may also introduce misalignment for the 64bit value.
Hence, two members of val_hi32 and val_lo32 are chosen to avoid these issues.

The kflag is also introduced for BTF_KIND_ENUM and BTF_KIND_ENUM64
to indicate whether the value is signed or unsigned. The kflag intends
to provide consistent output of BTF C fortmat with the original
source code. For example, the original BTF_KIND_ENUM bit value is 0xffffffff.
The format C has two choices, printing out 0xffffffff or -1 and current libbpf
prints out as unsigned value. But if the signedness is preserved in btf,
the value can be printed the same as the original source code.
The kflag value 0 means unsigned values, which is consistent to the default
by libbpf and should also cover most cases as well.

The new BTF_KIND_ENUM64 is intended to support the enum value represented as
64bit value. But it can represent all BTF_KIND_ENUM values as well.
The compiler ([1]) and pahole will generate BTF_KIND_ENUM64 only if the value has
to be represented with 64 bits.

In addition, a static inline function btf_kind_core_compat() is introduced which
will be used later when libbpf relo_core.c changed. Here the kernel shares the
same relo_core.c with libbpf.

  [1] https://reviews.llvm.org/D124641

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062600.3716578-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-10 14:13:02 -07:00
Andrii Nakryiko
e93b1010f3 ci: disable unpriv_bpf_disabled test on s390x
Seems like it's relying on fentry which is not supported on s390x.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-06-07 17:39:28 -07:00
Andrii Nakryiko
76fc1ad6d5 ci: make sure to not override CFLAGS
Use EXTRA_CFLAGS instead of overriding CFLAGS.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-06-07 17:39:28 -07:00
Andrii Nakryiko
33c5f2bec3 libbpf: bump Makefile version to 1.0.0 to match libbpf.map
We are now in v1.0 dev cycle.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-06-07 17:39:28 -07:00
Andrii Nakryiko
d4998cbb6c ci: update Kconfigs to make all selftests working
Also disable fexit_stress which is using test_run's support for TRACING
progs now.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-06-07 17:39:28 -07:00
Andrii Nakryiko
eb1d1ad83f sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   ac6a65868a5a45db49d5ee8524df3b701110d844
Checkpoint bpf-next commit: 02f4afebf8a54ba16f99f4f6ca10df3efeac6229
Baseline bpf commit:        f3f19f939c11925dadd3f4776f99f8c278a7017b
Checkpoint bpf commit:      d08af2c46881b62f4efad8ebb7eae381fa1f1033

Andrii Nakryiko (2):
  libbpf: start 1.0 development cycle
  libbpf: remove bpf_create_map*() APIs

Daniel Müller (5):
  libbpf: Introduce libbpf_bpf_prog_type_str
  libbpf: Introduce libbpf_bpf_map_type_str
  libbpf: Introduce libbpf_bpf_attach_type_str
  libbpf: Introduce libbpf_bpf_link_type_str
  libbpf: Fix a couple of typos

Douglas Raillard (1):
  libbpf: Fix determine_ptr_size() guessing

Eric Dumazet (1):
  net: add IFLA_TSO_{MAX_SIZE|SEGS} attributes

Geliang Tang (1):
  bpf: Add bpf_skc_to_mptcp_sock_proto

Joanne Koong (5):
  bpf: Add verifier support for dynptrs
  bpf: Add bpf_dynptr_from_mem for local dynptrs
  bpf: Dynptr support for ring buffers
  bpf: Add bpf_dynptr_read and bpf_dynptr_write
  bpf: Add dynptr data slices

Julia Lawall (1):
  libbpf: Fix typo in comment

Yuze Chi (1):
  libbpf: Fix is_pow_of_2

 include/uapi/linux/bpf.h     |  90 +++++++++++++++++++
 include/uapi/linux/if_link.h |   2 +
 src/bpf.c                    |  80 -----------------
 src/bpf.h                    |  42 ---------
 src/btf.c                    |  28 ++++--
 src/libbpf.c                 | 167 +++++++++++++++++++++++++++++++++--
 src/libbpf.h                 |  38 +++++++-
 src/libbpf.map               |  10 +++
 src/libbpf_internal.h        |   5 ++
 src/libbpf_version.h         |   4 +-
 src/linker.c                 |   5 --
 src/relo_core.c              |   8 +-
 12 files changed, 332 insertions(+), 147 deletions(-)

--
2.30.2
2022-06-07 17:39:28 -07:00
Andrii Nakryiko
8aa946389d sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2022-06-07 17:39:28 -07:00
Yuze Chi
ad0783c430 libbpf: Fix is_pow_of_2
Move the correct definition from linker.c into libbpf_internal.h.

Fixes: 0087a681fa8c ("libbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary")
Reported-by: Yuze Chi <chiyuze@google.com>
Signed-off-by: Yuze Chi <chiyuze@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220603055156.2830463-1-irogers@google.com
2022-06-07 17:39:28 -07:00
Daniel Müller
55638904af libbpf: Fix a couple of typos
This change fixes a couple of typos that were encountered while studying
the source code.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20220601154025.3295035-1-deso@posteo.net
2022-06-07 17:39:28 -07:00
Douglas Raillard
a5d75daa8c libbpf: Fix determine_ptr_size() guessing
One strategy employed by libbpf to guess the pointer size is by finding
the size of "unsigned long" type. This is achieved by looking for a type
of with the expected name and checking its size.

Unfortunately, the C syntax is friendlier to humans than to computers
as there is some variety in how such a type can be named. Specifically,
gcc and clang do not use the same names for integer types in debug info:

    - clang uses "unsigned long"
    - gcc uses "long unsigned int"

Lookup all the names for such a type so that libbpf can hope to find the
information it wants.

Signed-off-by: Douglas Raillard <douglas.raillard@arm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220524094447.332186-1-douglas.raillard@arm.com
2022-06-07 17:39:28 -07:00
Daniel Müller
37218f49fa libbpf: Introduce libbpf_bpf_link_type_str
This change introduces a new function, libbpf_bpf_link_type_str, to the
public libbpf API. The function allows users to get a string
representation for a bpf_link_type enum variant.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220523230428.3077108-11-deso@posteo.net
2022-06-07 17:39:28 -07:00
Daniel Müller
bdbce77631 libbpf: Introduce libbpf_bpf_attach_type_str
This change introduces a new function, libbpf_bpf_attach_type_str, to
the public libbpf API. The function allows users to get a string
representation for a bpf_attach_type variant.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220523230428.3077108-8-deso@posteo.net
2022-06-07 17:39:28 -07:00
Daniel Müller
242c116f04 libbpf: Introduce libbpf_bpf_map_type_str
This change introduces a new function, libbpf_bpf_map_type_str, to the
public libbpf API. The function allows users to get a string
representation for a bpf_map_type enum variant.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220523230428.3077108-5-deso@posteo.net
2022-06-07 17:39:28 -07:00
Daniel Müller
4d9cd51e7e libbpf: Introduce libbpf_bpf_prog_type_str
This change introduces a new function, libbpf_bpf_prog_type_str, to the
public libbpf API. The function allows users to get a string
representation for a bpf_prog_type variant.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220523230428.3077108-2-deso@posteo.net
2022-06-07 17:39:28 -07:00
Joanne Koong
f035838503 bpf: Add dynptr data slices
This patch adds a new helper function

void *bpf_dynptr_data(struct bpf_dynptr *ptr, u32 offset, u32 len);

which returns a pointer to the underlying data of a dynptr. *len*
must be a statically known value. The bpf program may access the returned
data slice as a normal buffer (eg can do direct reads and writes), since
the verifier associates the length with the returned pointer, and
enforces that no out of bounds accesses occur.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220523210712.3641569-6-joannelkoong@gmail.com
2022-06-07 17:39:28 -07:00
Joanne Koong
7ed5bf8f4c bpf: Add bpf_dynptr_read and bpf_dynptr_write
This patch adds two helper functions, bpf_dynptr_read and
bpf_dynptr_write:

long bpf_dynptr_read(void *dst, u32 len, struct bpf_dynptr *src, u32 offset);

long bpf_dynptr_write(struct bpf_dynptr *dst, u32 offset, void *src, u32 len);

The dynptr passed into these functions must be valid dynptrs that have
been initialized.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220523210712.3641569-5-joannelkoong@gmail.com
2022-06-07 17:39:28 -07:00
Joanne Koong
1a0f5d1c87 bpf: Dynptr support for ring buffers
Currently, our only way of writing dynamically-sized data into a ring
buffer is through bpf_ringbuf_output but this incurs an extra memcpy
cost. bpf_ringbuf_reserve + bpf_ringbuf_commit avoids this extra
memcpy, but it can only safely support reservation sizes that are
statically known since the verifier cannot guarantee that the bpf
program won’t access memory outside the reserved space.

The bpf_dynptr abstraction allows for dynamically-sized ring buffer
reservations without the extra memcpy.

There are 3 new APIs:

long bpf_ringbuf_reserve_dynptr(void *ringbuf, u32 size, u64 flags, struct bpf_dynptr *ptr);
void bpf_ringbuf_submit_dynptr(struct bpf_dynptr *ptr, u64 flags);
void bpf_ringbuf_discard_dynptr(struct bpf_dynptr *ptr, u64 flags);

These closely follow the functionalities of the original ringbuf APIs.
For example, all ringbuffer dynptrs that have been reserved must be
either submitted or discarded before the program exits.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/bpf/20220523210712.3641569-4-joannelkoong@gmail.com
2022-06-07 17:39:28 -07:00
Joanne Koong
c68a2738fd bpf: Add bpf_dynptr_from_mem for local dynptrs
This patch adds a new api bpf_dynptr_from_mem:

long bpf_dynptr_from_mem(void *data, u32 size, u64 flags, struct bpf_dynptr *ptr);

which initializes a dynptr to point to a bpf program's local memory. For now
only local memory that is of reg type PTR_TO_MAP_VALUE is supported.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220523210712.3641569-3-joannelkoong@gmail.com
2022-06-07 17:39:28 -07:00
Joanne Koong
97009215cb bpf: Add verifier support for dynptrs
This patch adds the bulk of the verifier work for supporting dynamic
pointers (dynptrs) in bpf.

A bpf_dynptr is opaque to the bpf program. It is a 16-byte structure
defined internally as:

struct bpf_dynptr_kern {
    void *data;
    u32 size;
    u32 offset;
} __aligned(8);

The upper 8 bits of *size* is reserved (it contains extra metadata about
read-only status and dynptr type). Consequently, a dynptr only supports
memory less than 16 MB.

There are different types of dynptrs (eg malloc, ringbuf, ...). In this
patchset, the most basic one, dynptrs to a bpf program's local memory,
is added. For now only local memory that is of reg type PTR_TO_MAP_VALUE
is supported.

In the verifier, dynptr state information will be tracked in stack
slots. When the program passes in an uninitialized dynptr
(ARG_PTR_TO_DYNPTR | MEM_UNINIT), the stack slots corresponding
to the frame pointer where the dynptr resides at are marked
STACK_DYNPTR. For helper functions that take in initialized dynptrs (eg
bpf_dynptr_read + bpf_dynptr_write which are added later in this
patchset), the verifier enforces that the dynptr has been initialized
properly by checking that their corresponding stack slots have been
marked as STACK_DYNPTR.

The 6th patch in this patchset adds test cases that the verifier should
successfully reject, such as for example attempting to use a dynptr
after doing a direct write into it inside the bpf program.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/bpf/20220523210712.3641569-2-joannelkoong@gmail.com
2022-06-07 17:39:28 -07:00
Julia Lawall
4c39a3e1aa libbpf: Fix typo in comment
Spelling mistake (triple letters) in comment.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Daniel Müller <deso@posteo.net>
Link: https://lore.kernel.org/bpf/20220521111145.81697-71-Julia.Lawall@inria.fr
2022-06-07 17:39:28 -07:00
Geliang Tang
cb11988cf4 bpf: Add bpf_skc_to_mptcp_sock_proto
This patch implements a new struct bpf_func_proto, named
bpf_skc_to_mptcp_sock_proto. Define a new bpf_id BTF_SOCK_TYPE_MPTCP,
and a new helper bpf_skc_to_mptcp_sock(), which invokes another new
helper bpf_mptcp_sock_from_subflow() in net/mptcp/bpf.c to get struct
mptcp_sock from a given subflow socket.

v2: Emit BTF type, add func_id checks in verifier.c and bpf_trace.c,
remove build check for CONFIG_BPF_JIT
v5: Drop EXPORT_SYMBOL (Martin)

Co-developed-by: Nicolas Rybowski <nicolas.rybowski@tessares.net>
Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Nicolas Rybowski <nicolas.rybowski@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220519233016.105670-2-mathew.j.martineau@linux.intel.com
2022-06-07 17:39:28 -07:00
Andrii Nakryiko
7e8d4234ac libbpf: remove bpf_create_map*() APIs
To test API removal, get rid of bpf_create_map*() APIs. Perf defines
__weak implementation of bpf_map_create() that redirects to old
bpf_create_map() and that seems to compile and run fine.

Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220518185915.3529475-4-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-07 17:39:28 -07:00
Andrii Nakryiko
00f40c01fb libbpf: start 1.0 development cycle
Start libbpf 1.0 development cycle by adding LIBBPF_1.0.0 section to
libbpf.map file and marking all current symbols as local. As we remove
all the deprecated APIs we'll populate global list before the final 1.0
release.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220518185915.3529475-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-07 17:39:28 -07:00
Eric Dumazet
881eba7ef5 net: add IFLA_TSO_{MAX_SIZE|SEGS} attributes
New netlink attributes IFLA_TSO_MAX_SIZE and IFLA_TSO_MAX_SEGS
are used to report to user-space the device TSO limits.

ip -d link sh dev eth1
...
   tso_max_size 65536 tso_max_segs 65535

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-07 17:39:28 -07:00
wangjie
4eb6485c08 Makefile: add support for cross compilation
Support CROSS_COMPILE and EXTRA_CFLAGS/EXTRA_LDFLAGS environments,
to make cross compiling more flexible.

Signed-off-by: Jie Wang <wangjie22@lixiang.com>
2022-05-24 23:24:54 -07:00
Ilya Leoshkevich
eaf9123419 vmtest: add netfilter to s390x config
This is required for the new synproxy test.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
2022-05-23 17:39:33 -07:00
Ilya Leoshkevich
cc904c1a74 vmtest: keep coreutils
Kernel's vmtest.sh uses stdbuf, which is unfortunately not present in
busybox. Do not delete coreutils, which has it. As a result, the
compressed image grows by 1M (~5%).

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
2022-05-23 17:39:33 -07:00
Ilya Leoshkevich
f3b96c873d vmtest: add iptables
iptables is required by the new selftests for raw syncookie helpers.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
2022-05-23 17:39:33 -07:00
Maxim Mikityanskiy
47595c2f08 ci: blacklist xdp_syncookie on s390x
The xdp_syncookie test uses kfunc, and BPF JIT doesn't support kfunc on
s390x.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
2022-05-20 17:14:41 -07:00
thiagoftsm
e4f2e6e865 Merge branch 'libbpf:master' into master 2022-05-17 02:16:02 +00:00
Andrii Nakryiko
86eb09863c sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   b2531d4bdce19f28364b45aac9132e153b1f23a4
Checkpoint bpf-next commit: ac6a65868a5a45db49d5ee8524df3b701110d844
Baseline bpf commit:        f3f19f939c11925dadd3f4776f99f8c278a7017b
Checkpoint bpf commit:      f3f19f939c11925dadd3f4776f99f8c278a7017b

Andrii Nakryiko (1):
  libbpf: fix memory leak in attach_tp for target-less tracepoint
    program

 src/libbpf.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--
2.30.2
2022-05-16 13:46:05 -07:00
Andrii Nakryiko
d43fc5a42f libbpf: fix memory leak in attach_tp for target-less tracepoint program
Fix sec_name memory leak if user defines target-less SEC("tp").

Fixes: 9af8efc45eb1 ("libbpf: Allow "incomplete" basic tracing SEC() definitions")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20220516184547.3204674-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-16 13:46:05 -07:00
Andrii Nakryiko
12e932ac0e ci: whitelist 'usdt' test on 5.5 and update vmlinux.h
Update vmlinux.h for latest selftests. Also whitelist usdt test on 5.5,
as it should work.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-05-13 16:13:31 -07:00
Andrii Nakryiko
75452cd290 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   d54d06a4c4bc5d76815d02e4b041b31d9dbb3fef
Checkpoint bpf-next commit: b2531d4bdce19f28364b45aac9132e153b1f23a4
Baseline bpf commit:        ba3beec2ec1d3b4fd8672ca6e781dac4b3267f6e
Checkpoint bpf commit:      f3f19f939c11925dadd3f4776f99f8c278a7017b

Andrii Nakryiko (12):
  libbpf: Allow "incomplete" basic tracing SEC() definitions
  libbpf: Support target-less SEC() definitions for BTF-backed programs
  libbpf: Append "..." in fixed up log if CO-RE spec is truncated
  libbpf: Use libbpf_mem_ensure() when allocating new map
  libbpf: Allow to opt-out from creating BPF maps
  libbpf: Make __kptr and __kptr_ref unconditionally use btf_type_tag()
    attr
  libbpf: Improve usability of field-based CO-RE helpers
  libbpf: Complete field-based CO-RE helpers with field offset helper
  libbpf: Provide barrier() and barrier_var() in bpf_helpers.h
  libbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary
  libbpf: Clean up ringbuf size adjustment implementation
  libbpf: Add safer high-level wrappers for map operations

Feng Zhou (1):
  bpf: add bpf_map_lookup_percpu_elem for percpu map

Jiri Olsa (1):
  libbpf: Add bpf_program__set_insns function

Kaixi Fan (1):
  bpf: Add source ip in "struct bpf_tunnel_key"

Kui-Feng Lee (3):
  bpf, x86: Generate trampolines from bpf_tramp_links
  bpf, x86: Attach a cookie to fentry/fexit/fmod_ret/lsm.
  libbpf: Assign cookies to links in libbpf.

 include/uapi/linux/bpf.h |  23 ++
 src/bpf.c                |  22 ++
 src/bpf.h                |   4 +
 src/bpf_core_read.h      |  37 ++-
 src/bpf_helpers.h        |  29 ++-
 src/libbpf.c             | 473 ++++++++++++++++++++++++++++++++-------
 src/libbpf.h             | 156 +++++++++++++
 src/libbpf.map           |  12 +-
 8 files changed, 659 insertions(+), 97 deletions(-)

--
2.30.2
2022-05-13 16:13:31 -07:00
Andrii Nakryiko
ae67bfbae3 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2022-05-13 16:13:31 -07:00
Andrii Nakryiko
650adc5118 libbpf: Add safer high-level wrappers for map operations
Add high-level API wrappers for most common and typical BPF map
operations that works directly on instances of struct bpf_map * (so
you don't have to call bpf_map__fd()) and validate key/value size
expectations.

These helpers require users to specify key (and value, where
appropriate) sizes when performing lookup/update/delete/etc. This forces
user to actually think and validate (for themselves) those. This is
a good thing as user is expected by kernel to implicitly provide correct
key/value buffer sizes and kernel will just read/write necessary amount
of data. If it so happens that user doesn't set up buffers correctly
(which bit people for per-CPU maps especially) kernel either randomly
overwrites stack data or return -EFAULT, depending on user's luck and
circumstances. These high-level APIs are meant to prevent such
unpleasant and hard to debug bugs.

This patch also adds bpf_map_delete_elem_flags() low-level API and
requires passing flags to bpf_map__delete_elem() API for consistency
across all similar APIs, even though currently kernel doesn't expect
any extra flags for BPF_MAP_DELETE_ELEM operation.

List of map operations that get these high-level APIs:

  - bpf_map_lookup_elem;
  - bpf_map_update_elem;
  - bpf_map_delete_elem;
  - bpf_map_lookup_and_delete_elem;
  - bpf_map_get_next_key.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220512220713.2617964-1-andrii@kernel.org
2022-05-13 16:13:31 -07:00
Feng Zhou
babc92b9f1 bpf: add bpf_map_lookup_percpu_elem for percpu map
Add new ebpf helpers bpf_map_lookup_percpu_elem.

The implementation method is relatively simple, refer to the implementation
method of map_lookup_elem of percpu map, increase the parameters of cpu, and
obtain it according to the specified cpu.

Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com>
Link: https://lore.kernel.org/r/20220511093854.411-2-zhoufeng.zf@bytedance.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-13 16:13:31 -07:00
Jiri Olsa
e335f3fa5f libbpf: Add bpf_program__set_insns function
Adding bpf_program__set_insns that allows to set new instructions
for a BPF program.

This is a very advanced libbpf API and users need to know what
they are doing. This should be used from prog_prepare_load_fn
callback only.

We can have changed instructions after calling prog_prepare_load_fn
callback, reloading them.

One of the users of this new API will be perf's internal BPF prologue
generation.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220510074659.2557731-2-jolsa@kernel.org
2022-05-13 16:13:31 -07:00
Andrii Nakryiko
7062757357 libbpf: Clean up ringbuf size adjustment implementation
Drop unused iteration variable, move overflow prevention check into the
for loop.

Fixes: 0087a681fa8c ("libbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220510185159.754299-1-andrii@kernel.org
2022-05-13 16:13:31 -07:00
Kui-Feng Lee
aec48fffee libbpf: Assign cookies to links in libbpf.
Add a cookie field to the attributes of bpf_link_create().
Add bpf_program__attach_trace_opts() to attach a cookie to a link.

Signed-off-by: Kui-Feng Lee <kuifeng@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220510205923.3206889-5-kuifeng@fb.com
2022-05-13 16:13:31 -07:00
Kui-Feng Lee
c116ae6130 bpf, x86: Attach a cookie to fentry/fexit/fmod_ret/lsm.
Pass a cookie along with BPF_LINK_CREATE requests.

Add a bpf_cookie field to struct bpf_tracing_link to attach a cookie.
The cookie of a bpf_tracing_link is available by calling
bpf_get_attach_cookie when running the BPF program of the attached
link.

The value of a cookie will be set at bpf_tramp_run_ctx by the
trampoline of the link.

Signed-off-by: Kui-Feng Lee <kuifeng@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220510205923.3206889-4-kuifeng@fb.com
2022-05-13 16:13:31 -07:00
Kui-Feng Lee
99b21d41e3 bpf, x86: Generate trampolines from bpf_tramp_links
Replace struct bpf_tramp_progs with struct bpf_tramp_links to collect
struct bpf_tramp_link(s) for a trampoline.  struct bpf_tramp_link
extends bpf_link to act as a linked list node.

arch_prepare_bpf_trampoline() accepts a struct bpf_tramp_links to
collects all bpf_tramp_link(s) that a trampoline should call.

Change BPF trampoline and bpf_struct_ops to pass bpf_tramp_links
instead of bpf_tramp_progs.

Signed-off-by: Kui-Feng Lee <kuifeng@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220510205923.3206889-2-kuifeng@fb.com
2022-05-13 16:13:31 -07:00
Kaixi Fan
7a443259de bpf: Add source ip in "struct bpf_tunnel_key"
Add tunnel source ip field in "struct bpf_tunnel_key". Add related code
to set and get tunnel source field.

Signed-off-by: Kaixi Fan <fankaixi.li@bytedance.com>
Link: https://lore.kernel.org/r/20220430074844.69214-2-fankaixi.li@bytedance.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-13 16:13:31 -07:00
Andrii Nakryiko
b3197662ba libbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary
Kernel imposes a pretty particular restriction on ringbuf map size. It
has to be a power-of-2 multiple of page size. While generally this isn't
hard for user to satisfy, sometimes it's impossible to do this
declaratively in BPF source code or just plain inconvenient to do at
runtime.

One such example might be BPF libraries that are supposed to work on
different architectures, which might not agree on what the common page
size is.

Let libbpf find the right size for user instead, if it turns out to not
satisfy kernel requirements. If user didn't set size at all, that's most
probably a mistake so don't upsize such zero size to one full page,
though. Also we need to be careful about not overflowing __u32
max_entries.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220509004148.1801791-9-andrii@kernel.org
2022-05-13 16:13:31 -07:00
Andrii Nakryiko
486b1a080b libbpf: Provide barrier() and barrier_var() in bpf_helpers.h
Add barrier() and barrier_var() macros into bpf_helpers.h to be used by
end users. While a bit advanced and specialized instruments, they are
sometimes indispensable. Instead of requiring each user to figure out
exact asm volatile incantations for themselves, provide them from
bpf_helpers.h.

Also remove conflicting definitions from selftests. Some tests rely on
barrier_var() definition being nothing, those will still work as libbpf
does the #ifndef/#endif guarding for barrier() and barrier_var(),
allowing users to redefine them, if necessary.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220509004148.1801791-8-andrii@kernel.org
2022-05-13 16:13:31 -07:00
Andrii Nakryiko
ba9850c048 libbpf: Complete field-based CO-RE helpers with field offset helper
Add bpf_core_field_offset() helper to complete field-based CO-RE
helpers. This helper can be useful for feature-detection and for some
more advanced cases of field reading (e.g., reading flexible array members).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220509004148.1801791-6-andrii@kernel.org
2022-05-13 16:13:31 -07:00
Andrii Nakryiko
5c1d6799df libbpf: Improve usability of field-based CO-RE helpers
Allow to specify field reference in two ways:

  - if user has variable of necessary type, they can use variable-based
    reference (my_var.my_field or my_var_ptr->my_field). This was the
    only supported syntax up till now.
  - now, bpf_core_field_exists() and bpf_core_field_size() support also
    specifying field in a fashion similar to offsetof() macro, by
    specifying type of the containing struct/union separately and field
    name separately: bpf_core_field_exists(struct my_type, my_field).
    This forms is quite often more convenient in practice and it matches
    type-based CO-RE helpers that support specifying type by its name
    without requiring any variables.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220509004148.1801791-4-andrii@kernel.org
2022-05-13 16:13:31 -07:00
Andrii Nakryiko
1f30788b41 libbpf: Make __kptr and __kptr_ref unconditionally use btf_type_tag() attr
It will be annoying and surprising for users of __kptr and __kptr_ref if
libbpf silently ignores them just because Clang used for compilation
didn't support btf_type_tag(). It's much better to get clear compiler
error than debug BPF verifier failures later on.

Fixes: ef89654f2bc7 ("libbpf: Add kptr type tag macros to bpf_helpers.h")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220509004148.1801791-3-andrii@kernel.org
2022-05-13 16:13:31 -07:00
Andrii Nakryiko
a8bc578af9 libbpf: Allow to opt-out from creating BPF maps
Add bpf_map__set_autocreate() API that allows user to opt-out from
libbpf automatically creating BPF map during BPF object load.

This is a useful feature when building CO-RE-enabled BPF application
that takes advantage of some new-ish BPF map type (e.g., socket-local
storage) if kernel supports it, but otherwise uses some alternative way
(e.g., extra HASH map). In such case, being able to disable the creation
of a map that kernel doesn't support allows to successfully create and
load BPF object file with all its other maps and programs.

It's still up to user to make sure that no "live" code in any of their BPF
programs are referencing such map instance, which can be achieved by
guarding such code with CO-RE relocation check or by using .rodata
global variables.

If user fails to properly guard such code to turn it into "dead code",
libbpf will helpfully post-process BPF verifier log and will provide
more meaningful error and map name that needs to be guarded properly. As
such, instead of:

  ; value = bpf_map_lookup_elem(&missing_map, &zero);
  4: (85) call unknown#2001000000
  invalid func unknown#2001000000

... user will see:

  ; value = bpf_map_lookup_elem(&missing_map, &zero);
  4: <invalid BPF map reference>
  BPF map 'missing_map' is referenced but wasn't created

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220428041523.4089853-4-andrii@kernel.org
2022-05-13 16:13:31 -07:00
Andrii Nakryiko
d46f1aaa7c libbpf: Use libbpf_mem_ensure() when allocating new map
Reuse libbpf_mem_ensure() when adding a new map to the list of maps
inside bpf_object. It takes care of proper resizing and reallocating of
map array and zeroing out newly allocated memory.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220428041523.4089853-3-andrii@kernel.org
2022-05-13 16:13:31 -07:00
Andrii Nakryiko
1a18c6f051 libbpf: Append "..." in fixed up log if CO-RE spec is truncated
Detect CO-RE spec truncation and append "..." to make user aware that
there was supposed to be more of the spec there.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220428041523.4089853-2-andrii@kernel.org
2022-05-13 16:13:31 -07:00
Andrii Nakryiko
97ab064bc0 libbpf: Support target-less SEC() definitions for BTF-backed programs
Similar to previous patch, support target-less definitions like
SEC("fentry"), SEC("freplace"), etc. For such BTF-backed program types
it is expected that user will specify BTF target programmatically at
runtime using bpf_program__set_attach_target() *before* load phase. If
not, libbpf will report this as an error.

Aslo use SEC_ATTACH_BTF flag instead of explicitly listing a set of
types that are expected to require attach_btf_id. This was an accidental
omission during custom SEC() support refactoring.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20220428185349.3799599-3-andrii@kernel.org
2022-05-13 16:13:31 -07:00
Andrii Nakryiko
eee09dc704 libbpf: Allow "incomplete" basic tracing SEC() definitions
In a lot of cases the target of kprobe/kretprobe, tracepoint, raw
tracepoint, etc BPF program might not be known at the compilation time
and will be discovered at runtime. This was always a supported case by
libbpf, with APIs like bpf_program__attach_{kprobe,tracepoint,etc}()
accepting full target definition, regardless of what was defined in
SEC() definition in BPF source code.

Unfortunately, up till now libbpf still enforced users to specify at
least something for the fake target, e.g., SEC("kprobe/whatever"), which
is cumbersome and somewhat misleading.

This patch allows target-less SEC() definitions for basic tracing BPF
program types:

  - kprobe/kretprobe;
  - multi-kprobe/multi-kretprobe;
  - tracepoints;
  - raw tracepoints.

Such target-less SEC() definitions are meant to specify declaratively
proper BPF program type only. Attachment of them will have to be handled
programmatically using correct APIs. As such, skeleton's auto-attachment
of such BPF programs is skipped and generic bpf_program__attach() will
fail, if attempted, due to the lack of enough target information.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20220428185349.3799599-2-andrii@kernel.org
2022-05-13 16:13:31 -07:00
Ilya Leoshkevich
87dff0a2c7 vmtest: allow building foreign debian rootfs
This would allow building s390x images without access to an IBM Z.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
2022-05-09 12:17:29 -07:00
Ilya Leoshkevich
14777c3784 vmtest: use debian bookworm
A newer iproute2 version is required for MPTCP tests. Use a newer
distro version, which has it.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
2022-05-09 12:17:29 -07:00
Andrii Nakryiko
3a4e26307d sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   34ba23b44c664792a4308ec37b5788a3162944ec
Checkpoint bpf-next commit: d54d06a4c4bc5d76815d02e4b041b31d9dbb3fef
Baseline bpf commit:        8de8b71b787f38983d414d2dba169a3bfefa668a
Checkpoint bpf commit:      ba3beec2ec1d3b4fd8672ca6e781dac4b3267f6e

Alan Maguire (1):
  libbpf: Usdt aarch64 arg parsing support

Andrii Nakryiko (10):
  libbpf: Support opting out from autoloading BPF programs declaratively
  libbpf: Teach bpf_link_create() to fallback to
    bpf_raw_tracepoint_open()
  libbpf: Fix anonymous type check in CO-RE logic
  libbpf: Drop unhelpful "program too large" guess
  libbpf: Fix logic for finding matching program for CO-RE relocation
  libbpf: Avoid joining .BTF.ext data with BPF programs by section name
  libbpf: Record subprog-resolved CO-RE relocations unconditionally
  libbpf: Refactor CO-RE relo human description formatting routine
  libbpf: Simplify bpf_core_parse_spec() signature
  libbpf: Fix up verifier log for unguarded failed CO-RE relos

Gaosheng Cui (1):
  libbpf: Remove redundant non-null checks on obj_elf

Grant Seltzer (4):
  libbpf: Add error returns to two API functions
  libbpf: Update API functions usage to check error
  libbpf: Add documentation to API functions
  libbpf: Improve libbpf API documentation link position

Kumar Kartikeya Dwivedi (2):
  bpf: Allow storing referenced kptr in map
  libbpf: Add kptr type tag macros to bpf_helpers.h

Pu Lehui (2):
  libbpf: Fix usdt_cookie being cast to 32 bits
  libbpf: Support riscv USDT argument parsing logic

Runqing Yang (1):
  libbpf: Fix a bug with checking bpf_probe_read_kernel() support in old
    kernels

Vladimir Isaev (1):
  libbpf: Add ARC support to bpf_tracing.h

Yuntao Wang (1):
  libbpf: Remove unnecessary type cast

 docs/index.rst           |   3 +-
 include/uapi/linux/bpf.h |  12 ++
 src/bpf.c                |  34 ++++-
 src/bpf_helpers.h        |   7 +
 src/bpf_tracing.h        |  23 +++
 src/btf.c                |   9 +-
 src/libbpf.c             | 322 ++++++++++++++++++++++++++++++---------
 src/libbpf.h             |  82 +++++++++-
 src/libbpf_internal.h    |   9 +-
 src/relo_core.c          | 104 +++++++------
 src/relo_core.h          |   6 +
 src/usdt.c               | 191 ++++++++++++++++++++++-
 12 files changed, 668 insertions(+), 134 deletions(-)

--
2.30.2
2022-04-27 15:19:08 -07:00
Andrii Nakryiko
ef6f1fdfff sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2022-04-27 15:19:08 -07:00
Andrii Nakryiko
c3f58eb6cf libbpf: Fix up verifier log for unguarded failed CO-RE relos
Teach libbpf to post-process BPF verifier log on BPF program load
failure and detect known error patterns to provide user with more
context.

Currently there is one such common situation: an "unguarded" failed BPF
CO-RE relocation. While failing CO-RE relocation is expected, it is
expected to be property guarded in BPF code such that BPF verifier
always eliminates BPF instructions corresponding to such failed CO-RE
relos as dead code. In cases when user failed to take such precautions,
BPF verifier provides the best log it can:

  123: (85) call unknown#195896080
  invalid func unknown#195896080

Such incomprehensible log error is due to libbpf "poisoning" BPF
instruction that corresponds to failed CO-RE relocation by replacing it
with invalid `call 0xbad2310` instruction (195896080 == 0xbad2310 reads
"bad relo" if you squint hard enough).

Luckily, libbpf has all the necessary information to look up CO-RE
relocation that failed and provide more human-readable description of
what's going on:

  5: <invalid CO-RE relocation>
  failed to resolve CO-RE relocation <byte_off> [6] struct task_struct___bad.fake_field_subprog (0:2 @ offset 8)

This hopefully makes it much easier to understand what's wrong with
user's BPF program without googling magic constants.

This BPF verifier log fixup is setup to be extensible and is going to be
used for at least one other upcoming feature of libbpf in follow up patches.
Libbpf is parsing lines of BPF verifier log starting from the very end.
Currently it processes up to 10 lines of code looking for familiar
patterns. This avoids wasting lots of CPU processing huge verifier logs
(especially for log_level=2 verbosity level). Actual verification error
should normally be found in last few lines, so this should work
reliably.

If libbpf needs to expand log beyond available log_buf_size, it
truncates the end of the verifier log. Given verifier log normally ends
with something like:

  processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0

... truncating this on program load error isn't too bad (end user can
always increase log size, if it needs to get complete log).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-10-andrii@kernel.org
2022-04-27 15:19:08 -07:00
Andrii Nakryiko
2c3a55bfe7 libbpf: Simplify bpf_core_parse_spec() signature
Simplify bpf_core_parse_spec() signature to take struct bpf_core_relo as
an input instead of requiring callers to decompose them into type_id,
relo, spec_str, etc. This makes using and reusing this helper easier.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-9-andrii@kernel.org
2022-04-27 15:19:08 -07:00
Andrii Nakryiko
e2d8a820cb libbpf: Refactor CO-RE relo human description formatting routine
Refactor how CO-RE relocation is formatted. Now it dumps human-readable
representation, currently used by libbpf in either debug or error
message output during CO-RE relocation resolution process, into provided
buffer. This approach allows for better reuse of this functionality
outside of CO-RE relocation resolution, which we'll use in next patch
for providing better error message for BPF verifier rejecting BPF
program due to unguarded failed CO-RE relocation.

It also gets rid of annoying "stitching" of libbpf_print() calls, which
was the only place where we did this.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-8-andrii@kernel.org
2022-04-27 15:19:08 -07:00
Andrii Nakryiko
aaaeea6499 libbpf: Record subprog-resolved CO-RE relocations unconditionally
Previously, libbpf recorded CO-RE relocations with insns_idx resolved
according to finalized subprog locations (which are appended at the end
of entry BPF program) to simplify the job of light skeleton generator.

This is necessary because once subprogs' instructions are appended to
main entry BPF program all the subprog instruction indices are shifted
and that shift is different for each entry (main) BPF program, so it's
generally impossible to map final absolute insn_idx of the finalized BPF
program to their original locations inside subprograms.

This information is now going to be used not only during light skeleton
generation, but also to map absolute instruction index to subprog's
instruction and its corresponding CO-RE relocation. So start recording
these relocations always, not just when obj->gen_loader is set.

This information is going to be freed at the end of bpf_object__load()
step, as before (but this can change in the future if there will be
a need for this information post load step).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-7-andrii@kernel.org
2022-04-27 15:19:08 -07:00
Andrii Nakryiko
f2e994e0b7 libbpf: Avoid joining .BTF.ext data with BPF programs by section name
Instead of using ELF section names as a joining key between .BTF.ext and
corresponding BPF programs, pre-build .BTF.ext section number to ELF
section index mapping during bpf_object__open() and use it later for
matching .BTF.ext information (func/line info or CO-RE relocations) to
their respective BPF programs and subprograms.

This simplifies corresponding joining logic and let's libbpf do
manipulations with BPF program's ELF sections like dropping leading '?'
character for non-autoloaded programs. Original joining logic in
bpf_object__relocate_core() (see relevant comment that's now removed)
was never elegant, so it's a good improvement regardless. But it also
avoids unnecessary internal assumptions about preserving original ELF
section name as BPF program's section name (which was broken when
SEC("?abc") support was added).

Fixes: a3820c481112 ("libbpf: Support opting out from autoloading BPF programs declaratively")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-5-andrii@kernel.org
2022-04-27 15:19:08 -07:00
Andrii Nakryiko
eb22de1f7d libbpf: Fix logic for finding matching program for CO-RE relocation
Fix the bug in bpf_object__relocate_core() which can lead to finding
invalid matching BPF program when processing CO-RE relocation. IF
matching program is not found, last encountered program will be assumed
to be correct program and thus error detection won't detect the problem.

Fixes: 9c82a63cf370 ("libbpf: Fix CO-RE relocs against .text section")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-4-andrii@kernel.org
2022-04-27 15:19:08 -07:00
Andrii Nakryiko
0a901dd1cd libbpf: Drop unhelpful "program too large" guess
libbpf pretends it knows actual limit of BPF program instructions based
on UAPI headers it compiled with. There is neither any guarantee that
UAPI headers match host kernel, nor BPF verifier actually uses
BPF_MAXINSNS constant anymore. Just drop unhelpful "guess", BPF verifier
will emit actual reason for failure in its logs anyways.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-3-andrii@kernel.org
2022-04-27 15:19:08 -07:00
Andrii Nakryiko
36582ee432 libbpf: Fix anonymous type check in CO-RE logic
Use type name for checking whether CO-RE relocation is referring to
anonymous type. Using spec string makes no sense.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-2-andrii@kernel.org
2022-04-27 15:19:08 -07:00
Kumar Kartikeya Dwivedi
e7f46e2cae libbpf: Add kptr type tag macros to bpf_helpers.h
Include convenience definitions:
__kptr:	Unreferenced kptr
__kptr_ref: Referenced kptr

Users can use them to tag the pointer type meant to be used with the new
support directly in the map value definition.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220424214901.2743946-11-memxor@gmail.com
2022-04-27 15:19:08 -07:00
Kumar Kartikeya Dwivedi
179ca056b0 bpf: Allow storing referenced kptr in map
Extending the code in previous commits, introduce referenced kptr
support, which needs to be tagged using 'kptr_ref' tag instead. Unlike
unreferenced kptr, referenced kptr have a lot more restrictions. In
addition to the type matching, only a newly introduced bpf_kptr_xchg
helper is allowed to modify the map value at that offset. This transfers
the referenced pointer being stored into the map, releasing the
references state for the program, and returning the old value and
creating new reference state for the returned pointer.

Similar to unreferenced pointer case, return value for this case will
also be PTR_TO_BTF_ID_OR_NULL. The reference for the returned pointer
must either be eventually released by calling the corresponding release
function, otherwise it must be transferred into another map.

It is also allowed to call bpf_kptr_xchg with a NULL pointer, to clear
the value, and obtain the old value if any.

BPF_LDX, BPF_STX, and BPF_ST cannot access referenced kptr. A future
commit will permit using BPF_LDX for such pointers, but attempt at
making it safe, since the lifetime of object won't be guaranteed.

There are valid reasons to enforce the restriction of permitting only
bpf_kptr_xchg to operate on referenced kptr. The pointer value must be
consistent in face of concurrent modification, and any prior values
contained in the map must also be released before a new one is moved
into the map. To ensure proper transfer of this ownership, bpf_kptr_xchg
returns the old value, which the verifier would require the user to
either free or move into another map, and releases the reference held
for the pointer being moved in.

In the future, direct BPF_XCHG instruction may also be permitted to work
like bpf_kptr_xchg helper.

Note that process_kptr_func doesn't have to call
check_helper_mem_access, since we already disallow rdonly/wronly flags
for map, which is what check_map_access_type checks, and we already
ensure the PTR_TO_MAP_VALUE refers to kptr by obtaining its off_desc,
so check_map_access is also not required.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220424214901.2743946-4-memxor@gmail.com
2022-04-27 15:19:08 -07:00
Yuntao Wang
56dff81d46 libbpf: Remove unnecessary type cast
The link variable is already of type 'struct bpf_link *', casting it to
'struct bpf_link *' is redundant, drop it.

Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220424143420.457082-1-ytcoode@gmail.com
2022-04-27 15:19:08 -07:00
Andrii Nakryiko
0d4cefc4fc libbpf: Teach bpf_link_create() to fallback to bpf_raw_tracepoint_open()
Teach bpf_link_create() to fallback to bpf_raw_tracepoint_open() on
older kernels for programs that are attachable through
BPF_RAW_TRACEPOINT_OPEN. This makes bpf_link_create() more unified and
convenient interface for creating bpf_link-based attachments.

With this approach end users can just use bpf_link_create() for
tp_btf/fentry/fexit/fmod_ret/lsm program attachments without needing to
care about kernel support, as libbpf will handle this transparently. On
the other hand, as newer features (like BPF cookie) are added to
LINK_CREATE interface, they will be readily usable though the same
bpf_link_create() API without any major refactoring from user's
standpoint.

bpf_program__attach_btf_id() is now using bpf_link_create() internally
as well and will take advantaged of this unified interface when BPF
cookie is added for fentry/fexit.

Doing proactive feature detection of LINK_CREATE support for
fentry/tp_btf/etc is quite involved. It requires parsing vmlinux BTF,
determining some stable and guaranteed to be in all kernels versions
target BTF type (either raw tracepoint or fentry target function),
actually attaching this program and thus potentially affecting the
performance of the host kernel briefly, etc. So instead we are taking
much simpler "lazy" approach of falling back to
bpf_raw_tracepoint_open() call only if initial LINK_CREATE command
fails. For modern kernels this will mean zero added overhead, while
older kernels will incur minimal overhead with a single fast-failing
LINK_CREATE call.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Kui-Feng Lee <kuifeng@fb.com>
Link: https://lore.kernel.org/bpf/20220421033945.3602803-3-andrii@kernel.org
2022-04-27 15:19:08 -07:00
Grant Seltzer
5954a6c4aa libbpf: Improve libbpf API documentation link position
This puts the link for libbpf API documentation into the sidebar
for much easier navigation.

You can preview this change at:

  https://libbpf-test.readthedocs.io/en/latest/

Note that the link is hardcoded to the production version, so you
can see that it self references itself here for now:

  https://libbpf-test.readthedocs.io/en/latest/api.html

This will need to make its way into the libbpf mirror, before being
deployed to libbpf.readthedocs.org

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220422031050.303984-1-grantseltzer@gmail.com
2022-04-27 15:19:08 -07:00
Gaosheng Cui
38be0379c9 libbpf: Remove redundant non-null checks on obj_elf
Obj_elf is already non-null checked at the function entry, so remove
redundant non-null checks on obj_elf.

Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220421031803.2283974-1-cuigaosheng1@huawei.com
2022-04-27 15:19:08 -07:00
Grant Seltzer
5fa8bb6b42 libbpf: Add documentation to API functions
This adds documentation for the following API functions:

- bpf_program__set_expected_attach_type()
- bpf_program__set_type()
- bpf_program__set_attach_target()
- bpf_program__attach()
- bpf_program__pin()
- bpf_program__unpin()

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220420161226.86803-3-grantseltzer@gmail.com
2022-04-27 15:19:08 -07:00
Grant Seltzer
c5b91a333e libbpf: Update API functions usage to check error
This updates usage of the following API functions within
libbpf so their newly added error return is checked:

- bpf_program__set_expected_attach_type()
- bpf_program__set_type()

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220420161226.86803-2-grantseltzer@gmail.com
2022-04-27 15:19:08 -07:00
Grant Seltzer
8073e03491 libbpf: Add error returns to two API functions
This adds an error return to the following API functions:

- bpf_program__set_expected_attach_type()
- bpf_program__set_type()

In both cases, the error occurs when the BPF object has
already been loaded when the function is called. In this
case -EBUSY is returned.

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220420161226.86803-1-grantseltzer@gmail.com
2022-04-27 15:19:08 -07:00
Pu Lehui
eb2b216081 libbpf: Support riscv USDT argument parsing logic
Add riscv-specific USDT argument specification parsing logic.
riscv USDT argument format is shown below:
- Memory dereference case:
  "size@off(reg)", e.g. "-8@-88(s0)"
- Constant value case:
  "size@val", e.g. "4@5"
- Register read case:
  "size@reg", e.g. "-8@a1"

s8 will be marked as poison while it's a reg of riscv, we need
to alias it in advance. Both RV32 and RV64 have been tested.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220419145238.482134-3-pulehui@huawei.com
2022-04-27 15:19:08 -07:00
Pu Lehui
bddd106e80 libbpf: Fix usdt_cookie being cast to 32 bits
The usdt_cookie is defined as __u64, which should not be
used as a long type because it will be cast to 32 bits
in 32-bit platforms.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220419145238.482134-2-pulehui@huawei.com
2022-04-27 15:19:08 -07:00
Andrii Nakryiko
e205664ddb libbpf: Support opting out from autoloading BPF programs declaratively
Establish SEC("?abc") naming convention (i.e., adding question mark in
front of otherwise normal section name) that allows to set corresponding
program's autoload property to false. This is effectively just
a declarative way to do bpf_program__set_autoload(prog, false).

Having a way to do this declaratively in BPF code itself is useful and
convenient for various scenarios. E.g., for testing, when BPF object
consists of multiple independent BPF programs that each needs to be
tested separately. Opting out all of them by default and then setting
autoload to true for just one of them at a time simplifies testing code
(see next patch for few conversions in BPF selftests taking advantage of
this new feature).

Another real-world use case is in libbpf-tools for cases when different
BPF programs have to be picked depending on particulars of the host
kernel due to various incompatible changes (like kernel function renames
or signature change, or to pick kprobe vs fentry depending on
corresponding kernel support for the latter). Marking all the different
BPF program candidates as non-autoloaded declaratively makes this more
obvious in BPF source code and allows simpler code in user-space code.

When BPF program marked as SEC("?abc") it is otherwise treated just like
SEC("abc") and bpf_program__section_name() reported will be "abc".

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220419002452.632125-1-andrii@kernel.org
2022-04-27 15:19:08 -07:00
Alan Maguire
557499a13e libbpf: Usdt aarch64 arg parsing support
Parsing of USDT arguments is architecture-specific. On aarch64 it is
relatively easy since registers used are x[0-31] and sp. Format is
slightly different compared to x86_64. Possible forms are:

- "size@[reg[,offset]]" for dereferences, e.g. "-8@[sp,76]" and "-4@[sp]";
- "size@reg" for register values, e.g. "-4@x0";
- "size@value" for raw values, e.g. "-8@1".

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1649690496-1902-2-git-send-email-alan.maguire@oracle.com
2022-04-27 15:19:08 -07:00
Runqing Yang
ffd4015f3b libbpf: Fix a bug with checking bpf_probe_read_kernel() support in old kernels
Background:
Libbpf automatically replaces calls to BPF bpf_probe_read_{kernel,user}
[_str]() helpers with bpf_probe_read[_str](), if libbpf detects that
kernel doesn't support new APIs. Specifically, libbpf invokes the
probe_kern_probe_read_kernel function to load a small eBPF program into
the kernel in which bpf_probe_read_kernel API is invoked and lets the
kernel checks whether the new API is valid. If the loading fails, libbpf
considers the new API invalid and replaces it with the old API.

static int probe_kern_probe_read_kernel(void)
{
	struct bpf_insn insns[] = {
		BPF_MOV64_REG(BPF_REG_1, BPF_REG_10),	/* r1 = r10 (fp) */
		BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -8),	/* r1 += -8 */
		BPF_MOV64_IMM(BPF_REG_2, 8),		/* r2 = 8 */
		BPF_MOV64_IMM(BPF_REG_3, 0),		/* r3 = 0 */
		BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_probe_read_kernel),
		BPF_EXIT_INSN(),
	};
	int fd, insn_cnt = ARRAY_SIZE(insns);

	fd = bpf_prog_load(BPF_PROG_TYPE_KPROBE, NULL,
                           "GPL", insns, insn_cnt, NULL);
	return probe_fd(fd);
}

Bug:
On older kernel versions [0], the kernel checks whether the version
number provided in the bpf syscall, matches the LINUX_VERSION_CODE.
If not matched, the bpf syscall fails. eBPF However, the
probe_kern_probe_read_kernel code does not set the kernel version
number provided to the bpf syscall, which causes the loading process
alwasys fails for old versions. It means that libbpf will replace the
new API with the old one even the kernel supports the new one.

Solution:
After a discussion in [1], the solution is using BPF_PROG_TYPE_TRACEPOINT
program type instead of BPF_PROG_TYPE_KPROBE because kernel does not
enfoce version check for tracepoint programs. I test the patch in old
kernels (4.18 and 4.19) and it works well.

  [0] https://elixir.bootlin.com/linux/v4.19/source/kernel/bpf/syscall.c#L1360
  [1] Closes: https://github.com/libbpf/libbpf/issues/473

Signed-off-by: Runqing Yang <rainkin1993@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220409144928.27499-1-rainkin1993@gmail.com
2022-04-27 15:19:08 -07:00
Vladimir Isaev
68e7624e9f libbpf: Add ARC support to bpf_tracing.h
Add PT_REGS macros suitable for ARCompact and ARCv2.

Signed-off-by: Vladimir Isaev <isaev@synopsys.com>
Signed-off-by: Sergey Matyukevich <geomatsi@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20220408224442.599566-1-geomatsi@gmail.com
2022-04-27 15:19:08 -07:00
Maxim Mikityanskiy
b221db664f ci: enable synproxy config for all architectures
Enable the following options in Kconfig for x86-64 and s390x:

CONFIG_NETFILTER_SYNPROXY=y
CONFIG_NETFILTER_XT_TARGET_CT=y
CONFIG_NETFILTER_XT_MATCH_STATE=y
CONFIG_IP_NF_FILTER=y
CONFIG_IP_NF_TARGET_SYNPROXY=y
CONFIG_IP_NF_RAW=y

These options are needed to run the selftests for the new BPF SYN cookie
helpers.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
2022-04-27 15:18:22 -07:00
chantra
7bf9ee2dba [rootfs] update rootfs to ship with ethtool
Add `ethtool` as a dependency to the rootfs image.

Tested by running and building the rootfs images with both
`sudo ./mkrootfs_arch.sh`
and
`sudo ./mkrootfs_debian.sh`

and running in qemu with:
```
wget https://libbpf-ci.s3-us-west-1.amazonaws.com/x86_64/vmlinuz-5.5.0
rootfs_img=rootfs.img kernel_bzimage=vmlinuz-5.5.0

mkdir rootfs
touch rootfs.img
truncate -s 2G rootfs.img
sudo mount -o loop rootfs.img rootfs
cat ~/Downloads/libbpf-vmtest-rootfs-2022.04.25.tar.zst | sudo tar -C rootfs -I zstd -xvf -
sudo install  -m 755 -o root -g root  /dev/stdin rootfs/etc/rcS.d/S50-startup <<'EOF'
ethtool -h
cat /etc/issue
EOF

qemu-system-x86_64 -nodefaults  -display none -serial mon:stdio -enable-kvm -m 4G -drive file="${rootfs_img}",format=raw,index=1,media=disk,if=virtio,cache=none -kernel "${kernel_bzimage}" -append "root=/dev/vda rw console=ttyS0,115200"
```

The last block printed ethtool's help, confirming the presence of
ethtool in the rootfs.

`libbpf-vmtest-rootfs-2022.04.25.tar.zst` was generated and uploaded to S3. INDEX in libbpf/ci needs to be changed to make the CI pick it up.
2022-04-25 16:30:32 -07:00
grantseltzer
533c7666eb Fix downloads formats
Signed-off-by: grantseltzer <grantseltzer@gmail.com>
2022-04-22 14:30:27 -07:00
grantseltzer
dea5ae9fc9 Enable downloads feature
Signed-off-by: grantseltzer <grantseltzer@gmail.com>
2022-04-19 16:08:19 -07:00
Evgeny Vereshchagin
8bc3e510fc ci: turn off _FORTIFY_SOURCE explicitly
libelf is compiled with _FORTIFY_SOURCE by default and it
isn't compatible with MSan. It was borrowed
from https://github.com/google/oss-fuzz/pull/7422
2022-04-10 18:57:38 -07:00
Evgeny Vereshchagin
14414c6ea5 ci: turn on the alignment check
to catch issues like https://github.com/libbpf/libbpf/issues/391
2022-04-10 18:57:38 -07:00
Evgeny Vereshchagin
ea10235072 ci: point elfutils to a commit where a couple bugs are fixed
Fixes
```
./out/bpf-object-fuzzer: Running 1 inputs 1 time(s) each.
Running: CORPUS/036ff286c13e4590646c7ef59435ec642432da8e
elf_begin.c:232:20: runtime error: member access within misaligned address 0x000001655e71 for type 'Elf64_Shdr', which requires 8 byte alignment
0x000001655e71: note: pointer points here
 00 00 00  7f 45 4c 46 02 02 01 00  00 00 07 fb 00 1d 00 00  6c 69 63 65 42 fb 00 41  00 57 03 00 20
              ^
    #0 0x574d51 in get_shnum /home/libbpf/elfutils/libelf/elf_begin.c:232:20
    #1 0x574d51 in file_read_elf /home/libbpf/elfutils/libelf/elf_begin.c:296:19
    #2 0x569c2c in __libelf_read_mmaped_file /home/libbpf/elfutils/libelf/elf_begin.c:559:14
    #3 0x58e812 in elf_memory /home/libbpf/elfutils/libelf/elf_memory.c:49:10
    #4 0x4905b4 in bpf_object__elf_init /home/libbpf/src/libbpf.c:1255:9
    #5 0x4905b4 in bpf_object_open /home/libbpf/src/libbpf.c:7104:8
    #6 0x49144e in bpf_object__open_mem /home/libbpf/src/libbpf.c:7171:20
    #7 0x483018 in LLVMFuzzerTestOneInput /home/libbpf/fuzz/bpf-object-fuzzer.c:16:8
    #8 0x439389 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (/home/libbpf/out/bpf-object-fuzzer+0x439389)
    #9 0x419e2f in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) (/home/libbpf/out/bpf-object-fuzzer+0x419e2f)
    #10 0x421aee in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (/home/libbpf/out/bpf-object-fuzzer+0x421aee)
    #11 0x410f96 in main (/home/libbpf/out/bpf-object-fuzzer+0x410f96)
    #12 0x7f153e21255f in __libc_start_call_main (/lib64/libc.so.6+0x2d55f)
    #13 0x7f153e21260b in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2d60b)
    #14 0x410fe4 in _start (/home/libbpf/out/bpf-object-fuzzer+0x410fe4)

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior elf_begin.c:232:20 in
```
and
```
./out/bpf-object-fuzzer: Running 1 inputs 1 time(s) each.
Running: CORPUS/446b578d82c47fe177de6fd675f4cb6bae8d1ea9
elf_begin.c:485:40: runtime error: addition of unsigned offset to 0x000002277e70 overflowed to 0x0000021d7e6f
    #0 0x5748f1 in file_read_elf /home/libbpf/elfutils/libelf/elf_begin.c:485:40
    #1 0x569c2c in __libelf_read_mmaped_file /home/libbpf/elfutils/libelf/elf_begin.c:559:14
    #2 0x58e812 in elf_memory /home/libbpf/elfutils/libelf/elf_memory.c:49:10
    #3 0x4905b4 in bpf_object__elf_init /home/libbpf/src/libbpf.c:1255:9
    #4 0x4905b4 in bpf_object_open /home/libbpf/src/libbpf.c:7104:8
    #5 0x49144e in bpf_object__open_mem /home/libbpf/src/libbpf.c:7171:20
    #6 0x483018 in LLVMFuzzerTestOneInput /home/libbpf/fuzz/bpf-object-fuzzer.c:16:8
    #7 0x439389 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (/home/libbpf/out/bpf-object-fuzzer+0x439389)
    #8 0x419e2f in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) (/home/libbpf/out/bpf-object-fuzzer+0x419e2f)
    #9 0x421aee in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (/home/libbpf/out/bpf-object-fuzzer+0x421aee)
    #10 0x410f96 in main (/home/libbpf/out/bpf-object-fuzzer+0x410f96)
    #11 0x7f753e38255f in __libc_start_call_main (/lib64/libc.so.6+0x2d55f)
    #12 0x7f753e38260b in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2d60b)
    #13 0x410fe4 in _start (/home/libbpf/out/bpf-object-fuzzer+0x410fe4)

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior elf_begin.c:485:40 in
```
2022-04-10 18:57:38 -07:00
Evgeny Vereshchagin
f3cc144922 ci: turn off unaligned access in libelf explicitly 2022-04-10 18:57:38 -07:00
Andrii Nakryiko
b69f8ee93e ci: allow usdt selftest on s390x
libbpf now has s390x support for USDT, so enable corresponding selftest.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-04-09 09:17:51 -07:00
Andrii Nakryiko
bbfb018473 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   2d0df01974ce2b59b6f7d5bd3ea58d74f12ddf85
Checkpoint bpf-next commit: 34ba23b44c664792a4308ec37b5788a3162944ec
Baseline bpf commit:        0a210af6d0a0595fef566e7eeb072f10f37774be
Checkpoint bpf commit:      8de8b71b787f38983d414d2dba169a3bfefa668a

Alan Maguire (2):
  libbpf: Improve library identification for uprobe binary path
    resolution
  libbpf: Improve string parsing for uprobe auto-attach

Andrii Nakryiko (5):
  libbpf: Fix use #ifdef instead of #if to avoid compiler warning
  libbpf: Use strlcpy() in path resolution fallback logic
  libbpf: Allow WEAK and GLOBAL bindings during BTF fixup
  libbpf: Don't error out on CO-RE relos for overriden weak subprogs
  libbpf: Use weak hidden modifier for USDT BPF-side API functions

Colin Ian King (1):
  libbpf: Fix spelling mistake "libaries" -> "libraries"

Haowen Bai (1):
  libbpf: Potential NULL dereference in usdt_manager_attach_usdt()

Ilya Leoshkevich (3):
  libbpf: Minor style improvements in USDT code
  libbpf: Make BPF-side of USDT support work on big-endian machines
  libbpf: Add s390-specific USDT arg spec parsing logic

 src/libbpf.c          | 105 ++++++++++++++++++++----------------------
 src/libbpf_internal.h |  11 +++++
 src/usdt.bpf.h        |  13 ++++--
 src/usdt.c            |  79 ++++++++++++++++++++++++++-----
 4 files changed, 136 insertions(+), 72 deletions(-)

--
2.30.2
2022-04-09 09:17:51 -07:00
Andrii Nakryiko
1ce956ab3a libbpf: Use weak hidden modifier for USDT BPF-side API functions
Use __weak __hidden for bpf_usdt_xxx() APIs instead of much more
confusing `static inline __noinline`. This was previously impossible due
to libbpf erroring out on CO-RE relocations pointing to eliminated weak
subprogs. Now that previous patch fixed this issue, switch back to
__weak __hidden as it's a more direct way of specifying the desired
behavior.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220408181425.2287230-3-andrii@kernel.org
2022-04-09 09:17:51 -07:00
Andrii Nakryiko
5016f30a24 libbpf: Don't error out on CO-RE relos for overriden weak subprogs
During BPF static linking, all the ELF relocations and .BTF.ext
information (including CO-RE relocations) are preserved for __weak
subprograms that were logically overriden by either previous weak
subprogram instance or by corresponding "strong" (non-weak) subprogram.
This is just how native user-space linkers work, nothing new.

But libbpf is over-zealous when processing CO-RE relocation to error out
when CO-RE relocation belonging to such eliminated weak subprogram is
encountered. Instead of erroring out on this expected situation, log
debug-level message and skip the relocation.

Fixes: db2b8b06423c ("libbpf: Support CO-RE relocations for multi-prog sections")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220408181425.2287230-2-andrii@kernel.org
2022-04-09 09:17:51 -07:00
Andrii Nakryiko
075c96c298 libbpf: Allow WEAK and GLOBAL bindings during BTF fixup
During BTF fix up for global variables, global variable can be global
weak and will have STB_WEAK binding in ELF. Support such global
variables in addition to non-weak ones.

This is not the problem when using BPF static linking, as BPF static
linker "fixes up" BTF during generation so that libbpf doesn't have to
do it anymore during bpf_object__open(), which led to this not being
noticed for a while, along with a pretty rare (currently) use of __weak
variables and maps.

Reported-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220407230446.3980075-2-andrii@kernel.org
2022-04-09 09:17:51 -07:00
Andrii Nakryiko
f044607934 libbpf: Use strlcpy() in path resolution fallback logic
Coverity static analyzer complains that strcpy() can cause buffer
overflow. Use libbpf_strlcpy() instead to be 100% sure this doesn't
happen.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220407230446.3980075-1-andrii@kernel.org
2022-04-09 09:17:51 -07:00
Ilya Leoshkevich
4fd682d358 libbpf: Add s390-specific USDT arg spec parsing logic
The logic is superficially similar to that of x86, but the small
differences (no need for register table and dynamic allocation of
register names, no $ sign before constants) make maintaining a common
implementation too burdensome. Therefore simply add a s390x-specific
version of parse_usdt_arg().

Note that while bcc supports index registers, this patch does not. This
should not be a problem in most cases, since s390 uses a default value
"nor" for STAP_SDT_ARG_CONSTRAINT.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220407214411.257260-4-iii@linux.ibm.com
2022-04-09 09:17:51 -07:00
Ilya Leoshkevich
3663820dda libbpf: Make BPF-side of USDT support work on big-endian machines
BPF_USDT_ARG_REG_DEREF handling always reads 8 bytes, regardless of
the actual argument size. On little-endian the relevant argument bits
end up in the lower bits of val, and later on the code that handles
all the argument types expects them to be there.

On big-endian they end up in the upper bits of val, breaking that
expectation. Fix by right-shifting val on big-endian.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220407214411.257260-3-iii@linux.ibm.com
2022-04-09 09:17:51 -07:00
Ilya Leoshkevich
fcb67a3e70 libbpf: Minor style improvements in USDT code
Fix several typos and references to non-existing headers.
Also use __BYTE_ORDER__ instead of __BYTE_ORDER for consistency with
the rest of the bpf code - see commit 45f2bebc8079 ("libbpf: Fix
endianness detection in BPF_CORE_READ_BITFIELD_PROBED()") for
rationale).

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220407214411.257260-2-iii@linux.ibm.com
2022-04-09 09:17:51 -07:00
Andrii Nakryiko
73b8386f2e libbpf: Fix use #ifdef instead of #if to avoid compiler warning
As reported by Naresh:

  perf build errors on i386 [1] on Linux next-20220407 [2]

  usdt.c:1181:5: error: "__x86_64__" is not defined, evaluates to 0
  [-Werror=undef]
   1181 | #if __x86_64__
        |     ^~~~~~~~~~
  usdt.c:1196:5: error: "__x86_64__" is not defined, evaluates to 0
  [-Werror=undef]
   1196 | #if __x86_64__
        |     ^~~~~~~~~~
  cc1: all warnings being treated as errors

Use #ifdef instead of #if to avoid this.

Fixes: 4c59e584d158 ("libbpf: Add x86-specific USDT arg spec parsing logic")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220407203842.3019904-1-andrii@kernel.org
2022-04-09 09:17:51 -07:00
Haowen Bai
462e3f600a libbpf: Potential NULL dereference in usdt_manager_attach_usdt()
link could be null but still dereference bpf_link__destroy(&link->link)
and it will lead to a null pointer access.

Signed-off-by: Haowen Bai <baihaowen@meizu.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1649299098-2069-1-git-send-email-baihaowen@meizu.com
2022-04-09 09:17:51 -07:00
Alan Maguire
13fe7fedfa libbpf: Improve string parsing for uprobe auto-attach
For uprobe auto-attach, the parsing can be simplified for the SEC()
name to a single sscanf(); the return value of the sscanf can then
be used to distinguish between sections that simply specify
"u[ret]probe" (and thus cannot auto-attach), those that specify
"u[ret]probe/binary_path:function+offset" etc.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1649245431-29956-3-git-send-email-alan.maguire@oracle.com
2022-04-09 09:17:51 -07:00
Alan Maguire
b974879969 libbpf: Improve library identification for uprobe binary path resolution
In the process of doing path resolution for uprobe attach, libraries are
identified by matching a ".so" substring in the binary_path.
This matches a lot of patterns that do not conform to library.so[.version]
format, so instead match a ".so" _suffix_, and if that fails match a
".so." substring for the versioned library case.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1649245431-29956-2-git-send-email-alan.maguire@oracle.com
2022-04-09 09:17:51 -07:00
Colin Ian King
2b674f2b21 libbpf: Fix spelling mistake "libaries" -> "libraries"
There is a spelling mistake in a pr_warn message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220406080835.14879-1-colin.i.king@gmail.com
2022-04-09 09:17:51 -07:00
Hengqi Chen
5810af7446 Makefile: Add usdt.bpf.h to list of HEADERS
Add usdt.bpf.h to HEADERS so that it can be installed
and included by users.

Signed-off-by: Hengqi Chen <chenhengqi@outlook.com>
2022-04-06 20:28:15 -07:00
Andrii Nakryiko
042471d356 ci: blacklist usdt selftest on s390x
libbpf doesn't support USDTs on s390x yet, blacklist corresponding
selftest.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-04-06 07:34:58 -07:00
Andrii Nakryiko
f7833c0819 ci: ensure CONFIG_DEBUG_INFO_BTF=y by choosing DWARF debug info
With recent upstream changes, the default for debug info is
CONFIG_DEBUG_INFO_NONE=y, which prevents BTF from being generated.
Choose CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y to make sure we do
get DWARF generated.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-04-06 07:34:58 -07:00
Andrii Nakryiko
c562444fb0 Makefile: add usdt.o to list of OBJS
Compile user-space parts of USDT support.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-04-06 07:34:58 -07:00
Andrii Nakryiko
750c9fb595 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   9492450fd28736262dea9143ebb3afc2c131ace1
Checkpoint bpf-next commit: 2d0df01974ce2b59b6f7d5bd3ea58d74f12ddf85
Baseline bpf commit:        6bd0c76bd70447aedfeafa9e1fcc249991d6c678
Checkpoint bpf commit:      0a210af6d0a0595fef566e7eeb072f10f37774be

Alan Maguire (3):
  libbpf: auto-resolve programs/libraries when necessary for uprobes
  libbpf: Support function name-based attach uprobes
  libbpf: Add auto-attach for uprobes based on section name

Andrii Nakryiko (6):
  libbpf: Avoid NULL deref when initializing map BTF info
  libbpf: Add BPF-side of USDT support
  libbpf: Wire up USDT API and bpf_link integration
  libbpf: Add USDT notes parsing and resolution logic
  libbpf: Wire up spec management and other arch-independent USDT logic
  libbpf: Add x86-specific USDT arg spec parsing logic

Anshuman Khandual (1):
  perf: Add irq and exception return branch types

Geliang Tang (1):
  bpf: Sync comments for bpf_get_stack

Haiyue Wang (1):
  bpf: Correct the comment for BTF kind bitfield

Hengqi Chen (1):
  libbpf: Close fd in bpf_object__reuse_map

Ilya Leoshkevich (1):
  libbpf: Support Debian in resolve_full_path()

Yuntao Wang (1):
  libbpf: Don't return -EINVAL if hdr_len < offsetofend(core_relo_len)

 include/uapi/linux/bpf.h        |    8 +-
 include/uapi/linux/btf.h        |    4 +-
 include/uapi/linux/perf_event.h |    2 +
 src/btf.c                       |    6 +-
 src/libbpf.c                    |  486 +++++++++++-
 src/libbpf.h                    |   41 +-
 src/libbpf.map                  |    1 +
 src/libbpf_internal.h           |   19 +
 src/usdt.bpf.h                  |  256 +++++++
 src/usdt.c                      | 1280 +++++++++++++++++++++++++++++++
 10 files changed, 2080 insertions(+), 23 deletions(-)
 create mode 100644 src/usdt.bpf.h
 create mode 100644 src/usdt.c

--
2.30.2
2022-04-06 07:34:58 -07:00
Andrii Nakryiko
08cc701fae sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2022-04-06 07:34:58 -07:00
Andrii Nakryiko
fa323673c5 libbpf: Add x86-specific USDT arg spec parsing logic
Add x86/x86_64-specific USDT argument specification parsing. Each
architecture will require their own logic, as all this is arch-specific
assembly-based notation. Architectures that libbpf doesn't support for
USDTs will pr_warn() with specific error and return -ENOTSUP.

We use sscanf() as a very powerful and easy to use string parser. Those
spaces in sscanf's format string mean "skip any whitespaces", which is
pretty nifty (and somewhat little known) feature.

All this was tested on little-endian architecture, so bit shifts are
probably off on big-endian, which our CI will hopefully prove.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Reviewed-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20220404234202.331384-6-andrii@kernel.org
2022-04-06 07:34:58 -07:00
Andrii Nakryiko
876b933999 libbpf: Wire up spec management and other arch-independent USDT logic
Last part of architecture-agnostic user-space USDT handling logic is to
set up BPF spec and, optionally, IP-to-ID maps from user-space.
usdt_manager performs a compact spec ID allocation to utilize
fixed-sized BPF maps as efficiently as possible. We also use hashmap to
deduplicate USDT arg spec strings and map identical strings to single
USDT spec, minimizing the necessary BPF map size. usdt_manager supports
arbitrary sequences of attachment and detachment, both of the same USDT
and multiple different USDTs and internally maintains a free list of
unused spec IDs. bpf_link_usdt's logic is extended with proper setup and
teardown of this spec ID free list and supporting BPF maps.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Reviewed-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20220404234202.331384-5-andrii@kernel.org
2022-04-06 07:34:58 -07:00
Andrii Nakryiko
406386b441 libbpf: Add USDT notes parsing and resolution logic
Implement architecture-agnostic parts of USDT parsing logic. The code is
the documentation in this case, it's futile to try to succinctly
describe how USDT parsing is done in any sort of concreteness. But
still, USDTs are recorded in special ELF notes section (.note.stapsdt),
where each USDT call site is described separately. Along with USDT
provider and USDT name, each such note contains USDT argument
specification, which uses assembly-like syntax to describe how to fetch
value of USDT argument. USDT arg spec could be just a constant, or
a register, or a register dereference (most common cases in x86_64), but
it technically can be much more complicated cases, like offset relative
to global symbol and stuff like that. One of the later patches will
implement most common subset of this for x86 and x86-64 architectures,
which seems to handle a lot of real-world production application.

USDT arg spec contains a compact encoding allowing usdt.bpf.h from
previous patch to handle the above 3 cases. Instead of recording which
register might be needed, we encode register's offset within struct
pt_regs to simplify BPF-side implementation. USDT argument can be of
different byte sizes (1, 2, 4, and 8) and signed or unsigned. To handle
this, libbpf pre-calculates necessary bit shifts to do proper casting
and sign-extension in a short sequences of left and right shifts.

The rest is in the code with sometimes extensive comments and references
to external "documentation" for USDTs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Reviewed-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20220404234202.331384-4-andrii@kernel.org
2022-04-06 07:34:58 -07:00
Andrii Nakryiko
1b4b798916 libbpf: Wire up USDT API and bpf_link integration
Wire up libbpf USDT support APIs without yet implementing all the
nitty-gritty details of USDT discovery, spec parsing, and BPF map
initialization.

User-visible user-space API is simple and is conceptually very similar
to uprobe API.

bpf_program__attach_usdt() API allows to programmatically attach given
BPF program to a USDT, specified through binary path (executable or
shared lib), USDT provider and name. Also, just like in uprobe case, PID
filter is specified (0 - self, -1 - any process, or specific PID).
Optionally, USDT cookie value can be specified. Such single API
invocation will try to discover given USDT in specified binary and will
use (potentially many) BPF uprobes to attach this program in correct
locations.

Just like any bpf_program__attach_xxx() APIs, bpf_link is returned that
represents this attachment. It is a virtual BPF link that doesn't have
direct kernel object, as it can consist of multiple underlying BPF
uprobe links. As such, attachment is not atomic operation and there can
be brief moment when some USDT call sites are attached while others are
still in the process of attaching. This should be taken into
consideration by user. But bpf_program__attach_usdt() guarantees that
in the case of success all USDT call sites are successfully attached, or
all the successfuly attachments will be detached as soon as some USDT
call sites failed to be attached. So, in theory, there could be cases of
failed bpf_program__attach_usdt() call which did trigger few USDT
program invocations. This is unavoidable due to multi-uprobe nature of
USDT and has to be handled by user, if it's important to create an
illusion of atomicity.

USDT BPF programs themselves are marked in BPF source code as either
SEC("usdt"), in which case they won't be auto-attached through
skeleton's <skel>__attach() method, or it can have a full definition,
which follows the spirit of fully-specified uprobes:
SEC("usdt/<path>:<provider>:<name>"). In the latter case skeleton's
attach method will attempt auto-attachment. Similarly, generic
bpf_program__attach() will have enought information to go off of for
parameterless attachment.

USDT BPF programs are actually uprobes, and as such for kernel they are
marked as BPF_PROG_TYPE_KPROBE.

Another part of this patch is USDT-related feature probing:
  - BPF cookie support detection from user-space;
  - detection of kernel support for auto-refcounting of USDT semaphore.

The latter is optional. If kernel doesn't support such feature and USDT
doesn't rely on USDT semaphores, no error is returned. But if libbpf
detects that USDT requires setting semaphores and kernel doesn't support
this, libbpf errors out with explicit pr_warn() message. Libbpf doesn't
support poking process's memory directly to increment semaphore value,
like BCC does on legacy kernels, due to inherent raciness and danger of
such process memory manipulation. Libbpf let's kernel take care of this
properly or gives up.

Logistically, all the extra USDT-related infrastructure of libbpf is put
into a separate usdt.c file and abstracted behind struct usdt_manager.
Each bpf_object has lazily-initialized usdt_manager pointer, which is
only instantiated if USDT programs are attempted to be attached. Closing
BPF object frees up usdt_manager resources. usdt_manager keeps track of
USDT spec ID assignment and few other small things.

Subsequent patches will fill out remaining missing pieces of USDT
initialization and setup logic.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20220404234202.331384-3-andrii@kernel.org
2022-04-06 07:34:58 -07:00
Andrii Nakryiko
f5390e4f07 libbpf: Add BPF-side of USDT support
Add BPF-side implementation of libbpf-provided USDT support. This
consists of single header library, usdt.bpf.h, which is meant to be used
from user's BPF-side source code. This header is added to the list of
installed libbpf header, along bpf_helpers.h and others.

BPF-side implementation consists of two BPF maps:
  - spec map, which contains "a USDT spec" which encodes information
    necessary to be able to fetch USDT arguments and other information
    (argument count, user-provided cookie value, etc) at runtime;
  - IP-to-spec-ID map, which is only used on kernels that don't support
    BPF cookie feature. It allows to lookup spec ID based on the place
    in user application that triggers USDT program.

These maps have default sizes, 256 and 1024, which are chosen
conservatively to not waste a lot of space, but handling a lot of common
cases. But there could be cases when user application needs to either
trace a lot of different USDTs, or USDTs are heavily inlined and their
arguments are located in a lot of differing locations. For such cases it
might be necessary to size those maps up, which libbpf allows to do by
overriding BPF_USDT_MAX_SPEC_CNT and BPF_USDT_MAX_IP_CNT macros.

It is an important aspect to keep in mind. Single USDT (user-space
equivalent of kernel tracepoint) can have multiple USDT "call sites".
That is, single logical USDT is triggered from multiple places in user
application. This can happen due to function inlining. Each such inlined
instance of USDT invocation can have its own unique USDT argument
specification (instructions about the location of the value of each of
USDT arguments). So while USDT looks very similar to usual uprobe or
kernel tracepoint, under the hood it's actually a collection of uprobes,
each potentially needing different spec to know how to fetch arguments.

User-visible API consists of three helper functions:
  - bpf_usdt_arg_cnt(), which returns number of arguments of current USDT;
  - bpf_usdt_arg(), which reads value of specified USDT argument (by
    it's zero-indexed position) and returns it as 64-bit value;
  - bpf_usdt_cookie(), which functions like BPF cookie for USDT
    programs; this is necessary as libbpf doesn't allow specifying actual
    BPF cookie and utilizes it internally for USDT support implementation.

Each bpf_usdt_xxx() APIs expect struct pt_regs * context, passed into
BPF program. On kernels that don't support BPF cookie it is used to
fetch absolute IP address of the underlying uprobe.

usdt.bpf.h also provides BPF_USDT() macro, which functions like
BPF_PROG() and BPF_KPROBE() and allows much more user-friendly way to
get access to USDT arguments, if USDT definition is static and known to
the user. It is expected that majority of use cases won't have to use
bpf_usdt_arg_cnt() and bpf_usdt_arg() directly and BPF_USDT() will cover
all their needs.

Last, usdt.bpf.h is utilizing BPF CO-RE for one single purpose: to
detect kernel support for BPF cookie. If BPF CO-RE dependency is
undesirable, user application can redefine BPF_USDT_HAS_BPF_COOKIE to
either a boolean constant (or equivalently zero and non-zero), or even
point it to its own .rodata variable that can be specified from user's
application user-space code. It is important that
BPF_USDT_HAS_BPF_COOKIE is known to BPF verifier as static value (thus
.rodata and not just .data), as otherwise BPF code will still contain
bpf_get_attach_cookie() BPF helper call and will fail validation at
runtime, if not dead-code eliminated.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20220404234202.331384-2-andrii@kernel.org
2022-04-06 07:34:58 -07:00
Ilya Leoshkevich
00cd090f81 libbpf: Support Debian in resolve_full_path()
attach_probe selftest fails on Debian-based distros with `failed to
resolve full path for 'libc.so.6'`. The reason is that these distros
embraced multiarch to the point where even for the "main" architecture
they store libc in /lib/<triple>.

This is configured in /etc/ld.so.conf and in theory it's possible to
replicate the loader's parsing and processing logic in libbpf, however
a much simpler solution is to just enumerate the known library paths.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220404225020.51029-1-iii@linux.ibm.com
2022-04-06 07:34:58 -07:00
Yuntao Wang
0167a314e7 libbpf: Don't return -EINVAL if hdr_len < offsetofend(core_relo_len)
Since core relos is an optional part of the .BTF.ext ELF section, we should
skip parsing it instead of returning -EINVAL if header size is less than
offsetofend(struct btf_ext_header, core_relo_len).

Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220404005320.1723055-1-ytcoode@gmail.com
2022-04-06 07:34:58 -07:00
Alan Maguire
8dcb95d509 libbpf: Add auto-attach for uprobes based on section name
Now that u[ret]probes can use name-based specification, it makes
sense to add support for auto-attach based on SEC() definition.
The format proposed is

        SEC("u[ret]probe/binary:[raw_offset|[function_name[+offset]]")

For example, to trace malloc() in libc:

        SEC("uprobe/libc.so.6:malloc")

...or to trace function foo2 in /usr/bin/foo:

        SEC("uprobe//usr/bin/foo:foo2")

Auto-attach is done for all tasks (pid -1).  prog can be an absolute
path or simply a program/library name; in the latter case, we use
PATH/LD_LIBRARY_PATH to resolve the full path, falling back to
standard locations (/usr/bin:/usr/sbin or /usr/lib64:/usr/lib) if
the file is not found via environment-variable specified locations.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1648654000-21758-4-git-send-email-alan.maguire@oracle.com
2022-04-06 07:34:58 -07:00
Alan Maguire
d112c9ce24 libbpf: Support function name-based attach uprobes
kprobe attach is name-based, using lookups of kallsyms to translate
a function name to an address.  Currently uprobe attach is done
via an offset value as described in [1].  Extend uprobe opts
for attach to include a function name which can then be converted
into a uprobe-friendly offset.  The calcualation is done in
several steps:

1. First, determine the symbol address using libelf; this gives us
   the offset as reported by objdump
2. If the function is a shared library function - and the binary
   provided is a shared library - no further work is required;
   the address found is the required address
3. Finally, if the function is local, subtract the base address
   associated with the object, retrieved from ELF program headers.

The resultant value is then added to the func_offset value passed
in to specify the uprobe attach address.  So specifying a func_offset
of 0 along with a function name "printf" will attach to printf entry.

The modes of operation supported are then

1. to attach to a local function in a binary; function "foo1" in
   "/usr/bin/foo"
2. to attach to a shared library function in a shared library -
   function "malloc" in libc.

[1] https://www.kernel.org/doc/html/latest/trace/uprobetracer.html

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1648654000-21758-3-git-send-email-alan.maguire@oracle.com
2022-04-06 07:34:58 -07:00
Alan Maguire
4a7fa5b2bc libbpf: auto-resolve programs/libraries when necessary for uprobes
bpf_program__attach_uprobe_opts() requires a binary_path argument
specifying binary to instrument.  Supporting simply specifying
"libc.so.6" or "foo" should be possible too.

Library search checks LD_LIBRARY_PATH, then /usr/lib64, /usr/lib.
This allows users to run BPF programs prefixed with
LD_LIBRARY_PATH=/path2/lib while still searching standard locations.
Similarly for non .so files, we check PATH and /usr/bin, /usr/sbin.

Path determination will be useful for auto-attach of BPF uprobe programs
using SEC() definition.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1648654000-21758-2-git-send-email-alan.maguire@oracle.com
2022-04-06 07:34:58 -07:00
Haiyue Wang
ff845b85e8 bpf: Correct the comment for BTF kind bitfield
The commit 8fd886911a6a ("bpf: Add BTF_KIND_FLOAT to uapi") has extended
the BTF kind bitfield from 4 to 5 bits, correct the comment.

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220403115327.205964-1-haiyue.wang@intel.com
2022-04-06 07:34:58 -07:00
Geliang Tang
fee7b9400a bpf: Sync comments for bpf_get_stack
Commit ee2a098851bf missed updating the comments for helper bpf_get_stack
in tools/include/uapi/linux/bpf.h. Sync it.

Fixes: ee2a098851bf ("bpf: Adjust BPF stack helper functions to accommodate skip > 0")
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/ce54617746b7ed5e9ba3b844e55e74cb8a60e0b5.1648110794.git.geliang.tang@suse.com
2022-04-06 07:34:58 -07:00
Hengqi Chen
360ed84faa libbpf: Close fd in bpf_object__reuse_map
pin_fd is dup-ed and assigned in bpf_map__reuse_fd. Close it
in bpf_object__reuse_map after reuse.

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220319030533.3132250-1-hengqi.chen@gmail.com
2022-04-06 07:34:58 -07:00
Anshuman Khandual
3fbed0f1b2 perf: Add irq and exception return branch types
This expands generic branch type classification by adding two more entries
there in i.e irq and exception return. Also updates the x86 implementation
to process X86_BR_IRET and X86_BR_IRQ records as appropriate. This changes
branch types reported to user space on x86 platform but it should not be a
problem. The possible scenarios and impacts are enumerated here.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/1645681014-3346-1-git-send-email-anshuman.khandual@arm.com
2022-04-06 07:34:58 -07:00
Andrii Nakryiko
67a4b14643 ci: remove subprogs from 5.5 whitelist
It seems like it started to cause kernel panic in CI, so drop it from
whitelist.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-03-19 23:08:50 -07:00
Andrii Nakryiko
7db9ce5fda libbpf: avoid NULL deref when initializing map BTF info
If BPF object doesn't have an BTF info, don't attempt to search for BTF
types describing BPF map key or value layout.

Fixes: 262cfb74ffda ("libbpf: Init btf_{key,value}_type_id on internal map open")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-03-19 23:08:50 -07:00
Andrii Nakryiko
f1b6bc31a5 ci: update s390x blacklist
Sync s390x blacklist with the one currently used for kernel-patches CI.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-03-19 23:08:50 -07:00
Andrii Nakryiko
3ef1813702 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   c344b9fc2108eeaa347c387219886cf87e520e93
Checkpoint bpf-next commit: 9492450fd28736262dea9143ebb3afc2c131ace1
Baseline bpf commit:        18b1ab7aa76bde181bdb1ab19a87fa9523c32f21
Checkpoint bpf commit:      6bd0c76bd70447aedfeafa9e1fcc249991d6c678

Delyan Kratunov (3):
  libbpf: .text routines are subprograms in strict mode
  libbpf: Init btf_{key,value}_type_id on internal map open
  libbpf: Add subskeleton scaffolding

Guo Zhengkui (1):
  libbpf: Fix array_size.cocci warning

Hengqi Chen (1):
  bpf: Fix comment for helper bpf_current_task_under_cgroup()

Jiri Olsa (5):
  bpf: Add multi kprobe link
  bpf: Add cookie support to programs attached with kprobe multi link
  libbpf: Add libbpf_kallsyms_parse function
  libbpf: Add bpf_link_create support for multi kprobes
  libbpf: Add bpf_program__attach_kprobe_multi_opts function

Martin KaFai Lau (1):
  bpf: Remove BPF_SKB_DELIVERY_TIME_NONE and rename
    s/delivery_time_/tstamp_/

Roberto Sassu (1):
  bpf-lsm: Introduce new helper bpf_ima_file_hash()

Toke Høiland-Jørgensen (2):
  bpf: Add "live packet" mode for XDP in BPF_PROG_RUN
  libbpf: Support batch_size option to bpf_prog_test_run

lic121 (1):
  libbpf: Unmap rings when umem deleted

 include/uapi/linux/bpf.h |  72 +++++---
 src/bpf.c                |  13 +-
 src/bpf.h                |  12 +-
 src/libbpf.c             | 383 ++++++++++++++++++++++++++++++++++-----
 src/libbpf.h             |  52 ++++++
 src/libbpf.map           |   3 +
 src/libbpf_internal.h    |   5 +
 src/libbpf_legacy.h      |   4 +
 src/xsk.c                |  15 +-
 9 files changed, 487 insertions(+), 72 deletions(-)

--
2.30.2
2022-03-19 23:08:50 -07:00
Andrii Nakryiko
d580bc49d1 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2022-03-19 23:08:50 -07:00
Delyan Kratunov
cc4ef17c78 libbpf: Add subskeleton scaffolding
In symmetry with bpf_object__open_skeleton(),
bpf_object__open_subskeleton() performs the actual walking and linking
of maps, progs, and globals described by bpf_*_skeleton objects.

Signed-off-by: Delyan Kratunov <delyank@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/6942a46fbe20e7ebf970affcca307ba616985b15.1647473511.git.delyank@fb.com
2022-03-19 23:08:50 -07:00
Delyan Kratunov
e7084d4363 libbpf: Init btf_{key,value}_type_id on internal map open
For internal and user maps, look up the key and value btf
types on open() and not load(), so that `bpf_map_btf_value_type_id`
is usable in `bpftool gen`.

Signed-off-by: Delyan Kratunov <delyank@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/78dbe4e457b4a05e098fc6c8f50014b680c86e4e.1647473511.git.delyank@fb.com
2022-03-19 23:08:50 -07:00
Delyan Kratunov
c2ec92f0ee libbpf: .text routines are subprograms in strict mode
Currently, libbpf considers a single routine in .text to be a program. This
is particularly confusing when it comes to library objects - a single routine
meant to be used as an extern will instead be considered a bpf_program.

This patch hides this compatibility behavior behind the pre-existing
SEC_NAME strict mode flag.

Signed-off-by: Delyan Kratunov <delyank@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/018de8d0d67c04bf436055270d35d394ba393505.1647473511.git.delyank@fb.com
2022-03-19 23:08:50 -07:00
Jiri Olsa
05acce9e03 libbpf: Add bpf_program__attach_kprobe_multi_opts function
Adding bpf_program__attach_kprobe_multi_opts function for attaching
kprobe program to multiple functions.

  struct bpf_link *
  bpf_program__attach_kprobe_multi_opts(const struct bpf_program *prog,
                                        const char *pattern,
                                        const struct bpf_kprobe_multi_opts *opts);

User can specify functions to attach with 'pattern' argument that
allows wildcards (*?' supported) or provide symbols or addresses
directly through opts argument. These 3 options are mutually
exclusive.

When using symbols or addresses, user can also provide cookie value
for each symbol/address that can be retrieved later in bpf program
with bpf_get_attach_cookie helper.

  struct bpf_kprobe_multi_opts {
          size_t sz;
          const char **syms;
          const unsigned long *addrs;
          const __u64 *cookies;
          size_t cnt;
          bool retprobe;
          size_t :0;
  };

Symbols, addresses and cookies are provided through opts object
(syms/addrs/cookies) as array pointers with specified count (cnt).

Each cookie value is paired with provided function address or symbol
with the same array index.

The program can be also attached as return probe if 'retprobe' is set.

For quick usage with NULL opts argument, like:

  bpf_program__attach_kprobe_multi_opts(prog, "ksys_*", NULL)

the 'prog' will be attached as kprobe to 'ksys_*' functions.

Also adding new program sections for automatic attachment:

  kprobe.multi/<symbol_pattern>
  kretprobe.multi/<symbol_pattern>

The symbol_pattern is used as 'pattern' argument in
bpf_program__attach_kprobe_multi_opts function.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-10-jolsa@kernel.org
2022-03-19 23:08:50 -07:00
Jiri Olsa
2e6e39ef80 libbpf: Add bpf_link_create support for multi kprobes
Adding new kprobe_multi struct to bpf_link_create_opts object
to pass multiple kprobe data to link_create attr uapi.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-9-jolsa@kernel.org
2022-03-19 23:08:50 -07:00
Jiri Olsa
42f78dd5ac libbpf: Add libbpf_kallsyms_parse function
Move the kallsyms parsing in internal libbpf_kallsyms_parse
function, so it can be used from other places.

It will be used in following changes.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-8-jolsa@kernel.org
2022-03-19 23:08:50 -07:00
Jiri Olsa
50ae8c25d2 bpf: Add cookie support to programs attached with kprobe multi link
Adding support to call bpf_get_attach_cookie helper from
kprobe programs attached with kprobe multi link.

The cookie is provided by array of u64 values, where each
value is paired with provided function address or symbol
with the same array index.

When cookie array is provided it's sorted together with
addresses (check bpf_kprobe_multi_cookie_swap). This way
we can find cookie based on the address in
bpf_get_attach_cookie helper.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-7-jolsa@kernel.org
2022-03-19 23:08:50 -07:00
Jiri Olsa
e85e26492d bpf: Add multi kprobe link
Adding new link type BPF_LINK_TYPE_KPROBE_MULTI that attaches kprobe
program through fprobe API.

The fprobe API allows to attach probe on multiple functions at once
very fast, because it works on top of ftrace. On the other hand this
limits the probe point to the function entry or return.

The kprobe program gets the same pt_regs input ctx as when it's attached
through the perf API.

Adding new attach type BPF_TRACE_KPROBE_MULTI that allows attachment
kprobe to multiple function with new link.

User provides array of addresses or symbols with count to attach the
kprobe program to. The new link_create uapi interface looks like:

  struct {
          __u32           flags;
          __u32           cnt;
          __aligned_u64   syms;
          __aligned_u64   addrs;
  } kprobe_multi;

The flags field allows single BPF_TRACE_KPROBE_MULTI bit to create
return multi kprobe.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-4-jolsa@kernel.org
2022-03-19 23:08:50 -07:00
Roberto Sassu
9fb154ee77 bpf-lsm: Introduce new helper bpf_ima_file_hash()
ima_file_hash() has been modified to calculate the measurement of a file on
demand, if it has not been already performed by IMA or the measurement is
not fresh. For compatibility reasons, ima_inode_hash() remains unchanged.

Keep the same approach in eBPF and introduce the new helper
bpf_ima_file_hash() to take advantage of the modified behavior of
ima_file_hash().

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220302111404.193900-4-roberto.sassu@huawei.com
2022-03-19 23:08:50 -07:00
Hengqi Chen
34d57cc0eb bpf: Fix comment for helper bpf_current_task_under_cgroup()
Fix the descriptions of the return values of helper bpf_current_task_under_cgroup().

Fixes: c6b5fb8690fa ("bpf: add documentation for eBPF helpers (42-50)")
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220310155335.1278783-1-hengqi.chen@gmail.com
2022-03-19 23:08:50 -07:00
Martin KaFai Lau
a557610d11 bpf: Remove BPF_SKB_DELIVERY_TIME_NONE and rename s/delivery_time_/tstamp_/
This patch is to simplify the uapi bpf.h regarding to the tstamp type
and use a similar way as the kernel to describe the value stored
in __sk_buff->tstamp.

My earlier thought was to avoid describing the semantic and
clock base for the rcv timestamp until there is more clarity
on the use case, so the __sk_buff->delivery_time_type naming instead
of __sk_buff->tstamp_type.

With some thoughts, it can reuse the UNSPEC naming.  This patch first
removes BPF_SKB_DELIVERY_TIME_NONE and also

rename BPF_SKB_DELIVERY_TIME_UNSPEC to BPF_SKB_TSTAMP_UNSPEC
and    BPF_SKB_DELIVERY_TIME_MONO   to BPF_SKB_TSTAMP_DELIVERY_MONO.

The semantic of BPF_SKB_TSTAMP_DELIVERY_MONO is the same:
__sk_buff->tstamp has delivery time in mono clock base.

BPF_SKB_TSTAMP_UNSPEC means __sk_buff->tstamp has the (rcv)
tstamp at ingress and the delivery time at egress.  At egress,
the clock base could be found from skb->sk->sk_clockid.
__sk_buff->tstamp == 0 naturally means NONE, so NONE is not needed.

With BPF_SKB_TSTAMP_UNSPEC for the rcv tstamp at ingress,
the __sk_buff->delivery_time_type is also renamed to __sk_buff->tstamp_type
which was also suggested in the earlier discussion:
https://lore.kernel.org/bpf/b181acbe-caf8-502d-4b7b-7d96b9fc5d55@iogearbox.net/

The above will then make __sk_buff->tstamp and __sk_buff->tstamp_type
the same as its kernel skb->tstamp and skb->mono_delivery_time
counter part.

The internal kernel function bpf_skb_convert_dtime_type_read() is then
renamed to bpf_skb_convert_tstamp_type_read() and it can be simplified
with the BPF_SKB_DELIVERY_TIME_NONE gone.  A BPF_ALU32_IMM(BPF_AND)
insn is also saved by using BPF_JMP32_IMM(BPF_JSET).

The bpf helper bpf_skb_set_delivery_time() is also renamed to
bpf_skb_set_tstamp().  The arg name is changed from dtime
to tstamp also.  It only allows setting tstamp 0 for
BPF_SKB_TSTAMP_UNSPEC and it could be relaxed later
if there is use case to change mono delivery time to
non mono.

prog->delivery_time_access is also renamed to prog->tstamp_type_access.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220309090509.3712315-1-kafai@fb.com
2022-03-19 23:08:50 -07:00
Toke Høiland-Jørgensen
5ad674a007 libbpf: Support batch_size option to bpf_prog_test_run
Add support for setting the new batch_size parameter to BPF_PROG_TEST_RUN
to libbpf; just add it as an option and pass it through to the kernel.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220309105346.100053-4-toke@redhat.com
2022-03-19 23:08:50 -07:00
Toke Høiland-Jørgensen
d647265e4b bpf: Add "live packet" mode for XDP in BPF_PROG_RUN
This adds support for running XDP programs through BPF_PROG_RUN in a mode
that enables live packet processing of the resulting frames. Previous uses
of BPF_PROG_RUN for XDP returned the XDP program return code and the
modified packet data to userspace, which is useful for unit testing of XDP
programs.

The existing BPF_PROG_RUN for XDP allows userspace to set the ingress
ifindex and RXQ number as part of the context object being passed to the
kernel. This patch reuses that code, but adds a new mode with different
semantics, which can be selected with the new BPF_F_TEST_XDP_LIVE_FRAMES
flag.

When running BPF_PROG_RUN in this mode, the XDP program return codes will
be honoured: returning XDP_PASS will result in the frame being injected
into the networking stack as if it came from the selected networking
interface, while returning XDP_TX and XDP_REDIRECT will result in the frame
being transmitted out that interface. XDP_TX is translated into an
XDP_REDIRECT operation to the same interface, since the real XDP_TX action
is only possible from within the network drivers themselves, not from the
process context where BPF_PROG_RUN is executed.

Internally, this new mode of operation creates a page pool instance while
setting up the test run, and feeds pages from that into the XDP program.
The setup cost of this is amortised over the number of repetitions
specified by userspace.

To support the performance testing use case, we further optimise the setup
step so that all pages in the pool are pre-initialised with the packet
data, and pre-computed context and xdp_frame objects stored at the start of
each page. This makes it possible to entirely avoid touching the page
content on each XDP program invocation, and enables sending up to 9
Mpps/core on my test box.

Because the data pages are recycled by the page pool, and the test runner
doesn't re-initialise them for each run, subsequent invocations of the XDP
program will see the packet data in the state it was after the last time it
ran on that particular page. This means that an XDP program that modifies
the packet before redirecting it has to be careful about which assumptions
it makes about the packet content, but that is only an issue for the most
naively written programs.

Enabling the new flag is only allowed when not setting ctx_out and data_out
in the test specification, since using it means frames will be redirected
somewhere else, so they can't be returned.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220309105346.100053-2-toke@redhat.com
2022-03-19 23:08:50 -07:00
Guo Zhengkui
21cd83a1d1 libbpf: Fix array_size.cocci warning
Fix the following coccicheck warning:
tools/lib/bpf/bpf.c:114:31-32: WARNING: Use ARRAY_SIZE
tools/lib/bpf/xsk.c:484:34-35: WARNING: Use ARRAY_SIZE
tools/lib/bpf/xsk.c:485:35-36: WARNING: Use ARRAY_SIZE

It has been tested with gcc (Debian 8.3.0-6) 8.3.0 on x86_64.

Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220306023426.19324-1-guozhengkui@vivo.com
2022-03-19 23:08:50 -07:00
lic121
6e77ef94f0 libbpf: Unmap rings when umem deleted
xsk_umem__create() does mmap for fill/comp rings, but xsk_umem__delete()
doesn't do the unmap. This works fine for regular cases, because
xsk_socket__delete() does unmap for the rings. But for the case that
xsk_socket__create_shared() fails, umem rings are not unmapped.

fill_save/comp_save are checked to determine if rings have already be
unmapped by xsk. If fill_save and comp_save are NULL, it means that the
rings have already been used by xsk. Then they are supposed to be
unmapped by xsk_socket__delete(). Otherwise, xsk_umem__delete() does the
unmap.

Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices")
Signed-off-by: Cheng Li <lic121@chinatelecom.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220301132623.GA19995@vscode.7~
2022-03-19 23:08:50 -07:00
Andrii Nakryiko
c84815ee37 ci: enable CONFIG_FPROBE=y for multi-attach kprobe tests
Recently landed multi-attach kprobe functionality expects
CONFIG_FPROBE=y.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-03-18 00:52:43 -07:00
Mykola Lysenko
4282f3cdec ci: Add troubleshooting steps to s390x setup readme
Related to libbpf CI. Added more information on how
to setup and troubleshoot GitHub action runners for
s390x platform.

Signed-off-by: Mykola Lysenko <mykolal@fb.com>
2022-03-17 21:19:03 -07:00
Andrii Nakryiko
3591deb9bc ci: blacklist s390x tests
Blacklist timer_crash_mode as requiring BPF trampoline.

Temporary blacklist sk_lookup due to big-endian problems that haven't
been resolved upstream yet.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-03-07 22:16:11 -08:00
Andrii Nakryiko
767badc609 Makefile: update libbpf version to 0.8.0
New version cycle, bump LIBBPF_MINOR_VERSION to 8 in Makefile.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-03-07 22:16:11 -08:00
Andrii Nakryiko
8e654d74c4 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   b75dacaac4650478ed5a9d33975b91b99016daff
Checkpoint bpf-next commit: c344b9fc2108eeaa347c387219886cf87e520e93
Baseline bpf commit:        75134f16e7dd0007aa474b281935c5f42e79f2c8
Checkpoint bpf commit:      18b1ab7aa76bde181bdb1ab19a87fa9523c32f21

Andrii Nakryiko (2):
  libbpf: Allow BPF program auto-attach handlers to bail out
  libbpf: Support custom SEC() handlers

Hangbin Liu (1):
  bonding: add new option ns_ip6_target

Martin KaFai Lau (1):
  bpf: Add __sk_buff->delivery_time_type and
    bpf_skb_set_skb_delivery_time()

Stijn Tintel (1):
  libbpf: Fix BPF_MAP_TYPE_PERF_EVENT_ARRAY auto-pinning

Xu Kuohai (1):
  libbpf: Skip forward declaration when counting duplicated type names

Yuntao Wang (3):
  libbpf: Remove redundant check in btf_fixup_datasec()
  libbpf: Simplify the find_elf_sec_sz() function
  libbpf: Add a check to ensure that page_cnt is non-zero

 include/uapi/linux/bpf.h     |  41 +++-
 include/uapi/linux/if_link.h |   1 +
 src/btf_dump.c               |   5 +
 src/libbpf.c                 | 388 +++++++++++++++++++++++------------
 src/libbpf.h                 | 109 ++++++++++
 src/libbpf.map               |   6 +
 src/libbpf_version.h         |   2 +-
 7 files changed, 423 insertions(+), 129 deletions(-)

--
2.30.2
2022-03-07 22:16:11 -08:00
Andrii Nakryiko
dac1e23c97 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2022-03-07 22:16:11 -08:00
Andrii Nakryiko
dc679587eb libbpf: Support custom SEC() handlers
Allow registering and unregistering custom handlers for BPF program.
This allows user applications and libraries to plug into libbpf's
declarative SEC() definition handling logic. This allows to offload
complex and intricate custom logic into external libraries, but still
provide a great user experience.

One such example is USDT handling library, which has a lot of code and
complexity which doesn't make sense to put into libbpf directly, but it
would be really great for users to be able to specify BPF programs with
something like SEC("usdt/<path-to-binary>:<usdt_provider>:<usdt_name>")
and have correct BPF program type set (BPF_PROGRAM_TYPE_KPROBE, as it is
uprobe) and even support BPF skeleton's auto-attach logic.

In some cases, it might be even good idea to override libbpf's default
handling, like for SEC("perf_event") programs. With custom library, it's
possible to extend logic to support specifying perf event specification
right there in SEC() definition without burdening libbpf with lots of
custom logic or extra library dependecies (e.g., libpfm4). With current
patch it's possible to override libbpf's SEC("perf_event") handling and
specify a completely custom ones.

Further, it's possible to specify a generic fallback handling for any
SEC() that doesn't match any other custom or standard libbpf handlers.
This allows to accommodate whatever legacy use cases there might be, if
necessary.

See doc comments for libbpf_register_prog_handler() and
libbpf_unregister_prog_handler() for detailed semantics.

This patch also bumps libbpf development version to v0.8 and adds new
APIs there.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20220305010129.1549719-3-andrii@kernel.org
2022-03-07 22:16:11 -08:00
Andrii Nakryiko
0d834905d8 libbpf: Allow BPF program auto-attach handlers to bail out
Allow some BPF program types to support auto-attach only in subste of
cases. Currently, if some BPF program type specifies attach callback, it
is assumed that during skeleton attach operation all such programs
either successfully attach or entire skeleton attachment fails. If some
program doesn't support auto-attachment from skeleton, such BPF program
types shouldn't have attach callback specified.

This is limiting for cases when, depending on how full the SEC("")
definition is, there could either be enough details to support
auto-attach or there might not be and user has to use some specific API
to provide more details at runtime.

One specific example of such desired behavior might be SEC("uprobe"). If
it's specified as just uprobe auto-attach isn't possible. But if it's
SEC("uprobe/<some_binary>:<some_func>") then there are enough details to
support auto-attach. Note that there is a somewhat subtle difference
between auto-attach behavior of BPF skeleton and using "generic"
bpf_program__attach(prog) (which uses the same attach handlers under the
cover). Skeleton allow some programs within bpf_object to not have
auto-attach implemented and doesn't treat that as an error. Instead such
BPF programs are just skipped during skeleton's (optional) attach step.
bpf_program__attach(), on the other hand, is called when user *expects*
auto-attach to work, so if specified program doesn't implement or
doesn't support auto-attach functionality, that will be treated as an
error.

Another improvement to the way libbpf is handling SEC()s would be to not
require providing dummy kernel function name for kprobe. Currently,
SEC("kprobe/whatever") is necessary even if actual kernel function is
determined by user at runtime and bpf_program__attach_kprobe() is used
to specify it. With changes in this patch, it's possible to support both
SEC("kprobe") and SEC("kprobe/<actual_kernel_function"), while only in
the latter case auto-attach will be performed. In the former one, such
kprobe will be skipped during skeleton attach operation.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20220305010129.1549719-2-andrii@kernel.org
2022-03-07 22:16:11 -08:00
Yuntao Wang
0a43bc8905 libbpf: Add a check to ensure that page_cnt is non-zero
The page_cnt parameter is used to specify the number of memory pages
allocated for each per-CPU buffer, it must be non-zero and a power of 2.

Currently, the __perf_buffer__new() function attempts to validate that
the page_cnt is a power of 2 but forgets checking for the case where
page_cnt is zero, we can fix it by replacing 'page_cnt & (page_cnt - 1)'
with 'page_cnt == 0 || (page_cnt & (page_cnt - 1))'.

If so, we also don't need to add a check in perf_buffer__new_v0_6_0() to
make sure that page_cnt is non-zero and the check for zero in
perf_buffer__new_raw_v0_6_0() can also be removed.

The code will be cleaner and more readable.

Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220303005921.53436-1-ytcoode@gmail.com
2022-03-07 22:16:11 -08:00
Xu Kuohai
5d491d5d07 libbpf: Skip forward declaration when counting duplicated type names
Currently if a declaration appears in the BTF before the definition, the
definition is dumped as a conflicting name, e.g.:

    $ bpftool btf dump file vmlinux format raw | grep "'unix_sock'"
    [81287] FWD 'unix_sock' fwd_kind=struct
    [89336] STRUCT 'unix_sock' size=1024 vlen=14

    $ bpftool btf dump file vmlinux format c | grep "struct unix_sock"
    struct unix_sock;
    struct unix_sock___2 {	<--- conflict, the "___2" is unexpected
		    struct unix_sock___2 *unix_sk;

This causes a compilation error if the dump output is used as a header file.

Fix it by skipping declaration when counting duplicated type names.

Fixes: 351131b51c7a ("libbpf: add btf_dump API for BTF-to-C conversion")
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20220301053250.1464204-2-xukuohai@huawei.com
2022-03-07 22:16:11 -08:00
Stijn Tintel
9b53decb02 libbpf: Fix BPF_MAP_TYPE_PERF_EVENT_ARRAY auto-pinning
When a BPF map of type BPF_MAP_TYPE_PERF_EVENT_ARRAY doesn't have the
max_entries parameter set, the map will be created with max_entries set
to the number of available CPUs. When we try to reuse such a pinned map,
map_is_reuse_compat will return false, as max_entries in the map
definition differs from max_entries of the existing map, causing the
following error:

  libbpf: couldn't reuse pinned map at '/sys/fs/bpf/m_logging': parameter mismatch

Fix this by overwriting max_entries in the map definition. For this to
work, we need to do this in bpf_object__create_maps, before calling
bpf_object__reuse_map.

Fixes: 57a00f41644f ("libbpf: Add auto-pinning of maps when loading BPF objects")
Signed-off-by: Stijn Tintel <stijn@linux-ipv6.be>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20220225152355.315204-1-stijn@linux-ipv6.be
2022-03-07 22:16:11 -08:00
Yuntao Wang
426672106e libbpf: Simplify the find_elf_sec_sz() function
The check in the last return statement is unnecessary, we can just return
the ret variable.

But we can simplify the function further by returning 0 immediately if we
find the section size and -ENOENT otherwise.

Thus we can also remove the ret variable.

Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220223085244.3058118-1-ytcoode@gmail.com
2022-03-07 22:16:11 -08:00
Yuntao Wang
c85a8bbe9c libbpf: Remove redundant check in btf_fixup_datasec()
The check 't->size && t->size != size' is redundant because if t->size
compares unequal to 0, we will just skip straight to sorting variables.

Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220220072750.209215-1-ytcoode@gmail.com
2022-03-07 22:16:11 -08:00
Martin KaFai Lau
e7997e49ea bpf: Add __sk_buff->delivery_time_type and bpf_skb_set_skb_delivery_time()
* __sk_buff->delivery_time_type:
This patch adds __sk_buff->delivery_time_type.  It tells if the
delivery_time is stored in __sk_buff->tstamp or not.

It will be most useful for ingress to tell if the __sk_buff->tstamp
has the (rcv) timestamp or delivery_time.  If delivery_time_type
is 0 (BPF_SKB_DELIVERY_TIME_NONE), it has the (rcv) timestamp.

Two non-zero types are defined for the delivery_time_type,
BPF_SKB_DELIVERY_TIME_MONO and BPF_SKB_DELIVERY_TIME_UNSPEC.  For UNSPEC,
it can only happen in egress because only mono delivery_time can be
forwarded to ingress now.  The clock of UNSPEC delivery_time
can be deduced from the skb->sk->sk_clockid which is how
the sch_etf doing it also.

* Provide forwarded delivery_time to tc-bpf@ingress:
With the help of the new delivery_time_type, the tc-bpf has a way
to tell if the __sk_buff->tstamp has the (rcv) timestamp or
the delivery_time.  During bpf load time, the verifier will learn if
the bpf prog has accessed the new __sk_buff->delivery_time_type.
If it does, it means the tc-bpf@ingress is expecting the
skb->tstamp could have the delivery_time.  The kernel will then
read the skb->tstamp as-is during bpf insn rewrite without
checking the skb->mono_delivery_time.  This is done by adding a
new prog->delivery_time_access bit.  The same goes for
writing skb->tstamp.

* bpf_skb_set_delivery_time():
The bpf_skb_set_delivery_time() helper is added to allow setting both
delivery_time and the delivery_time_type at the same time.  If the
tc-bpf does not need to change the delivery_time_type, it can directly
write to the __sk_buff->tstamp as the existing tc-bpf has already been
doing.  It will be most useful at ingress to change the
__sk_buff->tstamp from the (rcv) timestamp to
a mono delivery_time and then bpf_redirect_*().

bpf only has mono clock helper (bpf_ktime_get_ns), and
the current known use case is the mono EDT for fq, and
only mono delivery time can be kept during forward now,
so bpf_skb_set_delivery_time() only supports setting
BPF_SKB_DELIVERY_TIME_MONO.  It can be extended later when use cases
come up and the forwarding path also supports other clock bases.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-07 22:16:11 -08:00
Hangbin Liu
4c560383a6 bonding: add new option ns_ip6_target
This patch add a new bonding option ns_ip6_target, which correspond
to the arp_ip_target. With this we set IPv6 targets and send IPv6 NS
request to determine the health of the link.

For other related options like the validation, we still use
arp_validate, and will change to ns_validate later.

Note: the sysfs configuration support was removed based on
https://lore.kernel.org/netdev/8863.1645071997@famine

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-07 22:16:11 -08:00
Andrii Nakryiko
9c44c8a8e0 LICENSE: fix BSD-2-Clause by adding year and authors
Seems like 2015 is the year of the first libbpf commit. So use Lorenz's
suggestion and add "(c) 2015 The Libbpf Authors".

Closes: https://github.com/libbpf/libbpf/issues/461
Reported-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-02-23 17:55:21 -08:00
Andrii Nakryiko
1c173e5fc8 libbpf: fix libbpf.pc generation w.r.t. patch versions
Ensure that libbpf.pc gets full libbpf's version, including patch
releases. Also add some mechanism to ensure that official released
version (e.g., 0.7.1) and the one recorded in libbpf.map (which never
bumps patch version, so will be 0.7.0) are in sync up to major and minor
versions. This should ensure that major mistakes are captured. We'll
still need to be very careful with zeroing out patch version on minor
version bumps.

Closes: https://github.com/libbpf/libbpf/issues/455
Reported-by: Michel Salim <michel@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-02-22 20:06:42 -08:00
Andrii Nakryiko
93c570ca4b sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   2e3f7bed28376a1a41ce4a58b7163b586e97a546
Checkpoint bpf-next commit: b75dacaac4650478ed5a9d33975b91b99016daff
Baseline bpf commit:        45ce4b4f9009102cd9f581196d480a59208690c1
Checkpoint bpf commit:      75134f16e7dd0007aa474b281935c5f42e79f2c8

Andrii Nakryiko (1):
  libbpf: Fix memleak in libbpf_netlink_recv()

 src/netlink.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

--
2.30.2
2022-02-17 11:33:57 -08:00
Andrii Nakryiko
33201b7ebd libbpf: Fix memleak in libbpf_netlink_recv()
Ensure that libbpf_netlink_recv() frees dynamically allocated buffer in
all code paths.

Fixes: 9c3de619e13e ("libbpf: Use dynamically allocated buffer when receiving netlink messages")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20220217073958.276959-1-andrii@kernel.org
2022-02-17 11:33:57 -08:00
Andrii Nakryiko
6edaacad4f sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   8cbf062a250ed52148badf6f3ffd03657dd4a3f0
Checkpoint bpf-next commit: 2e3f7bed28376a1a41ce4a58b7163b586e97a546
Baseline bpf commit:        61d06f01f9710b327a53492e5add9f972eb909b3
Checkpoint bpf commit:      45ce4b4f9009102cd9f581196d480a59208690c1

Mauricio Vásquez (2):
  libbpf: Split bpf_core_apply_relo()
  libbpf: Expose bpf_core_{add,free}_cands() to bpftool

 src/libbpf.c          | 88 ++++++++++++++++++++++++-------------------
 src/libbpf_internal.h |  9 +++++
 src/relo_core.c       | 79 +++++++++++---------------------------
 src/relo_core.h       | 42 ++++++++++++++++++---
 4 files changed, 118 insertions(+), 100 deletions(-)

--
2.30.2
2022-02-16 13:58:30 -08:00
Mauricio Vásquez
af29a83fe2 libbpf: Expose bpf_core_{add,free}_cands() to bpftool
Expose bpf_core_add_cands() and bpf_core_free_cands() to handle
candidates list.

Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io>
Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com>
Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co>
Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220215225856.671072-3-mauricio@kinvolk.io
2022-02-16 13:58:30 -08:00
Mauricio Vásquez
6387d3900f libbpf: Split bpf_core_apply_relo()
BTFGen needs to run the core relocation logic in order to understand
what are the types involved in a given relocation.

Currently bpf_core_apply_relo() calculates and **applies** a relocation
to an instruction. Having both operations in the same function makes it
difficult to only calculate the relocation without patching the
instruction. This commit splits that logic in two different phases: (1)
calculate the relocation and (2) patch the instruction.

For the first phase bpf_core_apply_relo() is renamed to
bpf_core_calc_relo_insn() who is now only on charge of calculating the
relocation, the second phase uses the already existing
bpf_core_patch_insn(). bpf_object__relocate_core() uses both of them and
the BTFGen will use only bpf_core_calc_relo_insn().

Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io>
Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com>
Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co>
Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220215225856.671072-2-mauricio@kinvolk.io
2022-02-16 13:58:30 -08:00
Andrii Nakryiko
196da61f1d sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   dc37dc617fabfb1c3a16d49f5d8cc20e9e3608ca
Checkpoint bpf-next commit: 8cbf062a250ed52148badf6f3ffd03657dd4a3f0
Baseline bpf commit:        fe68195daf34d5dddacd3f93dd3eafc4beca3a0e
Checkpoint bpf commit:      61d06f01f9710b327a53492e5add9f972eb909b3

Alexei Starovoitov (1):
  libbpf: Prepare light skeleton for the kernel.

Jakub Sitnicki (1):
  selftests/bpf: Cover 4-byte load from remote_port in bpf_sk_lookup

Marco Elver (1):
  perf: uapi: Document perf_event_attr::sig_data truncation on 32 bit
    architectures

Toke Høiland-Jørgensen (1):
  libbpf: Use dynamically allocated buffer when receiving netlink
    messages

 include/uapi/linux/bpf.h        |   3 +-
 include/uapi/linux/perf_event.h |   2 +
 src/gen_loader.c                |  15 ++-
 src/netlink.c                   |  55 +++++++++-
 src/skel_internal.h             | 185 ++++++++++++++++++++++++++++----
 5 files changed, 234 insertions(+), 26 deletions(-)

--
2.30.2
2022-02-15 22:32:04 -08:00
Marco Elver
db8dc47ce8 perf: uapi: Document perf_event_attr::sig_data truncation on 32 bit architectures
Due to the alignment requirements of siginfo_t, as described in
3ddb3fd8cdb0 ("signal, perf: Fix siginfo_t by avoiding u64 on 32-bit
architectures"), siginfo_t::si_perf_data is limited to an unsigned long.

However, perf_event_attr::sig_data is an u64, to avoid having to deal
with compat conversions. Due to being an u64, it may not immediately be
clear to users that sig_data is truncated on 32 bit architectures.

Add a comment to explicitly point this out, and hopefully help some
users save time by not having to deduce themselves what's happening.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://lore.kernel.org/r/20220131103407.1971678-3-elver@google.com
2022-02-15 22:32:04 -08:00
Toke Høiland-Jørgensen
f7d89c3910 libbpf: Use dynamically allocated buffer when receiving netlink messages
When receiving netlink messages, libbpf was using a statically allocated
stack buffer of 4k bytes. This happened to work fine on systems with a 4k
page size, but on systems with larger page sizes it can lead to truncated
messages. The user-visible impact of this was that libbpf would insist no
XDP program was attached to some interfaces because that bit of the netlink
message got chopped off.

Fix this by switching to a dynamically allocated buffer; we borrow the
approach from iproute2 of using recvmsg() with MSG_PEEK|MSG_TRUNC to get
the actual size of the pending message before receiving it, adjusting the
buffer as necessary. While we're at it, also add retries on interrupted
system calls around the recvmsg() call.

v2:
  - Move peek logic to libbpf_netlink_recv(), don't double free on ENOMEM.

Fixes: 8bbb77b7c7a2 ("libbpf: Add various netlink helpers")
Reported-by: Zhiqian Guan <zhguan@redhat.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/bpf/20220211234819.612288-1-toke@redhat.com
2022-02-15 22:32:04 -08:00
Alexei Starovoitov
0d6262ad0a libbpf: Prepare light skeleton for the kernel.
Prepare light skeleton to be used in the kernel module and in the user space.
The look and feel of lskel.h is mostly the same with the difference that for
user space the skel->rodata is the same pointer before and after skel_load
operation, while in the kernel the skel->rodata after skel_open and the
skel->rodata after skel_load are different pointers.
Typical usage of skeleton remains the same for kernel and user space:
skel = my_bpf__open();
skel->rodata->my_global_var = init_val;
err = my_bpf__load(skel);
err = my_bpf__attach(skel);
// access skel->rodata->my_global_var;
// access skel->bss->another_var;

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220209232001.27490-3-alexei.starovoitov@gmail.com
2022-02-15 22:32:04 -08:00
Jakub Sitnicki
7593fc7a85 selftests/bpf: Cover 4-byte load from remote_port in bpf_sk_lookup
Extend the context access tests for sk_lookup prog to cover the surprising
case of a 4-byte load from the remote_port field, where the expected value
is actually shifted by 16 bits.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220209184333.654927-3-jakub@cloudflare.com
2022-02-15 22:32:04 -08:00
Andrii Nakryiko
67f813c8a8 README: add libbpf distro packaging badge
Add badge displaying libbpf's packaging status across various Linux distros.
2022-02-11 21:21:10 -08:00
Andrii Nakryiko
2cd2d03f63 libbpf: Fix libbpf.map inheritance chain for LIBBPF_0.7.0
Ensure that LIBBPF_0.7.0 inherits everything from LIBBPF_0.6.0.

Fixes: dbdd2c7f8cec ("libbpf: Add API to get/set log_level at per-program level")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220211205235.2089104-1-andrii@kernel.org
2022-02-11 13:01:37 -08:00
52 changed files with 141683 additions and 84675 deletions

File diff suppressed because it is too large Load Diff

View File

@@ -10,6 +10,11 @@ sphinx:
builder: html
configuration: docs/conf.py
formats:
- htmlzip
- pdf
- epub
# Optionally set the version of Python and requirements required to build your docs
python:
version: 3.7

View File

@@ -1,130 +0,0 @@
sudo: required
language: bash
dist: focal
services:
- docker
env:
global:
- PROJECT_NAME='libbpf'
- AUTHOR_EMAIL="$(git log -1 --pretty=\"%aE\")"
- REPO_ROOT="$TRAVIS_BUILD_DIR"
- CI_ROOT="$REPO_ROOT/travis-ci"
- VMTEST_ROOT="$CI_ROOT/vmtest"
addons:
apt:
packages:
- qemu-kvm
- zstd
- binutils-dev
- elfutils
- libcap-dev
- libelf-dev
- libdw-dev
stages:
# Run Coverity periodically instead of for each PR for following reasons:
# 1) Coverity jobs are heavily rate-limited
# 2) Due to security restrictions of encrypted environment variables
# in Travis CI, pull requests made from forks can't access encrypted
# env variables, making Coverity unusable
# See: https://docs.travis-ci.com/user/pull-requests#pull-requests-and-security-restrictions
- name: Coverity
if: type = cron
jobs:
include:
- stage: Builds & Tests
name: Kernel 5.5.0 + selftests
language: bash
env: KERNEL=5.5.0
script: $CI_ROOT/vmtest/run_vmtest.sh || travis_terminate 1
- name: Kernel LATEST + selftests
language: bash
env: KERNEL=LATEST
script: $CI_ROOT/vmtest/run_vmtest.sh || travis_terminate 1
- name: Kernel 4.9.0 + selftests
language: bash
env: KERNEL=4.9.0
script: $CI_ROOT/vmtest/run_vmtest.sh || travis_terminate 1
- name: Debian Build
language: bash
install: $CI_ROOT/managers/debian.sh SETUP
script: $CI_ROOT/managers/debian.sh RUN || travis_terminate 1
after_script: $CI_ROOT/managers/debian.sh CLEANUP
- name: Debian Build (ASan+UBSan)
language: bash
install: $CI_ROOT/managers/debian.sh SETUP
script: $CI_ROOT/managers/debian.sh RUN_ASAN || travis_terminate 1
after_script: $CI_ROOT/managers/debian.sh CLEANUP
- name: Debian Build (clang)
language: bash
install: $CI_ROOT/managers/debian.sh SETUP
script: $CI_ROOT/managers/debian.sh RUN_CLANG || travis_terminate 1
after_script: $CI_ROOT/managers/debian.sh CLEANUP
- name: Debian Build (clang ASan+UBSan)
language: bash
install: $CI_ROOT/managers/debian.sh SETUP
script: $CI_ROOT/managers/debian.sh RUN_CLANG_ASAN || travis_terminate 1
after_script: $CI_ROOT/managers/debian.sh CLEANUP
- name: Debian Build (gcc-10)
language: bash
install: $CI_ROOT/managers/debian.sh SETUP
script: $CI_ROOT/managers/debian.sh RUN_GCC10 || travis_terminate 1
after_script: $CI_ROOT/managers/debian.sh CLEANUP
- name: Debian Build (gcc-10 ASan+UBSan)
language: bash
install: $CI_ROOT/managers/debian.sh SETUP
script: $CI_ROOT/managers/debian.sh RUN_GCC10_ASAN || travis_terminate 1
after_script: $CI_ROOT/managers/debian.sh CLEANUP
- name: Ubuntu Focal Build
language: bash
script: sudo $CI_ROOT/managers/ubuntu.sh || travis_terminate 1
- name: Ubuntu Focal Build (arm)
arch: arm64
language: bash
script: sudo $CI_ROOT/managers/ubuntu.sh || travis_terminate 1
- name: Ubuntu Focal Build (s390x)
arch: s390x
language: bash
script: sudo $CI_ROOT/managers/ubuntu.sh || travis_terminate 1
- name: Ubuntu Focal Build (ppc64le)
arch: ppc64le
language: bash
script: sudo $CI_ROOT/managers/ubuntu.sh || travis_terminate 1
- stage: Coverity
language: bash
env:
# Coverity configuration
# COVERITY_SCAN_TOKEN=xxx
# Encrypted using `travis encrypt --repo libbpf/libbpf COVERITY_SCAN_TOKEN=xxx`
- secure: "I9OsMRHbb82IUivDp+I+w/jEQFOJgBDAqYqf1ollqCM1QhocxMcS9bwIAgfPhdXi2hohV7sRrVMZstahY67FAvJLGxNopi4tAPDIAaIFxgO0yDxMhaTMx5xDfMwlIm2FOP/9gB9BQsd6M7CmoQZgXYwBIv7xd1ooxoQrh2rOK1YrRl7UQu3+c3zPTjDfIYZzR3bFttMqZ9/c4U0v8Ry5IFXrel3hCshndHA1TtttJrUSrILlZcmVc1ch7JIy6zCbCU/2lGv0B/7rWXfF8MT7O9jPtFOhJ1DEcd2zhw2n4j9YT3a8OhtnM61LA6ask632mwCOsxpFLTun7AzuR1Cb5mdPHsxhxnCHcXXARa2mJjem0QG1NhwxwJE8sbRDapojexxCvweYlEN40ofwMDSnj/qNt95XIcrk0tiIhGFx0gVNWvAdmZwx+N4mwGPMTAN0AEOFjpgI+ZdB89m+tL/CbEgE1flc8QxUxJhcp5OhH6yR0z9qYOp0nXIbHsIaCiRvt/7LqFRQfheifztWVz4mdQlCdKS9gcOQ09oKicPevKO1L0Ue3cb7Ug7jOpMs+cdh3XokJtUeYEr1NijMHT9+CTAhhO5RToWXIZRon719z3fwoUBNDREATwVFMlVxqSO/pbYgaKminigYbl785S89YYaZ6E5UvaKRHM6KHKMDszs="
- COVERITY_SCAN_PROJECT_NAME="libbpf"
- COVERITY_SCAN_NOTIFICATION_EMAIL="${AUTHOR_EMAIL}"
- COVERITY_SCAN_BRANCH_PATTERN="$TRAVIS_BRANCH"
# Note: `make -C src/` as a BUILD_COMMAND will not work here
- COVERITY_SCAN_BUILD_COMMAND_PREPEND="cd src/"
- COVERITY_SCAN_BUILD_COMMAND="make"
install:
- sudo echo 'deb-src http://archive.ubuntu.com/ubuntu/ focal main restricted universe multiverse' >>/etc/apt/sources.list
- sudo apt-get update
- sudo apt-get -y build-dep libelf-dev
- sudo apt-get install -y libelf-dev pkg-config
script:
- scripts/coverity.sh || travis_terminate 1
allow_failures:
- env: KERNEL=x.x.x

View File

@@ -1 +1 @@
fe68195daf34d5dddacd3f93dd3eafc4beca3a0e
d28b25a62a47a8c8aa19bd543863aab6717e68c9

View File

@@ -1 +1 @@
dc37dc617fabfb1c3a16d49f5d8cc20e9e3608ca
b0d93b44641a83c28014ca38001e85bf6dc8501e

View File

@@ -7,7 +7,7 @@ Usage-Guide:
SPDX-License-Identifier: BSD-2-Clause
License-Text:
Copyright (c) <year> <owner> . All rights reserved.
Copyright (c) 2015 The Libbpf Authors. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

View File

@@ -73,34 +73,6 @@ $ cd src
$ PKG_CONFIG_PATH=/build/root/lib64/pkgconfig DESTDIR=/build/root make install
```
Distributions
=============
Distributions packaging libbpf from this mirror:
- [Fedora](https://src.fedoraproject.org/rpms/libbpf)
- [Gentoo](https://packages.gentoo.org/packages/dev-libs/libbpf)
- [Debian](https://packages.debian.org/source/sid/libbpf)
- [Arch](https://www.archlinux.org/packages/extra/x86_64/libbpf/)
- [Ubuntu](https://packages.ubuntu.com/source/impish/libbpf)
- [Alpine](https://pkgs.alpinelinux.org/packages?name=libbpf)
Benefits of packaging from the mirror over packaging from kernel sources:
- Consistent versioning across distributions.
- No ties to any specific kernel, transparent handling of older kernels.
Libbpf is designed to be kernel-agnostic and work across multitude of
kernel versions. It has built-in mechanisms to gracefully handle older
kernels, that are missing some of the features, by working around or
gracefully degrading functionality. Thus libbpf is not tied to a specific
kernel version and can/should be packaged and versioned independently.
- Continuous integration testing via
[TravisCI](https://travis-ci.org/libbpf/libbpf).
- Static code analysis via [LGTM](https://lgtm.com/projects/g/libbpf/libbpf)
and [Coverity](https://scan.coverity.com/projects/libbpf).
Package dependencies of libbpf, package names may vary across distros:
- zlib
- libelf
BPF CO-RE (Compile Once Run Everywhere)
=========================================
@@ -154,6 +126,36 @@ use it:
converting some more to both contribute to the BPF community and gain some
more experience with it.
Distributions
=============
Distributions packaging libbpf from this mirror:
- [Fedora](https://src.fedoraproject.org/rpms/libbpf)
- [Gentoo](https://packages.gentoo.org/packages/dev-libs/libbpf)
- [Debian](https://packages.debian.org/source/sid/libbpf)
- [Arch](https://www.archlinux.org/packages/extra/x86_64/libbpf/)
- [Ubuntu](https://packages.ubuntu.com/source/impish/libbpf)
- [Alpine](https://pkgs.alpinelinux.org/packages?name=libbpf)
Benefits of packaging from the mirror over packaging from kernel sources:
- Consistent versioning across distributions.
- No ties to any specific kernel, transparent handling of older kernels.
Libbpf is designed to be kernel-agnostic and work across multitude of
kernel versions. It has built-in mechanisms to gracefully handle older
kernels, that are missing some of the features, by working around or
gracefully degrading functionality. Thus libbpf is not tied to a specific
kernel version and can/should be packaged and versioned independently.
- Continuous integration testing via
[GitHub Actions](https://github.com/libbpf/libbpf/actions).
- Static code analysis via [LGTM](https://lgtm.com/projects/g/libbpf/libbpf)
and [Coverity](https://scan.coverity.com/projects/libbpf).
Package dependencies of libbpf, package names may vary across distros:
- zlib
- libelf
[![libbpf distro packaging status](https://repology.org/badge/vertical-allrepos/libbpf.svg)](https://repology.org/project/libbpf/versions)
License
=======

View File

@@ -6,14 +6,13 @@ libbpf
.. toctree::
:maxdepth: 1
API Documentation <https://libbpf.readthedocs.io/en/latest/api.html>
libbpf_naming_convention
libbpf_build
This is documentation for libbpf, a userspace library for loading and
interacting with bpf programs.
For API documentation see the `versioned API documentation site <https://libbpf.readthedocs.io/en/latest/api.html>`_.
All general BPF questions, including kernel functionality, libbpf APIs and
their application, should be sent to bpf@vger.kernel.org mailing list.
You can `subscribe <http://vger.kernel.org/vger-lists.html#bpf>`_ to the

View File

@@ -997,6 +997,8 @@ enum bpf_attach_type {
BPF_SK_REUSEPORT_SELECT,
BPF_SK_REUSEPORT_SELECT_OR_MIGRATE,
BPF_PERF_EVENT,
BPF_TRACE_KPROBE_MULTI,
BPF_LSM_CGROUP,
__MAX_BPF_ATTACH_TYPE
};
@@ -1011,6 +1013,8 @@ enum bpf_link_type {
BPF_LINK_TYPE_NETNS = 5,
BPF_LINK_TYPE_XDP = 6,
BPF_LINK_TYPE_PERF_EVENT = 7,
BPF_LINK_TYPE_KPROBE_MULTI = 8,
BPF_LINK_TYPE_STRUCT_OPS = 9,
MAX_BPF_LINK_TYPE,
};
@@ -1118,6 +1122,11 @@ enum bpf_link_type {
*/
#define BPF_F_XDP_HAS_FRAGS (1U << 5)
/* link_create.kprobe_multi.flags used in LINK_CREATE command for
* BPF_TRACE_KPROBE_MULTI attach type to create return probe.
*/
#define BPF_F_KPROBE_MULTI_RETURN (1U << 0)
/* When BPF ldimm64's insn[0].src_reg != 0 then this can have
* the following extensions:
*
@@ -1232,6 +1241,8 @@ enum {
/* If set, run the test on the cpu specified by bpf_attr.test.cpu */
#define BPF_F_TEST_RUN_ON_CPU (1U << 0)
/* If set, XDP frames will be transmitted after processing */
#define BPF_F_TEST_XDP_LIVE_FRAMES (1U << 1)
/* type for BPF_ENABLE_STATS */
enum bpf_stats_type {
@@ -1393,6 +1404,7 @@ union bpf_attr {
__aligned_u64 ctx_out;
__u32 flags;
__u32 cpu;
__u32 batch_size;
} test;
struct { /* anonymous struct used by BPF_*_GET_*_ID */
@@ -1420,6 +1432,7 @@ union bpf_attr {
__u32 attach_flags;
__aligned_u64 prog_ids;
__u32 prog_cnt;
__aligned_u64 prog_attach_flags; /* output: per-program attach_flags */
} query;
struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */
@@ -1472,6 +1485,22 @@ union bpf_attr {
*/
__u64 bpf_cookie;
} perf_event;
struct {
__u32 flags;
__u32 cnt;
__aligned_u64 syms;
__aligned_u64 addrs;
__aligned_u64 cookies;
} kprobe_multi;
struct {
/* this is overlaid with the target_btf_id above. */
__u32 target_btf_id;
/* black box user-provided value passed through
* to BPF program at the execution time and
* accessible through bpf_get_attach_cookie() BPF helper
*/
__u64 cookie;
} tracing;
};
} link_create;
@@ -2299,8 +2328,8 @@ union bpf_attr {
* Return
* The return value depends on the result of the test, and can be:
*
* * 0, if current task belongs to the cgroup2.
* * 1, if current task does not belong to the cgroup2.
* * 1, if current task belongs to the cgroup2.
* * 0, if current task does not belong to the cgroup2.
* * A negative error code, if an error occurred.
*
* long bpf_skb_change_tail(struct sk_buff *skb, u32 len, u64 flags)
@@ -2992,8 +3021,8 @@ union bpf_attr {
*
* # sysctl kernel.perf_event_max_stack=<new value>
* Return
* A non-negative value equal to or less than *size* on success,
* or a negative error in case of failure.
* The non-negative copied *buf* length equal to or less than
* *size* on success, or a negative error in case of failure.
*
* long bpf_skb_load_bytes_relative(const void *skb, u32 offset, void *to, u32 len, u32 start_header)
* Description
@@ -3570,10 +3599,11 @@ union bpf_attr {
*
* *iph* points to the start of the IPv4 or IPv6 header, while
* *iph_len* contains **sizeof**\ (**struct iphdr**) or
* **sizeof**\ (**struct ip6hdr**).
* **sizeof**\ (**struct ipv6hdr**).
*
* *th* points to the start of the TCP header, while *th_len*
* contains **sizeof**\ (**struct tcphdr**).
* contains the length of the TCP header (at least
* **sizeof**\ (**struct tcphdr**)).
* Return
* 0 if *iph* and *th* are a valid SYN cookie ACK, or a negative
* error otherwise.
@@ -3756,10 +3786,11 @@ union bpf_attr {
*
* *iph* points to the start of the IPv4 or IPv6 header, while
* *iph_len* contains **sizeof**\ (**struct iphdr**) or
* **sizeof**\ (**struct ip6hdr**).
* **sizeof**\ (**struct ipv6hdr**).
*
* *th* points to the start of the TCP header, while *th_len*
* contains the length of the TCP header.
* contains the length of the TCP header with options (at least
* **sizeof**\ (**struct tcphdr**)).
* Return
* On success, lower 32 bits hold the generated SYN cookie in
* followed by 16 bits which hold the MSS value for that cookie,
@@ -4299,8 +4330,8 @@ union bpf_attr {
*
* # sysctl kernel.perf_event_max_stack=<new value>
* Return
* A non-negative value equal to or less than *size* on success,
* or a negative error in case of failure.
* The non-negative copied *buf* length equal to or less than
* *size* on success, or a negative error in case of failure.
*
* long bpf_load_hdr_opt(struct bpf_sock_ops *skops, void *searchby_res, u32 len, u64 flags)
* Description
@@ -5086,6 +5117,216 @@ union bpf_attr {
* Return
* 0 on success, or a negative error in case of failure. On error
* *dst* buffer is zeroed out.
*
* long bpf_skb_set_tstamp(struct sk_buff *skb, u64 tstamp, u32 tstamp_type)
* Description
* Change the __sk_buff->tstamp_type to *tstamp_type*
* and set *tstamp* to the __sk_buff->tstamp together.
*
* If there is no need to change the __sk_buff->tstamp_type,
* the tstamp value can be directly written to __sk_buff->tstamp
* instead.
*
* BPF_SKB_TSTAMP_DELIVERY_MONO is the only tstamp that
* will be kept during bpf_redirect_*(). A non zero
* *tstamp* must be used with the BPF_SKB_TSTAMP_DELIVERY_MONO
* *tstamp_type*.
*
* A BPF_SKB_TSTAMP_UNSPEC *tstamp_type* can only be used
* with a zero *tstamp*.
*
* Only IPv4 and IPv6 skb->protocol are supported.
*
* This function is most useful when it needs to set a
* mono delivery time to __sk_buff->tstamp and then
* bpf_redirect_*() to the egress of an iface. For example,
* changing the (rcv) timestamp in __sk_buff->tstamp at
* ingress to a mono delivery time and then bpf_redirect_*()
* to sch_fq@phy-dev.
* Return
* 0 on success.
* **-EINVAL** for invalid input
* **-EOPNOTSUPP** for unsupported protocol
*
* long bpf_ima_file_hash(struct file *file, void *dst, u32 size)
* Description
* Returns a calculated IMA hash of the *file*.
* If the hash is larger than *size*, then only *size*
* bytes will be copied to *dst*
* Return
* The **hash_algo** is returned on success,
* **-EOPNOTSUP** if the hash calculation failed or **-EINVAL** if
* invalid arguments are passed.
*
* void *bpf_kptr_xchg(void *map_value, void *ptr)
* Description
* Exchange kptr at pointer *map_value* with *ptr*, and return the
* old value. *ptr* can be NULL, otherwise it must be a referenced
* pointer which will be released when this helper is called.
* Return
* The old value of kptr (which can be NULL). The returned pointer
* if not NULL, is a reference which must be released using its
* corresponding release function, or moved into a BPF map before
* program exit.
*
* void *bpf_map_lookup_percpu_elem(struct bpf_map *map, const void *key, u32 cpu)
* Description
* Perform a lookup in *percpu map* for an entry associated to
* *key* on *cpu*.
* Return
* Map value associated to *key* on *cpu*, or **NULL** if no entry
* was found or *cpu* is invalid.
*
* struct mptcp_sock *bpf_skc_to_mptcp_sock(void *sk)
* Description
* Dynamically cast a *sk* pointer to a *mptcp_sock* pointer.
* Return
* *sk* if casting is valid, or **NULL** otherwise.
*
* long bpf_dynptr_from_mem(void *data, u32 size, u64 flags, struct bpf_dynptr *ptr)
* Description
* Get a dynptr to local memory *data*.
*
* *data* must be a ptr to a map value.
* The maximum *size* supported is DYNPTR_MAX_SIZE.
* *flags* is currently unused.
* Return
* 0 on success, -E2BIG if the size exceeds DYNPTR_MAX_SIZE,
* -EINVAL if flags is not 0.
*
* long bpf_ringbuf_reserve_dynptr(void *ringbuf, u32 size, u64 flags, struct bpf_dynptr *ptr)
* Description
* Reserve *size* bytes of payload in a ring buffer *ringbuf*
* through the dynptr interface. *flags* must be 0.
*
* Please note that a corresponding bpf_ringbuf_submit_dynptr or
* bpf_ringbuf_discard_dynptr must be called on *ptr*, even if the
* reservation fails. This is enforced by the verifier.
* Return
* 0 on success, or a negative error in case of failure.
*
* void bpf_ringbuf_submit_dynptr(struct bpf_dynptr *ptr, u64 flags)
* Description
* Submit reserved ring buffer sample, pointed to by *data*,
* through the dynptr interface. This is a no-op if the dynptr is
* invalid/null.
*
* For more information on *flags*, please see
* 'bpf_ringbuf_submit'.
* Return
* Nothing. Always succeeds.
*
* void bpf_ringbuf_discard_dynptr(struct bpf_dynptr *ptr, u64 flags)
* Description
* Discard reserved ring buffer sample through the dynptr
* interface. This is a no-op if the dynptr is invalid/null.
*
* For more information on *flags*, please see
* 'bpf_ringbuf_discard'.
* Return
* Nothing. Always succeeds.
*
* long bpf_dynptr_read(void *dst, u32 len, struct bpf_dynptr *src, u32 offset)
* Description
* Read *len* bytes from *src* into *dst*, starting from *offset*
* into *src*.
* Return
* 0 on success, -E2BIG if *offset* + *len* exceeds the length
* of *src*'s data, -EINVAL if *src* is an invalid dynptr.
*
* long bpf_dynptr_write(struct bpf_dynptr *dst, u32 offset, void *src, u32 len)
* Description
* Write *len* bytes from *src* into *dst*, starting from *offset*
* into *dst*.
* Return
* 0 on success, -E2BIG if *offset* + *len* exceeds the length
* of *dst*'s data, -EINVAL if *dst* is an invalid dynptr or if *dst*
* is a read-only dynptr.
*
* void *bpf_dynptr_data(struct bpf_dynptr *ptr, u32 offset, u32 len)
* Description
* Get a pointer to the underlying dynptr data.
*
* *len* must be a statically known value. The returned data slice
* is invalidated whenever the dynptr is invalidated.
* Return
* Pointer to the underlying dynptr data, NULL if the dynptr is
* read-only, if the dynptr is invalid, or if the offset and length
* is out of bounds.
*
* s64 bpf_tcp_raw_gen_syncookie_ipv4(struct iphdr *iph, struct tcphdr *th, u32 th_len)
* Description
* Try to issue a SYN cookie for the packet with corresponding
* IPv4/TCP headers, *iph* and *th*, without depending on a
* listening socket.
*
* *iph* points to the IPv4 header.
*
* *th* points to the start of the TCP header, while *th_len*
* contains the length of the TCP header (at least
* **sizeof**\ (**struct tcphdr**)).
* Return
* On success, lower 32 bits hold the generated SYN cookie in
* followed by 16 bits which hold the MSS value for that cookie,
* and the top 16 bits are unused.
*
* On failure, the returned value is one of the following:
*
* **-EINVAL** if *th_len* is invalid.
*
* s64 bpf_tcp_raw_gen_syncookie_ipv6(struct ipv6hdr *iph, struct tcphdr *th, u32 th_len)
* Description
* Try to issue a SYN cookie for the packet with corresponding
* IPv6/TCP headers, *iph* and *th*, without depending on a
* listening socket.
*
* *iph* points to the IPv6 header.
*
* *th* points to the start of the TCP header, while *th_len*
* contains the length of the TCP header (at least
* **sizeof**\ (**struct tcphdr**)).
* Return
* On success, lower 32 bits hold the generated SYN cookie in
* followed by 16 bits which hold the MSS value for that cookie,
* and the top 16 bits are unused.
*
* On failure, the returned value is one of the following:
*
* **-EINVAL** if *th_len* is invalid.
*
* **-EPROTONOSUPPORT** if CONFIG_IPV6 is not builtin.
*
* long bpf_tcp_raw_check_syncookie_ipv4(struct iphdr *iph, struct tcphdr *th)
* Description
* Check whether *iph* and *th* contain a valid SYN cookie ACK
* without depending on a listening socket.
*
* *iph* points to the IPv4 header.
*
* *th* points to the TCP header.
* Return
* 0 if *iph* and *th* are a valid SYN cookie ACK.
*
* On failure, the returned value is one of the following:
*
* **-EACCES** if the SYN cookie is not valid.
*
* long bpf_tcp_raw_check_syncookie_ipv6(struct ipv6hdr *iph, struct tcphdr *th)
* Description
* Check whether *iph* and *th* contain a valid SYN cookie ACK
* without depending on a listening socket.
*
* *iph* points to the IPv6 header.
*
* *th* points to the TCP header.
* Return
* 0 if *iph* and *th* are a valid SYN cookie ACK.
*
* On failure, the returned value is one of the following:
*
* **-EACCES** if the SYN cookie is not valid.
*
* **-EPROTONOSUPPORT** if CONFIG_IPV6 is not builtin.
*/
#define __BPF_FUNC_MAPPER(FN) \
FN(unspec), \
@@ -5280,6 +5521,22 @@ union bpf_attr {
FN(xdp_load_bytes), \
FN(xdp_store_bytes), \
FN(copy_from_user_task), \
FN(skb_set_tstamp), \
FN(ima_file_hash), \
FN(kptr_xchg), \
FN(map_lookup_percpu_elem), \
FN(skc_to_mptcp_sock), \
FN(dynptr_from_mem), \
FN(ringbuf_reserve_dynptr), \
FN(ringbuf_submit_dynptr), \
FN(ringbuf_discard_dynptr), \
FN(dynptr_read), \
FN(dynptr_write), \
FN(dynptr_data), \
FN(tcp_raw_gen_syncookie_ipv4), \
FN(tcp_raw_gen_syncookie_ipv6), \
FN(tcp_raw_check_syncookie_ipv4), \
FN(tcp_raw_check_syncookie_ipv6), \
/* */
/* integer value in 'imm' field of BPF_CALL instruction selects which helper
@@ -5469,6 +5726,15 @@ union { \
__u64 :64; \
} __attribute__((aligned(8)))
enum {
BPF_SKB_TSTAMP_UNSPEC,
BPF_SKB_TSTAMP_DELIVERY_MONO, /* tstamp has mono delivery time */
/* For any BPF_SKB_TSTAMP_* that the bpf prog cannot handle,
* the bpf prog should handle it like BPF_SKB_TSTAMP_UNSPEC
* and try to deduce it by ingress, egress or skb->sk->sk_clockid.
*/
};
/* user accessible mirror of in-kernel sk_buff.
* new fields can only be added to the end of this structure
*/
@@ -5509,7 +5775,8 @@ struct __sk_buff {
__u32 gso_segs;
__bpf_md_ptr(struct bpf_sock *, sk);
__u32 gso_size;
__u32 :32; /* Padding, future use. */
__u8 tstamp_type;
__u32 :24; /* Padding, future use. */
__u64 hwtstamp;
};
@@ -5523,6 +5790,10 @@ struct bpf_tunnel_key {
__u8 tunnel_ttl;
__u16 tunnel_ext; /* Padding, future use. */
__u32 tunnel_label;
union {
__u32 local_ipv4;
__u32 local_ipv6[4];
};
};
/* user accessible mirror of in-kernel xfrm_state.
@@ -5806,6 +6077,8 @@ struct bpf_prog_info {
__u64 run_cnt;
__u64 recursion_misses;
__u32 verified_insns;
__u32 attach_btf_obj_id;
__u32 attach_btf_id;
} __attribute__((aligned(8)));
struct bpf_map_info {
@@ -6417,6 +6690,11 @@ struct bpf_timer {
__u64 :64;
} __attribute__((aligned(8)));
struct bpf_dynptr {
__u64 :64;
__u64 :64;
} __attribute__((aligned(8)));
struct bpf_sysctl {
__u32 write; /* Sysctl is being read (= 0) or written (= 1).
* Allows 1,2,4-byte read, but no write.
@@ -6453,7 +6731,8 @@ struct bpf_sk_lookup {
__u32 protocol; /* IP protocol (IPPROTO_TCP, IPPROTO_UDP) */
__u32 remote_ip4; /* Network byte order */
__u32 remote_ip6[4]; /* Network byte order */
__u32 remote_port; /* Network byte order */
__be16 remote_port; /* Network byte order */
__u16 :16; /* Zero padding */
__u32 local_ip4; /* Network byte order */
__u32 local_ip6[4]; /* Network byte order */
__u32 local_port; /* Host byte order */

View File

@@ -33,13 +33,13 @@ struct btf_type {
/* "info" bits arrangement
* bits 0-15: vlen (e.g. # of struct's members)
* bits 16-23: unused
* bits 24-27: kind (e.g. int, ptr, array...etc)
* bits 28-30: unused
* bits 24-28: kind (e.g. int, ptr, array...etc)
* bits 29-30: unused
* bit 31: kind_flag, currently used by
* struct, union and fwd
* struct, union, enum, fwd and enum64
*/
__u32 info;
/* "size" is used by INT, ENUM, STRUCT, UNION and DATASEC.
/* "size" is used by INT, ENUM, STRUCT, UNION, DATASEC and ENUM64.
* "size" tells the size of the type it is describing.
*
* "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT,
@@ -63,7 +63,7 @@ enum {
BTF_KIND_ARRAY = 3, /* Array */
BTF_KIND_STRUCT = 4, /* Struct */
BTF_KIND_UNION = 5, /* Union */
BTF_KIND_ENUM = 6, /* Enumeration */
BTF_KIND_ENUM = 6, /* Enumeration up to 32-bit values */
BTF_KIND_FWD = 7, /* Forward */
BTF_KIND_TYPEDEF = 8, /* Typedef */
BTF_KIND_VOLATILE = 9, /* Volatile */
@@ -76,6 +76,7 @@ enum {
BTF_KIND_FLOAT = 16, /* Floating point */
BTF_KIND_DECL_TAG = 17, /* Decl Tag */
BTF_KIND_TYPE_TAG = 18, /* Type Tag */
BTF_KIND_ENUM64 = 19, /* Enumeration up to 64-bit values */
NR_BTF_KINDS,
BTF_KIND_MAX = NR_BTF_KINDS - 1,
@@ -186,4 +187,14 @@ struct btf_decl_tag {
__s32 component_idx;
};
/* BTF_KIND_ENUM64 is followed by multiple "struct btf_enum64".
* The exact number of btf_enum64 is stored in the vlen (of the
* info in "struct btf_type").
*/
struct btf_enum64 {
__u32 name_off;
__u32 val_lo32;
__u32 val_hi32;
};
#endif /* _UAPI__LINUX_BTF_H__ */

View File

@@ -348,6 +348,8 @@ enum {
IFLA_PARENT_DEV_NAME,
IFLA_PARENT_DEV_BUS_NAME,
IFLA_GRO_MAX_SIZE,
IFLA_TSO_MAX_SIZE,
IFLA_TSO_MAX_SEGS,
__IFLA_MAX
};
@@ -860,6 +862,7 @@ enum {
IFLA_BOND_PEER_NOTIF_DELAY,
IFLA_BOND_AD_LACP_ACTIVE,
IFLA_BOND_MISSED_MAX,
IFLA_BOND_NS_IP6_TARGET,
__IFLA_BOND_MAX,
};

View File

@@ -251,6 +251,8 @@ enum {
PERF_BR_SYSRET = 8, /* syscall return */
PERF_BR_COND_CALL = 9, /* conditional function call */
PERF_BR_COND_RET = 10, /* conditional function return */
PERF_BR_ERET = 11, /* exception return */
PERF_BR_IRQ = 12, /* irq */
PERF_BR_MAX,
};
@@ -465,6 +467,8 @@ struct perf_event_attr {
/*
* User provided data if sigtrap=1, passed back to user via
* siginfo_t::si_perf_data, e.g. to permit user to identify the event.
* Note, siginfo_t::si_perf_data is long-sized, and sig_data will be
* truncated accordingly on 32 bit architectures.
*/
__u64 sig_data;
};

View File

@@ -17,6 +17,24 @@ mkdir -p "$OUT"
export LIB_FUZZING_ENGINE=${LIB_FUZZING_ENGINE:--fsanitize=fuzzer}
# libelf is compiled with _FORTIFY_SOURCE by default and it
# isn't compatible with MSan. It was borrowed
# from https://github.com/google/oss-fuzz/pull/7422
if [[ "$SANITIZER" == memory ]]; then
CFLAGS+=" -U_FORTIFY_SOURCE"
CXXFLAGS+=" -U_FORTIFY_SOURCE"
fi
# The alignment check is turned off by default on OSS-Fuzz/CFLite so it should be
# turned on explicitly there. It was borrowed from
# https://github.com/google/oss-fuzz/pull/7092
if [[ "$SANITIZER" == undefined ]]; then
additional_ubsan_checks=alignment
UBSAN_FLAGS="-fsanitize=$additional_ubsan_checks -fno-sanitize-recover=$additional_ubsan_checks"
CFLAGS+=" $UBSAN_FLAGS"
CXXFLAGS+=" $UBSAN_FLAGS"
fi
# Ideally libbelf should be built using release tarballs available
# at https://sourceware.org/elfutils/ftp/. Unfortunately sometimes they
# fail to compile (for example, elfutils-0.185 fails to compile with LDFLAGS enabled
@@ -26,7 +44,7 @@ rm -rf elfutils
git clone git://sourceware.org/git/elfutils.git
(
cd elfutils
git checkout 983e86fd89e8bf02f2d27ba5dce5bf078af4ceda
git checkout 83251d4091241acddbdcf16f814e3bc6ef3df49a
git log --oneline -1
# ASan isn't compatible with -Wl,--no-undefined: https://github.com/google/sanitizers/issues/380
@@ -36,6 +54,11 @@ find -name Makefile.am | xargs sed -i 's/,--no-undefined//'
# https://clang.llvm.org/docs/AddressSanitizer.html#usage
sed -i 's/^\(ZDEFS_LDFLAGS=\).*/\1/' configure.ac
if [[ "$SANITIZER" == undefined ]]; then
# That's basicaly what --enable-sanitize-undefined does to turn off unaligned access
# elfutils heavily relies on on i386/x86_64 but without changing compiler flags along the way
sed -i 's/\(check_undefined_val\)=[0-9]/\1=1/' configure.ac
fi
autoreconf -i -f
if ! ./configure --enable-maintainer-mode --disable-debuginfod --disable-libdebuginfod \

View File

@@ -8,10 +8,24 @@ else
msg = @printf ' %-8s %s%s\n' "$(1)" "$(2)" "$(if $(3), $(3))";
endif
LIBBPF_VERSION := $(shell \
grep -oE '^LIBBPF_([0-9.]+)' libbpf.map | \
sort -rV | head -n1 | cut -d'_' -f2)
LIBBPF_MAJOR_VERSION := $(firstword $(subst ., ,$(LIBBPF_VERSION)))
LIBBPF_MAJOR_VERSION := 1
LIBBPF_MINOR_VERSION := 0
LIBBPF_PATCH_VERSION := 0
LIBBPF_VERSION := $(LIBBPF_MAJOR_VERSION).$(LIBBPF_MINOR_VERSION).$(LIBBPF_PATCH_VERSION)
LIBBPF_MAJMIN_VERSION := $(LIBBPF_MAJOR_VERSION).$(LIBBPF_MINOR_VERSION).0
LIBBPF_MAP_VERSION := $(shell grep -oE '^LIBBPF_([0-9.]+)' libbpf.map | sort -rV | head -n1 | cut -d'_' -f2)
ifneq ($(LIBBPF_MAJMIN_VERSION), $(LIBBPF_MAP_VERSION))
$(error Libbpf release ($(LIBBPF_VERSION)) and map ($(LIBBPF_MAP_VERSION)) versions are out of sync!)
endif
define allow-override
$(if $(or $(findstring environment,$(origin $(1))),\
$(findstring command line,$(origin $(1)))),,\
$(eval $(1) = $(2)))
endef
$(call allow-override,CC,$(CROSS_COMPILE)cc)
$(call allow-override,LD,$(CROSS_COMPILE)ld)
TOPDIR = ..
@@ -21,8 +35,9 @@ ALL_CFLAGS := $(INCLUDES)
SHARED_CFLAGS += -fPIC -fvisibility=hidden -DSHARED
CFLAGS ?= -g -O2 -Werror -Wall -std=gnu89
ALL_CFLAGS += $(CFLAGS) -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64
ALL_LDFLAGS += $(LDFLAGS)
ALL_CFLAGS += $(CFLAGS) -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 $(EXTRA_CFLAGS)
ALL_LDFLAGS += $(LDFLAGS) $(EXTRA_LDFLAGS)
ifdef NO_PKG_CONFIG
ALL_LDFLAGS += -lelf -lz
else
@@ -35,9 +50,9 @@ OBJDIR ?= .
SHARED_OBJDIR := $(OBJDIR)/sharedobjs
STATIC_OBJDIR := $(OBJDIR)/staticobjs
OBJS := bpf.o btf.o libbpf.o libbpf_errno.o netlink.o \
nlattr.o str_error.o libbpf_probes.o bpf_prog_linfo.o xsk.o \
nlattr.o str_error.o libbpf_probes.o bpf_prog_linfo.o \
btf_dump.o hashmap.o ringbuf.o strset.o linker.o gen_loader.o \
relo_core.o
relo_core.o usdt.o
SHARED_OBJS := $(addprefix $(SHARED_OBJDIR)/,$(OBJS))
STATIC_OBJS := $(addprefix $(STATIC_OBJDIR)/,$(OBJS))
@@ -49,9 +64,10 @@ ifndef BUILD_STATIC_ONLY
VERSION_SCRIPT := libbpf.map
endif
HEADERS := bpf.h libbpf.h btf.h libbpf_common.h libbpf_legacy.h xsk.h \
HEADERS := bpf.h libbpf.h btf.h libbpf_common.h libbpf_legacy.h \
bpf_helpers.h bpf_helper_defs.h bpf_tracing.h \
bpf_endian.h bpf_core_read.h skel_internal.h libbpf_version.h
bpf_endian.h bpf_core_read.h skel_internal.h libbpf_version.h \
usdt.bpf.h
UAPI_HEADERS := $(addprefix $(TOPDIR)/include/uapi/linux/,\
bpf.h bpf_common.h btf.h)
@@ -99,7 +115,7 @@ $(OBJDIR)/libbpf.so.$(LIBBPF_VERSION): $(SHARED_OBJS)
-Wl,-soname,libbpf.so.$(LIBBPF_MAJOR_VERSION) \
$^ $(ALL_LDFLAGS) -o $@
$(OBJDIR)/libbpf.pc:
$(OBJDIR)/libbpf.pc: force
$(Q)sed -e "s|@PREFIX@|$(PREFIX)|" \
-e "s|@LIBDIR@|$(LIBDIR_PC)|" \
-e "s|@VERSION@|$(LIBBPF_VERSION)|" \
@@ -152,7 +168,7 @@ clean:
$(call msg,CLEAN)
$(Q)rm -rf *.o *.a *.so *.so.* *.pc $(SHARED_OBJDIR) $(STATIC_OBJDIR)
.PHONY: cscope tags
.PHONY: cscope tags force
cscope:
$(call msg,CSCOPE)
$(Q)ls *.c *.h > cscope.files
@@ -162,3 +178,5 @@ tags:
$(call msg,CTAGS)
$(Q)rm -f TAGS tags
$(Q)ls *.c *.h | xargs $(TAGS_PROG) -a
force:

369
src/bpf.c
View File

@@ -29,6 +29,7 @@
#include <errno.h>
#include <linux/bpf.h>
#include <linux/filter.h>
#include <linux/kernel.h>
#include <limits.h>
#include <sys/resource.h>
#include "bpf.h"
@@ -111,7 +112,7 @@ int probe_memcg_account(void)
BPF_EMIT_CALL(BPF_FUNC_ktime_get_coarse_ns),
BPF_EXIT_INSN(),
};
size_t insn_cnt = sizeof(insns) / sizeof(insns[0]);
size_t insn_cnt = ARRAY_SIZE(insns);
union bpf_attr attr;
int prog_fd;
@@ -146,10 +147,6 @@ int bump_rlimit_memlock(void)
{
struct rlimit rlim;
/* this the default in libbpf 1.0, but for now user has to opt-in explicitly */
if (!(libbpf_mode & LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK))
return 0;
/* if kernel supports memcg-based accounting, skip bumping RLIMIT_MEMLOCK */
if (memlock_bumped || kernel_supports(NULL, FEAT_MEMCG_ACCOUNT))
return 0;
@@ -207,86 +204,6 @@ int bpf_map_create(enum bpf_map_type map_type,
return libbpf_err_errno(fd);
}
int bpf_create_map_xattr(const struct bpf_create_map_attr *create_attr)
{
LIBBPF_OPTS(bpf_map_create_opts, p);
p.map_flags = create_attr->map_flags;
p.numa_node = create_attr->numa_node;
p.btf_fd = create_attr->btf_fd;
p.btf_key_type_id = create_attr->btf_key_type_id;
p.btf_value_type_id = create_attr->btf_value_type_id;
p.map_ifindex = create_attr->map_ifindex;
if (create_attr->map_type == BPF_MAP_TYPE_STRUCT_OPS)
p.btf_vmlinux_value_type_id = create_attr->btf_vmlinux_value_type_id;
else
p.inner_map_fd = create_attr->inner_map_fd;
return bpf_map_create(create_attr->map_type, create_attr->name,
create_attr->key_size, create_attr->value_size,
create_attr->max_entries, &p);
}
int bpf_create_map_node(enum bpf_map_type map_type, const char *name,
int key_size, int value_size, int max_entries,
__u32 map_flags, int node)
{
LIBBPF_OPTS(bpf_map_create_opts, opts);
opts.map_flags = map_flags;
if (node >= 0) {
opts.numa_node = node;
opts.map_flags |= BPF_F_NUMA_NODE;
}
return bpf_map_create(map_type, name, key_size, value_size, max_entries, &opts);
}
int bpf_create_map(enum bpf_map_type map_type, int key_size,
int value_size, int max_entries, __u32 map_flags)
{
LIBBPF_OPTS(bpf_map_create_opts, opts, .map_flags = map_flags);
return bpf_map_create(map_type, NULL, key_size, value_size, max_entries, &opts);
}
int bpf_create_map_name(enum bpf_map_type map_type, const char *name,
int key_size, int value_size, int max_entries,
__u32 map_flags)
{
LIBBPF_OPTS(bpf_map_create_opts, opts, .map_flags = map_flags);
return bpf_map_create(map_type, name, key_size, value_size, max_entries, &opts);
}
int bpf_create_map_in_map_node(enum bpf_map_type map_type, const char *name,
int key_size, int inner_map_fd, int max_entries,
__u32 map_flags, int node)
{
LIBBPF_OPTS(bpf_map_create_opts, opts);
opts.inner_map_fd = inner_map_fd;
opts.map_flags = map_flags;
if (node >= 0) {
opts.map_flags |= BPF_F_NUMA_NODE;
opts.numa_node = node;
}
return bpf_map_create(map_type, name, key_size, 4, max_entries, &opts);
}
int bpf_create_map_in_map(enum bpf_map_type map_type, const char *name,
int key_size, int inner_map_fd, int max_entries,
__u32 map_flags)
{
LIBBPF_OPTS(bpf_map_create_opts, opts,
.inner_map_fd = inner_map_fd,
.map_flags = map_flags,
);
return bpf_map_create(map_type, name, key_size, 4, max_entries, &opts);
}
static void *
alloc_zero_tailing_info(const void *orecord, __u32 cnt,
__u32 actual_rec_size, __u32 expected_rec_size)
@@ -312,11 +229,10 @@ alloc_zero_tailing_info(const void *orecord, __u32 cnt,
return info;
}
DEFAULT_VERSION(bpf_prog_load_v0_6_0, bpf_prog_load, LIBBPF_0.6.0)
int bpf_prog_load_v0_6_0(enum bpf_prog_type prog_type,
const char *prog_name, const char *license,
const struct bpf_insn *insns, size_t insn_cnt,
const struct bpf_prog_load_opts *opts)
int bpf_prog_load(enum bpf_prog_type prog_type,
const char *prog_name, const char *license,
const struct bpf_insn *insns, size_t insn_cnt,
const struct bpf_prog_load_opts *opts)
{
void *finfo = NULL, *linfo = NULL;
const char *func_info, *line_info;
@@ -463,94 +379,6 @@ done:
return libbpf_err_errno(fd);
}
__attribute__((alias("bpf_load_program_xattr2")))
int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
char *log_buf, size_t log_buf_sz);
static int bpf_load_program_xattr2(const struct bpf_load_program_attr *load_attr,
char *log_buf, size_t log_buf_sz)
{
LIBBPF_OPTS(bpf_prog_load_opts, p);
if (!load_attr || !log_buf != !log_buf_sz)
return libbpf_err(-EINVAL);
p.expected_attach_type = load_attr->expected_attach_type;
switch (load_attr->prog_type) {
case BPF_PROG_TYPE_STRUCT_OPS:
case BPF_PROG_TYPE_LSM:
p.attach_btf_id = load_attr->attach_btf_id;
break;
case BPF_PROG_TYPE_TRACING:
case BPF_PROG_TYPE_EXT:
p.attach_btf_id = load_attr->attach_btf_id;
p.attach_prog_fd = load_attr->attach_prog_fd;
break;
default:
p.prog_ifindex = load_attr->prog_ifindex;
p.kern_version = load_attr->kern_version;
}
p.log_level = load_attr->log_level;
p.log_buf = log_buf;
p.log_size = log_buf_sz;
p.prog_btf_fd = load_attr->prog_btf_fd;
p.func_info_rec_size = load_attr->func_info_rec_size;
p.func_info_cnt = load_attr->func_info_cnt;
p.func_info = load_attr->func_info;
p.line_info_rec_size = load_attr->line_info_rec_size;
p.line_info_cnt = load_attr->line_info_cnt;
p.line_info = load_attr->line_info;
p.prog_flags = load_attr->prog_flags;
return bpf_prog_load(load_attr->prog_type, load_attr->name, load_attr->license,
load_attr->insns, load_attr->insns_cnt, &p);
}
int bpf_load_program(enum bpf_prog_type type, const struct bpf_insn *insns,
size_t insns_cnt, const char *license,
__u32 kern_version, char *log_buf,
size_t log_buf_sz)
{
struct bpf_load_program_attr load_attr;
memset(&load_attr, 0, sizeof(struct bpf_load_program_attr));
load_attr.prog_type = type;
load_attr.expected_attach_type = 0;
load_attr.name = NULL;
load_attr.insns = insns;
load_attr.insns_cnt = insns_cnt;
load_attr.license = license;
load_attr.kern_version = kern_version;
return bpf_load_program_xattr2(&load_attr, log_buf, log_buf_sz);
}
int bpf_verify_program(enum bpf_prog_type type, const struct bpf_insn *insns,
size_t insns_cnt, __u32 prog_flags, const char *license,
__u32 kern_version, char *log_buf, size_t log_buf_sz,
int log_level)
{
union bpf_attr attr;
int fd;
bump_rlimit_memlock();
memset(&attr, 0, sizeof(attr));
attr.prog_type = type;
attr.insn_cnt = (__u32)insns_cnt;
attr.insns = ptr_to_u64(insns);
attr.license = ptr_to_u64(license);
attr.log_buf = ptr_to_u64(log_buf);
attr.log_size = log_buf_sz;
attr.log_level = log_level;
log_buf[0] = 0;
attr.kern_version = kern_version;
attr.prog_flags = prog_flags;
fd = sys_bpf_prog_load(&attr, sizeof(attr), PROG_LOAD_ATTEMPTS);
return libbpf_err_errno(fd);
}
int bpf_map_update_elem(int fd, const void *key, const void *value,
__u64 flags)
{
@@ -638,6 +466,20 @@ int bpf_map_delete_elem(int fd, const void *key)
return libbpf_err_errno(ret);
}
int bpf_map_delete_elem_flags(int fd, const void *key, __u64 flags)
{
union bpf_attr attr;
int ret;
memset(&attr, 0, sizeof(attr));
attr.map_fd = fd;
attr.key = ptr_to_u64(key);
attr.flags = flags;
ret = sys_bpf(BPF_MAP_DELETE_ELEM, &attr, sizeof(attr));
return libbpf_err_errno(ret);
}
int bpf_map_get_next_key(int fd, const void *key, void *next_key)
{
union bpf_attr attr;
@@ -816,7 +658,7 @@ int bpf_link_create(int prog_fd, int target_fd,
{
__u32 target_btf_id, iter_info_len;
union bpf_attr attr;
int fd;
int fd, err;
if (!OPTS_VALID(opts, bpf_link_create_opts))
return libbpf_err(-EINVAL);
@@ -853,6 +695,23 @@ int bpf_link_create(int prog_fd, int target_fd,
if (!OPTS_ZEROED(opts, perf_event))
return libbpf_err(-EINVAL);
break;
case BPF_TRACE_KPROBE_MULTI:
attr.link_create.kprobe_multi.flags = OPTS_GET(opts, kprobe_multi.flags, 0);
attr.link_create.kprobe_multi.cnt = OPTS_GET(opts, kprobe_multi.cnt, 0);
attr.link_create.kprobe_multi.syms = ptr_to_u64(OPTS_GET(opts, kprobe_multi.syms, 0));
attr.link_create.kprobe_multi.addrs = ptr_to_u64(OPTS_GET(opts, kprobe_multi.addrs, 0));
attr.link_create.kprobe_multi.cookies = ptr_to_u64(OPTS_GET(opts, kprobe_multi.cookies, 0));
if (!OPTS_ZEROED(opts, kprobe_multi))
return libbpf_err(-EINVAL);
break;
case BPF_TRACE_FENTRY:
case BPF_TRACE_FEXIT:
case BPF_MODIFY_RETURN:
case BPF_LSM_MAC:
attr.link_create.tracing.cookie = OPTS_GET(opts, tracing.cookie, 0);
if (!OPTS_ZEROED(opts, tracing))
return libbpf_err(-EINVAL);
break;
default:
if (!OPTS_ZEROED(opts, flags))
return libbpf_err(-EINVAL);
@@ -860,7 +719,37 @@ int bpf_link_create(int prog_fd, int target_fd,
}
proceed:
fd = sys_bpf_fd(BPF_LINK_CREATE, &attr, sizeof(attr));
return libbpf_err_errno(fd);
if (fd >= 0)
return fd;
/* we'll get EINVAL if LINK_CREATE doesn't support attaching fentry
* and other similar programs
*/
err = -errno;
if (err != -EINVAL)
return libbpf_err(err);
/* if user used features not supported by
* BPF_RAW_TRACEPOINT_OPEN command, then just give up immediately
*/
if (attr.link_create.target_fd || attr.link_create.target_btf_id)
return libbpf_err(err);
if (!OPTS_ZEROED(opts, sz))
return libbpf_err(err);
/* otherwise, for few select kinds of programs that can be
* attached using BPF_RAW_TRACEPOINT_OPEN command, try that as
* a fallback for older kernels
*/
switch (attach_type) {
case BPF_TRACE_RAW_TP:
case BPF_LSM_MAC:
case BPF_TRACE_FENTRY:
case BPF_TRACE_FEXIT:
case BPF_MODIFY_RETURN:
return bpf_raw_tracepoint_open(NULL, prog_fd);
default:
return libbpf_err(err);
}
}
int bpf_link_detach(int link_fd)
@@ -906,80 +795,48 @@ int bpf_iter_create(int link_fd)
return libbpf_err_errno(fd);
}
int bpf_prog_query(int target_fd, enum bpf_attach_type type, __u32 query_flags,
__u32 *attach_flags, __u32 *prog_ids, __u32 *prog_cnt)
int bpf_prog_query_opts(int target_fd,
enum bpf_attach_type type,
struct bpf_prog_query_opts *opts)
{
union bpf_attr attr;
int ret;
memset(&attr, 0, sizeof(attr));
attr.query.target_fd = target_fd;
attr.query.attach_type = type;
attr.query.query_flags = query_flags;
attr.query.prog_cnt = *prog_cnt;
attr.query.prog_ids = ptr_to_u64(prog_ids);
ret = sys_bpf(BPF_PROG_QUERY, &attr, sizeof(attr));
if (attach_flags)
*attach_flags = attr.query.attach_flags;
*prog_cnt = attr.query.prog_cnt;
return libbpf_err_errno(ret);
}
int bpf_prog_test_run(int prog_fd, int repeat, void *data, __u32 size,
void *data_out, __u32 *size_out, __u32 *retval,
__u32 *duration)
{
union bpf_attr attr;
int ret;
memset(&attr, 0, sizeof(attr));
attr.test.prog_fd = prog_fd;
attr.test.data_in = ptr_to_u64(data);
attr.test.data_out = ptr_to_u64(data_out);
attr.test.data_size_in = size;
attr.test.repeat = repeat;
ret = sys_bpf(BPF_PROG_TEST_RUN, &attr, sizeof(attr));
if (size_out)
*size_out = attr.test.data_size_out;
if (retval)
*retval = attr.test.retval;
if (duration)
*duration = attr.test.duration;
return libbpf_err_errno(ret);
}
int bpf_prog_test_run_xattr(struct bpf_prog_test_run_attr *test_attr)
{
union bpf_attr attr;
int ret;
if (!test_attr->data_out && test_attr->data_size_out > 0)
if (!OPTS_VALID(opts, bpf_prog_query_opts))
return libbpf_err(-EINVAL);
memset(&attr, 0, sizeof(attr));
attr.test.prog_fd = test_attr->prog_fd;
attr.test.data_in = ptr_to_u64(test_attr->data_in);
attr.test.data_out = ptr_to_u64(test_attr->data_out);
attr.test.data_size_in = test_attr->data_size_in;
attr.test.data_size_out = test_attr->data_size_out;
attr.test.ctx_in = ptr_to_u64(test_attr->ctx_in);
attr.test.ctx_out = ptr_to_u64(test_attr->ctx_out);
attr.test.ctx_size_in = test_attr->ctx_size_in;
attr.test.ctx_size_out = test_attr->ctx_size_out;
attr.test.repeat = test_attr->repeat;
ret = sys_bpf(BPF_PROG_TEST_RUN, &attr, sizeof(attr));
attr.query.target_fd = target_fd;
attr.query.attach_type = type;
attr.query.query_flags = OPTS_GET(opts, query_flags, 0);
attr.query.prog_cnt = OPTS_GET(opts, prog_cnt, 0);
attr.query.prog_ids = ptr_to_u64(OPTS_GET(opts, prog_ids, NULL));
attr.query.prog_attach_flags = ptr_to_u64(OPTS_GET(opts, prog_attach_flags, NULL));
test_attr->data_size_out = attr.test.data_size_out;
test_attr->ctx_size_out = attr.test.ctx_size_out;
test_attr->retval = attr.test.retval;
test_attr->duration = attr.test.duration;
ret = sys_bpf(BPF_PROG_QUERY, &attr, sizeof(attr));
OPTS_SET(opts, attach_flags, attr.query.attach_flags);
OPTS_SET(opts, prog_cnt, attr.query.prog_cnt);
return libbpf_err_errno(ret);
}
int bpf_prog_query(int target_fd, enum bpf_attach_type type, __u32 query_flags,
__u32 *attach_flags, __u32 *prog_ids, __u32 *prog_cnt)
{
LIBBPF_OPTS(bpf_prog_query_opts, opts);
int ret;
opts.query_flags = query_flags;
opts.prog_ids = prog_ids;
opts.prog_cnt = *prog_cnt;
ret = bpf_prog_query_opts(target_fd, type, &opts);
if (attach_flags)
*attach_flags = opts.attach_flags;
*prog_cnt = opts.prog_cnt;
return libbpf_err_errno(ret);
}
@@ -994,6 +851,7 @@ int bpf_prog_test_run_opts(int prog_fd, struct bpf_test_run_opts *opts)
memset(&attr, 0, sizeof(attr));
attr.test.prog_fd = prog_fd;
attr.test.batch_size = OPTS_GET(opts, batch_size, 0);
attr.test.cpu = OPTS_GET(opts, cpu, 0);
attr.test.flags = OPTS_GET(opts, flags, 0);
attr.test.repeat = OPTS_GET(opts, repeat, 0);
@@ -1179,27 +1037,6 @@ int bpf_btf_load(const void *btf_data, size_t btf_size, const struct bpf_btf_loa
return libbpf_err_errno(fd);
}
int bpf_load_btf(const void *btf, __u32 btf_size, char *log_buf, __u32 log_buf_size, bool do_log)
{
LIBBPF_OPTS(bpf_btf_load_opts, opts);
int fd;
retry:
if (do_log && log_buf && log_buf_size) {
opts.log_buf = log_buf;
opts.log_size = log_buf_size;
opts.log_level = 1;
}
fd = bpf_btf_load(btf, btf_size, &opts);
if (fd < 0 && !do_log && log_buf && log_buf_size) {
do_log = true;
goto retry;
}
return libbpf_err_errno(fd);
}
int bpf_task_fd_query(int pid, int fd, __u32 flags, char *buf, __u32 *buf_len,
__u32 *prog_id, __u32 *fd_type, __u64 *probe_offset,
__u64 *probe_addr)

156
src/bpf.h
View File

@@ -61,48 +61,6 @@ LIBBPF_API int bpf_map_create(enum bpf_map_type map_type,
__u32 max_entries,
const struct bpf_map_create_opts *opts);
struct bpf_create_map_attr {
const char *name;
enum bpf_map_type map_type;
__u32 map_flags;
__u32 key_size;
__u32 value_size;
__u32 max_entries;
__u32 numa_node;
__u32 btf_fd;
__u32 btf_key_type_id;
__u32 btf_value_type_id;
__u32 map_ifindex;
union {
__u32 inner_map_fd;
__u32 btf_vmlinux_value_type_id;
};
};
LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_map_create() instead")
LIBBPF_API int bpf_create_map_xattr(const struct bpf_create_map_attr *create_attr);
LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_map_create() instead")
LIBBPF_API int bpf_create_map_node(enum bpf_map_type map_type, const char *name,
int key_size, int value_size,
int max_entries, __u32 map_flags, int node);
LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_map_create() instead")
LIBBPF_API int bpf_create_map_name(enum bpf_map_type map_type, const char *name,
int key_size, int value_size,
int max_entries, __u32 map_flags);
LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_map_create() instead")
LIBBPF_API int bpf_create_map(enum bpf_map_type map_type, int key_size,
int value_size, int max_entries, __u32 map_flags);
LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_map_create() instead")
LIBBPF_API int bpf_create_map_in_map_node(enum bpf_map_type map_type,
const char *name, int key_size,
int inner_map_fd, int max_entries,
__u32 map_flags, int node);
LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_map_create() instead")
LIBBPF_API int bpf_create_map_in_map(enum bpf_map_type map_type,
const char *name, int key_size,
int inner_map_fd, int max_entries,
__u32 map_flags);
struct bpf_prog_load_opts {
size_t sz; /* size of this struct for forward/backward compatibility */
@@ -145,54 +103,6 @@ LIBBPF_API int bpf_prog_load(enum bpf_prog_type prog_type,
const char *prog_name, const char *license,
const struct bpf_insn *insns, size_t insn_cnt,
const struct bpf_prog_load_opts *opts);
/* this "specialization" should go away in libbpf 1.0 */
LIBBPF_API int bpf_prog_load_v0_6_0(enum bpf_prog_type prog_type,
const char *prog_name, const char *license,
const struct bpf_insn *insns, size_t insn_cnt,
const struct bpf_prog_load_opts *opts);
/* This is an elaborate way to not conflict with deprecated bpf_prog_load()
* API, defined in libbpf.h. Once we hit libbpf 1.0, all this will be gone.
* With this approach, if someone is calling bpf_prog_load() with
* 4 arguments, they will use the deprecated API, which keeps backwards
* compatibility (both source code and binary). If bpf_prog_load() is called
* with 6 arguments, though, it gets redirected to __bpf_prog_load.
* So looking forward to libbpf 1.0 when this hack will be gone and
* __bpf_prog_load() will be called just bpf_prog_load().
*/
#ifndef bpf_prog_load
#define bpf_prog_load(...) ___libbpf_overload(___bpf_prog_load, __VA_ARGS__)
#define ___bpf_prog_load4(file, type, pobj, prog_fd) \
bpf_prog_load_deprecated(file, type, pobj, prog_fd)
#define ___bpf_prog_load6(prog_type, prog_name, license, insns, insn_cnt, opts) \
bpf_prog_load(prog_type, prog_name, license, insns, insn_cnt, opts)
#endif /* bpf_prog_load */
struct bpf_load_program_attr {
enum bpf_prog_type prog_type;
enum bpf_attach_type expected_attach_type;
const char *name;
const struct bpf_insn *insns;
size_t insns_cnt;
const char *license;
union {
__u32 kern_version;
__u32 attach_prog_fd;
};
union {
__u32 prog_ifindex;
__u32 attach_btf_id;
};
__u32 prog_btf_fd;
__u32 func_info_rec_size;
const void *func_info;
__u32 func_info_cnt;
__u32 line_info_rec_size;
const void *line_info;
__u32 line_info_cnt;
__u32 log_level;
__u32 prog_flags;
};
/* Flags to direct loading requirements */
#define MAPS_RELAX_COMPAT 0x01
@@ -200,22 +110,6 @@ struct bpf_load_program_attr {
/* Recommended log buffer size */
#define BPF_LOG_BUF_SIZE (UINT32_MAX >> 8) /* verifier maximum in kernels <= 5.1 */
LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_prog_load() instead")
LIBBPF_API int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
char *log_buf, size_t log_buf_sz);
LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_prog_load() instead")
LIBBPF_API int bpf_load_program(enum bpf_prog_type type,
const struct bpf_insn *insns, size_t insns_cnt,
const char *license, __u32 kern_version,
char *log_buf, size_t log_buf_sz);
LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_prog_load() instead")
LIBBPF_API int bpf_verify_program(enum bpf_prog_type type,
const struct bpf_insn *insns,
size_t insns_cnt, __u32 prog_flags,
const char *license, __u32 kern_version,
char *log_buf, size_t log_buf_sz,
int log_level);
struct bpf_btf_load_opts {
size_t sz; /* size of this struct for forward/backward compatibility */
@@ -229,10 +123,6 @@ struct bpf_btf_load_opts {
LIBBPF_API int bpf_btf_load(const void *btf_data, size_t btf_size,
const struct bpf_btf_load_opts *opts);
LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_btf_load() instead")
LIBBPF_API int bpf_load_btf(const void *btf, __u32 btf_size, char *log_buf,
__u32 log_buf_size, bool do_log);
LIBBPF_API int bpf_map_update_elem(int fd, const void *key, const void *value,
__u64 flags);
@@ -244,6 +134,7 @@ LIBBPF_API int bpf_map_lookup_and_delete_elem(int fd, const void *key,
LIBBPF_API int bpf_map_lookup_and_delete_elem_flags(int fd, const void *key,
void *value, __u64 flags);
LIBBPF_API int bpf_map_delete_elem(int fd, const void *key);
LIBBPF_API int bpf_map_delete_elem_flags(int fd, const void *key, __u64 flags);
LIBBPF_API int bpf_map_get_next_key(int fd, const void *key, void *next_key);
LIBBPF_API int bpf_map_freeze(int fd);
@@ -394,10 +285,6 @@ LIBBPF_API int bpf_prog_attach(int prog_fd, int attachable_fd,
LIBBPF_API int bpf_prog_attach_opts(int prog_fd, int attachable_fd,
enum bpf_attach_type type,
const struct bpf_prog_attach_opts *opts);
LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_prog_attach_opts() instead")
LIBBPF_API int bpf_prog_attach_xattr(int prog_fd, int attachable_fd,
enum bpf_attach_type type,
const struct bpf_prog_attach_opts *opts);
LIBBPF_API int bpf_prog_detach(int attachable_fd, enum bpf_attach_type type);
LIBBPF_API int bpf_prog_detach2(int prog_fd, int attachable_fd,
enum bpf_attach_type type);
@@ -413,10 +300,20 @@ struct bpf_link_create_opts {
struct {
__u64 bpf_cookie;
} perf_event;
struct {
__u32 flags;
__u32 cnt;
const char **syms;
const unsigned long *addrs;
const __u64 *cookies;
} kprobe_multi;
struct {
__u64 cookie;
} tracing;
};
size_t :0;
};
#define bpf_link_create_opts__last_field perf_event
#define bpf_link_create_opts__last_field kprobe_multi.cookies
LIBBPF_API int bpf_link_create(int prog_fd, int target_fd,
enum bpf_attach_type attach_type,
@@ -453,17 +350,6 @@ struct bpf_prog_test_run_attr {
* out: length of cxt_out */
};
LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_prog_test_run_opts() instead")
LIBBPF_API int bpf_prog_test_run_xattr(struct bpf_prog_test_run_attr *test_attr);
/*
* bpf_prog_test_run does not check that data_out is large enough. Consider
* using bpf_prog_test_run_opts instead.
*/
LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_prog_test_run_opts() instead")
LIBBPF_API int bpf_prog_test_run(int prog_fd, int repeat, void *data,
__u32 size, void *data_out, __u32 *size_out,
__u32 *retval, __u32 *duration);
LIBBPF_API int bpf_prog_get_next_id(__u32 start_id, __u32 *next_id);
LIBBPF_API int bpf_map_get_next_id(__u32 start_id, __u32 *next_id);
LIBBPF_API int bpf_btf_get_next_id(__u32 start_id, __u32 *next_id);
@@ -473,9 +359,24 @@ LIBBPF_API int bpf_map_get_fd_by_id(__u32 id);
LIBBPF_API int bpf_btf_get_fd_by_id(__u32 id);
LIBBPF_API int bpf_link_get_fd_by_id(__u32 id);
LIBBPF_API int bpf_obj_get_info_by_fd(int bpf_fd, void *info, __u32 *info_len);
struct bpf_prog_query_opts {
size_t sz; /* size of this struct for forward/backward compatibility */
__u32 query_flags;
__u32 attach_flags; /* output argument */
__u32 *prog_ids;
__u32 prog_cnt; /* input+output argument */
__u32 *prog_attach_flags;
};
#define bpf_prog_query_opts__last_field prog_attach_flags
LIBBPF_API int bpf_prog_query_opts(int target_fd,
enum bpf_attach_type type,
struct bpf_prog_query_opts *opts);
LIBBPF_API int bpf_prog_query(int target_fd, enum bpf_attach_type type,
__u32 query_flags, __u32 *attach_flags,
__u32 *prog_ids, __u32 *prog_cnt);
LIBBPF_API int bpf_raw_tracepoint_open(const char *name, int prog_fd);
LIBBPF_API int bpf_task_fd_query(int pid, int fd, __u32 flags, char *buf,
__u32 *buf_len, __u32 *prog_id, __u32 *fd_type,
@@ -512,8 +413,9 @@ struct bpf_test_run_opts {
__u32 duration; /* out: average per repetition in ns */
__u32 flags;
__u32 cpu;
__u32 batch_size;
};
#define bpf_test_run_opts__last_field cpu
#define bpf_test_run_opts__last_field batch_size
LIBBPF_API int bpf_prog_test_run_opts(int prog_fd,
struct bpf_test_run_opts *opts);

View File

@@ -110,21 +110,50 @@ enum bpf_enum_value_kind {
val; \
})
#define ___bpf_field_ref1(field) (field)
#define ___bpf_field_ref2(type, field) (((typeof(type) *)0)->field)
#define ___bpf_field_ref(args...) \
___bpf_apply(___bpf_field_ref, ___bpf_narg(args))(args)
/*
* Convenience macro to check that field actually exists in target kernel's.
* Returns:
* 1, if matching field is present in target kernel;
* 0, if no matching field found.
*
* Supports two forms:
* - field reference through variable access:
* bpf_core_field_exists(p->my_field);
* - field reference through type and field names:
* bpf_core_field_exists(struct my_type, my_field).
*/
#define bpf_core_field_exists(field) \
__builtin_preserve_field_info(field, BPF_FIELD_EXISTS)
#define bpf_core_field_exists(field...) \
__builtin_preserve_field_info(___bpf_field_ref(field), BPF_FIELD_EXISTS)
/*
* Convenience macro to get the byte size of a field. Works for integers,
* struct/unions, pointers, arrays, and enums.
*
* Supports two forms:
* - field reference through variable access:
* bpf_core_field_size(p->my_field);
* - field reference through type and field names:
* bpf_core_field_size(struct my_type, my_field).
*/
#define bpf_core_field_size(field) \
__builtin_preserve_field_info(field, BPF_FIELD_BYTE_SIZE)
#define bpf_core_field_size(field...) \
__builtin_preserve_field_info(___bpf_field_ref(field), BPF_FIELD_BYTE_SIZE)
/*
* Convenience macro to get field's byte offset.
*
* Supports two forms:
* - field reference through variable access:
* bpf_core_field_offset(p->my_field);
* - field reference through type and field names:
* bpf_core_field_offset(struct my_type, my_field).
*/
#define bpf_core_field_offset(field...) \
__builtin_preserve_field_info(___bpf_field_ref(field), BPF_FIELD_BYTE_OFFSET)
/*
* Convenience macro to get BTF type ID of a specified type, using a local BTF

View File

@@ -38,6 +38,10 @@ struct inode;
struct socket;
struct file;
struct bpf_timer;
struct mptcp_sock;
struct bpf_dynptr;
struct iphdr;
struct ipv6hdr;
/*
* bpf_map_lookup_elem
@@ -961,8 +965,8 @@ static long (*bpf_probe_write_user)(void *dst, const void *src, __u32 len) = (vo
* Returns
* The return value depends on the result of the test, and can be:
*
* * 0, if current task belongs to the cgroup2.
* * 1, if current task does not belong to the cgroup2.
* * 1, if current task belongs to the cgroup2.
* * 0, if current task does not belong to the cgroup2.
* * A negative error code, if an error occurred.
*/
static long (*bpf_current_task_under_cgroup)(void *map, __u32 index) = (void *) 37;
@@ -1752,8 +1756,8 @@ static long (*bpf_skb_get_xfrm_state)(struct __sk_buff *skb, __u32 index, struct
* # sysctl kernel.perf_event_max_stack=<new value>
*
* Returns
* A non-negative value equal to or less than *size* on success,
* or a negative error in case of failure.
* The non-negative copied *buf* length equal to or less than
* *size* on success, or a negative error in case of failure.
*/
static long (*bpf_get_stack)(void *ctx, void *buf, __u32 size, __u64 flags) = (void *) 67;
@@ -2461,10 +2465,11 @@ static struct bpf_sock *(*bpf_skc_lookup_tcp)(void *ctx, struct bpf_sock_tuple *
*
* *iph* points to the start of the IPv4 or IPv6 header, while
* *iph_len* contains **sizeof**\ (**struct iphdr**) or
* **sizeof**\ (**struct ip6hdr**).
* **sizeof**\ (**struct ipv6hdr**).
*
* *th* points to the start of the TCP header, while *th_len*
* contains **sizeof**\ (**struct tcphdr**).
* contains the length of the TCP header (at least
* **sizeof**\ (**struct tcphdr**)).
*
* Returns
* 0 if *iph* and *th* are a valid SYN cookie ACK, or a negative
@@ -2687,10 +2692,11 @@ static long (*bpf_send_signal)(__u32 sig) = (void *) 109;
*
* *iph* points to the start of the IPv4 or IPv6 header, while
* *iph_len* contains **sizeof**\ (**struct iphdr**) or
* **sizeof**\ (**struct ip6hdr**).
* **sizeof**\ (**struct ipv6hdr**).
*
* *th* points to the start of the TCP header, while *th_len*
* contains the length of the TCP header.
* contains the length of the TCP header with options (at least
* **sizeof**\ (**struct tcphdr**)).
*
* Returns
* On success, lower 32 bits hold the generated SYN cookie in
@@ -3305,8 +3311,8 @@ static struct udp6_sock *(*bpf_skc_to_udp6_sock)(void *sk) = (void *) 140;
* # sysctl kernel.perf_event_max_stack=<new value>
*
* Returns
* A non-negative value equal to or less than *size* on success,
* or a negative error in case of failure.
* The non-negative copied *buf* length equal to or less than
* *size* on success, or a negative error in case of failure.
*/
static long (*bpf_get_task_stack)(struct task_struct *task, void *buf, __u32 size, __u64 flags) = (void *) 141;
@@ -4295,4 +4301,278 @@ static long (*bpf_xdp_store_bytes)(struct xdp_md *xdp_md, __u32 offset, void *bu
*/
static long (*bpf_copy_from_user_task)(void *dst, __u32 size, const void *user_ptr, struct task_struct *tsk, __u64 flags) = (void *) 191;
/*
* bpf_skb_set_tstamp
*
* Change the __sk_buff->tstamp_type to *tstamp_type*
* and set *tstamp* to the __sk_buff->tstamp together.
*
* If there is no need to change the __sk_buff->tstamp_type,
* the tstamp value can be directly written to __sk_buff->tstamp
* instead.
*
* BPF_SKB_TSTAMP_DELIVERY_MONO is the only tstamp that
* will be kept during bpf_redirect_*(). A non zero
* *tstamp* must be used with the BPF_SKB_TSTAMP_DELIVERY_MONO
* *tstamp_type*.
*
* A BPF_SKB_TSTAMP_UNSPEC *tstamp_type* can only be used
* with a zero *tstamp*.
*
* Only IPv4 and IPv6 skb->protocol are supported.
*
* This function is most useful when it needs to set a
* mono delivery time to __sk_buff->tstamp and then
* bpf_redirect_*() to the egress of an iface. For example,
* changing the (rcv) timestamp in __sk_buff->tstamp at
* ingress to a mono delivery time and then bpf_redirect_*()
* to sch_fq@phy-dev.
*
* Returns
* 0 on success.
* **-EINVAL** for invalid input
* **-EOPNOTSUPP** for unsupported protocol
*/
static long (*bpf_skb_set_tstamp)(struct __sk_buff *skb, __u64 tstamp, __u32 tstamp_type) = (void *) 192;
/*
* bpf_ima_file_hash
*
* Returns a calculated IMA hash of the *file*.
* If the hash is larger than *size*, then only *size*
* bytes will be copied to *dst*
*
* Returns
* The **hash_algo** is returned on success,
* **-EOPNOTSUP** if the hash calculation failed or **-EINVAL** if
* invalid arguments are passed.
*/
static long (*bpf_ima_file_hash)(struct file *file, void *dst, __u32 size) = (void *) 193;
/*
* bpf_kptr_xchg
*
* Exchange kptr at pointer *map_value* with *ptr*, and return the
* old value. *ptr* can be NULL, otherwise it must be a referenced
* pointer which will be released when this helper is called.
*
* Returns
* The old value of kptr (which can be NULL). The returned pointer
* if not NULL, is a reference which must be released using its
* corresponding release function, or moved into a BPF map before
* program exit.
*/
static void *(*bpf_kptr_xchg)(void *map_value, void *ptr) = (void *) 194;
/*
* bpf_map_lookup_percpu_elem
*
* Perform a lookup in *percpu map* for an entry associated to
* *key* on *cpu*.
*
* Returns
* Map value associated to *key* on *cpu*, or **NULL** if no entry
* was found or *cpu* is invalid.
*/
static void *(*bpf_map_lookup_percpu_elem)(void *map, const void *key, __u32 cpu) = (void *) 195;
/*
* bpf_skc_to_mptcp_sock
*
* Dynamically cast a *sk* pointer to a *mptcp_sock* pointer.
*
* Returns
* *sk* if casting is valid, or **NULL** otherwise.
*/
static struct mptcp_sock *(*bpf_skc_to_mptcp_sock)(void *sk) = (void *) 196;
/*
* bpf_dynptr_from_mem
*
* Get a dynptr to local memory *data*.
*
* *data* must be a ptr to a map value.
* The maximum *size* supported is DYNPTR_MAX_SIZE.
* *flags* is currently unused.
*
* Returns
* 0 on success, -E2BIG if the size exceeds DYNPTR_MAX_SIZE,
* -EINVAL if flags is not 0.
*/
static long (*bpf_dynptr_from_mem)(void *data, __u32 size, __u64 flags, struct bpf_dynptr *ptr) = (void *) 197;
/*
* bpf_ringbuf_reserve_dynptr
*
* Reserve *size* bytes of payload in a ring buffer *ringbuf*
* through the dynptr interface. *flags* must be 0.
*
* Please note that a corresponding bpf_ringbuf_submit_dynptr or
* bpf_ringbuf_discard_dynptr must be called on *ptr*, even if the
* reservation fails. This is enforced by the verifier.
*
* Returns
* 0 on success, or a negative error in case of failure.
*/
static long (*bpf_ringbuf_reserve_dynptr)(void *ringbuf, __u32 size, __u64 flags, struct bpf_dynptr *ptr) = (void *) 198;
/*
* bpf_ringbuf_submit_dynptr
*
* Submit reserved ring buffer sample, pointed to by *data*,
* through the dynptr interface. This is a no-op if the dynptr is
* invalid/null.
*
* For more information on *flags*, please see
* 'bpf_ringbuf_submit'.
*
* Returns
* Nothing. Always succeeds.
*/
static void (*bpf_ringbuf_submit_dynptr)(struct bpf_dynptr *ptr, __u64 flags) = (void *) 199;
/*
* bpf_ringbuf_discard_dynptr
*
* Discard reserved ring buffer sample through the dynptr
* interface. This is a no-op if the dynptr is invalid/null.
*
* For more information on *flags*, please see
* 'bpf_ringbuf_discard'.
*
* Returns
* Nothing. Always succeeds.
*/
static void (*bpf_ringbuf_discard_dynptr)(struct bpf_dynptr *ptr, __u64 flags) = (void *) 200;
/*
* bpf_dynptr_read
*
* Read *len* bytes from *src* into *dst*, starting from *offset*
* into *src*.
*
* Returns
* 0 on success, -E2BIG if *offset* + *len* exceeds the length
* of *src*'s data, -EINVAL if *src* is an invalid dynptr.
*/
static long (*bpf_dynptr_read)(void *dst, __u32 len, struct bpf_dynptr *src, __u32 offset) = (void *) 201;
/*
* bpf_dynptr_write
*
* Write *len* bytes from *src* into *dst*, starting from *offset*
* into *dst*.
*
* Returns
* 0 on success, -E2BIG if *offset* + *len* exceeds the length
* of *dst*'s data, -EINVAL if *dst* is an invalid dynptr or if *dst*
* is a read-only dynptr.
*/
static long (*bpf_dynptr_write)(struct bpf_dynptr *dst, __u32 offset, void *src, __u32 len) = (void *) 202;
/*
* bpf_dynptr_data
*
* Get a pointer to the underlying dynptr data.
*
* *len* must be a statically known value. The returned data slice
* is invalidated whenever the dynptr is invalidated.
*
* Returns
* Pointer to the underlying dynptr data, NULL if the dynptr is
* read-only, if the dynptr is invalid, or if the offset and length
* is out of bounds.
*/
static void *(*bpf_dynptr_data)(struct bpf_dynptr *ptr, __u32 offset, __u32 len) = (void *) 203;
/*
* bpf_tcp_raw_gen_syncookie_ipv4
*
* Try to issue a SYN cookie for the packet with corresponding
* IPv4/TCP headers, *iph* and *th*, without depending on a
* listening socket.
*
* *iph* points to the IPv4 header.
*
* *th* points to the start of the TCP header, while *th_len*
* contains the length of the TCP header (at least
* **sizeof**\ (**struct tcphdr**)).
*
* Returns
* On success, lower 32 bits hold the generated SYN cookie in
* followed by 16 bits which hold the MSS value for that cookie,
* and the top 16 bits are unused.
*
* On failure, the returned value is one of the following:
*
* **-EINVAL** if *th_len* is invalid.
*/
static __s64 (*bpf_tcp_raw_gen_syncookie_ipv4)(struct iphdr *iph, struct tcphdr *th, __u32 th_len) = (void *) 204;
/*
* bpf_tcp_raw_gen_syncookie_ipv6
*
* Try to issue a SYN cookie for the packet with corresponding
* IPv6/TCP headers, *iph* and *th*, without depending on a
* listening socket.
*
* *iph* points to the IPv6 header.
*
* *th* points to the start of the TCP header, while *th_len*
* contains the length of the TCP header (at least
* **sizeof**\ (**struct tcphdr**)).
*
* Returns
* On success, lower 32 bits hold the generated SYN cookie in
* followed by 16 bits which hold the MSS value for that cookie,
* and the top 16 bits are unused.
*
* On failure, the returned value is one of the following:
*
* **-EINVAL** if *th_len* is invalid.
*
* **-EPROTONOSUPPORT** if CONFIG_IPV6 is not builtin.
*/
static __s64 (*bpf_tcp_raw_gen_syncookie_ipv6)(struct ipv6hdr *iph, struct tcphdr *th, __u32 th_len) = (void *) 205;
/*
* bpf_tcp_raw_check_syncookie_ipv4
*
* Check whether *iph* and *th* contain a valid SYN cookie ACK
* without depending on a listening socket.
*
* *iph* points to the IPv4 header.
*
* *th* points to the TCP header.
*
* Returns
* 0 if *iph* and *th* are a valid SYN cookie ACK.
*
* On failure, the returned value is one of the following:
*
* **-EACCES** if the SYN cookie is not valid.
*/
static long (*bpf_tcp_raw_check_syncookie_ipv4)(struct iphdr *iph, struct tcphdr *th) = (void *) 206;
/*
* bpf_tcp_raw_check_syncookie_ipv6
*
* Check whether *iph* and *th* contain a valid SYN cookie ACK
* without depending on a listening socket.
*
* *iph* points to the IPv6 header.
*
* *th* points to the TCP header.
*
* Returns
* 0 if *iph* and *th* are a valid SYN cookie ACK.
*
* On failure, the returned value is one of the following:
*
* **-EACCES** if the SYN cookie is not valid.
*
* **-EPROTONOSUPPORT** if CONFIG_IPV6 is not builtin.
*/
static long (*bpf_tcp_raw_check_syncookie_ipv6)(struct ipv6hdr *iph, struct tcphdr *th) = (void *) 207;

View File

@@ -75,6 +75,30 @@
})
#endif
/*
* Compiler (optimization) barrier.
*/
#ifndef barrier
#define barrier() asm volatile("" ::: "memory")
#endif
/* Variable-specific compiler (optimization) barrier. It's a no-op which makes
* compiler believe that there is some black box modification of a given
* variable and thus prevents compiler from making extra assumption about its
* value and potential simplifications and optimizations on this variable.
*
* E.g., compiler might often delay or even omit 32-bit to 64-bit casting of
* a variable, making some code patterns unverifiable. Putting barrier_var()
* in place will ensure that cast is performed before the barrier_var()
* invocation, because compiler has to pessimistically assume that embedded
* asm section might perform some extra operations on that variable.
*
* This is a variable-specific variant of more global barrier().
*/
#ifndef barrier_var
#define barrier_var(var) asm volatile("" : "=r"(var) : "0"(var))
#endif
/*
* Helper macro to throw a compilation error if __bpf_unreachable() gets
* built into the resulting code. This works given BPF back end does not
@@ -149,6 +173,8 @@ enum libbpf_tristate {
#define __kconfig __attribute__((section(".kconfig")))
#define __ksym __attribute__((section(".ksyms")))
#define __kptr __attribute__((btf_type_tag("kptr")))
#define __kptr_ref __attribute__((btf_type_tag("kptr_ref")))
#ifndef ___bpf_concat
#define ___bpf_concat(a, b) a ## b

View File

@@ -27,6 +27,9 @@
#elif defined(__TARGET_ARCH_riscv)
#define bpf_target_riscv
#define bpf_target_defined
#elif defined(__TARGET_ARCH_arc)
#define bpf_target_arc
#define bpf_target_defined
#else
/* Fall back to what the compiler says */
@@ -54,6 +57,9 @@
#elif defined(__riscv) && __riscv_xlen == 64
#define bpf_target_riscv
#define bpf_target_defined
#elif defined(__arc__)
#define bpf_target_arc
#define bpf_target_defined
#endif /* no compiler target */
#endif
@@ -233,6 +239,23 @@ struct pt_regs___arm64 {
/* riscv does not select ARCH_HAS_SYSCALL_WRAPPER. */
#define PT_REGS_SYSCALL_REGS(ctx) ctx
#elif defined(bpf_target_arc)
/* arc provides struct user_pt_regs instead of struct pt_regs to userspace */
#define __PT_REGS_CAST(x) ((const struct user_regs_struct *)(x))
#define __PT_PARM1_REG scratch.r0
#define __PT_PARM2_REG scratch.r1
#define __PT_PARM3_REG scratch.r2
#define __PT_PARM4_REG scratch.r3
#define __PT_PARM5_REG scratch.r4
#define __PT_RET_REG scratch.blink
#define __PT_FP_REG __unsupported__
#define __PT_RC_REG scratch.r0
#define __PT_SP_REG scratch.sp
#define __PT_IP_REG scratch.ret
/* arc does not select ARCH_HAS_SYSCALL_WRAPPER. */
#define PT_REGS_SYSCALL_REGS(ctx) ctx
#endif
#if defined(bpf_target_defined)

427
src/btf.c
View File

@@ -130,7 +130,7 @@ static inline __u64 ptr_to_u64(const void *ptr)
/* Ensure given dynamically allocated memory region pointed to by *data* with
* capacity of *cap_cnt* elements each taking *elem_sz* bytes has enough
* memory to accomodate *add_cnt* new elements, assuming *cur_cnt* elements
* memory to accommodate *add_cnt* new elements, assuming *cur_cnt* elements
* are already used. At most *max_cnt* elements can be ever allocated.
* If necessary, memory is reallocated and all existing data is copied over,
* new pointer to the memory region is stored at *data, new memory region
@@ -305,6 +305,8 @@ static int btf_type_size(const struct btf_type *t)
return base_size + sizeof(__u32);
case BTF_KIND_ENUM:
return base_size + vlen * sizeof(struct btf_enum);
case BTF_KIND_ENUM64:
return base_size + vlen * sizeof(struct btf_enum64);
case BTF_KIND_ARRAY:
return base_size + sizeof(struct btf_array);
case BTF_KIND_STRUCT:
@@ -334,6 +336,7 @@ static void btf_bswap_type_base(struct btf_type *t)
static int btf_bswap_type_rest(struct btf_type *t)
{
struct btf_var_secinfo *v;
struct btf_enum64 *e64;
struct btf_member *m;
struct btf_array *a;
struct btf_param *p;
@@ -361,6 +364,13 @@ static int btf_bswap_type_rest(struct btf_type *t)
e->val = bswap_32(e->val);
}
return 0;
case BTF_KIND_ENUM64:
for (i = 0, e64 = btf_enum64(t); i < vlen; i++, e64++) {
e64->name_off = bswap_32(e64->name_off);
e64->val_lo32 = bswap_32(e64->val_lo32);
e64->val_hi32 = bswap_32(e64->val_hi32);
}
return 0;
case BTF_KIND_ARRAY:
a = btf_array(t);
a->type = bswap_32(a->type);
@@ -438,11 +448,6 @@ static int btf_parse_type_sec(struct btf *btf)
return 0;
}
__u32 btf__get_nr_types(const struct btf *btf)
{
return btf->start_id + btf->nr_types - 1;
}
__u32 btf__type_cnt(const struct btf *btf)
{
return btf->start_id + btf->nr_types;
@@ -472,9 +477,22 @@ const struct btf_type *btf__type_by_id(const struct btf *btf, __u32 type_id)
static int determine_ptr_size(const struct btf *btf)
{
static const char * const long_aliases[] = {
"long",
"long int",
"int long",
"unsigned long",
"long unsigned",
"unsigned long int",
"unsigned int long",
"long unsigned int",
"long int unsigned",
"int unsigned long",
"int long unsigned",
};
const struct btf_type *t;
const char *name;
int i, n;
int i, j, n;
if (btf->base_btf && btf->base_btf->ptr_sz > 0)
return btf->base_btf->ptr_sz;
@@ -485,15 +503,16 @@ static int determine_ptr_size(const struct btf *btf)
if (!btf_is_int(t))
continue;
if (t->size != 4 && t->size != 8)
continue;
name = btf__name_by_offset(btf, t->name_off);
if (!name)
continue;
if (strcmp(name, "long int") == 0 ||
strcmp(name, "long unsigned int") == 0) {
if (t->size != 4 && t->size != 8)
continue;
return t->size;
for (j = 0; j < ARRAY_SIZE(long_aliases); j++) {
if (strcmp(name, long_aliases[j]) == 0)
return t->size;
}
}
@@ -597,6 +616,7 @@ __s64 btf__resolve_size(const struct btf *btf, __u32 type_id)
case BTF_KIND_STRUCT:
case BTF_KIND_UNION:
case BTF_KIND_ENUM:
case BTF_KIND_ENUM64:
case BTF_KIND_DATASEC:
case BTF_KIND_FLOAT:
size = t->size;
@@ -644,6 +664,7 @@ int btf__align_of(const struct btf *btf, __u32 id)
switch (kind) {
case BTF_KIND_INT:
case BTF_KIND_ENUM:
case BTF_KIND_ENUM64:
case BTF_KIND_FLOAT:
return min(btf_ptr_sz(btf), (size_t)t->size);
case BTF_KIND_PTR:
@@ -1382,92 +1403,6 @@ struct btf *btf__load_from_kernel_by_id(__u32 id)
return btf__load_from_kernel_by_id_split(id, NULL);
}
int btf__get_from_id(__u32 id, struct btf **btf)
{
struct btf *res;
int err;
*btf = NULL;
res = btf__load_from_kernel_by_id(id);
err = libbpf_get_error(res);
if (err)
return libbpf_err(err);
*btf = res;
return 0;
}
int btf__get_map_kv_tids(const struct btf *btf, const char *map_name,
__u32 expected_key_size, __u32 expected_value_size,
__u32 *key_type_id, __u32 *value_type_id)
{
const struct btf_type *container_type;
const struct btf_member *key, *value;
const size_t max_name = 256;
char container_name[max_name];
__s64 key_size, value_size;
__s32 container_id;
if (snprintf(container_name, max_name, "____btf_map_%s", map_name) == max_name) {
pr_warn("map:%s length of '____btf_map_%s' is too long\n",
map_name, map_name);
return libbpf_err(-EINVAL);
}
container_id = btf__find_by_name(btf, container_name);
if (container_id < 0) {
pr_debug("map:%s container_name:%s cannot be found in BTF. Missing BPF_ANNOTATE_KV_PAIR?\n",
map_name, container_name);
return libbpf_err(container_id);
}
container_type = btf__type_by_id(btf, container_id);
if (!container_type) {
pr_warn("map:%s cannot find BTF type for container_id:%u\n",
map_name, container_id);
return libbpf_err(-EINVAL);
}
if (!btf_is_struct(container_type) || btf_vlen(container_type) < 2) {
pr_warn("map:%s container_name:%s is an invalid container struct\n",
map_name, container_name);
return libbpf_err(-EINVAL);
}
key = btf_members(container_type);
value = key + 1;
key_size = btf__resolve_size(btf, key->type);
if (key_size < 0) {
pr_warn("map:%s invalid BTF key_type_size\n", map_name);
return libbpf_err(key_size);
}
if (expected_key_size != key_size) {
pr_warn("map:%s btf_key_type_size:%u != map_def_key_size:%u\n",
map_name, (__u32)key_size, expected_key_size);
return libbpf_err(-EINVAL);
}
value_size = btf__resolve_size(btf, value->type);
if (value_size < 0) {
pr_warn("map:%s invalid BTF value_type_size\n", map_name);
return libbpf_err(value_size);
}
if (expected_value_size != value_size) {
pr_warn("map:%s btf_value_type_size:%u != map_def_value_size:%u\n",
map_name, (__u32)value_size, expected_value_size);
return libbpf_err(-EINVAL);
}
*key_type_id = key->type;
*value_type_id = value->type;
return 0;
}
static void btf_invalidate_raw_data(struct btf *btf)
{
if (btf->raw_data) {
@@ -2115,20 +2050,8 @@ int btf__add_field(struct btf *btf, const char *name, int type_id,
return 0;
}
/*
* Append new BTF_KIND_ENUM type with:
* - *name* - name of the enum, can be NULL or empty for anonymous enums;
* - *byte_sz* - size of the enum, in bytes.
*
* Enum initially has no enum values in it (and corresponds to enum forward
* declaration). Enumerator values can be added by btf__add_enum_value()
* immediately after btf__add_enum() succeeds.
*
* Returns:
* - >0, type ID of newly added BTF type;
* - <0, on error.
*/
int btf__add_enum(struct btf *btf, const char *name, __u32 byte_sz)
static int btf_add_enum_common(struct btf *btf, const char *name, __u32 byte_sz,
bool is_signed, __u8 kind)
{
struct btf_type *t;
int sz, name_off = 0;
@@ -2153,12 +2076,34 @@ int btf__add_enum(struct btf *btf, const char *name, __u32 byte_sz)
/* start out with vlen=0; it will be adjusted when adding enum values */
t->name_off = name_off;
t->info = btf_type_info(BTF_KIND_ENUM, 0, 0);
t->info = btf_type_info(kind, 0, is_signed);
t->size = byte_sz;
return btf_commit_type(btf, sz);
}
/*
* Append new BTF_KIND_ENUM type with:
* - *name* - name of the enum, can be NULL or empty for anonymous enums;
* - *byte_sz* - size of the enum, in bytes.
*
* Enum initially has no enum values in it (and corresponds to enum forward
* declaration). Enumerator values can be added by btf__add_enum_value()
* immediately after btf__add_enum() succeeds.
*
* Returns:
* - >0, type ID of newly added BTF type;
* - <0, on error.
*/
int btf__add_enum(struct btf *btf, const char *name, __u32 byte_sz)
{
/*
* set the signedness to be unsigned, it will change to signed
* if any later enumerator is negative.
*/
return btf_add_enum_common(btf, name, byte_sz, false, BTF_KIND_ENUM);
}
/*
* Append new enum value for the current ENUM type with:
* - *name* - name of the enumerator value, can't be NULL or empty;
@@ -2206,6 +2151,82 @@ int btf__add_enum_value(struct btf *btf, const char *name, __s64 value)
t = btf_last_type(btf);
btf_type_inc_vlen(t);
/* if negative value, set signedness to signed */
if (value < 0)
t->info = btf_type_info(btf_kind(t), btf_vlen(t), true);
btf->hdr->type_len += sz;
btf->hdr->str_off += sz;
return 0;
}
/*
* Append new BTF_KIND_ENUM64 type with:
* - *name* - name of the enum, can be NULL or empty for anonymous enums;
* - *byte_sz* - size of the enum, in bytes.
* - *is_signed* - whether the enum values are signed or not;
*
* Enum initially has no enum values in it (and corresponds to enum forward
* declaration). Enumerator values can be added by btf__add_enum64_value()
* immediately after btf__add_enum64() succeeds.
*
* Returns:
* - >0, type ID of newly added BTF type;
* - <0, on error.
*/
int btf__add_enum64(struct btf *btf, const char *name, __u32 byte_sz,
bool is_signed)
{
return btf_add_enum_common(btf, name, byte_sz, is_signed,
BTF_KIND_ENUM64);
}
/*
* Append new enum value for the current ENUM64 type with:
* - *name* - name of the enumerator value, can't be NULL or empty;
* - *value* - integer value corresponding to enum value *name*;
* Returns:
* - 0, on success;
* - <0, on error.
*/
int btf__add_enum64_value(struct btf *btf, const char *name, __u64 value)
{
struct btf_enum64 *v;
struct btf_type *t;
int sz, name_off;
/* last type should be BTF_KIND_ENUM64 */
if (btf->nr_types == 0)
return libbpf_err(-EINVAL);
t = btf_last_type(btf);
if (!btf_is_enum64(t))
return libbpf_err(-EINVAL);
/* non-empty name */
if (!name || !name[0])
return libbpf_err(-EINVAL);
/* decompose and invalidate raw data */
if (btf_ensure_modifiable(btf))
return libbpf_err(-ENOMEM);
sz = sizeof(struct btf_enum64);
v = btf_add_type_mem(btf, sz);
if (!v)
return libbpf_err(-ENOMEM);
name_off = btf__add_str(btf, name);
if (name_off < 0)
return name_off;
v->name_off = name_off;
v->val_lo32 = (__u32)value;
v->val_hi32 = value >> 32;
/* update parent type's vlen */
t = btf_last_type(btf);
btf_type_inc_vlen(t);
btf->hdr->type_len += sz;
btf->hdr->str_off += sz;
return 0;
@@ -2626,6 +2647,7 @@ static int btf_ext_setup_info(struct btf_ext *btf_ext,
const struct btf_ext_info_sec *sinfo;
struct btf_ext_info *ext_info;
__u32 info_left, record_size;
size_t sec_cnt = 0;
/* The start of the info sec (including the __u32 record_size). */
void *info;
@@ -2689,8 +2711,7 @@ static int btf_ext_setup_info(struct btf_ext *btf_ext,
return -EINVAL;
}
total_record_size = sec_hdrlen +
(__u64)num_records * record_size;
total_record_size = sec_hdrlen + (__u64)num_records * record_size;
if (info_left < total_record_size) {
pr_debug("%s section has incorrect num_records in .BTF.ext\n",
ext_sec->desc);
@@ -2699,12 +2720,14 @@ static int btf_ext_setup_info(struct btf_ext *btf_ext,
info_left -= total_record_size;
sinfo = (void *)sinfo + total_record_size;
sec_cnt++;
}
ext_info = ext_sec->ext_info;
ext_info->len = ext_sec->len - sizeof(__u32);
ext_info->rec_size = record_size;
ext_info->info = info + sizeof(__u32);
ext_info->sec_cnt = sec_cnt;
return 0;
}
@@ -2788,6 +2811,9 @@ void btf_ext__free(struct btf_ext *btf_ext)
{
if (IS_ERR_OR_NULL(btf_ext))
return;
free(btf_ext->func_info.sec_idxs);
free(btf_ext->line_info.sec_idxs);
free(btf_ext->core_relo_info.sec_idxs);
free(btf_ext->data);
free(btf_ext);
}
@@ -2826,10 +2852,8 @@ struct btf_ext *btf_ext__new(const __u8 *data, __u32 size)
if (err)
goto done;
if (btf_ext->hdr->hdr_len < offsetofend(struct btf_ext_header, core_relo_len)) {
err = -EINVAL;
goto done;
}
if (btf_ext->hdr->hdr_len < offsetofend(struct btf_ext_header, core_relo_len))
goto done; /* skip core relos parsing */
err = btf_ext_setup_core_relos(btf_ext);
if (err)
@@ -2850,81 +2874,6 @@ const void *btf_ext__get_raw_data(const struct btf_ext *btf_ext, __u32 *size)
return btf_ext->data;
}
static int btf_ext_reloc_info(const struct btf *btf,
const struct btf_ext_info *ext_info,
const char *sec_name, __u32 insns_cnt,
void **info, __u32 *cnt)
{
__u32 sec_hdrlen = sizeof(struct btf_ext_info_sec);
__u32 i, record_size, existing_len, records_len;
struct btf_ext_info_sec *sinfo;
const char *info_sec_name;
__u64 remain_len;
void *data;
record_size = ext_info->rec_size;
sinfo = ext_info->info;
remain_len = ext_info->len;
while (remain_len > 0) {
records_len = sinfo->num_info * record_size;
info_sec_name = btf__name_by_offset(btf, sinfo->sec_name_off);
if (strcmp(info_sec_name, sec_name)) {
remain_len -= sec_hdrlen + records_len;
sinfo = (void *)sinfo + sec_hdrlen + records_len;
continue;
}
existing_len = (*cnt) * record_size;
data = realloc(*info, existing_len + records_len);
if (!data)
return libbpf_err(-ENOMEM);
memcpy(data + existing_len, sinfo->data, records_len);
/* adjust insn_off only, the rest data will be passed
* to the kernel.
*/
for (i = 0; i < sinfo->num_info; i++) {
__u32 *insn_off;
insn_off = data + existing_len + (i * record_size);
*insn_off = *insn_off / sizeof(struct bpf_insn) + insns_cnt;
}
*info = data;
*cnt += sinfo->num_info;
return 0;
}
return libbpf_err(-ENOENT);
}
int btf_ext__reloc_func_info(const struct btf *btf,
const struct btf_ext *btf_ext,
const char *sec_name, __u32 insns_cnt,
void **func_info, __u32 *cnt)
{
return btf_ext_reloc_info(btf, &btf_ext->func_info, sec_name,
insns_cnt, func_info, cnt);
}
int btf_ext__reloc_line_info(const struct btf *btf,
const struct btf_ext *btf_ext,
const char *sec_name, __u32 insns_cnt,
void **line_info, __u32 *cnt)
{
return btf_ext_reloc_info(btf, &btf_ext->line_info, sec_name,
insns_cnt, line_info, cnt);
}
__u32 btf_ext__func_info_rec_size(const struct btf_ext *btf_ext)
{
return btf_ext->func_info.rec_size;
}
__u32 btf_ext__line_info_rec_size(const struct btf_ext *btf_ext)
{
return btf_ext->line_info.rec_size;
}
struct btf_dedup;
static struct btf_dedup *btf_dedup_new(struct btf *btf, const struct btf_dedup_opts *opts);
@@ -3074,9 +3023,7 @@ static int btf_dedup_remap_types(struct btf_dedup *d);
* deduplicating structs/unions is described in greater details in comments for
* `btf_dedup_is_equiv` function.
*/
DEFAULT_VERSION(btf__dedup_v0_6_0, btf__dedup, LIBBPF_0.6.0)
int btf__dedup_v0_6_0(struct btf *btf, const struct btf_dedup_opts *opts)
int btf__dedup(struct btf *btf, const struct btf_dedup_opts *opts)
{
struct btf_dedup *d;
int err;
@@ -3136,19 +3083,6 @@ done:
return libbpf_err(err);
}
COMPAT_VERSION(btf__dedup_deprecated, btf__dedup, LIBBPF_0.0.2)
int btf__dedup_deprecated(struct btf *btf, struct btf_ext *btf_ext, const void *unused_opts)
{
LIBBPF_OPTS(btf_dedup_opts, opts, .btf_ext = btf_ext);
if (unused_opts) {
pr_warn("please use new version of btf__dedup() that supports options\n");
return libbpf_err(-ENOTSUP);
}
return btf__dedup(btf, &opts);
}
#define BTF_UNPROCESSED_ID ((__u32)-1)
#define BTF_IN_PROGRESS_ID ((__u32)-2)
@@ -3467,7 +3401,7 @@ static bool btf_equal_int_tag(struct btf_type *t1, struct btf_type *t2)
return info1 == info2;
}
/* Calculate type signature hash of ENUM. */
/* Calculate type signature hash of ENUM/ENUM64. */
static long btf_hash_enum(struct btf_type *t)
{
long h;
@@ -3501,9 +3435,31 @@ static bool btf_equal_enum(struct btf_type *t1, struct btf_type *t2)
return true;
}
static bool btf_equal_enum64(struct btf_type *t1, struct btf_type *t2)
{
const struct btf_enum64 *m1, *m2;
__u16 vlen;
int i;
if (!btf_equal_common(t1, t2))
return false;
vlen = btf_vlen(t1);
m1 = btf_enum64(t1);
m2 = btf_enum64(t2);
for (i = 0; i < vlen; i++) {
if (m1->name_off != m2->name_off || m1->val_lo32 != m2->val_lo32 ||
m1->val_hi32 != m2->val_hi32)
return false;
m1++;
m2++;
}
return true;
}
static inline bool btf_is_enum_fwd(struct btf_type *t)
{
return btf_is_enum(t) && btf_vlen(t) == 0;
return btf_is_any_enum(t) && btf_vlen(t) == 0;
}
static bool btf_compat_enum(struct btf_type *t1, struct btf_type *t2)
@@ -3516,6 +3472,17 @@ static bool btf_compat_enum(struct btf_type *t1, struct btf_type *t2)
t1->size == t2->size;
}
static bool btf_compat_enum64(struct btf_type *t1, struct btf_type *t2)
{
if (!btf_is_enum_fwd(t1) && !btf_is_enum_fwd(t2))
return btf_equal_enum64(t1, t2);
/* ignore vlen when comparing */
return t1->name_off == t2->name_off &&
(t1->info & ~0xffff) == (t2->info & ~0xffff) &&
t1->size == t2->size;
}
/*
* Calculate type signature hash of STRUCT/UNION, ignoring referenced type IDs,
* as referenced type IDs equivalence is established separately during type
@@ -3728,6 +3695,7 @@ static int btf_dedup_prep(struct btf_dedup *d)
h = btf_hash_int_decl_tag(t);
break;
case BTF_KIND_ENUM:
case BTF_KIND_ENUM64:
h = btf_hash_enum(t);
break;
case BTF_KIND_STRUCT:
@@ -3817,6 +3785,27 @@ static int btf_dedup_prim_type(struct btf_dedup *d, __u32 type_id)
}
break;
case BTF_KIND_ENUM64:
h = btf_hash_enum(t);
for_each_dedup_cand(d, hash_entry, h) {
cand_id = (__u32)(long)hash_entry->value;
cand = btf_type_by_id(d->btf, cand_id);
if (btf_equal_enum64(t, cand)) {
new_id = cand_id;
break;
}
if (btf_compat_enum64(t, cand)) {
if (btf_is_enum_fwd(t)) {
/* resolve fwd to full enum */
new_id = cand_id;
break;
}
/* resolve canonical enum fwd to full enum */
d->map[cand_id] = type_id;
}
}
break;
case BTF_KIND_FWD:
case BTF_KIND_FLOAT:
h = btf_hash_common(t);
@@ -4112,6 +4101,9 @@ static int btf_dedup_is_equiv(struct btf_dedup *d, __u32 cand_id,
case BTF_KIND_ENUM:
return btf_compat_enum(cand_type, canon_type);
case BTF_KIND_ENUM64:
return btf_compat_enum64(cand_type, canon_type);
case BTF_KIND_FWD:
case BTF_KIND_FLOAT:
return btf_equal_common(cand_type, canon_type);
@@ -4714,6 +4706,7 @@ int btf_type_visit_type_ids(struct btf_type *t, type_id_visit_fn visit, void *ct
case BTF_KIND_INT:
case BTF_KIND_FLOAT:
case BTF_KIND_ENUM:
case BTF_KIND_ENUM64:
return 0;
case BTF_KIND_FWD:
@@ -4808,6 +4801,16 @@ int btf_type_visit_str_offs(struct btf_type *t, str_off_visit_fn visit, void *ct
}
break;
}
case BTF_KIND_ENUM64: {
struct btf_enum64 *m = btf_enum64(t);
for (i = 0, n = btf_vlen(t); i < n; i++, m++) {
err = visit(&m->name_off, ctx);
if (err)
return err;
}
break;
}
case BTF_KIND_FUNC_PROTO: {
struct btf_param *m = btf_params(t);

118
src/btf.h
View File

@@ -120,20 +120,12 @@ LIBBPF_API struct btf *libbpf_find_kernel_btf(void);
LIBBPF_API struct btf *btf__load_from_kernel_by_id(__u32 id);
LIBBPF_API struct btf *btf__load_from_kernel_by_id_split(__u32 id, struct btf *base_btf);
LIBBPF_DEPRECATED_SINCE(0, 6, "use btf__load_from_kernel_by_id instead")
LIBBPF_API int btf__get_from_id(__u32 id, struct btf **btf);
LIBBPF_DEPRECATED_SINCE(0, 6, "intended for internal libbpf use only")
LIBBPF_API int btf__finalize_data(struct bpf_object *obj, struct btf *btf);
LIBBPF_DEPRECATED_SINCE(0, 6, "use btf__load_into_kernel instead")
LIBBPF_API int btf__load(struct btf *btf);
LIBBPF_API int btf__load_into_kernel(struct btf *btf);
LIBBPF_API __s32 btf__find_by_name(const struct btf *btf,
const char *type_name);
LIBBPF_API __s32 btf__find_by_name_kind(const struct btf *btf,
const char *type_name, __u32 kind);
LIBBPF_DEPRECATED_SINCE(0, 7, "use btf__type_cnt() instead; note that btf__get_nr_types() == btf__type_cnt() - 1")
LIBBPF_API __u32 btf__get_nr_types(const struct btf *btf);
LIBBPF_API __u32 btf__type_cnt(const struct btf *btf);
LIBBPF_API const struct btf *btf__base_btf(const struct btf *btf);
LIBBPF_API const struct btf_type *btf__type_by_id(const struct btf *btf,
@@ -150,29 +142,10 @@ LIBBPF_API void btf__set_fd(struct btf *btf, int fd);
LIBBPF_API const void *btf__raw_data(const struct btf *btf, __u32 *size);
LIBBPF_API const char *btf__name_by_offset(const struct btf *btf, __u32 offset);
LIBBPF_API const char *btf__str_by_offset(const struct btf *btf, __u32 offset);
LIBBPF_DEPRECATED_SINCE(0, 7, "this API is not necessary when BTF-defined maps are used")
LIBBPF_API int btf__get_map_kv_tids(const struct btf *btf, const char *map_name,
__u32 expected_key_size,
__u32 expected_value_size,
__u32 *key_type_id, __u32 *value_type_id);
LIBBPF_API struct btf_ext *btf_ext__new(const __u8 *data, __u32 size);
LIBBPF_API void btf_ext__free(struct btf_ext *btf_ext);
LIBBPF_API const void *btf_ext__raw_data(const struct btf_ext *btf_ext, __u32 *size);
LIBBPF_API LIBBPF_DEPRECATED("btf_ext__reloc_func_info was never meant as a public API and has wrong assumptions embedded in it; it will be removed in the future libbpf versions")
int btf_ext__reloc_func_info(const struct btf *btf,
const struct btf_ext *btf_ext,
const char *sec_name, __u32 insns_cnt,
void **func_info, __u32 *cnt);
LIBBPF_API LIBBPF_DEPRECATED("btf_ext__reloc_line_info was never meant as a public API and has wrong assumptions embedded in it; it will be removed in the future libbpf versions")
int btf_ext__reloc_line_info(const struct btf *btf,
const struct btf_ext *btf_ext,
const char *sec_name, __u32 insns_cnt,
void **line_info, __u32 *cnt);
LIBBPF_API LIBBPF_DEPRECATED("btf_ext__reloc_func_info is deprecated; write custom func_info parsing to fetch rec_size")
__u32 btf_ext__func_info_rec_size(const struct btf_ext *btf_ext);
LIBBPF_API LIBBPF_DEPRECATED("btf_ext__reloc_line_info is deprecated; write custom line_info parsing to fetch rec_size")
__u32 btf_ext__line_info_rec_size(const struct btf_ext *btf_ext);
LIBBPF_API int btf__find_str(struct btf *btf, const char *s);
LIBBPF_API int btf__add_str(struct btf *btf, const char *s);
@@ -215,6 +188,8 @@ LIBBPF_API int btf__add_field(struct btf *btf, const char *name, int field_type_
/* enum construction APIs */
LIBBPF_API int btf__add_enum(struct btf *btf, const char *name, __u32 bytes_sz);
LIBBPF_API int btf__add_enum_value(struct btf *btf, const char *name, __s64 value);
LIBBPF_API int btf__add_enum64(struct btf *btf, const char *name, __u32 bytes_sz, bool is_signed);
LIBBPF_API int btf__add_enum64_value(struct btf *btf, const char *name, __u64 value);
enum btf_fwd_kind {
BTF_FWD_STRUCT = 0,
@@ -257,22 +232,12 @@ struct btf_dedup_opts {
LIBBPF_API int btf__dedup(struct btf *btf, const struct btf_dedup_opts *opts);
LIBBPF_API int btf__dedup_v0_6_0(struct btf *btf, const struct btf_dedup_opts *opts);
LIBBPF_DEPRECATED_SINCE(0, 7, "use btf__dedup() instead")
LIBBPF_API int btf__dedup_deprecated(struct btf *btf, struct btf_ext *btf_ext, const void *opts);
#define btf__dedup(...) ___libbpf_overload(___btf_dedup, __VA_ARGS__)
#define ___btf_dedup3(btf, btf_ext, opts) btf__dedup_deprecated(btf, btf_ext, opts)
#define ___btf_dedup2(btf, opts) btf__dedup(btf, opts)
struct btf_dump;
struct btf_dump_opts {
union {
size_t sz;
void *ctx; /* DEPRECATED: will be gone in v1.0 */
};
size_t sz;
};
#define btf_dump_opts__last_field sz
typedef void (*btf_dump_printf_fn_t)(void *ctx, const char *fmt, va_list args);
@@ -281,51 +246,6 @@ LIBBPF_API struct btf_dump *btf_dump__new(const struct btf *btf,
void *ctx,
const struct btf_dump_opts *opts);
LIBBPF_API struct btf_dump *btf_dump__new_v0_6_0(const struct btf *btf,
btf_dump_printf_fn_t printf_fn,
void *ctx,
const struct btf_dump_opts *opts);
LIBBPF_API struct btf_dump *btf_dump__new_deprecated(const struct btf *btf,
const struct btf_ext *btf_ext,
const struct btf_dump_opts *opts,
btf_dump_printf_fn_t printf_fn);
/* Choose either btf_dump__new() or btf_dump__new_deprecated() based on the
* type of 4th argument. If it's btf_dump's print callback, use deprecated
* API; otherwise, choose the new btf_dump__new(). ___libbpf_override()
* doesn't work here because both variants have 4 input arguments.
*
* (void *) casts are necessary to avoid compilation warnings about type
* mismatches, because even though __builtin_choose_expr() only ever evaluates
* one side the other side still has to satisfy type constraints (this is
* compiler implementation limitation which might be lifted eventually,
* according to the documentation). So passing struct btf_ext in place of
* btf_dump_printf_fn_t would be generating compilation warning. Casting to
* void * avoids this issue.
*
* Also, two type compatibility checks for a function and function pointer are
* required because passing function reference into btf_dump__new() as
* btf_dump__new(..., my_callback, ...) and as btf_dump__new(...,
* &my_callback, ...) (not explicit ampersand in the latter case) actually
* differs as far as __builtin_types_compatible_p() is concerned. Thus two
* checks are combined to detect callback argument.
*
* The rest works just like in case of ___libbpf_override() usage with symbol
* versioning.
*
* C++ compilers don't support __builtin_types_compatible_p(), so at least
* don't screw up compilation for them and let C++ users pick btf_dump__new
* vs btf_dump__new_deprecated explicitly.
*/
#ifndef __cplusplus
#define btf_dump__new(a1, a2, a3, a4) __builtin_choose_expr( \
__builtin_types_compatible_p(typeof(a4), btf_dump_printf_fn_t) || \
__builtin_types_compatible_p(typeof(a4), void(void *, const char *, va_list)), \
btf_dump__new_deprecated((void *)a1, (void *)a2, (void *)a3, (void *)a4), \
btf_dump__new((void *)a1, (void *)a2, (void *)a3, (void *)a4))
#endif
LIBBPF_API void btf_dump__free(struct btf_dump *d);
LIBBPF_API int btf_dump__dump_type(struct btf_dump *d, __u32 id);
@@ -393,9 +313,10 @@ btf_dump__dump_type_data(struct btf_dump *d, __u32 id,
#ifndef BTF_KIND_FLOAT
#define BTF_KIND_FLOAT 16 /* Floating point */
#endif
/* The kernel header switched to enums, so these two were never #defined */
/* The kernel header switched to enums, so the following were never #defined */
#define BTF_KIND_DECL_TAG 17 /* Decl Tag */
#define BTF_KIND_TYPE_TAG 18 /* Type Tag */
#define BTF_KIND_ENUM64 19 /* Enum for up-to 64bit values */
static inline __u16 btf_kind(const struct btf_type *t)
{
@@ -454,6 +375,11 @@ static inline bool btf_is_enum(const struct btf_type *t)
return btf_kind(t) == BTF_KIND_ENUM;
}
static inline bool btf_is_enum64(const struct btf_type *t)
{
return btf_kind(t) == BTF_KIND_ENUM64;
}
static inline bool btf_is_fwd(const struct btf_type *t)
{
return btf_kind(t) == BTF_KIND_FWD;
@@ -524,6 +450,18 @@ static inline bool btf_is_type_tag(const struct btf_type *t)
return btf_kind(t) == BTF_KIND_TYPE_TAG;
}
static inline bool btf_is_any_enum(const struct btf_type *t)
{
return btf_is_enum(t) || btf_is_enum64(t);
}
static inline bool btf_kind_core_compat(const struct btf_type *t1,
const struct btf_type *t2)
{
return btf_kind(t1) == btf_kind(t2) ||
(btf_is_any_enum(t1) && btf_is_any_enum(t2));
}
static inline __u8 btf_int_encoding(const struct btf_type *t)
{
return BTF_INT_ENCODING(*(__u32 *)(t + 1));
@@ -549,6 +487,16 @@ static inline struct btf_enum *btf_enum(const struct btf_type *t)
return (struct btf_enum *)(t + 1);
}
static inline struct btf_enum64 *btf_enum64(const struct btf_type *t)
{
return (struct btf_enum64 *)(t + 1);
}
static inline __u64 btf_enum64_value(const struct btf_enum64 *e)
{
return ((__u64)e->val_hi32 << 32) | e->val_lo32;
}
static inline struct btf_member *btf_members(const struct btf_type *t)
{
return (struct btf_member *)(t + 1);

View File

@@ -144,15 +144,17 @@ static void btf_dump_printf(const struct btf_dump *d, const char *fmt, ...)
static int btf_dump_mark_referenced(struct btf_dump *d);
static int btf_dump_resize(struct btf_dump *d);
DEFAULT_VERSION(btf_dump__new_v0_6_0, btf_dump__new, LIBBPF_0.6.0)
struct btf_dump *btf_dump__new_v0_6_0(const struct btf *btf,
btf_dump_printf_fn_t printf_fn,
void *ctx,
const struct btf_dump_opts *opts)
struct btf_dump *btf_dump__new(const struct btf *btf,
btf_dump_printf_fn_t printf_fn,
void *ctx,
const struct btf_dump_opts *opts)
{
struct btf_dump *d;
int err;
if (!OPTS_VALID(opts, btf_dump_opts))
return libbpf_err_ptr(-EINVAL);
if (!printf_fn)
return libbpf_err_ptr(-EINVAL);
@@ -188,17 +190,6 @@ err:
return libbpf_err_ptr(err);
}
COMPAT_VERSION(btf_dump__new_deprecated, btf_dump__new, LIBBPF_0.0.4)
struct btf_dump *btf_dump__new_deprecated(const struct btf *btf,
const struct btf_ext *btf_ext,
const struct btf_dump_opts *opts,
btf_dump_printf_fn_t printf_fn)
{
if (!printf_fn)
return libbpf_err_ptr(-EINVAL);
return btf_dump__new_v0_6_0(btf, printf_fn, opts ? opts->ctx : NULL, opts);
}
static int btf_dump_resize(struct btf_dump *d)
{
int err, last_id = btf__type_cnt(d->btf) - 1;
@@ -318,6 +309,7 @@ static int btf_dump_mark_referenced(struct btf_dump *d)
switch (btf_kind(t)) {
case BTF_KIND_INT:
case BTF_KIND_ENUM:
case BTF_KIND_ENUM64:
case BTF_KIND_FWD:
case BTF_KIND_FLOAT:
break;
@@ -538,6 +530,7 @@ static int btf_dump_order_type(struct btf_dump *d, __u32 id, bool through_ptr)
return 1;
}
case BTF_KIND_ENUM:
case BTF_KIND_ENUM64:
case BTF_KIND_FWD:
/*
* non-anonymous or non-referenced enums are top-level
@@ -739,6 +732,7 @@ static void btf_dump_emit_type(struct btf_dump *d, __u32 id, __u32 cont_id)
tstate->emit_state = EMITTED;
break;
case BTF_KIND_ENUM:
case BTF_KIND_ENUM64:
if (top_level_def) {
btf_dump_emit_enum_def(d, id, t, 0);
btf_dump_printf(d, ";\n\n");
@@ -989,38 +983,81 @@ static void btf_dump_emit_enum_fwd(struct btf_dump *d, __u32 id,
btf_dump_printf(d, "enum %s", btf_dump_type_name(d, id));
}
static void btf_dump_emit_enum32_val(struct btf_dump *d,
const struct btf_type *t,
int lvl, __u16 vlen)
{
const struct btf_enum *v = btf_enum(t);
bool is_signed = btf_kflag(t);
const char *fmt_str;
const char *name;
size_t dup_cnt;
int i;
for (i = 0; i < vlen; i++, v++) {
name = btf_name_of(d, v->name_off);
/* enumerators share namespace with typedef idents */
dup_cnt = btf_dump_name_dups(d, d->ident_names, name);
if (dup_cnt > 1) {
fmt_str = is_signed ? "\n%s%s___%zd = %d," : "\n%s%s___%zd = %u,";
btf_dump_printf(d, fmt_str, pfx(lvl + 1), name, dup_cnt, v->val);
} else {
fmt_str = is_signed ? "\n%s%s = %d," : "\n%s%s = %u,";
btf_dump_printf(d, fmt_str, pfx(lvl + 1), name, v->val);
}
}
}
static void btf_dump_emit_enum64_val(struct btf_dump *d,
const struct btf_type *t,
int lvl, __u16 vlen)
{
const struct btf_enum64 *v = btf_enum64(t);
bool is_signed = btf_kflag(t);
const char *fmt_str;
const char *name;
size_t dup_cnt;
__u64 val;
int i;
for (i = 0; i < vlen; i++, v++) {
name = btf_name_of(d, v->name_off);
dup_cnt = btf_dump_name_dups(d, d->ident_names, name);
val = btf_enum64_value(v);
if (dup_cnt > 1) {
fmt_str = is_signed ? "\n%s%s___%zd = %lldLL,"
: "\n%s%s___%zd = %lluULL,";
btf_dump_printf(d, fmt_str,
pfx(lvl + 1), name, dup_cnt,
(unsigned long long)val);
} else {
fmt_str = is_signed ? "\n%s%s = %lldLL,"
: "\n%s%s = %lluULL,";
btf_dump_printf(d, fmt_str,
pfx(lvl + 1), name,
(unsigned long long)val);
}
}
}
static void btf_dump_emit_enum_def(struct btf_dump *d, __u32 id,
const struct btf_type *t,
int lvl)
{
const struct btf_enum *v = btf_enum(t);
__u16 vlen = btf_vlen(t);
const char *name;
size_t dup_cnt;
int i;
btf_dump_printf(d, "enum%s%s",
t->name_off ? " " : "",
btf_dump_type_name(d, id));
if (vlen) {
btf_dump_printf(d, " {");
for (i = 0; i < vlen; i++, v++) {
name = btf_name_of(d, v->name_off);
/* enumerators share namespace with typedef idents */
dup_cnt = btf_dump_name_dups(d, d->ident_names, name);
if (dup_cnt > 1) {
btf_dump_printf(d, "\n%s%s___%zu = %u,",
pfx(lvl + 1), name, dup_cnt,
(__u32)v->val);
} else {
btf_dump_printf(d, "\n%s%s = %u,",
pfx(lvl + 1), name,
(__u32)v->val);
}
}
btf_dump_printf(d, "\n%s}", pfx(lvl));
}
if (!vlen)
return;
btf_dump_printf(d, " {");
if (btf_is_enum(t))
btf_dump_emit_enum32_val(d, t, lvl, vlen);
else
btf_dump_emit_enum64_val(d, t, lvl, vlen);
btf_dump_printf(d, "\n%s}", pfx(lvl));
}
static void btf_dump_emit_fwd_def(struct btf_dump *d, __u32 id,
@@ -1178,6 +1215,7 @@ skip_mod:
break;
case BTF_KIND_INT:
case BTF_KIND_ENUM:
case BTF_KIND_ENUM64:
case BTF_KIND_FWD:
case BTF_KIND_STRUCT:
case BTF_KIND_UNION:
@@ -1312,6 +1350,7 @@ static void btf_dump_emit_type_chain(struct btf_dump *d,
btf_dump_emit_struct_fwd(d, id, t);
break;
case BTF_KIND_ENUM:
case BTF_KIND_ENUM64:
btf_dump_emit_mods(d, decls);
/* inline anonymous enum */
if (t->name_off == 0 && !d->skip_anon_defs)
@@ -1505,6 +1544,11 @@ static const char *btf_dump_resolve_name(struct btf_dump *d, __u32 id,
if (s->name_resolved)
return *cached_name ? *cached_name : orig_name;
if (btf_is_fwd(t) || (btf_is_enum(t) && btf_vlen(t) == 0)) {
s->name_resolved = 1;
return orig_name;
}
dup_cnt = btf_dump_name_dups(d, name_map, orig_name);
if (dup_cnt > 1) {
const size_t max_len = 256;
@@ -1983,7 +2027,8 @@ static int btf_dump_get_enum_value(struct btf_dump *d,
__u32 id,
__s64 *value)
{
/* handle unaligned enum value */
bool is_signed = btf_kflag(t);
if (!ptr_is_aligned(d->btf, id, data)) {
__u64 val;
int err;
@@ -2000,13 +2045,13 @@ static int btf_dump_get_enum_value(struct btf_dump *d,
*value = *(__s64 *)data;
return 0;
case 4:
*value = *(__s32 *)data;
*value = is_signed ? *(__s32 *)data : *(__u32 *)data;
return 0;
case 2:
*value = *(__s16 *)data;
*value = is_signed ? *(__s16 *)data : *(__u16 *)data;
return 0;
case 1:
*value = *(__s8 *)data;
*value = is_signed ? *(__s8 *)data : *(__u8 *)data;
return 0;
default:
pr_warn("unexpected size %d for enum, id:[%u]\n", t->size, id);
@@ -2019,7 +2064,7 @@ static int btf_dump_enum_data(struct btf_dump *d,
__u32 id,
const void *data)
{
const struct btf_enum *e;
bool is_signed;
__s64 value;
int i, err;
@@ -2027,14 +2072,31 @@ static int btf_dump_enum_data(struct btf_dump *d,
if (err)
return err;
for (i = 0, e = btf_enum(t); i < btf_vlen(t); i++, e++) {
if (value != e->val)
continue;
btf_dump_type_values(d, "%s", btf_name_of(d, e->name_off));
return 0;
}
is_signed = btf_kflag(t);
if (btf_is_enum(t)) {
const struct btf_enum *e;
btf_dump_type_values(d, "%d", value);
for (i = 0, e = btf_enum(t); i < btf_vlen(t); i++, e++) {
if (value != e->val)
continue;
btf_dump_type_values(d, "%s", btf_name_of(d, e->name_off));
return 0;
}
btf_dump_type_values(d, is_signed ? "%d" : "%u", value);
} else {
const struct btf_enum64 *e;
for (i = 0, e = btf_enum64(t); i < btf_vlen(t); i++, e++) {
if (value != btf_enum64_value(e))
continue;
btf_dump_type_values(d, "%s", btf_name_of(d, e->name_off));
return 0;
}
btf_dump_type_values(d, is_signed ? "%lldLL" : "%lluULL",
(unsigned long long)value);
}
return 0;
}
@@ -2094,6 +2156,7 @@ static int btf_dump_type_data_check_overflow(struct btf_dump *d,
case BTF_KIND_FLOAT:
case BTF_KIND_PTR:
case BTF_KIND_ENUM:
case BTF_KIND_ENUM64:
if (data + bits_offset / 8 + size > d->typed_dump->data_end)
return -E2BIG;
break;
@@ -2198,6 +2261,7 @@ static int btf_dump_type_data_check_zero(struct btf_dump *d,
return -ENODATA;
}
case BTF_KIND_ENUM:
case BTF_KIND_ENUM64:
err = btf_dump_get_enum_value(d, t, data, id, &value);
if (err)
return err;
@@ -2270,6 +2334,7 @@ static int btf_dump_dump_type_data(struct btf_dump *d,
err = btf_dump_struct_data(d, t, id, data);
break;
case BTF_KIND_ENUM:
case BTF_KIND_ENUM64:
/* handle bitfield and int enum values */
if (bit_sz) {
__u64 print_num;

View File

@@ -1043,18 +1043,27 @@ void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *pvalue,
value = add_data(gen, pvalue, value_size);
key = add_data(gen, &zero, sizeof(zero));
/* if (map_desc[map_idx].initial_value)
* copy_from_user(value, initial_value, value_size);
/* if (map_desc[map_idx].initial_value) {
* if (ctx->flags & BPF_SKEL_KERNEL)
* bpf_probe_read_kernel(value, value_size, initial_value);
* else
* bpf_copy_from_user(value, value_size, initial_value);
* }
*/
emit(gen, BPF_LDX_MEM(BPF_DW, BPF_REG_3, BPF_REG_6,
sizeof(struct bpf_loader_ctx) +
sizeof(struct bpf_map_desc) * map_idx +
offsetof(struct bpf_map_desc, initial_value)));
emit(gen, BPF_JMP_IMM(BPF_JEQ, BPF_REG_3, 0, 4));
emit(gen, BPF_JMP_IMM(BPF_JEQ, BPF_REG_3, 0, 8));
emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE,
0, 0, 0, value));
emit(gen, BPF_MOV64_IMM(BPF_REG_2, value_size));
emit(gen, BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_6,
offsetof(struct bpf_loader_ctx, flags)));
emit(gen, BPF_JMP_IMM(BPF_JSET, BPF_REG_0, BPF_SKEL_KERNEL, 2));
emit(gen, BPF_EMIT_CALL(BPF_FUNC_copy_from_user));
emit(gen, BPF_JMP_IMM(BPF_JA, 0, 0, 1));
emit(gen, BPF_EMIT_CALL(BPF_FUNC_probe_read_kernel));
map_update_attr = add_data(gen, &attr, attr_size);
move_blob2blob(gen, attr_field(map_update_attr, map_fd), 4,

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -1,29 +1,14 @@
LIBBPF_0.0.1 {
global:
bpf_btf_get_fd_by_id;
bpf_create_map;
bpf_create_map_in_map;
bpf_create_map_in_map_node;
bpf_create_map_name;
bpf_create_map_node;
bpf_create_map_xattr;
bpf_load_btf;
bpf_load_program;
bpf_load_program_xattr;
bpf_map__btf_key_type_id;
bpf_map__btf_value_type_id;
bpf_map__def;
bpf_map__fd;
bpf_map__is_offload_neutral;
bpf_map__name;
bpf_map__next;
bpf_map__pin;
bpf_map__prev;
bpf_map__priv;
bpf_map__reuse_fd;
bpf_map__set_ifindex;
bpf_map__set_inner_map_fd;
bpf_map__set_priv;
bpf_map__unpin;
bpf_map_delete_elem;
bpf_map_get_fd_by_id;
@@ -38,79 +23,37 @@ LIBBPF_0.0.1 {
bpf_object__btf_fd;
bpf_object__close;
bpf_object__find_map_by_name;
bpf_object__find_map_by_offset;
bpf_object__find_program_by_title;
bpf_object__kversion;
bpf_object__load;
bpf_object__name;
bpf_object__next;
bpf_object__open;
bpf_object__open_buffer;
bpf_object__open_xattr;
bpf_object__pin;
bpf_object__pin_maps;
bpf_object__pin_programs;
bpf_object__priv;
bpf_object__set_priv;
bpf_object__unload;
bpf_object__unpin_maps;
bpf_object__unpin_programs;
bpf_perf_event_read_simple;
bpf_prog_attach;
bpf_prog_detach;
bpf_prog_detach2;
bpf_prog_get_fd_by_id;
bpf_prog_get_next_id;
bpf_prog_load;
bpf_prog_load_xattr;
bpf_prog_query;
bpf_prog_test_run;
bpf_prog_test_run_xattr;
bpf_program__fd;
bpf_program__is_kprobe;
bpf_program__is_perf_event;
bpf_program__is_raw_tracepoint;
bpf_program__is_sched_act;
bpf_program__is_sched_cls;
bpf_program__is_socket_filter;
bpf_program__is_tracepoint;
bpf_program__is_xdp;
bpf_program__load;
bpf_program__next;
bpf_program__nth_fd;
bpf_program__pin;
bpf_program__pin_instance;
bpf_program__prev;
bpf_program__priv;
bpf_program__set_expected_attach_type;
bpf_program__set_ifindex;
bpf_program__set_kprobe;
bpf_program__set_perf_event;
bpf_program__set_prep;
bpf_program__set_priv;
bpf_program__set_raw_tracepoint;
bpf_program__set_sched_act;
bpf_program__set_sched_cls;
bpf_program__set_socket_filter;
bpf_program__set_tracepoint;
bpf_program__set_type;
bpf_program__set_xdp;
bpf_program__title;
bpf_program__unload;
bpf_program__unpin;
bpf_program__unpin_instance;
bpf_prog_linfo__free;
bpf_prog_linfo__new;
bpf_prog_linfo__lfind_addr_func;
bpf_prog_linfo__lfind;
bpf_raw_tracepoint_open;
bpf_set_link_xdp_fd;
bpf_task_fd_query;
bpf_verify_program;
btf__fd;
btf__find_by_name;
btf__free;
btf__get_from_id;
btf__name_by_offset;
btf__new;
btf__resolve_size;
@@ -127,48 +70,24 @@ LIBBPF_0.0.1 {
LIBBPF_0.0.2 {
global:
bpf_probe_helper;
bpf_probe_map_type;
bpf_probe_prog_type;
bpf_map__resize;
bpf_map_lookup_elem_flags;
bpf_object__btf;
bpf_object__find_map_fd_by_name;
bpf_get_link_xdp_id;
btf__dedup;
btf__get_map_kv_tids;
btf__get_nr_types;
btf__get_raw_data;
btf__load;
btf_ext__free;
btf_ext__func_info_rec_size;
btf_ext__get_raw_data;
btf_ext__line_info_rec_size;
btf_ext__new;
btf_ext__reloc_func_info;
btf_ext__reloc_line_info;
xsk_umem__create;
xsk_socket__create;
xsk_umem__delete;
xsk_socket__delete;
xsk_umem__fd;
xsk_socket__fd;
bpf_program__get_prog_info_linear;
bpf_program__bpil_addr_to_offs;
bpf_program__bpil_offs_to_addr;
} LIBBPF_0.0.1;
LIBBPF_0.0.3 {
global:
bpf_map__is_internal;
bpf_map_freeze;
btf__finalize_data;
} LIBBPF_0.0.2;
LIBBPF_0.0.4 {
global:
bpf_link__destroy;
bpf_object__load_xattr;
bpf_program__attach_kprobe;
bpf_program__attach_perf_event;
bpf_program__attach_raw_tracepoint;
@@ -176,14 +95,10 @@ LIBBPF_0.0.4 {
bpf_program__attach_uprobe;
btf_dump__dump_type;
btf_dump__free;
btf_dump__new;
btf__parse_elf;
libbpf_num_possible_cpus;
perf_buffer__free;
perf_buffer__new;
perf_buffer__new_raw;
perf_buffer__poll;
xsk_umem__create;
} LIBBPF_0.0.3;
LIBBPF_0.0.5 {
@@ -193,7 +108,6 @@ LIBBPF_0.0.5 {
LIBBPF_0.0.6 {
global:
bpf_get_link_xdp_info;
bpf_map__get_pin_path;
bpf_map__is_pinned;
bpf_map__set_pin_path;
@@ -202,9 +116,6 @@ LIBBPF_0.0.6 {
bpf_program__attach_trace;
bpf_program__get_expected_attach_type;
bpf_program__get_type;
bpf_program__is_tracing;
bpf_program__set_tracing;
bpf_program__size;
btf__find_by_name_kind;
libbpf_find_vmlinux_btf_id;
} LIBBPF_0.0.5;
@@ -224,14 +135,8 @@ LIBBPF_0.0.7 {
bpf_object__detach_skeleton;
bpf_object__load_skeleton;
bpf_object__open_skeleton;
bpf_probe_large_insn_limit;
bpf_prog_attach_xattr;
bpf_program__attach;
bpf_program__name;
bpf_program__is_extension;
bpf_program__is_struct_ops;
bpf_program__set_extension;
bpf_program__set_struct_ops;
btf__align_of;
libbpf_find_kernel_btf;
} LIBBPF_0.0.6;
@@ -250,10 +155,7 @@ LIBBPF_0.0.8 {
bpf_prog_attach_opts;
bpf_program__attach_cgroup;
bpf_program__attach_lsm;
bpf_program__is_lsm;
bpf_program__set_attach_target;
bpf_program__set_lsm;
bpf_set_link_xdp_fd_opts;
} LIBBPF_0.0.7;
LIBBPF_0.0.9 {
@@ -291,9 +193,7 @@ LIBBPF_0.1.0 {
bpf_map__value_size;
bpf_program__attach_xdp;
bpf_program__autoload;
bpf_program__is_sk_lookup;
bpf_program__set_autoload;
bpf_program__set_sk_lookup;
btf__parse;
btf__parse_raw;
btf__pointer_size;
@@ -336,7 +236,6 @@ LIBBPF_0.2.0 {
perf_buffer__buffer_fd;
perf_buffer__epoll_fd;
perf_buffer__consume_buffer;
xsk_socket__create_shared;
} LIBBPF_0.1.0;
LIBBPF_0.3.0 {
@@ -348,8 +247,6 @@ LIBBPF_0.3.0 {
btf__new_empty_split;
btf__new_split;
ring_buffer__epoll_fd;
xsk_setup_xdp_prog;
xsk_socket__update_xskmap;
} LIBBPF_0.2.0;
LIBBPF_0.4.0 {
@@ -397,7 +294,6 @@ LIBBPF_0.6.0 {
bpf_object__next_program;
bpf_object__prev_map;
bpf_object__prev_program;
bpf_prog_load_deprecated;
bpf_prog_load;
bpf_program__flags;
bpf_program__insn_cnt;
@@ -407,18 +303,14 @@ LIBBPF_0.6.0 {
btf__add_decl_tag;
btf__add_type_tag;
btf__dedup;
btf__dedup_deprecated;
btf__raw_data;
btf__type_cnt;
btf_dump__new;
btf_dump__new_deprecated;
libbpf_major_version;
libbpf_minor_version;
libbpf_version_string;
perf_buffer__new;
perf_buffer__new_deprecated;
perf_buffer__new_raw;
perf_buffer__new_raw_deprecated;
} LIBBPF_0.5.0;
LIBBPF_0.7.0 {
@@ -434,8 +326,40 @@ LIBBPF_0.7.0 {
bpf_xdp_detach;
bpf_xdp_query;
bpf_xdp_query_id;
btf_ext__raw_data;
libbpf_probe_bpf_helper;
libbpf_probe_bpf_map_type;
libbpf_probe_bpf_prog_type;
libbpf_set_memlock_rlim_max;
libbpf_set_memlock_rlim;
} LIBBPF_0.6.0;
LIBBPF_0.8.0 {
global:
bpf_map__autocreate;
bpf_map__get_next_key;
bpf_map__delete_elem;
bpf_map__lookup_and_delete_elem;
bpf_map__lookup_elem;
bpf_map__set_autocreate;
bpf_map__update_elem;
bpf_map_delete_elem_flags;
bpf_object__destroy_subskeleton;
bpf_object__open_subskeleton;
bpf_program__attach_kprobe_multi_opts;
bpf_program__attach_trace_opts;
bpf_program__attach_usdt;
bpf_program__set_insns;
libbpf_register_prog_handler;
libbpf_unregister_prog_handler;
} LIBBPF_0.7.0;
LIBBPF_1.0.0 {
global:
bpf_prog_query_opts;
btf__add_enum64;
btf__add_enum64_value;
libbpf_bpf_attach_type_str;
libbpf_bpf_link_type_str;
libbpf_bpf_map_type_str;
libbpf_bpf_prog_type_str;
};

View File

@@ -30,20 +30,10 @@
/* Add checks for other versions below when planning deprecation of API symbols
* with the LIBBPF_DEPRECATED_SINCE macro.
*/
#if __LIBBPF_CURRENT_VERSION_GEQ(0, 6)
#define __LIBBPF_MARK_DEPRECATED_0_6(X) X
#if __LIBBPF_CURRENT_VERSION_GEQ(1, 0)
#define __LIBBPF_MARK_DEPRECATED_1_0(X) X
#else
#define __LIBBPF_MARK_DEPRECATED_0_6(X)
#endif
#if __LIBBPF_CURRENT_VERSION_GEQ(0, 7)
#define __LIBBPF_MARK_DEPRECATED_0_7(X) X
#else
#define __LIBBPF_MARK_DEPRECATED_0_7(X)
#endif
#if __LIBBPF_CURRENT_VERSION_GEQ(0, 8)
#define __LIBBPF_MARK_DEPRECATED_0_8(X) X
#else
#define __LIBBPF_MARK_DEPRECATED_0_8(X)
#define __LIBBPF_MARK_DEPRECATED_1_0(X)
#endif
/* This set of internal macros allows to do "function overloading" based on

View File

@@ -15,7 +15,6 @@
#include <linux/err.h>
#include <fcntl.h>
#include <unistd.h>
#include "libbpf_legacy.h"
#include "relo_core.h"
/* make sure libbpf doesn't use kernel-only integer typedefs */
@@ -103,6 +102,17 @@
#define str_has_pfx(str, pfx) \
(strncmp(str, pfx, __builtin_constant_p(pfx) ? sizeof(pfx) - 1 : strlen(pfx)) == 0)
/* suffix check */
static inline bool str_has_sfx(const char *str, const char *sfx)
{
size_t str_len = strlen(str);
size_t sfx_len = strlen(sfx);
if (sfx_len <= str_len)
return strcmp(str + str_len - sfx_len, sfx);
return false;
}
/* Symbol versioning is different between static and shared library.
* Properly versioned symbols are needed for shared library, but
* only the symbol of the new version is needed for static library.
@@ -148,6 +158,15 @@ do { \
#ifndef __has_builtin
#define __has_builtin(x) 0
#endif
struct bpf_link {
int (*detach)(struct bpf_link *link);
void (*dealloc)(struct bpf_link *link);
char *pin_path; /* NULL, if not pinned */
int fd; /* hook FD, -1 if not applicable */
bool disconnected;
};
/*
* Re-implement glibc's reallocarray() for libbpf internal-only use.
* reallocarray(), unfortunately, is not available in all versions of glibc,
@@ -329,6 +348,10 @@ enum kern_feature_id {
FEAT_BTF_TYPE_TAG,
/* memcg-based accounting for BPF maps and progs */
FEAT_MEMCG_ACCOUNT,
/* BPF cookie (bpf_get_attach_cookie() BPF helper) support */
FEAT_BPF_COOKIE,
/* BTF_KIND_ENUM64 support and BTF_KIND_ENUM kflag support */
FEAT_BTF_ENUM64,
__FEAT_CNT,
};
@@ -354,6 +377,13 @@ struct btf_ext_info {
void *info;
__u32 rec_size;
__u32 len;
/* optional (maintained internally by libbpf) mapping between .BTF.ext
* section and corresponding ELF section. This is used to join
* information like CO-RE relocation records with corresponding BPF
* programs defined in ELF sections
*/
__u32 *sec_idxs;
int sec_cnt;
};
#define for_each_btf_ext_sec(seg, sec) \
@@ -447,7 +477,10 @@ int btf_ext_visit_str_offs(struct btf_ext *btf_ext, str_off_visit_fn visit, void
__s32 btf__find_by_name_kind_own(const struct btf *btf, const char *type_name,
__u32 kind);
extern enum libbpf_strict_mode libbpf_mode;
typedef int (*kallsyms_cb_t)(unsigned long long sym_addr, char sym_type,
const char *sym_name, void *ctx);
int libbpf_kallsyms_parse(kallsyms_cb_t cb, void *arg);
/* handle direct returned errors */
static inline int libbpf_err(int ret)
@@ -462,12 +495,8 @@ static inline int libbpf_err(int ret)
*/
static inline int libbpf_err_errno(int ret)
{
if (libbpf_mode & LIBBPF_STRICT_DIRECT_ERRS)
/* errno is already assumed to be set on error */
return ret < 0 ? -errno : ret;
/* legacy: on error return -1 directly and don't touch errno */
return ret;
/* errno is already assumed to be set on error */
return ret < 0 ? -errno : ret;
}
/* handle error for pointer-returning APIs, err is assumed to be < 0 always */
@@ -475,12 +504,7 @@ static inline void *libbpf_err_ptr(int err)
{
/* set errno on error, this doesn't break anything */
errno = -err;
if (libbpf_mode & LIBBPF_STRICT_CLEAN_PTRS)
return NULL;
/* legacy: encode err as ptr */
return ERR_PTR(err);
return NULL;
}
/* handle pointer-returning APIs' error handling */
@@ -490,11 +514,7 @@ static inline void *libbpf_ptr(void *ret)
if (IS_ERR(ret))
errno = -PTR_ERR(ret);
if (libbpf_mode & LIBBPF_STRICT_CLEAN_PTRS)
return IS_ERR(ret) ? NULL : ret;
/* legacy: pass-through original pointer */
return ret;
return IS_ERR(ret) ? NULL : ret;
}
static inline bool str_is_empty(const char *s)
@@ -529,4 +549,26 @@ static inline int ensure_good_fd(int fd)
return fd;
}
/* The following two functions are exposed to bpftool */
int bpf_core_add_cands(struct bpf_core_cand *local_cand,
size_t local_essent_len,
const struct btf *targ_btf,
const char *targ_btf_name,
int targ_start_id,
struct bpf_core_cand_list *cands);
void bpf_core_free_cands(struct bpf_core_cand_list *cands);
struct usdt_manager *usdt_manager_new(struct bpf_object *obj);
void usdt_manager_free(struct usdt_manager *man);
struct bpf_link * usdt_manager_attach_usdt(struct usdt_manager *man,
const struct bpf_program *prog,
pid_t pid, const char *path,
const char *usdt_provider, const char *usdt_name,
__u64 usdt_cookie);
static inline bool is_pow_of_2(size_t x)
{
return x && (x & (x - 1)) == 0;
}
#endif /* __LIBBPF_LIBBPF_INTERNAL_H */

View File

@@ -20,6 +20,11 @@
extern "C" {
#endif
/* As of libbpf 1.0 libbpf_set_strict_mode() and enum libbpf_struct_mode have
* no effect. But they are left in libbpf_legacy.h so that applications that
* prepared for libbpf 1.0 before final release by using
* libbpf_set_strict_mode() still work with libbpf 1.0+ without any changes.
*/
enum libbpf_strict_mode {
/* Turn on all supported strict features of libbpf to simulate libbpf
* v1.0 behavior.
@@ -54,6 +59,10 @@ enum libbpf_strict_mode {
*
* Note, in this mode the program pin path will be based on the
* function name instead of section name.
*
* Additionally, routines in the .text section are always considered
* sub-programs. Legacy behavior allows for a single routine in .text
* to be a program.
*/
LIBBPF_STRICT_SEC_NAME = 0x04,
/*
@@ -67,8 +76,8 @@ enum libbpf_strict_mode {
* first BPF program or map creation operation. This is done only if
* kernel is too old to support memcg-based memory accounting for BPF
* subsystem. By default, RLIMIT_MEMLOCK limit is set to RLIM_INFINITY,
* but it can be overriden with libbpf_set_memlock_rlim_max() API.
* Note that libbpf_set_memlock_rlim_max() needs to be called before
* but it can be overriden with libbpf_set_memlock_rlim() API.
* Note that libbpf_set_memlock_rlim() needs to be called before
* the very first bpf_prog_load(), bpf_map_create() or bpf_object__load()
* operation.
*/
@@ -84,6 +93,25 @@ enum libbpf_strict_mode {
LIBBPF_API int libbpf_set_strict_mode(enum libbpf_strict_mode mode);
/**
* @brief **libbpf_get_error()** extracts the error code from the passed
* pointer
* @param ptr pointer returned from libbpf API function
* @return error code; or 0 if no error occured
*
* Note, as of libbpf 1.0 this function is not necessary and not recommended
* to be used. Libbpf doesn't return error code embedded into the pointer
* itself. Instead, NULL is returned on error and error code is passed through
* thread-local errno variable. **libbpf_get_error()** is just returning -errno
* value if it receives NULL, which is correct only if errno hasn't been
* modified between libbpf API call and corresponding **libbpf_get_error()**
* call. Prefer to check return for NULL and use errno directly.
*
* This API is left in libbpf 1.0 to allow applications that were 1.0-ready
* before final libbpf 1.0 without needing to change them.
*/
LIBBPF_API long libbpf_get_error(const void *ptr);
#define DECLARE_LIBBPF_OPTS LIBBPF_OPTS
/* "Discouraged" APIs which don't follow consistent libbpf naming patterns.

View File

@@ -17,47 +17,14 @@
#include "libbpf.h"
#include "libbpf_internal.h"
static bool grep(const char *buffer, const char *pattern)
{
return !!strstr(buffer, pattern);
}
static int get_vendor_id(int ifindex)
{
char ifname[IF_NAMESIZE], path[64], buf[8];
ssize_t len;
int fd;
if (!if_indextoname(ifindex, ifname))
return -1;
snprintf(path, sizeof(path), "/sys/class/net/%s/device/vendor", ifname);
fd = open(path, O_RDONLY | O_CLOEXEC);
if (fd < 0)
return -1;
len = read(fd, buf, sizeof(buf));
close(fd);
if (len < 0)
return -1;
if (len >= (ssize_t)sizeof(buf))
return -1;
buf[len] = '\0';
return strtol(buf, NULL, 0);
}
static int probe_prog_load(enum bpf_prog_type prog_type,
const struct bpf_insn *insns, size_t insns_cnt,
char *log_buf, size_t log_buf_sz,
__u32 ifindex)
char *log_buf, size_t log_buf_sz)
{
LIBBPF_OPTS(bpf_prog_load_opts, opts,
.log_buf = log_buf,
.log_size = log_buf_sz,
.log_level = log_buf ? 1 : 0,
.prog_ifindex = ifindex,
);
int fd, err, exp_err = 0;
const char *exp_msg = NULL;
@@ -161,31 +128,10 @@ int libbpf_probe_bpf_prog_type(enum bpf_prog_type prog_type, const void *opts)
if (opts)
return libbpf_err(-EINVAL);
ret = probe_prog_load(prog_type, insns, insn_cnt, NULL, 0, 0);
ret = probe_prog_load(prog_type, insns, insn_cnt, NULL, 0);
return libbpf_err(ret);
}
bool bpf_probe_prog_type(enum bpf_prog_type prog_type, __u32 ifindex)
{
struct bpf_insn insns[2] = {
BPF_MOV64_IMM(BPF_REG_0, 0),
BPF_EXIT_INSN()
};
/* prefer libbpf_probe_bpf_prog_type() unless offload is requested */
if (ifindex == 0)
return libbpf_probe_bpf_prog_type(prog_type, NULL) == 1;
if (ifindex && prog_type == BPF_PROG_TYPE_SCHED_CLS)
/* nfp returns -EINVAL on exit(0) with TC offload */
insns[0].imm = 2;
errno = 0;
probe_prog_load(prog_type, insns, ARRAY_SIZE(insns), NULL, 0, ifindex);
return errno != EINVAL && errno != EOPNOTSUPP;
}
int libbpf__load_raw_btf(const char *raw_types, size_t types_len,
const char *str_sec, size_t str_len)
{
@@ -242,15 +188,13 @@ static int load_local_storage_btf(void)
strs, sizeof(strs));
}
static int probe_map_create(enum bpf_map_type map_type, __u32 ifindex)
static int probe_map_create(enum bpf_map_type map_type)
{
LIBBPF_OPTS(bpf_map_create_opts, opts);
int key_size, value_size, max_entries;
__u32 btf_key_type_id = 0, btf_value_type_id = 0;
int fd = -1, btf_fd = -1, fd_inner = -1, exp_err = 0, err;
opts.map_ifindex = ifindex;
key_size = sizeof(__u32);
value_size = sizeof(__u32);
max_entries = 1;
@@ -326,12 +270,6 @@ static int probe_map_create(enum bpf_map_type map_type, __u32 ifindex)
if (map_type == BPF_MAP_TYPE_ARRAY_OF_MAPS ||
map_type == BPF_MAP_TYPE_HASH_OF_MAPS) {
/* TODO: probe for device, once libbpf has a function to create
* map-in-map for offload
*/
if (ifindex)
goto cleanup;
fd_inner = bpf_map_create(BPF_MAP_TYPE_HASH, NULL,
sizeof(__u32), sizeof(__u32), 1, NULL);
if (fd_inner < 0)
@@ -370,15 +308,10 @@ int libbpf_probe_bpf_map_type(enum bpf_map_type map_type, const void *opts)
if (opts)
return libbpf_err(-EINVAL);
ret = probe_map_create(map_type, 0);
ret = probe_map_create(map_type);
return libbpf_err(ret);
}
bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex)
{
return probe_map_create(map_type, ifindex) == 1;
}
int libbpf_probe_bpf_helper(enum bpf_prog_type prog_type, enum bpf_func_id helper_id,
const void *opts)
{
@@ -407,7 +340,7 @@ int libbpf_probe_bpf_helper(enum bpf_prog_type prog_type, enum bpf_func_id helpe
}
buf[0] = '\0';
ret = probe_prog_load(prog_type, insns, insn_cnt, buf, sizeof(buf), 0);
ret = probe_prog_load(prog_type, insns, insn_cnt, buf, sizeof(buf));
if (ret < 0)
return libbpf_err(ret);
@@ -427,51 +360,3 @@ int libbpf_probe_bpf_helper(enum bpf_prog_type prog_type, enum bpf_func_id helpe
return 0;
return 1; /* assume supported */
}
bool bpf_probe_helper(enum bpf_func_id id, enum bpf_prog_type prog_type,
__u32 ifindex)
{
struct bpf_insn insns[2] = {
BPF_EMIT_CALL(id),
BPF_EXIT_INSN()
};
char buf[4096] = {};
bool res;
probe_prog_load(prog_type, insns, ARRAY_SIZE(insns), buf, sizeof(buf), ifindex);
res = !grep(buf, "invalid func ") && !grep(buf, "unknown func ");
if (ifindex) {
switch (get_vendor_id(ifindex)) {
case 0x19ee: /* Netronome specific */
res = res && !grep(buf, "not supported by FW") &&
!grep(buf, "unsupported function id");
break;
default:
break;
}
}
return res;
}
/*
* Probe for availability of kernel commit (5.3):
*
* c04c0d2b968a ("bpf: increase complexity limit and maximum program size")
*/
bool bpf_probe_large_insn_limit(__u32 ifindex)
{
struct bpf_insn insns[BPF_MAXINSNS + 1];
int i;
for (i = 0; i < BPF_MAXINSNS; i++)
insns[i] = BPF_MOV64_IMM(BPF_REG_0, 1);
insns[BPF_MAXINSNS] = BPF_EXIT_INSN();
errno = 0;
probe_prog_load(BPF_PROG_TYPE_SCHED_CLS, insns, ARRAY_SIZE(insns), NULL, 0,
ifindex);
return errno != E2BIG && errno != EINVAL;
}

View File

@@ -3,7 +3,7 @@
#ifndef __LIBBPF_VERSION_H
#define __LIBBPF_VERSION_H
#define LIBBPF_MAJOR_VERSION 0
#define LIBBPF_MINOR_VERSION 7
#define LIBBPF_MAJOR_VERSION 1
#define LIBBPF_MINOR_VERSION 0
#endif /* __LIBBPF_VERSION_H */

View File

@@ -697,11 +697,6 @@ static int linker_load_obj_file(struct bpf_linker *linker, const char *filename,
return err;
}
static bool is_pow_of_2(size_t x)
{
return x && (x & (x - 1)) == 0;
}
static int linker_sanity_check_elf(struct src_obj *obj)
{
struct src_sec *sec;
@@ -1340,6 +1335,7 @@ recur:
case BTF_KIND_STRUCT:
case BTF_KIND_UNION:
case BTF_KIND_ENUM:
case BTF_KIND_ENUM64:
case BTF_KIND_FWD:
case BTF_KIND_FUNC:
case BTF_KIND_VAR:
@@ -1362,6 +1358,7 @@ recur:
case BTF_KIND_INT:
case BTF_KIND_FLOAT:
case BTF_KIND_ENUM:
case BTF_KIND_ENUM64:
/* ignore encoding for int and enum values for enum */
if (t1->size != t2->size) {
pr_warn("global '%s': incompatible %s '%s' size %u and %u\n",

View File

@@ -27,6 +27,14 @@ typedef int (*libbpf_dump_nlmsg_t)(void *cookie, void *msg, struct nlattr **tb);
typedef int (*__dump_nlmsg_t)(struct nlmsghdr *nlmsg, libbpf_dump_nlmsg_t,
void *cookie);
struct xdp_link_info {
__u32 prog_id;
__u32 drv_prog_id;
__u32 hw_prog_id;
__u32 skb_prog_id;
__u8 attach_mode;
};
struct xdp_id_md {
int ifindex;
__u32 flags;
@@ -87,29 +95,75 @@ enum {
NL_DONE,
};
static int netlink_recvmsg(int sock, struct msghdr *mhdr, int flags)
{
int len;
do {
len = recvmsg(sock, mhdr, flags);
} while (len < 0 && (errno == EINTR || errno == EAGAIN));
if (len < 0)
return -errno;
return len;
}
static int alloc_iov(struct iovec *iov, int len)
{
void *nbuf;
nbuf = realloc(iov->iov_base, len);
if (!nbuf)
return -ENOMEM;
iov->iov_base = nbuf;
iov->iov_len = len;
return 0;
}
static int libbpf_netlink_recv(int sock, __u32 nl_pid, int seq,
__dump_nlmsg_t _fn, libbpf_dump_nlmsg_t fn,
void *cookie)
{
struct iovec iov = {};
struct msghdr mhdr = {
.msg_iov = &iov,
.msg_iovlen = 1,
};
bool multipart = true;
struct nlmsgerr *err;
struct nlmsghdr *nh;
char buf[4096];
int len, ret;
ret = alloc_iov(&iov, 4096);
if (ret)
goto done;
while (multipart) {
start:
multipart = false;
len = recv(sock, buf, sizeof(buf), 0);
len = netlink_recvmsg(sock, &mhdr, MSG_PEEK | MSG_TRUNC);
if (len < 0) {
ret = -errno;
ret = len;
goto done;
}
if (len > iov.iov_len) {
ret = alloc_iov(&iov, len);
if (ret)
goto done;
}
len = netlink_recvmsg(sock, &mhdr, 0);
if (len < 0) {
ret = len;
goto done;
}
if (len == 0)
break;
for (nh = (struct nlmsghdr *)buf; NLMSG_OK(nh, len);
for (nh = (struct nlmsghdr *)iov.iov_base; NLMSG_OK(nh, len);
nh = NLMSG_NEXT(nh, len)) {
if (nh->nlmsg_pid != nl_pid) {
ret = -LIBBPF_ERRNO__WRNGPID;
@@ -130,7 +184,8 @@ start:
libbpf_nla_dump_errormsg(nh);
goto done;
case NLMSG_DONE:
return 0;
ret = 0;
goto done;
default:
break;
}
@@ -142,15 +197,17 @@ start:
case NL_NEXT:
goto start;
case NL_DONE:
return 0;
ret = 0;
goto done;
default:
return ret;
goto done;
}
}
}
}
ret = 0;
done:
free(iov.iov_base);
return ret;
}
@@ -239,31 +296,6 @@ int bpf_xdp_detach(int ifindex, __u32 flags, const struct bpf_xdp_attach_opts *o
return bpf_xdp_attach(ifindex, -1, flags, opts);
}
int bpf_set_link_xdp_fd_opts(int ifindex, int fd, __u32 flags,
const struct bpf_xdp_set_link_opts *opts)
{
int old_fd = -1, ret;
if (!OPTS_VALID(opts, bpf_xdp_set_link_opts))
return libbpf_err(-EINVAL);
if (OPTS_HAS(opts, old_fd)) {
old_fd = OPTS_GET(opts, old_fd, -1);
flags |= XDP_FLAGS_REPLACE;
}
ret = __bpf_set_link_xdp_fd_replace(ifindex, fd, old_fd, flags);
return libbpf_err(ret);
}
int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags)
{
int ret;
ret = __bpf_set_link_xdp_fd_replace(ifindex, fd, 0, flags);
return libbpf_err(ret);
}
static int __dump_link_nlmsg(struct nlmsghdr *nlh,
libbpf_dump_nlmsg_t dump_link_nlmsg, void *cookie)
{
@@ -364,30 +396,6 @@ int bpf_xdp_query(int ifindex, int xdp_flags, struct bpf_xdp_query_opts *opts)
return 0;
}
int bpf_get_link_xdp_info(int ifindex, struct xdp_link_info *info,
size_t info_size, __u32 flags)
{
LIBBPF_OPTS(bpf_xdp_query_opts, opts);
size_t sz;
int err;
if (!info_size)
return libbpf_err(-EINVAL);
err = bpf_xdp_query(ifindex, flags, &opts);
if (err)
return libbpf_err(err);
/* struct xdp_link_info field layout matches struct bpf_xdp_query_opts
* layout after sz field
*/
sz = min(info_size, offsetofend(struct xdp_link_info, attach_mode));
memcpy(info, &opts.prog_id, sz);
memset((void *)info + sz, 0, info_size - sz);
return 0;
}
int bpf_xdp_query_id(int ifindex, int flags, __u32 *prog_id)
{
LIBBPF_OPTS(bpf_xdp_query_opts, opts);
@@ -414,11 +422,6 @@ int bpf_xdp_query_id(int ifindex, int flags, __u32 *prog_id)
}
int bpf_get_link_xdp_id(int ifindex, __u32 *prog_id, __u32 flags)
{
return bpf_xdp_query_id(ifindex, flags, prog_id);
}
typedef int (*qdisc_config_t)(struct libbpf_nla_req *req);
static int clsact_config(struct libbpf_nla_req *req)

View File

@@ -141,6 +141,86 @@ static bool core_relo_is_enumval_based(enum bpf_core_relo_kind kind)
}
}
int __bpf_core_types_are_compat(const struct btf *local_btf, __u32 local_id,
const struct btf *targ_btf, __u32 targ_id, int level)
{
const struct btf_type *local_type, *targ_type;
int depth = 32; /* max recursion depth */
/* caller made sure that names match (ignoring flavor suffix) */
local_type = btf_type_by_id(local_btf, local_id);
targ_type = btf_type_by_id(targ_btf, targ_id);
if (!btf_kind_core_compat(local_type, targ_type))
return 0;
recur:
depth--;
if (depth < 0)
return -EINVAL;
local_type = skip_mods_and_typedefs(local_btf, local_id, &local_id);
targ_type = skip_mods_and_typedefs(targ_btf, targ_id, &targ_id);
if (!local_type || !targ_type)
return -EINVAL;
if (!btf_kind_core_compat(local_type, targ_type))
return 0;
switch (btf_kind(local_type)) {
case BTF_KIND_UNKN:
case BTF_KIND_STRUCT:
case BTF_KIND_UNION:
case BTF_KIND_ENUM:
case BTF_KIND_FWD:
case BTF_KIND_ENUM64:
return 1;
case BTF_KIND_INT:
/* just reject deprecated bitfield-like integers; all other
* integers are by default compatible between each other
*/
return btf_int_offset(local_type) == 0 && btf_int_offset(targ_type) == 0;
case BTF_KIND_PTR:
local_id = local_type->type;
targ_id = targ_type->type;
goto recur;
case BTF_KIND_ARRAY:
local_id = btf_array(local_type)->type;
targ_id = btf_array(targ_type)->type;
goto recur;
case BTF_KIND_FUNC_PROTO: {
struct btf_param *local_p = btf_params(local_type);
struct btf_param *targ_p = btf_params(targ_type);
__u16 local_vlen = btf_vlen(local_type);
__u16 targ_vlen = btf_vlen(targ_type);
int i, err;
if (local_vlen != targ_vlen)
return 0;
for (i = 0; i < local_vlen; i++, local_p++, targ_p++) {
if (level <= 0)
return -EINVAL;
skip_mods_and_typedefs(local_btf, local_p->type, &local_id);
skip_mods_and_typedefs(targ_btf, targ_p->type, &targ_id);
err = __bpf_core_types_are_compat(local_btf, local_id, targ_btf, targ_id,
level - 1);
if (err <= 0)
return err;
}
/* tail recurse for return type check */
skip_mods_and_typedefs(local_btf, local_type->type, &local_id);
skip_mods_and_typedefs(targ_btf, targ_type->type, &targ_id);
goto recur;
}
default:
pr_warn("unexpected kind %s relocated, local [%d], target [%d]\n",
btf_kind_str(local_type), local_id, targ_id);
return 0;
}
}
/*
* Turn bpf_core_relo into a low- and high-level spec representation,
* validating correctness along the way, as well as calculating resulting
@@ -167,7 +247,7 @@ static bool core_relo_is_enumval_based(enum bpf_core_relo_kind kind)
* just a parsed access string representation): [0, 1, 2, 3].
*
* High-level spec will capture only 3 points:
* - intial zero-index access by pointer (&s->... is the same as &s[0]...);
* - initial zero-index access by pointer (&s->... is the same as &s[0]...);
* - field 'a' access (corresponds to '2' in low-level spec);
* - array element #3 access (corresponds to '3' in low-level spec).
*
@@ -178,29 +258,28 @@ static bool core_relo_is_enumval_based(enum bpf_core_relo_kind kind)
* Enum value-based relocations (ENUMVAL_EXISTS/ENUMVAL_VALUE) use access
* string to specify enumerator's value index that need to be relocated.
*/
static int bpf_core_parse_spec(const char *prog_name, const struct btf *btf,
__u32 type_id,
const char *spec_str,
enum bpf_core_relo_kind relo_kind,
struct bpf_core_spec *spec)
int bpf_core_parse_spec(const char *prog_name, const struct btf *btf,
const struct bpf_core_relo *relo,
struct bpf_core_spec *spec)
{
int access_idx, parsed_len, i;
struct bpf_core_accessor *acc;
const struct btf_type *t;
const char *name;
__u32 id;
const char *name, *spec_str;
__u32 id, name_off;
__s64 sz;
spec_str = btf__name_by_offset(btf, relo->access_str_off);
if (str_is_empty(spec_str) || *spec_str == ':')
return -EINVAL;
memset(spec, 0, sizeof(*spec));
spec->btf = btf;
spec->root_type_id = type_id;
spec->relo_kind = relo_kind;
spec->root_type_id = relo->type_id;
spec->relo_kind = relo->kind;
/* type-based relocations don't have a field access string */
if (core_relo_is_type_based(relo_kind)) {
if (core_relo_is_type_based(relo->kind)) {
if (strcmp(spec_str, "0"))
return -EINVAL;
return 0;
@@ -221,7 +300,7 @@ static int bpf_core_parse_spec(const char *prog_name, const struct btf *btf,
if (spec->raw_len == 0)
return -EINVAL;
t = skip_mods_and_typedefs(btf, type_id, &id);
t = skip_mods_and_typedefs(btf, relo->type_id, &id);
if (!t)
return -EINVAL;
@@ -231,16 +310,18 @@ static int bpf_core_parse_spec(const char *prog_name, const struct btf *btf,
acc->idx = access_idx;
spec->len++;
if (core_relo_is_enumval_based(relo_kind)) {
if (!btf_is_enum(t) || spec->raw_len > 1 || access_idx >= btf_vlen(t))
if (core_relo_is_enumval_based(relo->kind)) {
if (!btf_is_any_enum(t) || spec->raw_len > 1 || access_idx >= btf_vlen(t))
return -EINVAL;
/* record enumerator name in a first accessor */
acc->name = btf__name_by_offset(btf, btf_enum(t)[access_idx].name_off);
name_off = btf_is_enum(t) ? btf_enum(t)[access_idx].name_off
: btf_enum64(t)[access_idx].name_off;
acc->name = btf__name_by_offset(btf, name_off);
return 0;
}
if (!core_relo_is_field_based(relo_kind))
if (!core_relo_is_field_based(relo->kind))
return -EINVAL;
sz = btf__resolve_size(btf, id);
@@ -301,7 +382,7 @@ static int bpf_core_parse_spec(const char *prog_name, const struct btf *btf,
spec->bit_offset += access_idx * sz * 8;
} else {
pr_warn("prog '%s': relo for [%u] %s (at idx %d) captures type [%d] of unexpected kind %s\n",
prog_name, type_id, spec_str, i, id, btf_kind_str(t));
prog_name, relo->type_id, spec_str, i, id, btf_kind_str(t));
return -EINVAL;
}
}
@@ -341,7 +422,7 @@ recur:
if (btf_is_composite(local_type) && btf_is_composite(targ_type))
return 1;
if (btf_kind(local_type) != btf_kind(targ_type))
if (!btf_kind_core_compat(local_type, targ_type))
return 0;
switch (btf_kind(local_type)) {
@@ -349,6 +430,7 @@ recur:
case BTF_KIND_FLOAT:
return 1;
case BTF_KIND_FWD:
case BTF_KIND_ENUM64:
case BTF_KIND_ENUM: {
const char *local_name, *targ_name;
size_t local_len, targ_len;
@@ -478,6 +560,7 @@ static int bpf_core_spec_match(struct bpf_core_spec *local_spec,
const struct bpf_core_accessor *local_acc;
struct bpf_core_accessor *targ_acc;
int i, sz, matched;
__u32 name_off;
memset(targ_spec, 0, sizeof(*targ_spec));
targ_spec->btf = targ_btf;
@@ -495,18 +578,22 @@ static int bpf_core_spec_match(struct bpf_core_spec *local_spec,
if (core_relo_is_enumval_based(local_spec->relo_kind)) {
size_t local_essent_len, targ_essent_len;
const struct btf_enum *e;
const char *targ_name;
/* has to resolve to an enum */
targ_type = skip_mods_and_typedefs(targ_spec->btf, targ_id, &targ_id);
if (!btf_is_enum(targ_type))
if (!btf_is_any_enum(targ_type))
return 0;
local_essent_len = bpf_core_essential_name_len(local_acc->name);
for (i = 0, e = btf_enum(targ_type); i < btf_vlen(targ_type); i++, e++) {
targ_name = btf__name_by_offset(targ_spec->btf, e->name_off);
for (i = 0; i < btf_vlen(targ_type); i++) {
if (btf_is_enum(targ_type))
name_off = btf_enum(targ_type)[i].name_off;
else
name_off = btf_enum64(targ_type)[i].name_off;
targ_name = btf__name_by_offset(targ_spec->btf, name_off);
targ_essent_len = bpf_core_essential_name_len(targ_name);
if (targ_essent_len != local_essent_len)
continue;
@@ -584,7 +671,7 @@ static int bpf_core_spec_match(struct bpf_core_spec *local_spec,
static int bpf_core_calc_field_relo(const char *prog_name,
const struct bpf_core_relo *relo,
const struct bpf_core_spec *spec,
__u32 *val, __u32 *field_sz, __u32 *type_id,
__u64 *val, __u32 *field_sz, __u32 *type_id,
bool *validate)
{
const struct bpf_core_accessor *acc;
@@ -681,8 +768,7 @@ static int bpf_core_calc_field_relo(const char *prog_name,
*val = byte_sz;
break;
case BPF_CORE_FIELD_SIGNED:
/* enums will be assumed unsigned */
*val = btf_is_enum(mt) ||
*val = (btf_is_any_enum(mt) && BTF_INFO_KFLAG(mt->info)) ||
(btf_int_encoding(mt) & BTF_INT_SIGNED);
if (validate)
*validate = true; /* signedness is never ambiguous */
@@ -709,7 +795,7 @@ static int bpf_core_calc_field_relo(const char *prog_name,
static int bpf_core_calc_type_relo(const struct bpf_core_relo *relo,
const struct bpf_core_spec *spec,
__u32 *val, bool *validate)
__u64 *val, bool *validate)
{
__s64 sz;
@@ -752,10 +838,9 @@ static int bpf_core_calc_type_relo(const struct bpf_core_relo *relo,
static int bpf_core_calc_enumval_relo(const struct bpf_core_relo *relo,
const struct bpf_core_spec *spec,
__u32 *val)
__u64 *val)
{
const struct btf_type *t;
const struct btf_enum *e;
switch (relo->kind) {
case BPF_CORE_ENUMVAL_EXISTS:
@@ -765,8 +850,10 @@ static int bpf_core_calc_enumval_relo(const struct bpf_core_relo *relo,
if (!spec)
return -EUCLEAN; /* request instruction poisoning */
t = btf_type_by_id(spec->btf, spec->spec[0].type_id);
e = btf_enum(t) + spec->spec[0].idx;
*val = e->val;
if (btf_is_enum(t))
*val = btf_enum(t)[spec->spec[0].idx].val;
else
*val = btf_enum64_value(btf_enum64(t) + spec->spec[0].idx);
break;
default:
return -EOPNOTSUPP;
@@ -775,31 +862,6 @@ static int bpf_core_calc_enumval_relo(const struct bpf_core_relo *relo,
return 0;
}
struct bpf_core_relo_res
{
/* expected value in the instruction, unless validate == false */
__u32 orig_val;
/* new value that needs to be patched up to */
__u32 new_val;
/* relocation unsuccessful, poison instruction, but don't fail load */
bool poison;
/* some relocations can't be validated against orig_val */
bool validate;
/* for field byte offset relocations or the forms:
* *(T *)(rX + <off>) = rY
* rX = *(T *)(rY + <off>),
* we remember original and resolved field size to adjust direct
* memory loads of pointers and integers; this is necessary for 32-bit
* host kernel architectures, but also allows to automatically
* relocate fields that were resized from, e.g., u32 to u64, etc.
*/
bool fail_memsz_adjust;
__u32 orig_sz;
__u32 orig_type_id;
__u32 new_sz;
__u32 new_type_id;
};
/* Calculate original and target relocation values, given local and target
* specs and relocation kind. These values are calculated for each candidate.
* If there are multiple candidates, resulting values should all be consistent
@@ -951,11 +1013,11 @@ static int insn_bytes_to_bpf_size(__u32 sz)
* 5. *(T *)(rX + <off>) = rY, where T is one of {u8, u16, u32, u64};
* 6. *(T *)(rX + <off>) = <imm>, where T is one of {u8, u16, u32, u64}.
*/
static int bpf_core_patch_insn(const char *prog_name, struct bpf_insn *insn,
int insn_idx, const struct bpf_core_relo *relo,
int relo_idx, const struct bpf_core_relo_res *res)
int bpf_core_patch_insn(const char *prog_name, struct bpf_insn *insn,
int insn_idx, const struct bpf_core_relo *relo,
int relo_idx, const struct bpf_core_relo_res *res)
{
__u32 orig_val, new_val;
__u64 orig_val, new_val;
__u8 class;
class = BPF_CLASS(insn->code);
@@ -980,28 +1042,30 @@ poison:
if (BPF_SRC(insn->code) != BPF_K)
return -EINVAL;
if (res->validate && insn->imm != orig_val) {
pr_warn("prog '%s': relo #%d: unexpected insn #%d (ALU/ALU64) value: got %u, exp %u -> %u\n",
pr_warn("prog '%s': relo #%d: unexpected insn #%d (ALU/ALU64) value: got %u, exp %llu -> %llu\n",
prog_name, relo_idx,
insn_idx, insn->imm, orig_val, new_val);
insn_idx, insn->imm, (unsigned long long)orig_val,
(unsigned long long)new_val);
return -EINVAL;
}
orig_val = insn->imm;
insn->imm = new_val;
pr_debug("prog '%s': relo #%d: patched insn #%d (ALU/ALU64) imm %u -> %u\n",
pr_debug("prog '%s': relo #%d: patched insn #%d (ALU/ALU64) imm %llu -> %llu\n",
prog_name, relo_idx, insn_idx,
orig_val, new_val);
(unsigned long long)orig_val, (unsigned long long)new_val);
break;
case BPF_LDX:
case BPF_ST:
case BPF_STX:
if (res->validate && insn->off != orig_val) {
pr_warn("prog '%s': relo #%d: unexpected insn #%d (LDX/ST/STX) value: got %u, exp %u -> %u\n",
prog_name, relo_idx, insn_idx, insn->off, orig_val, new_val);
pr_warn("prog '%s': relo #%d: unexpected insn #%d (LDX/ST/STX) value: got %u, exp %llu -> %llu\n",
prog_name, relo_idx, insn_idx, insn->off, (unsigned long long)orig_val,
(unsigned long long)new_val);
return -EINVAL;
}
if (new_val > SHRT_MAX) {
pr_warn("prog '%s': relo #%d: insn #%d (LDX/ST/STX) value too big: %u\n",
prog_name, relo_idx, insn_idx, new_val);
pr_warn("prog '%s': relo #%d: insn #%d (LDX/ST/STX) value too big: %llu\n",
prog_name, relo_idx, insn_idx, (unsigned long long)new_val);
return -ERANGE;
}
if (res->fail_memsz_adjust) {
@@ -1013,8 +1077,9 @@ poison:
orig_val = insn->off;
insn->off = new_val;
pr_debug("prog '%s': relo #%d: patched insn #%d (LDX/ST/STX) off %u -> %u\n",
prog_name, relo_idx, insn_idx, orig_val, new_val);
pr_debug("prog '%s': relo #%d: patched insn #%d (LDX/ST/STX) off %llu -> %llu\n",
prog_name, relo_idx, insn_idx, (unsigned long long)orig_val,
(unsigned long long)new_val);
if (res->new_sz != res->orig_sz) {
int insn_bytes_sz, insn_bpf_sz;
@@ -1050,20 +1115,20 @@ poison:
return -EINVAL;
}
imm = insn[0].imm + ((__u64)insn[1].imm << 32);
imm = (__u32)insn[0].imm | ((__u64)insn[1].imm << 32);
if (res->validate && imm != orig_val) {
pr_warn("prog '%s': relo #%d: unexpected insn #%d (LDIMM64) value: got %llu, exp %u -> %u\n",
pr_warn("prog '%s': relo #%d: unexpected insn #%d (LDIMM64) value: got %llu, exp %llu -> %llu\n",
prog_name, relo_idx,
insn_idx, (unsigned long long)imm,
orig_val, new_val);
(unsigned long long)orig_val, (unsigned long long)new_val);
return -EINVAL;
}
insn[0].imm = new_val;
insn[1].imm = 0; /* currently only 32-bit values are supported */
pr_debug("prog '%s': relo #%d: patched insn #%d (LDIMM64) imm64 %llu -> %u\n",
insn[1].imm = new_val >> 32;
pr_debug("prog '%s': relo #%d: patched insn #%d (LDIMM64) imm64 %llu -> %llu\n",
prog_name, relo_idx, insn_idx,
(unsigned long long)imm, new_val);
(unsigned long long)imm, (unsigned long long)new_val);
break;
}
default:
@@ -1080,55 +1145,82 @@ poison:
* [<type-id>] (<type-name>) + <raw-spec> => <offset>@<spec>,
* where <spec> is a C-syntax view of recorded field access, e.g.: x.a[3].b
*/
static void bpf_core_dump_spec(const char *prog_name, int level, const struct bpf_core_spec *spec)
int bpf_core_format_spec(char *buf, size_t buf_sz, const struct bpf_core_spec *spec)
{
const struct btf_type *t;
const struct btf_enum *e;
const char *s;
__u32 type_id;
int i;
int i, len = 0;
#define append_buf(fmt, args...) \
({ \
int r; \
r = snprintf(buf, buf_sz, fmt, ##args); \
len += r; \
if (r >= buf_sz) \
r = buf_sz; \
buf += r; \
buf_sz -= r; \
})
type_id = spec->root_type_id;
t = btf_type_by_id(spec->btf, type_id);
s = btf__name_by_offset(spec->btf, t->name_off);
libbpf_print(level, "[%u] %s %s", type_id, btf_kind_str(t), str_is_empty(s) ? "<anon>" : s);
append_buf("<%s> [%u] %s %s",
core_relo_kind_str(spec->relo_kind),
type_id, btf_kind_str(t), str_is_empty(s) ? "<anon>" : s);
if (core_relo_is_type_based(spec->relo_kind))
return;
return len;
if (core_relo_is_enumval_based(spec->relo_kind)) {
t = skip_mods_and_typedefs(spec->btf, type_id, NULL);
e = btf_enum(t) + spec->raw_spec[0];
s = btf__name_by_offset(spec->btf, e->name_off);
if (btf_is_enum(t)) {
const struct btf_enum *e;
const char *fmt_str;
libbpf_print(level, "::%s = %u", s, e->val);
return;
e = btf_enum(t) + spec->raw_spec[0];
s = btf__name_by_offset(spec->btf, e->name_off);
fmt_str = BTF_INFO_KFLAG(t->info) ? "::%s = %d" : "::%s = %u";
append_buf(fmt_str, s, e->val);
} else {
const struct btf_enum64 *e;
const char *fmt_str;
e = btf_enum64(t) + spec->raw_spec[0];
s = btf__name_by_offset(spec->btf, e->name_off);
fmt_str = BTF_INFO_KFLAG(t->info) ? "::%s = %lld" : "::%s = %llu";
append_buf(fmt_str, s, (unsigned long long)btf_enum64_value(e));
}
return len;
}
if (core_relo_is_field_based(spec->relo_kind)) {
for (i = 0; i < spec->len; i++) {
if (spec->spec[i].name)
libbpf_print(level, ".%s", spec->spec[i].name);
append_buf(".%s", spec->spec[i].name);
else if (i > 0 || spec->spec[i].idx > 0)
libbpf_print(level, "[%u]", spec->spec[i].idx);
append_buf("[%u]", spec->spec[i].idx);
}
libbpf_print(level, " (");
append_buf(" (");
for (i = 0; i < spec->raw_len; i++)
libbpf_print(level, "%s%d", i == 0 ? "" : ":", spec->raw_spec[i]);
append_buf("%s%d", i == 0 ? "" : ":", spec->raw_spec[i]);
if (spec->bit_offset % 8)
libbpf_print(level, " @ offset %u.%u)",
spec->bit_offset / 8, spec->bit_offset % 8);
append_buf(" @ offset %u.%u)", spec->bit_offset / 8, spec->bit_offset % 8);
else
libbpf_print(level, " @ offset %u)", spec->bit_offset / 8);
return;
append_buf(" @ offset %u)", spec->bit_offset / 8);
return len;
}
return len;
#undef append_buf
}
/*
* CO-RE relocate single instruction.
* Calculate CO-RE relocation target result.
*
* The outline and important points of the algorithm:
* 1. For given local type, find corresponding candidate target types.
@@ -1159,11 +1251,11 @@ static void bpf_core_dump_spec(const char *prog_name, int level, const struct bp
* 3. It is supported and expected that there might be multiple flavors
* matching the spec. As long as all the specs resolve to the same set of
* offsets across all candidates, there is no error. If there is any
* ambiguity, CO-RE relocation will fail. This is necessary to accomodate
* imprefection of BTF deduplication, which can cause slight duplication of
* ambiguity, CO-RE relocation will fail. This is necessary to accommodate
* imperfection of BTF deduplication, which can cause slight duplication of
* the same BTF type, if some directly or indirectly referenced (by
* pointer) type gets resolved to different actual types in different
* object files. If such situation occurs, deduplicated BTF will end up
* object files. If such a situation occurs, deduplicated BTF will end up
* with two (or more) structurally identical types, which differ only in
* types they refer to through pointer. This should be OK in most cases and
* is not an error.
@@ -1177,22 +1269,22 @@ static void bpf_core_dump_spec(const char *prog_name, int level, const struct bp
* between multiple relocations for the same type ID and is updated as some
* of the candidates are pruned due to structural incompatibility.
*/
int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn,
int insn_idx,
const struct bpf_core_relo *relo,
int relo_idx,
const struct btf *local_btf,
struct bpf_core_cand_list *cands,
struct bpf_core_spec *specs_scratch)
int bpf_core_calc_relo_insn(const char *prog_name,
const struct bpf_core_relo *relo,
int relo_idx,
const struct btf *local_btf,
struct bpf_core_cand_list *cands,
struct bpf_core_spec *specs_scratch,
struct bpf_core_relo_res *targ_res)
{
struct bpf_core_spec *local_spec = &specs_scratch[0];
struct bpf_core_spec *cand_spec = &specs_scratch[1];
struct bpf_core_spec *targ_spec = &specs_scratch[2];
struct bpf_core_relo_res cand_res, targ_res;
struct bpf_core_relo_res cand_res;
const struct btf_type *local_type;
const char *local_name;
__u32 local_id;
const char *spec_str;
char spec_buf[256];
int i, j, err;
local_id = relo->type_id;
@@ -1201,38 +1293,34 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn,
if (!local_name)
return -EINVAL;
spec_str = btf__name_by_offset(local_btf, relo->access_str_off);
if (str_is_empty(spec_str))
return -EINVAL;
err = bpf_core_parse_spec(prog_name, local_btf, local_id, spec_str,
relo->kind, local_spec);
err = bpf_core_parse_spec(prog_name, local_btf, relo, local_spec);
if (err) {
const char *spec_str;
spec_str = btf__name_by_offset(local_btf, relo->access_str_off);
pr_warn("prog '%s': relo #%d: parsing [%d] %s %s + %s failed: %d\n",
prog_name, relo_idx, local_id, btf_kind_str(local_type),
str_is_empty(local_name) ? "<anon>" : local_name,
spec_str, err);
spec_str ?: "<?>", err);
return -EINVAL;
}
pr_debug("prog '%s': relo #%d: kind <%s> (%d), spec is ", prog_name,
relo_idx, core_relo_kind_str(relo->kind), relo->kind);
bpf_core_dump_spec(prog_name, LIBBPF_DEBUG, local_spec);
libbpf_print(LIBBPF_DEBUG, "\n");
bpf_core_format_spec(spec_buf, sizeof(spec_buf), local_spec);
pr_debug("prog '%s': relo #%d: %s\n", prog_name, relo_idx, spec_buf);
/* TYPE_ID_LOCAL relo is special and doesn't need candidate search */
if (relo->kind == BPF_CORE_TYPE_ID_LOCAL) {
/* bpf_insn's imm value could get out of sync during linking */
memset(&targ_res, 0, sizeof(targ_res));
targ_res.validate = false;
targ_res.poison = false;
targ_res.orig_val = local_spec->root_type_id;
targ_res.new_val = local_spec->root_type_id;
goto patch_insn;
memset(targ_res, 0, sizeof(*targ_res));
targ_res->validate = false;
targ_res->poison = false;
targ_res->orig_val = local_spec->root_type_id;
targ_res->new_val = local_spec->root_type_id;
return 0;
}
/* libbpf doesn't support candidate search for anonymous types */
if (str_is_empty(spec_str)) {
if (str_is_empty(local_name)) {
pr_warn("prog '%s': relo #%d: <%s> (%d) relocation doesn't support anonymous types\n",
prog_name, relo_idx, core_relo_kind_str(relo->kind), relo->kind);
return -EOPNOTSUPP;
@@ -1242,17 +1330,15 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn,
err = bpf_core_spec_match(local_spec, cands->cands[i].btf,
cands->cands[i].id, cand_spec);
if (err < 0) {
pr_warn("prog '%s': relo #%d: error matching candidate #%d ",
prog_name, relo_idx, i);
bpf_core_dump_spec(prog_name, LIBBPF_WARN, cand_spec);
libbpf_print(LIBBPF_WARN, ": %d\n", err);
bpf_core_format_spec(spec_buf, sizeof(spec_buf), cand_spec);
pr_warn("prog '%s': relo #%d: error matching candidate #%d %s: %d\n ",
prog_name, relo_idx, i, spec_buf, err);
return err;
}
pr_debug("prog '%s': relo #%d: %s candidate #%d ", prog_name,
relo_idx, err == 0 ? "non-matching" : "matching", i);
bpf_core_dump_spec(prog_name, LIBBPF_DEBUG, cand_spec);
libbpf_print(LIBBPF_DEBUG, "\n");
bpf_core_format_spec(spec_buf, sizeof(spec_buf), cand_spec);
pr_debug("prog '%s': relo #%d: %s candidate #%d %s\n", prog_name,
relo_idx, err == 0 ? "non-matching" : "matching", i, spec_buf);
if (err == 0)
continue;
@@ -1262,7 +1348,7 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn,
return err;
if (j == 0) {
targ_res = cand_res;
*targ_res = cand_res;
*targ_spec = *cand_spec;
} else if (cand_spec->bit_offset != targ_spec->bit_offset) {
/* if there are many field relo candidates, they
@@ -1272,15 +1358,18 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn,
prog_name, relo_idx, cand_spec->bit_offset,
targ_spec->bit_offset);
return -EINVAL;
} else if (cand_res.poison != targ_res.poison || cand_res.new_val != targ_res.new_val) {
} else if (cand_res.poison != targ_res->poison ||
cand_res.new_val != targ_res->new_val) {
/* all candidates should result in the same relocation
* decision and value, otherwise it's dangerous to
* proceed due to ambiguity
*/
pr_warn("prog '%s': relo #%d: relocation decision ambiguity: %s %u != %s %u\n",
pr_warn("prog '%s': relo #%d: relocation decision ambiguity: %s %llu != %s %llu\n",
prog_name, relo_idx,
cand_res.poison ? "failure" : "success", cand_res.new_val,
targ_res.poison ? "failure" : "success", targ_res.new_val);
cand_res.poison ? "failure" : "success",
(unsigned long long)cand_res.new_val,
targ_res->poison ? "failure" : "success",
(unsigned long long)targ_res->new_val);
return -EINVAL;
}
@@ -1314,19 +1403,10 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn,
prog_name, relo_idx);
/* calculate single target relo result explicitly */
err = bpf_core_calc_relo(prog_name, relo, relo_idx, local_spec, NULL, &targ_res);
err = bpf_core_calc_relo(prog_name, relo, relo_idx, local_spec, NULL, targ_res);
if (err)
return err;
}
patch_insn:
/* bpf_core_patch_insn() should know how to handle missing targ_spec */
err = bpf_core_patch_insn(prog_name, insn, insn_idx, relo, relo_idx, &targ_res);
if (err) {
pr_warn("prog '%s': relo #%d: failed to patch insn #%u: %d\n",
prog_name, relo_idx, relo->insn_off / 8, err);
return -EINVAL;
}
return 0;
}

View File

@@ -44,14 +44,52 @@ struct bpf_core_spec {
__u32 bit_offset;
};
int bpf_core_apply_relo_insn(const char *prog_name,
struct bpf_insn *insn, int insn_idx,
const struct bpf_core_relo *relo, int relo_idx,
const struct btf *local_btf,
struct bpf_core_cand_list *cands,
struct bpf_core_spec *specs_scratch);
struct bpf_core_relo_res {
/* expected value in the instruction, unless validate == false */
__u64 orig_val;
/* new value that needs to be patched up to */
__u64 new_val;
/* relocation unsuccessful, poison instruction, but don't fail load */
bool poison;
/* some relocations can't be validated against orig_val */
bool validate;
/* for field byte offset relocations or the forms:
* *(T *)(rX + <off>) = rY
* rX = *(T *)(rY + <off>),
* we remember original and resolved field size to adjust direct
* memory loads of pointers and integers; this is necessary for 32-bit
* host kernel architectures, but also allows to automatically
* relocate fields that were resized from, e.g., u32 to u64, etc.
*/
bool fail_memsz_adjust;
__u32 orig_sz;
__u32 orig_type_id;
__u32 new_sz;
__u32 new_type_id;
};
int __bpf_core_types_are_compat(const struct btf *local_btf, __u32 local_id,
const struct btf *targ_btf, __u32 targ_id, int level);
int bpf_core_types_are_compat(const struct btf *local_btf, __u32 local_id,
const struct btf *targ_btf, __u32 targ_id);
size_t bpf_core_essential_name_len(const char *name);
int bpf_core_calc_relo_insn(const char *prog_name,
const struct bpf_core_relo *relo, int relo_idx,
const struct btf *local_btf,
struct bpf_core_cand_list *cands,
struct bpf_core_spec *specs_scratch,
struct bpf_core_relo_res *targ_res);
int bpf_core_patch_insn(const char *prog_name, struct bpf_insn *insn,
int insn_idx, const struct bpf_core_relo *relo,
int relo_idx, const struct bpf_core_relo_res *res);
int bpf_core_parse_spec(const char *prog_name, const struct btf *btf,
const struct bpf_core_relo *relo,
struct bpf_core_spec *spec);
int bpf_core_format_spec(char *buf, size_t buf_sz, const struct bpf_core_spec *spec);
#endif

View File

@@ -3,9 +3,19 @@
#ifndef __SKEL_INTERNAL_H
#define __SKEL_INTERNAL_H
#ifdef __KERNEL__
#include <linux/fdtable.h>
#include <linux/mm.h>
#include <linux/mman.h>
#include <linux/slab.h>
#include <linux/bpf.h>
#else
#include <unistd.h>
#include <sys/syscall.h>
#include <sys/mman.h>
#include <stdlib.h>
#include "bpf.h"
#endif
#ifndef __NR_bpf
# if defined(__mips__) && defined(_ABIO32)
@@ -25,24 +35,23 @@
* requested during loader program generation.
*/
struct bpf_map_desc {
union {
/* input for the loader prog */
struct {
__aligned_u64 initial_value;
__u32 max_entries;
};
/* output of the loader prog */
struct {
int map_fd;
};
};
/* output of the loader prog */
int map_fd;
/* input for the loader prog */
__u32 max_entries;
__aligned_u64 initial_value;
};
struct bpf_prog_desc {
int prog_fd;
};
enum {
BPF_SKEL_KERNEL = (1ULL << 0),
};
struct bpf_loader_ctx {
size_t sz;
__u32 sz;
__u32 flags;
__u32 log_level;
__u32 log_size;
__u64 log_buf;
@@ -57,12 +66,144 @@ struct bpf_load_and_run_opts {
const char *errstr;
};
long bpf_sys_bpf(__u32 cmd, void *attr, __u32 attr_size);
static inline int skel_sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr,
unsigned int size)
{
#ifdef __KERNEL__
return bpf_sys_bpf(cmd, attr, size);
#else
return syscall(__NR_bpf, cmd, attr, size);
#endif
}
#ifdef __KERNEL__
static inline int close(int fd)
{
return close_fd(fd);
}
static inline void *skel_alloc(size_t size)
{
struct bpf_loader_ctx *ctx = kzalloc(size, GFP_KERNEL);
if (!ctx)
return NULL;
ctx->flags |= BPF_SKEL_KERNEL;
return ctx;
}
static inline void skel_free(const void *p)
{
kfree(p);
}
/* skel->bss/rodata maps are populated the following way:
*
* For kernel use:
* skel_prep_map_data() allocates kernel memory that kernel module can directly access.
* Generated lskel stores the pointer in skel->rodata and in skel->maps.rodata.initial_value.
* The loader program will perform probe_read_kernel() from maps.rodata.initial_value.
* skel_finalize_map_data() sets skel->rodata to point to actual value in a bpf map and
* does maps.rodata.initial_value = ~0ULL to signal skel_free_map_data() that kvfree
* is not nessary.
*
* For user space:
* skel_prep_map_data() mmaps anon memory into skel->rodata that can be accessed directly.
* Generated lskel stores the pointer in skel->rodata and in skel->maps.rodata.initial_value.
* The loader program will perform copy_from_user() from maps.rodata.initial_value.
* skel_finalize_map_data() remaps bpf array map value from the kernel memory into
* skel->rodata address.
*
* The "bpftool gen skeleton -L" command generates lskel.h that is suitable for
* both kernel and user space. The generated loader program does
* either bpf_probe_read_kernel() or bpf_copy_from_user() from initial_value
* depending on bpf_loader_ctx->flags.
*/
static inline void skel_free_map_data(void *p, __u64 addr, size_t sz)
{
if (addr != ~0ULL)
kvfree(p);
/* When addr == ~0ULL the 'p' points to
* ((struct bpf_array *)map)->value. See skel_finalize_map_data.
*/
}
static inline void *skel_prep_map_data(const void *val, size_t mmap_sz, size_t val_sz)
{
void *addr;
addr = kvmalloc(val_sz, GFP_KERNEL);
if (!addr)
return NULL;
memcpy(addr, val, val_sz);
return addr;
}
static inline void *skel_finalize_map_data(__u64 *init_val, size_t mmap_sz, int flags, int fd)
{
struct bpf_map *map;
void *addr = NULL;
kvfree((void *) (long) *init_val);
*init_val = ~0ULL;
/* At this point bpf_load_and_run() finished without error and
* 'fd' is a valid bpf map FD. All sanity checks below should succeed.
*/
map = bpf_map_get(fd);
if (IS_ERR(map))
return NULL;
if (map->map_type != BPF_MAP_TYPE_ARRAY)
goto out;
addr = ((struct bpf_array *)map)->value;
/* the addr stays valid, since FD is not closed */
out:
bpf_map_put(map);
return addr;
}
#else
static inline void *skel_alloc(size_t size)
{
return calloc(1, size);
}
static inline void skel_free(void *p)
{
free(p);
}
static inline void skel_free_map_data(void *p, __u64 addr, size_t sz)
{
munmap(p, sz);
}
static inline void *skel_prep_map_data(const void *val, size_t mmap_sz, size_t val_sz)
{
void *addr;
addr = mmap(NULL, mmap_sz, PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_ANONYMOUS, -1, 0);
if (addr == (void *) -1)
return NULL;
memcpy(addr, val, val_sz);
return addr;
}
static inline void *skel_finalize_map_data(__u64 *init_val, size_t mmap_sz, int flags, int fd)
{
void *addr;
addr = mmap((void *) (long) *init_val, mmap_sz, flags, MAP_SHARED | MAP_FIXED, fd, 0);
if (addr == (void *) -1)
return NULL;
return addr;
}
#endif
static inline int skel_closenz(int fd)
{
if (fd > 0)
@@ -136,22 +277,28 @@ static inline int skel_link_create(int prog_fd, int target_fd,
return skel_sys_bpf(BPF_LINK_CREATE, &attr, attr_sz);
}
#ifdef __KERNEL__
#define set_err
#else
#define set_err err = -errno
#endif
static inline int bpf_load_and_run(struct bpf_load_and_run_opts *opts)
{
int map_fd = -1, prog_fd = -1, key = 0, err;
union bpf_attr attr;
map_fd = skel_map_create(BPF_MAP_TYPE_ARRAY, "__loader.map", 4, opts->data_sz, 1);
err = map_fd = skel_map_create(BPF_MAP_TYPE_ARRAY, "__loader.map", 4, opts->data_sz, 1);
if (map_fd < 0) {
opts->errstr = "failed to create loader map";
err = -errno;
set_err;
goto out;
}
err = skel_map_update_elem(map_fd, &key, opts->data, 0);
if (err < 0) {
opts->errstr = "failed to update loader map";
err = -errno;
set_err;
goto out;
}
@@ -166,10 +313,10 @@ static inline int bpf_load_and_run(struct bpf_load_and_run_opts *opts)
attr.log_size = opts->ctx->log_size;
attr.log_buf = opts->ctx->log_buf;
attr.prog_flags = BPF_F_SLEEPABLE;
prog_fd = skel_sys_bpf(BPF_PROG_LOAD, &attr, sizeof(attr));
err = prog_fd = skel_sys_bpf(BPF_PROG_LOAD, &attr, sizeof(attr));
if (prog_fd < 0) {
opts->errstr = "failed to load loader prog";
err = -errno;
set_err;
goto out;
}
@@ -181,10 +328,12 @@ static inline int bpf_load_and_run(struct bpf_load_and_run_opts *opts)
if (err < 0 || (int)attr.test.retval < 0) {
opts->errstr = "failed to execute loader prog";
if (err < 0) {
err = -errno;
set_err;
} else {
err = (int)attr.test.retval;
#ifndef __KERNEL__
errno = -err;
#endif
}
goto out;
}

259
src/usdt.bpf.h Normal file
View File

@@ -0,0 +1,259 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
/* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */
#ifndef __USDT_BPF_H__
#define __USDT_BPF_H__
#include <linux/errno.h>
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
#include <bpf/bpf_core_read.h>
/* Below types and maps are internal implementation details of libbpf's USDT
* support and are subjects to change. Also, bpf_usdt_xxx() API helpers should
* be considered an unstable API as well and might be adjusted based on user
* feedback from using libbpf's USDT support in production.
*/
/* User can override BPF_USDT_MAX_SPEC_CNT to change default size of internal
* map that keeps track of USDT argument specifications. This might be
* necessary if there are a lot of USDT attachments.
*/
#ifndef BPF_USDT_MAX_SPEC_CNT
#define BPF_USDT_MAX_SPEC_CNT 256
#endif
/* User can override BPF_USDT_MAX_IP_CNT to change default size of internal
* map that keeps track of IP (memory address) mapping to USDT argument
* specification.
* Note, if kernel supports BPF cookies, this map is not used and could be
* resized all the way to 1 to save a bit of memory.
*/
#ifndef BPF_USDT_MAX_IP_CNT
#define BPF_USDT_MAX_IP_CNT (4 * BPF_USDT_MAX_SPEC_CNT)
#endif
/* We use BPF CO-RE to detect support for BPF cookie from BPF side. This is
* the only dependency on CO-RE, so if it's undesirable, user can override
* BPF_USDT_HAS_BPF_COOKIE to specify whether to BPF cookie is supported or not.
*/
#ifndef BPF_USDT_HAS_BPF_COOKIE
#define BPF_USDT_HAS_BPF_COOKIE \
bpf_core_enum_value_exists(enum bpf_func_id___usdt, BPF_FUNC_get_attach_cookie___usdt)
#endif
enum __bpf_usdt_arg_type {
BPF_USDT_ARG_CONST,
BPF_USDT_ARG_REG,
BPF_USDT_ARG_REG_DEREF,
};
struct __bpf_usdt_arg_spec {
/* u64 scalar interpreted depending on arg_type, see below */
__u64 val_off;
/* arg location case, see bpf_udst_arg() for details */
enum __bpf_usdt_arg_type arg_type;
/* offset of referenced register within struct pt_regs */
short reg_off;
/* whether arg should be interpreted as signed value */
bool arg_signed;
/* number of bits that need to be cleared and, optionally,
* sign-extended to cast arguments that are 1, 2, or 4 bytes
* long into final 8-byte u64/s64 value returned to user
*/
char arg_bitshift;
};
/* should match USDT_MAX_ARG_CNT in usdt.c exactly */
#define BPF_USDT_MAX_ARG_CNT 12
struct __bpf_usdt_spec {
struct __bpf_usdt_arg_spec args[BPF_USDT_MAX_ARG_CNT];
__u64 usdt_cookie;
short arg_cnt;
};
struct {
__uint(type, BPF_MAP_TYPE_ARRAY);
__uint(max_entries, BPF_USDT_MAX_SPEC_CNT);
__type(key, int);
__type(value, struct __bpf_usdt_spec);
} __bpf_usdt_specs SEC(".maps") __weak;
struct {
__uint(type, BPF_MAP_TYPE_HASH);
__uint(max_entries, BPF_USDT_MAX_IP_CNT);
__type(key, long);
__type(value, __u32);
} __bpf_usdt_ip_to_spec_id SEC(".maps") __weak;
/* don't rely on user's BPF code to have latest definition of bpf_func_id */
enum bpf_func_id___usdt {
BPF_FUNC_get_attach_cookie___usdt = 0xBAD, /* value doesn't matter */
};
static __always_inline
int __bpf_usdt_spec_id(struct pt_regs *ctx)
{
if (!BPF_USDT_HAS_BPF_COOKIE) {
long ip = PT_REGS_IP(ctx);
int *spec_id_ptr;
spec_id_ptr = bpf_map_lookup_elem(&__bpf_usdt_ip_to_spec_id, &ip);
return spec_id_ptr ? *spec_id_ptr : -ESRCH;
}
return bpf_get_attach_cookie(ctx);
}
/* Return number of USDT arguments defined for currently traced USDT. */
__weak __hidden
int bpf_usdt_arg_cnt(struct pt_regs *ctx)
{
struct __bpf_usdt_spec *spec;
int spec_id;
spec_id = __bpf_usdt_spec_id(ctx);
if (spec_id < 0)
return -ESRCH;
spec = bpf_map_lookup_elem(&__bpf_usdt_specs, &spec_id);
if (!spec)
return -ESRCH;
return spec->arg_cnt;
}
/* Fetch USDT argument #*arg_num* (zero-indexed) and put its value into *res.
* Returns 0 on success; negative error, otherwise.
* On error *res is guaranteed to be set to zero.
*/
__weak __hidden
int bpf_usdt_arg(struct pt_regs *ctx, __u64 arg_num, long *res)
{
struct __bpf_usdt_spec *spec;
struct __bpf_usdt_arg_spec *arg_spec;
unsigned long val;
int err, spec_id;
*res = 0;
spec_id = __bpf_usdt_spec_id(ctx);
if (spec_id < 0)
return -ESRCH;
spec = bpf_map_lookup_elem(&__bpf_usdt_specs, &spec_id);
if (!spec)
return -ESRCH;
if (arg_num >= BPF_USDT_MAX_ARG_CNT || arg_num >= spec->arg_cnt)
return -ENOENT;
arg_spec = &spec->args[arg_num];
switch (arg_spec->arg_type) {
case BPF_USDT_ARG_CONST:
/* Arg is just a constant ("-4@$-9" in USDT arg spec).
* value is recorded in arg_spec->val_off directly.
*/
val = arg_spec->val_off;
break;
case BPF_USDT_ARG_REG:
/* Arg is in a register (e.g, "8@%rax" in USDT arg spec),
* so we read the contents of that register directly from
* struct pt_regs. To keep things simple user-space parts
* record offsetof(struct pt_regs, <regname>) in arg_spec->reg_off.
*/
err = bpf_probe_read_kernel(&val, sizeof(val), (void *)ctx + arg_spec->reg_off);
if (err)
return err;
break;
case BPF_USDT_ARG_REG_DEREF:
/* Arg is in memory addressed by register, plus some offset
* (e.g., "-4@-1204(%rbp)" in USDT arg spec). Register is
* identified like with BPF_USDT_ARG_REG case, and the offset
* is in arg_spec->val_off. We first fetch register contents
* from pt_regs, then do another user-space probe read to
* fetch argument value itself.
*/
err = bpf_probe_read_kernel(&val, sizeof(val), (void *)ctx + arg_spec->reg_off);
if (err)
return err;
err = bpf_probe_read_user(&val, sizeof(val), (void *)val + arg_spec->val_off);
if (err)
return err;
#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
val >>= arg_spec->arg_bitshift;
#endif
break;
default:
return -EINVAL;
}
/* cast arg from 1, 2, or 4 bytes to final 8 byte size clearing
* necessary upper arg_bitshift bits, with sign extension if argument
* is signed
*/
val <<= arg_spec->arg_bitshift;
if (arg_spec->arg_signed)
val = ((long)val) >> arg_spec->arg_bitshift;
else
val = val >> arg_spec->arg_bitshift;
*res = val;
return 0;
}
/* Retrieve user-specified cookie value provided during attach as
* bpf_usdt_opts.usdt_cookie. This serves the same purpose as BPF cookie
* returned by bpf_get_attach_cookie(). Libbpf's support for USDT is itself
* utilizing BPF cookies internally, so user can't use BPF cookie directly
* for USDT programs and has to use bpf_usdt_cookie() API instead.
*/
__weak __hidden
long bpf_usdt_cookie(struct pt_regs *ctx)
{
struct __bpf_usdt_spec *spec;
int spec_id;
spec_id = __bpf_usdt_spec_id(ctx);
if (spec_id < 0)
return 0;
spec = bpf_map_lookup_elem(&__bpf_usdt_specs, &spec_id);
if (!spec)
return 0;
return spec->usdt_cookie;
}
/* we rely on ___bpf_apply() and ___bpf_narg() macros already defined in bpf_tracing.h */
#define ___bpf_usdt_args0() ctx
#define ___bpf_usdt_args1(x) ___bpf_usdt_args0(), ({ long _x; bpf_usdt_arg(ctx, 0, &_x); (void *)_x; })
#define ___bpf_usdt_args2(x, args...) ___bpf_usdt_args1(args), ({ long _x; bpf_usdt_arg(ctx, 1, &_x); (void *)_x; })
#define ___bpf_usdt_args3(x, args...) ___bpf_usdt_args2(args), ({ long _x; bpf_usdt_arg(ctx, 2, &_x); (void *)_x; })
#define ___bpf_usdt_args4(x, args...) ___bpf_usdt_args3(args), ({ long _x; bpf_usdt_arg(ctx, 3, &_x); (void *)_x; })
#define ___bpf_usdt_args5(x, args...) ___bpf_usdt_args4(args), ({ long _x; bpf_usdt_arg(ctx, 4, &_x); (void *)_x; })
#define ___bpf_usdt_args6(x, args...) ___bpf_usdt_args5(args), ({ long _x; bpf_usdt_arg(ctx, 5, &_x); (void *)_x; })
#define ___bpf_usdt_args7(x, args...) ___bpf_usdt_args6(args), ({ long _x; bpf_usdt_arg(ctx, 6, &_x); (void *)_x; })
#define ___bpf_usdt_args8(x, args...) ___bpf_usdt_args7(args), ({ long _x; bpf_usdt_arg(ctx, 7, &_x); (void *)_x; })
#define ___bpf_usdt_args9(x, args...) ___bpf_usdt_args8(args), ({ long _x; bpf_usdt_arg(ctx, 8, &_x); (void *)_x; })
#define ___bpf_usdt_args10(x, args...) ___bpf_usdt_args9(args), ({ long _x; bpf_usdt_arg(ctx, 9, &_x); (void *)_x; })
#define ___bpf_usdt_args11(x, args...) ___bpf_usdt_args10(args), ({ long _x; bpf_usdt_arg(ctx, 10, &_x); (void *)_x; })
#define ___bpf_usdt_args12(x, args...) ___bpf_usdt_args11(args), ({ long _x; bpf_usdt_arg(ctx, 11, &_x); (void *)_x; })
#define ___bpf_usdt_args(args...) ___bpf_apply(___bpf_usdt_args, ___bpf_narg(args))(args)
/*
* BPF_USDT serves the same purpose for USDT handlers as BPF_PROG for
* tp_btf/fentry/fexit BPF programs and BPF_KPROBE for kprobes.
* Original struct pt_regs * context is preserved as 'ctx' argument.
*/
#define BPF_USDT(name, args...) \
name(struct pt_regs *ctx); \
static __attribute__((always_inline)) typeof(name(0)) \
____##name(struct pt_regs *ctx, ##args); \
typeof(name(0)) name(struct pt_regs *ctx) \
{ \
_Pragma("GCC diagnostic push") \
_Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \
return ____##name(___bpf_usdt_args(args)); \
_Pragma("GCC diagnostic pop") \
} \
static __attribute__((always_inline)) typeof(name(0)) \
____##name(struct pt_regs *ctx, ##args)
#endif /* __USDT_BPF_H__ */

1521
src/usdt.c Normal file

File diff suppressed because it is too large Load Diff

1249
src/xsk.c

File diff suppressed because it is too large Load Diff

336
src/xsk.h
View File

@@ -1,336 +0,0 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
/*
* AF_XDP user-space access library.
*
* Copyright (c) 2018 - 2019 Intel Corporation.
* Copyright (c) 2019 Facebook
*
* Author(s): Magnus Karlsson <magnus.karlsson@intel.com>
*/
#ifndef __LIBBPF_XSK_H
#define __LIBBPF_XSK_H
#include <stdio.h>
#include <stdint.h>
#include <stdbool.h>
#include <linux/if_xdp.h>
#include "libbpf.h"
#ifdef __cplusplus
extern "C" {
#endif
/* This whole API has been deprecated and moved to libxdp that can be found at
* https://github.com/xdp-project/xdp-tools. The APIs are exactly the same so
* it should just be linking with libxdp instead of libbpf for this set of
* functionality. If not, please submit a bug report on the aforementioned page.
*/
/* Load-Acquire Store-Release barriers used by the XDP socket
* library. The following macros should *NOT* be considered part of
* the xsk.h API, and is subject to change anytime.
*
* LIBRARY INTERNAL
*/
#define __XSK_READ_ONCE(x) (*(volatile typeof(x) *)&x)
#define __XSK_WRITE_ONCE(x, v) (*(volatile typeof(x) *)&x) = (v)
#if defined(__i386__) || defined(__x86_64__)
# define libbpf_smp_store_release(p, v) \
do { \
asm volatile("" : : : "memory"); \
__XSK_WRITE_ONCE(*p, v); \
} while (0)
# define libbpf_smp_load_acquire(p) \
({ \
typeof(*p) ___p1 = __XSK_READ_ONCE(*p); \
asm volatile("" : : : "memory"); \
___p1; \
})
#elif defined(__aarch64__)
# define libbpf_smp_store_release(p, v) \
asm volatile ("stlr %w1, %0" : "=Q" (*p) : "r" (v) : "memory")
# define libbpf_smp_load_acquire(p) \
({ \
typeof(*p) ___p1; \
asm volatile ("ldar %w0, %1" \
: "=r" (___p1) : "Q" (*p) : "memory"); \
___p1; \
})
#elif defined(__riscv)
# define libbpf_smp_store_release(p, v) \
do { \
asm volatile ("fence rw,w" : : : "memory"); \
__XSK_WRITE_ONCE(*p, v); \
} while (0)
# define libbpf_smp_load_acquire(p) \
({ \
typeof(*p) ___p1 = __XSK_READ_ONCE(*p); \
asm volatile ("fence r,rw" : : : "memory"); \
___p1; \
})
#endif
#ifndef libbpf_smp_store_release
#define libbpf_smp_store_release(p, v) \
do { \
__sync_synchronize(); \
__XSK_WRITE_ONCE(*p, v); \
} while (0)
#endif
#ifndef libbpf_smp_load_acquire
#define libbpf_smp_load_acquire(p) \
({ \
typeof(*p) ___p1 = __XSK_READ_ONCE(*p); \
__sync_synchronize(); \
___p1; \
})
#endif
/* LIBRARY INTERNAL -- END */
/* Do not access these members directly. Use the functions below. */
#define DEFINE_XSK_RING(name) \
struct name { \
__u32 cached_prod; \
__u32 cached_cons; \
__u32 mask; \
__u32 size; \
__u32 *producer; \
__u32 *consumer; \
void *ring; \
__u32 *flags; \
}
DEFINE_XSK_RING(xsk_ring_prod);
DEFINE_XSK_RING(xsk_ring_cons);
/* For a detailed explanation on the memory barriers associated with the
* ring, please take a look at net/xdp/xsk_queue.h.
*/
struct xsk_umem;
struct xsk_socket;
static inline __u64 *xsk_ring_prod__fill_addr(struct xsk_ring_prod *fill,
__u32 idx)
{
__u64 *addrs = (__u64 *)fill->ring;
return &addrs[idx & fill->mask];
}
static inline const __u64 *
xsk_ring_cons__comp_addr(const struct xsk_ring_cons *comp, __u32 idx)
{
const __u64 *addrs = (const __u64 *)comp->ring;
return &addrs[idx & comp->mask];
}
static inline struct xdp_desc *xsk_ring_prod__tx_desc(struct xsk_ring_prod *tx,
__u32 idx)
{
struct xdp_desc *descs = (struct xdp_desc *)tx->ring;
return &descs[idx & tx->mask];
}
static inline const struct xdp_desc *
xsk_ring_cons__rx_desc(const struct xsk_ring_cons *rx, __u32 idx)
{
const struct xdp_desc *descs = (const struct xdp_desc *)rx->ring;
return &descs[idx & rx->mask];
}
static inline int xsk_ring_prod__needs_wakeup(const struct xsk_ring_prod *r)
{
return *r->flags & XDP_RING_NEED_WAKEUP;
}
static inline __u32 xsk_prod_nb_free(struct xsk_ring_prod *r, __u32 nb)
{
__u32 free_entries = r->cached_cons - r->cached_prod;
if (free_entries >= nb)
return free_entries;
/* Refresh the local tail pointer.
* cached_cons is r->size bigger than the real consumer pointer so
* that this addition can be avoided in the more frequently
* executed code that computs free_entries in the beginning of
* this function. Without this optimization it whould have been
* free_entries = r->cached_prod - r->cached_cons + r->size.
*/
r->cached_cons = libbpf_smp_load_acquire(r->consumer);
r->cached_cons += r->size;
return r->cached_cons - r->cached_prod;
}
static inline __u32 xsk_cons_nb_avail(struct xsk_ring_cons *r, __u32 nb)
{
__u32 entries = r->cached_prod - r->cached_cons;
if (entries == 0) {
r->cached_prod = libbpf_smp_load_acquire(r->producer);
entries = r->cached_prod - r->cached_cons;
}
return (entries > nb) ? nb : entries;
}
static inline __u32 xsk_ring_prod__reserve(struct xsk_ring_prod *prod, __u32 nb, __u32 *idx)
{
if (xsk_prod_nb_free(prod, nb) < nb)
return 0;
*idx = prod->cached_prod;
prod->cached_prod += nb;
return nb;
}
static inline void xsk_ring_prod__submit(struct xsk_ring_prod *prod, __u32 nb)
{
/* Make sure everything has been written to the ring before indicating
* this to the kernel by writing the producer pointer.
*/
libbpf_smp_store_release(prod->producer, *prod->producer + nb);
}
static inline __u32 xsk_ring_cons__peek(struct xsk_ring_cons *cons, __u32 nb, __u32 *idx)
{
__u32 entries = xsk_cons_nb_avail(cons, nb);
if (entries > 0) {
*idx = cons->cached_cons;
cons->cached_cons += entries;
}
return entries;
}
static inline void xsk_ring_cons__cancel(struct xsk_ring_cons *cons, __u32 nb)
{
cons->cached_cons -= nb;
}
static inline void xsk_ring_cons__release(struct xsk_ring_cons *cons, __u32 nb)
{
/* Make sure data has been read before indicating we are done
* with the entries by updating the consumer pointer.
*/
libbpf_smp_store_release(cons->consumer, *cons->consumer + nb);
}
static inline void *xsk_umem__get_data(void *umem_area, __u64 addr)
{
return &((char *)umem_area)[addr];
}
static inline __u64 xsk_umem__extract_addr(__u64 addr)
{
return addr & XSK_UNALIGNED_BUF_ADDR_MASK;
}
static inline __u64 xsk_umem__extract_offset(__u64 addr)
{
return addr >> XSK_UNALIGNED_BUF_OFFSET_SHIFT;
}
static inline __u64 xsk_umem__add_offset_to_addr(__u64 addr)
{
return xsk_umem__extract_addr(addr) + xsk_umem__extract_offset(addr);
}
LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp")
int xsk_umem__fd(const struct xsk_umem *umem);
LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp")
int xsk_socket__fd(const struct xsk_socket *xsk);
#define XSK_RING_CONS__DEFAULT_NUM_DESCS 2048
#define XSK_RING_PROD__DEFAULT_NUM_DESCS 2048
#define XSK_UMEM__DEFAULT_FRAME_SHIFT 12 /* 4096 bytes */
#define XSK_UMEM__DEFAULT_FRAME_SIZE (1 << XSK_UMEM__DEFAULT_FRAME_SHIFT)
#define XSK_UMEM__DEFAULT_FRAME_HEADROOM 0
#define XSK_UMEM__DEFAULT_FLAGS 0
struct xsk_umem_config {
__u32 fill_size;
__u32 comp_size;
__u32 frame_size;
__u32 frame_headroom;
__u32 flags;
};
LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp")
int xsk_setup_xdp_prog(int ifindex, int *xsks_map_fd);
LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp")
int xsk_socket__update_xskmap(struct xsk_socket *xsk, int xsks_map_fd);
/* Flags for the libbpf_flags field. */
#define XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD (1 << 0)
struct xsk_socket_config {
__u32 rx_size;
__u32 tx_size;
__u32 libbpf_flags;
__u32 xdp_flags;
__u16 bind_flags;
};
/* Set config to NULL to get the default configuration. */
LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp")
int xsk_umem__create(struct xsk_umem **umem,
void *umem_area, __u64 size,
struct xsk_ring_prod *fill,
struct xsk_ring_cons *comp,
const struct xsk_umem_config *config);
LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp")
int xsk_umem__create_v0_0_2(struct xsk_umem **umem,
void *umem_area, __u64 size,
struct xsk_ring_prod *fill,
struct xsk_ring_cons *comp,
const struct xsk_umem_config *config);
LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp")
int xsk_umem__create_v0_0_4(struct xsk_umem **umem,
void *umem_area, __u64 size,
struct xsk_ring_prod *fill,
struct xsk_ring_cons *comp,
const struct xsk_umem_config *config);
LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp")
int xsk_socket__create(struct xsk_socket **xsk,
const char *ifname, __u32 queue_id,
struct xsk_umem *umem,
struct xsk_ring_cons *rx,
struct xsk_ring_prod *tx,
const struct xsk_socket_config *config);
LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp")
int xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
const char *ifname,
__u32 queue_id, struct xsk_umem *umem,
struct xsk_ring_cons *rx,
struct xsk_ring_prod *tx,
struct xsk_ring_prod *fill,
struct xsk_ring_cons *comp,
const struct xsk_socket_config *config);
/* Returns 0 for success and -EBUSY if the umem is still in use. */
LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp")
int xsk_umem__delete(struct xsk_umem *umem);
LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp")
void xsk_socket__delete(struct xsk_socket *xsk);
#ifdef __cplusplus
} /* extern "C" */
#endif
#endif /* __LIBBPF_XSK_H */

View File

@@ -7,7 +7,8 @@ ENV_VARS="${ENV_VARS:-}"
DOCKER_RUN="${DOCKER_RUN:-docker run}"
REPO_ROOT="${REPO_ROOT:-$PWD}"
ADDITIONAL_DEPS=(clang pkg-config gcc-10)
CFLAGS="-g -O2 -Werror -Wall"
EXTRA_CFLAGS=""
EXTRA_LDFLAGS=""
function info() {
echo -e "\033[33;1m$1\033[0m"
@@ -55,17 +56,17 @@ for phase in "${PHASES[@]}"; do
elif [[ "$phase" = *"GCC10"* ]]; then
ENV_VARS="-e CC=gcc-10 -e CXX=g++-10"
CC="gcc-10"
CFLAGS="${CFLAGS} -Wno-stringop-truncation"
else
CFLAGS="${CFLAGS} -Wno-stringop-truncation"
EXTRA_CFLAGS="${EXTRA_CFLAGS} -Wno-stringop-truncation"
fi
if [[ "$phase" = *"ASAN"* ]]; then
CFLAGS="${CFLAGS} -fsanitize=address,undefined"
EXTRA_CFLAGS="${EXTRA_CFLAGS} -fsanitize=address,undefined"
EXTRA_LDFLAGS="${EXTRA_LDFLAGS} -fsanitize=address,undefined"
fi
docker_exec mkdir build install
docker_exec ${CC} --version
info "build"
docker_exec make -j$((4*$(nproc))) CFLAGS="${CFLAGS}" -C ./src -B OBJDIR=../build
docker_exec make -j$((4*$(nproc))) EXTRA_CFLAGS="${EXTRA_CFLAGS}" EXTRA_LDFLAGS="${EXTRA_LDFLAGS}" -C ./src -B OBJDIR=../build
info "ldd build/libbpf.so:"
docker_exec ldd build/libbpf.so
if ! docker_exec ldd build/libbpf.so | grep -q libelf; then
@@ -75,7 +76,7 @@ for phase in "${PHASES[@]}"; do
info "install"
docker_exec make -j$((4*$(nproc))) -C src OBJDIR=../build DESTDIR=../install install
info "link binary"
docker_exec bash -c "CFLAGS=\"${CFLAGS}\" ./travis-ci/managers/test_compile.sh"
docker_exec bash -c "EXTRA_CFLAGS=\"${EXTRA_CFLAGS}\" EXTRA_LDFLAGS=\"${EXTRA_LDFLAGS}\" ./travis-ci/managers/test_compile.sh"
;;
CLEANUP)
info "Cleanup phase"

View File

@@ -1,7 +1,8 @@
#!/bin/bash
set -euox pipefail
CFLAGS=${CFLAGS:-}
EXTRA_CFLAGS=${EXTRA_CFLAGS:-}
EXTRA_LDFLAGS=${EXTRA_LDFLAGS:-}
cat << EOF > main.c
#include <bpf/libbpf.h>
@@ -11,4 +12,4 @@ int main() {
EOF
# static linking
${CC:-cc} ${CFLAGS} -o main -I./install/usr/include main.c ./build/libbpf.a -lelf -lz
${CC:-cc} ${EXTRA_CFLAGS} ${EXTRA_LDFLAGS} -o main -I./include/uapi -I./install/usr/include main.c ./build/libbpf.a -lelf -lz

View File

@@ -10,14 +10,15 @@ source "$(dirname $0)/travis_wait.bash"
cd $REPO_ROOT
CFLAGS="-g -O2 -Werror -Wall -fsanitize=address,undefined -Wno-stringop-truncation"
EXTRA_CFLAGS="-Werror -Wall -fsanitize=address,undefined"
EXTRA_LDFLAGS="-Werror -Wall -fsanitize=address,undefined"
mkdir build install
cc --version
make -j$((4*$(nproc))) CFLAGS="${CFLAGS}" -C ./src -B OBJDIR=../build
make -j$((4*$(nproc))) EXTRA_CFLAGS="${EXTRA_CFLAGS}" EXTRA_LDFLAGS="${EXTRA_LDFLAGS}" -C ./src -B OBJDIR=../build
ldd build/libbpf.so
if ! ldd build/libbpf.so | grep -q libelf; then
echo "FAIL: No reference to libelf.so in libbpf.so!"
exit 1
fi
make -j$((4*$(nproc))) -C src OBJDIR=../build DESTDIR=../install install
CFLAGS=${CFLAGS} $(dirname $0)/test_compile.sh
EXTRA_CFLAGS=${EXTRA_CFLAGS} EXTRA_LDFLAGS=${EXTRA_LDFLAGS} $(dirname $0)/test_compile.sh

View File

@@ -80,6 +80,7 @@ packages=(
# selftests test_progs dependencies.
binutils
elfutils
ethtool
glibc
iproute2
# selftests test_verifier dependencies.

View File

@@ -2,6 +2,11 @@
# This script builds a Debian root filesystem image for testing libbpf in a
# virtual machine. Requires debootstrap >= 1.0.95 and zstd.
# Use e.g. ./mkrootfs_debian.sh --arch=s390x to generate a rootfs for a
# foreign architecture. Requires configured binfmt_misc, e.g. using
# Debian/Ubuntu's qemu-user-binfmt package or
# https://github.com/multiarch/qemu-user-static.
set -e -u -x -o pipefail
# Check whether we are root now in order to avoid confusing errors later.
@@ -15,8 +20,20 @@ root=$(mktemp -d -p "$PWD")
trap 'rm -r "$root"' EXIT
# Install packages.
packages=binutils,busybox,elfutils,iproute2,libcap2,libelf1,strace,zlib1g
debootstrap --include="$packages" --variant=minbase bullseye "$root"
packages=(
binutils
busybox
elfutils
ethtool
iproute2
iptables
libcap2
libelf1
strace
zlib1g
)
packages=$(IFS=, && echo "${packages[*]}")
debootstrap --include="$packages" --variant=minbase "$@" bookworm "$root"
# Remove the init scripts (tests use their own). Also remove various
# unnecessary files in order to save space.
@@ -26,11 +43,6 @@ rm -rf \
"$root"/var/cache/apt/archives/* \
"$root"/var/lib/apt/lists/*
# Save some more space by removing coreutils - the tests use busybox. Before
# doing that, delete the buggy postrm script, which uses the rm command.
rm -f "$root/var/lib/dpkg/info/coreutils.postrm"
chroot "$root" dpkg --remove --force-remove-essential coreutils
# Apply common tweaks.
"$(dirname "$0")"/mkrootfs_tweak.sh "$root"

View File

@@ -1,16 +1,25 @@
# IBM Z self-hosted builder
libbpf CI uses an IBM-provided z15 self-hosted builder. There are no IBM Z
builds of GitHub Actions runner, and stable qemu-user has problems with .NET
builds of GitHub (GH) Actions runner, and stable qemu-user has problems with .NET
apps, so the builder runs the x86_64 runner version with qemu-user built from
the master branch.
We are currently supporting runners for the following repositories:
* libbpf/libbpf
* kernel-patches/bpf
* kernel-patches/vmtest
Below instructions are directly applicable to libbpf, and require minor
modifications for kernel-patches repos. Currently, qemu-user-static Docker
image is shared between all GitHub runners, but separate actions-runner-\*
service / Docker image is created for each runner type.
## Configuring the builder.
### Install prerequisites.
```
$ sudo dnf install docker # RHEL
$ sudo apt install -y docker.io # Ubuntu
```
@@ -35,6 +44,10 @@ for details.
### Autostart the x86_64 emulation support.
This step is important, you would not be able to build docker container
without having this service running. If container build fails, make sure
service is running properly.
```
$ sudo systemctl enable --now qemu-user-static
```
@@ -72,3 +85,23 @@ $ sudo systemctl stop actions-runner-libbpf
$ sudo docker rm -f actions-runner-libbpf
$ sudo docker volume rm actions-runner-libbpf
```
## Troubleshooting
In order to check if service is running, use the following command:
```
$ sudo systemctl status <service name>
```
In order to get logs for service:
```
$ journalctl -u <service name>
```
In order to check which containers are currently active:
```
$ sudo docker ps
```

View File

@@ -3,3 +3,4 @@ get_stack_raw_tp # spams with kernel warnings until next bpf -> bpf-next merg
stacktrace_build_id_nmi
stacktrace_build_id
task_fd_query_rawtp
varlen

View File

@@ -25,11 +25,14 @@ ksyms_module_libbpf # JIT does not support calling kernel f
ksyms_module_lskel # test_ksyms_module_lskel__open_and_load unexpected error: -9 (?)
modify_return # modify_return attach failed: -524 (trampoline)
module_attach # skel_attach skeleton attach failed: -524 (trampoline)
mptcp
kprobe_multi_test # relies on fentry
netcnt # failed to load BPF skeleton 'netcnt_prog': -7 (?)
probe_user # check_kprobe_res wrong kprobe res from probe read (?)
recursion # skel_attach unexpected error: -524 (trampoline)
ringbuf # skel_load skeleton load failed (?)
sk_assign # Can't read on server: Invalid argument (?)
sk_lookup # endianness problem
sk_storage_tracing # test_sk_storage_tracing__attach unexpected error: -524 (trampoline)
skc_to_unix_sock # could not attach BPF object unexpected error: -524 (trampoline)
socket_cookie # prog_attach unexpected error: -524 (trampoline)
@@ -44,6 +47,7 @@ test_lsm # failed to find kernel BTF type ID of
test_overhead # attach_fentry unexpected error: -524 (trampoline)
test_profiler # unknown func bpf_probe_read_str#45 (overlapping)
timer # failed to auto-attach program 'test1': -524 (trampoline)
timer_crash # trampoline
timer_mim # failed to auto-attach program 'test1': -524 (trampoline)
trace_ext # failed to auto-attach program 'test_pkt_md_access_new': -524 (trampoline)
trace_printk # trace_printk__load unexpected error: -2 (errno 2) (?)
@@ -54,3 +58,10 @@ vmlinux # failed to auto-attach program 'handle
xdp_adjust_tail # case-128 err 0 errno 28 retval 1 size 128 expect-size 3520 (?)
xdp_bonding # failed to auto-attach program 'trace_on_entry': -524 (trampoline)
xdp_bpf2bpf # failed to auto-attach program 'trace_on_entry': -524 (trampoline)
map_kptr # failed to open_and_load program: -524 (trampoline)
bpf_cookie # failed to open_and_load program: -524 (trampoline)
xdp_do_redirect # prog_run_max_size unexpected error: -22 (errno 22)
send_signal # intermittently fails to receive signal
select_reuseport # intermittently fails on new s390x setup
xdp_synproxy # JIT does not support calling kernel function (kfunc)
unpriv_bpf_disabled # fentry

View File

@@ -493,7 +493,7 @@ CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is not set
# CONFIG_MODULE_SIG is not set
CONFIG_MODULE_SIG=y
CONFIG_MODULE_COMPRESS_NONE=y
# CONFIG_MODULE_COMPRESS_GZIP is not set
# CONFIG_MODULE_COMPRESS_XZ is not set
@@ -764,7 +764,8 @@ CONFIG_IPV6_SEG6_BPF=y
# CONFIG_IPV6_RPL_LWTUNNEL is not set
# CONFIG_IPV6_IOAM6_LWTUNNEL is not set
# CONFIG_NETLABEL is not set
# CONFIG_MPTCP is not set
CONFIG_MPTCP=y
CONFIG_MPTCP_IPV6=y
# CONFIG_NETWORK_SECMARK is not set
# CONFIG_NETWORK_PHY_TIMESTAMPING is not set
CONFIG_NETFILTER=y
@@ -784,6 +785,7 @@ CONFIG_NETFILTER_NETLINK=y
# CONFIG_NETFILTER_NETLINK_OSF is not set
CONFIG_NF_CONNTRACK=y
# CONFIG_NF_LOG_SYSLOG is not set
CONFIG_NETFILTER_SYNPROXY=y
CONFIG_NF_TABLES=y
# CONFIG_NF_TABLES_INET is not set
# CONFIG_NF_TABLES_NETDEV is not set
@@ -814,6 +816,7 @@ CONFIG_NETFILTER_XT_MARK=y
#
# CONFIG_NETFILTER_XT_TARGET_AUDIT is not set
# CONFIG_NETFILTER_XT_TARGET_CLASSIFY is not set
CONFIG_NETFILTER_XT_TARGET_CT=y
# CONFIG_NETFILTER_XT_TARGET_HMARK is not set
# CONFIG_NETFILTER_XT_TARGET_IDLETIMER is not set
# CONFIG_NETFILTER_XT_TARGET_LOG is not set
@@ -858,6 +861,7 @@ CONFIG_NETFILTER_XT_MATCH_BPF=y
# CONFIG_NETFILTER_XT_MATCH_RECENT is not set
# CONFIG_NETFILTER_XT_MATCH_SCTP is not set
# CONFIG_NETFILTER_XT_MATCH_SOCKET is not set
CONFIG_NETFILTER_XT_MATCH_STATE=y
# CONFIG_NETFILTER_XT_MATCH_STATISTIC is not set
# CONFIG_NETFILTER_XT_MATCH_STRING is not set
# CONFIG_NETFILTER_XT_MATCH_TCPMSS is not set
@@ -885,9 +889,10 @@ CONFIG_IP_NF_IPTABLES=y
# CONFIG_IP_NF_MATCH_ECN is not set
# CONFIG_IP_NF_MATCH_TTL is not set
CONFIG_IP_NF_FILTER=y
CONFIG_IP_NF_TARGET_SYNPROXY=y
# CONFIG_IP_NF_TARGET_REJECT is not set
# CONFIG_IP_NF_MANGLE is not set
# CONFIG_IP_NF_RAW is not set
CONFIG_IP_NF_RAW=y
# CONFIG_IP_NF_SECURITY is not set
# CONFIG_IP_NF_ARPTABLES is not set
# end of IP: Netfilter Configuration
@@ -2586,6 +2591,7 @@ CONFIG_FUNCTION_TRACER=y
CONFIG_FUNCTION_GRAPH_TRACER=y
CONFIG_DYNAMIC_FTRACE=y
CONFIG_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_FPROBE=y
CONFIG_FUNCTION_PROFILER=y
CONFIG_STACK_TRACER=y
# CONFIG_IRQSOFF_TRACER is not set

View File

@@ -721,7 +721,7 @@ CONFIG_MODULE_UNLOAD=y
CONFIG_MODVERSIONS=y
CONFIG_ASM_MODVERSIONS=y
CONFIG_MODULE_SRCVERSION_ALL=y
# CONFIG_MODULE_SIG is not set
CONFIG_MODULE_SIG=y
# CONFIG_MODULE_COMPRESS is not set
# CONFIG_MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS is not set
# CONFIG_UNUSED_SYMBOLS is not set
@@ -961,7 +961,8 @@ CONFIG_IPV6_SEG6_LWTUNNEL=y
CONFIG_IPV6_SEG6_BPF=y
# CONFIG_IPV6_RPL_LWTUNNEL is not set
CONFIG_NETLABEL=y
# CONFIG_MPTCP is not set
CONFIG_MPTCP=y
CONFIG_MPTCP_IPV6=y
CONFIG_NETWORK_SECMARK=y
CONFIG_NET_PTP_CLASSIFY=y
# CONFIG_NETWORK_PHY_TIMESTAMPING is not set
@@ -979,6 +980,7 @@ CONFIG_NETFILTER_NETLINK_LOG=y
# CONFIG_NETFILTER_NETLINK_OSF is not set
CONFIG_NF_CONNTRACK=y
# CONFIG_NF_LOG_NETDEV is not set
CONFIG_NETFILTER_SYNPROXY=y
# CONFIG_NF_TABLES is not set
CONFIG_NETFILTER_XTABLES=y
@@ -992,6 +994,7 @@ CONFIG_NETFILTER_XTABLES=y
#
# CONFIG_NETFILTER_XT_TARGET_AUDIT is not set
# CONFIG_NETFILTER_XT_TARGET_CLASSIFY is not set
CONFIG_NETFILTER_XT_TARGET_CT=y
# CONFIG_NETFILTER_XT_TARGET_HMARK is not set
# CONFIG_NETFILTER_XT_TARGET_IDLETIMER is not set
# CONFIG_NETFILTER_XT_TARGET_LOG is not set
@@ -1037,6 +1040,7 @@ CONFIG_NETFILTER_XT_MATCH_BPF=y
# CONFIG_NETFILTER_XT_MATCH_RECENT is not set
# CONFIG_NETFILTER_XT_MATCH_SCTP is not set
# CONFIG_NETFILTER_XT_MATCH_SOCKET is not set
CONFIG_NETFILTER_XT_MATCH_STATE=y
CONFIG_NETFILTER_XT_MATCH_STATISTIC=y
# CONFIG_NETFILTER_XT_MATCH_STRING is not set
# CONFIG_NETFILTER_XT_MATCH_TCPMSS is not set
@@ -1061,9 +1065,10 @@ CONFIG_IP_NF_IPTABLES=y
# CONFIG_IP_NF_MATCH_AH is not set
# CONFIG_IP_NF_MATCH_ECN is not set
# CONFIG_IP_NF_MATCH_TTL is not set
# CONFIG_IP_NF_FILTER is not set
CONFIG_IP_NF_FILTER=y
CONFIG_IP_NF_TARGET_SYNPROXY=y
# CONFIG_IP_NF_MANGLE is not set
# CONFIG_IP_NF_RAW is not set
CONFIG_IP_NF_RAW=y
# CONFIG_IP_NF_SECURITY is not set
# CONFIG_IP_NF_ARPTABLES is not set
# end of IP: Netfilter Configuration
@@ -1211,7 +1216,7 @@ CONFIG_MPLS=y
# CONFIG_NET_NSH is not set
# CONFIG_HSR is not set
# CONFIG_NET_SWITCHDEV is not set
# CONFIG_NET_L3_MASTER_DEV is not set
CONFIG_NET_L3_MASTER_DEV=y
# CONFIG_QRTR is not set
# CONFIG_NET_NCSI is not set
CONFIG_RPS=y
@@ -1509,6 +1514,7 @@ CONFIG_TUN=y
CONFIG_VETH=y
CONFIG_VIRTIO_NET=y
# CONFIG_NLMON is not set
CONFIG_NET_VRF=y
# CONFIG_ARCNET is not set
#
@@ -2793,6 +2799,7 @@ CONFIG_DEBUG_INFO=y
# CONFIG_DEBUG_INFO_COMPRESSED is not set
# CONFIG_DEBUG_INFO_SPLIT is not set
# CONFIG_DEBUG_INFO_DWARF4 is not set
CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_DEBUG_INFO_BTF=y
# CONFIG_GDB_SCRIPTS is not set
CONFIG_ENABLE_MUST_CHECK=y
@@ -2976,6 +2983,7 @@ CONFIG_FUNCTION_GRAPH_TRACER=y
CONFIG_DYNAMIC_FTRACE=y
CONFIG_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y
CONFIG_FPROBE=y
# CONFIG_FUNCTION_PROFILER is not set
# CONFIG_STACK_TRACER is not set
# CONFIG_IRQSOFF_TRACER is not set

View File

@@ -1,4 +1,4 @@
attach_probe
# attach_probe
autoload
bpf_verif_scale
cgroup_attach_autodetach
@@ -10,7 +10,6 @@ core_reloc
core_retro
cpu_mask
endian
fexit_stress
get_branch_snapshot
get_stackid_cannot_attach
global_data
@@ -43,13 +42,13 @@ spinlock
stacktrace_map
stacktrace_map_raw_tp
static_linked
subprogs
task_fd_query_rawtp
task_fd_query_tp
tc_bpf
tcp_estats
tcp_rtt
tp_attach_query
usdt/urand_pid_attach
xdp
xdp_info
xdp_noinline