Compare commits

...

609 Commits

Author SHA1 Message Date
thiagoftsm
74c16e9a0c Merge branch 'libbpf:master' into master 2022-02-11 14:34:02 +00:00
Andrii Nakryiko
528094c0c1 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   227a0713b319e7a8605312dee1c97c97a719a9fc
Checkpoint bpf-next commit: dc37dc617fabfb1c3a16d49f5d8cc20e9e3608ca
Baseline bpf commit:        77b1b8b43ec3c060ecf7e926a92b0f8772171046
Checkpoint bpf commit:      fe68195daf34d5dddacd3f93dd3eafc4beca3a0e

Andrii Nakryiko (1):
  libbpf: Fix compilation warning due to mismatched printf format

Dan Carpenter (1):
  libbpf: Fix signedness bug in btf_dump_array_data()

Hengqi Chen (1):
  libbpf: Add BPF_KPROBE_SYSCALL macro

Ilya Leoshkevich (7):
  libbpf: Add PT_REGS_SYSCALL_REGS macro
  libbpf: Fix accessing syscall arguments on powerpc
  libbpf: Fix riscv register names
  libbpf: Fix accessing syscall arguments on riscv
  libbpf: Allow overriding PT_REGS_PARM1{_CORE}_SYSCALL
  libbpf: Fix accessing the first syscall argument on arm64
  libbpf: Fix accessing the first syscall argument on s390

Mauricio Vásquez (1):
  libbpf: Remove mode check in libbpf_set_strict_mode()

 src/bpf_tracing.h | 85 +++++++++++++++++++++++++++++++++++++++++------
 src/btf_dump.c    |  6 ++--
 src/libbpf.c      |  8 -----
 3 files changed, 79 insertions(+), 20 deletions(-)

--
2.30.2
2022-02-09 09:48:32 -08:00
Andrii Nakryiko
37493e639f libbpf: Fix compilation warning due to mismatched printf format
On ppc64le architecture __s64 is long int and requires %ld. Cast to
ssize_t and use %zd to avoid architecture-specific specifiers.

Fixes: 4172843ed4a3 ("libbpf: Fix signedness bug in btf_dump_array_data()")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220209063909.1268319-1-andrii@kernel.org
2022-02-09 09:48:32 -08:00
Hengqi Chen
b0a3b9e8fe libbpf: Add BPF_KPROBE_SYSCALL macro
Add syscall-specific variant of BPF_KPROBE named BPF_KPROBE_SYSCALL ([0]).
The new macro hides the underlying way of getting syscall input arguments.
With the new macro, the following code:

    SEC("kprobe/__x64_sys_close")
    int BPF_KPROBE(do_sys_close, struct pt_regs *regs)
    {
        int fd;

        fd = PT_REGS_PARM1_CORE(regs);
        /* do something with fd */
    }

can be written as:

    SEC("kprobe/__x64_sys_close")
    int BPF_KPROBE_SYSCALL(do_sys_close, int fd)
    {
        /* do something with fd */
    }

  [0] Closes: https://github.com/libbpf/libbpf/issues/425

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220207143134.2977852-2-hengqi.chen@gmail.com
2022-02-09 09:48:32 -08:00
Ilya Leoshkevich
f7e08b4a8f libbpf: Fix accessing the first syscall argument on s390
On s390, the first syscall argument should be accessed via orig_gpr2
(see arch/s390/include/asm/syscall.h). Currently gpr[2] is used
instead, leading to bpf_syscall_macro test failure.

orig_gpr2 cannot be added to user_pt_regs, since its layout is a part
of the ABI. Therefore provide access to it only through
PT_REGS_PARM1_CORE_SYSCALL() by using a struct pt_regs flavor.

Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220209021745.2215452-11-iii@linux.ibm.com
2022-02-09 09:48:32 -08:00
Ilya Leoshkevich
9f6e3a7a59 libbpf: Fix accessing the first syscall argument on arm64
On arm64, the first syscall argument should be accessed via orig_x0
(see arch/arm64/include/asm/syscall.h). Currently regs[0] is used
instead, leading to bpf_syscall_macro test failure.

orig_x0 cannot be added to struct user_pt_regs, since its layout is a
part of the ABI. Therefore provide access to it only through
PT_REGS_PARM1_CORE_SYSCALL() by using a struct pt_regs flavor.

Reported-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220209021745.2215452-10-iii@linux.ibm.com
2022-02-09 09:48:32 -08:00
Ilya Leoshkevich
f1a756d793 libbpf: Allow overriding PT_REGS_PARM1{_CORE}_SYSCALL
arm64 and s390 need a special way to access the first syscall argument.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220209021745.2215452-9-iii@linux.ibm.com
2022-02-09 09:48:32 -08:00
Ilya Leoshkevich
32c19d8505 libbpf: Fix accessing syscall arguments on riscv
riscv does not select ARCH_HAS_SYSCALL_WRAPPER, so its syscall
handlers take "unpacked" syscall arguments. Indicate this to libbpf
using PT_REGS_SYSCALL_REGS macro.

Reported-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220209021745.2215452-7-iii@linux.ibm.com
2022-02-09 09:48:32 -08:00
Ilya Leoshkevich
497ec1d35c libbpf: Fix riscv register names
riscv registers are accessed via struct user_regs_struct, not struct
pt_regs. The program counter member in this struct is called pc, not
epc. The frame pointer is called s0, not fp.

Fixes: 3cc31d794097 ("libbpf: Normalize PT_REGS_xxx() macro definitions")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220209021745.2215452-6-iii@linux.ibm.com
2022-02-09 09:48:32 -08:00
Ilya Leoshkevich
8a28842a20 libbpf: Fix accessing syscall arguments on powerpc
powerpc does not select ARCH_HAS_SYSCALL_WRAPPER, so its syscall
handlers take "unpacked" syscall arguments. Indicate this to libbpf
using PT_REGS_SYSCALL_REGS macro.

Reported-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Link: https://lore.kernel.org/bpf/20220209021745.2215452-5-iii@linux.ibm.com
2022-02-09 09:48:32 -08:00
Ilya Leoshkevich
50b4d99bbc libbpf: Add PT_REGS_SYSCALL_REGS macro
Architectures that select ARCH_HAS_SYSCALL_WRAPPER pass a pointer to
struct pt_regs to syscall handlers, others unpack it into individual
function parameters. Introduce a macro to describe what a particular
arch does.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220209021745.2215452-3-iii@linux.ibm.com
2022-02-09 09:48:32 -08:00
Dan Carpenter
f5bd7054f9 libbpf: Fix signedness bug in btf_dump_array_data()
The btf__resolve_size() function returns negative error codes so
"elem_size" must be signed for the error handling to work.

Fixes: 920d16af9b42 ("libbpf: BTF dumper support for typed data")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220208071552.GB10495@kili
2022-02-09 09:48:32 -08:00
Mauricio Vásquez
1ceec54cb0 libbpf: Remove mode check in libbpf_set_strict_mode()
libbpf_set_strict_mode() checks that the passed mode doesn't contain
extra bits for LIBBPF_STRICT_* flags that don't exist yet.

It makes it difficult for applications to disable some strict flags as
something like "LIBBPF_STRICT_ALL & ~LIBBPF_STRICT_MAP_DEFINITIONS"
is rejected by this check and they have to use a rather complicated
formula to calculate it.[0]

One possibility is to change LIBBPF_STRICT_ALL to only contain the bits
of all existing LIBBPF_STRICT_* flags instead of 0xffffffff. However
it's not possible because the idea is that applications compiled against
older libbpf_legacy.h would still be opting into latest
LIBBPF_STRICT_ALL features.[1]

The other possibility is to remove that check so something like
"LIBBPF_STRICT_ALL & ~LIBBPF_STRICT_MAP_DEFINITIONS" is allowed. It's
what this commit does.

[0]: https://lore.kernel.org/bpf/20220204220435.301896-1-mauricio@kinvolk.io/
[1]: https://lore.kernel.org/bpf/CAEf4BzaTWa9fELJLh+bxnOb0P1EMQmaRbJVG0L+nXZdy0b8G3Q@mail.gmail.com/

Fixes: 93b8952d223a ("libbpf: deprecate legacy BPF map definitions")
Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220207145052.124421-2-mauricio@kinvolk.io
2022-02-09 09:48:32 -08:00
Andrii Nakryiko
d6783c28b4 README: update Ubuntu package link
Point to libbpf package in Ubuntu Impish, so it doesn't expire soon.

Closes: https://github.com/libbpf/libbpf/issues/451
2022-02-07 09:37:49 -08:00
Andrii Nakryiko
cdef8257a8 ci: bump LLVM_VER to 15
Another bump of latest clang version to be used during BPF selftests build.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-02-04 15:28:06 -08:00
Andrii Nakryiko
fd181bc349 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   b3dddab2ff10853aa3ef70483415d07fee3034ba
Checkpoint bpf-next commit: 227a0713b319e7a8605312dee1c97c97a719a9fc
Baseline bpf commit:        e2bcbd7769ee8f05e1b3d10848aace98973844e4
Checkpoint bpf commit:      77b1b8b43ec3c060ecf7e926a92b0f8772171046

Alexei Starovoitov (2):
  libbpf: Open code low level bpf commands.
  libbpf: Open code raw_tp_open and link_create commands.

Andrii Nakryiko (2):
  libbpf: Stop using deprecated bpf_map__is_offload_neutral()
  libbpf: Deprecate forgotten btf__get_map_kv_tids()

Dave Marchevsky (1):
  libbpf: Deprecate btf_ext rec_size APIs

Delyan Kratunov (2):
  libbpf: Deprecate bpf_prog_test_run_xattr and bpf_prog_test_run
  libbpf: Deprecate priv/set_priv storage

Jakub Sitnicki (1):
  selftests/bpf: Extend verifier and bpf_sock tests for dst_port loads

Lorenzo Bianconi (1):
  libbpf: Deprecate xdp_cpumap, xdp_devmap and classifier sec
    definitions

 include/uapi/linux/bpf.h |  3 +-
 src/bpf.h                |  4 ++-
 src/btf.h                |  7 ++--
 src/libbpf.c             | 19 ++++++++---
 src/libbpf.h             |  7 +++-
 src/skel_internal.h      | 70 ++++++++++++++++++++++++++++++++++++++--
 6 files changed, 99 insertions(+), 11 deletions(-)

--
2.30.2
2022-02-04 10:27:02 -08:00
Andrii Nakryiko
47673bd255 libbpf: Deprecate forgotten btf__get_map_kv_tids()
btf__get_map_kv_tids() is in the same group of APIs as
btf_ext__reloc_func_info()/btf_ext__reloc_line_info() which were only
used by BCC. It was missed to be marked as deprecated in [0]. Fixing
that to complete [1].

  [0] https://patchwork.kernel.org/project/netdevbpf/patch/20220201014610.3522985-1-davemarchevsky@fb.com/
  [1] Closes: https://github.com/libbpf/libbpf/issues/277

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220203225017.1795946-1-andrii@kernel.org
2022-02-04 10:27:02 -08:00
Delyan Kratunov
20442822c0 libbpf: Deprecate priv/set_priv storage
Arbitrary storage via bpf_*__set_priv/__priv is being deprecated
without a replacement ([1]). perf uses this capability, but most of
that is going away with the removal of prologue generation ([2]).
perf is already suppressing deprecation warnings, so the remaining
cleanup will happen separately.

  [1]: Closes: https://github.com/libbpf/libbpf/issues/294
  [2]: https://lore.kernel.org/bpf/20220123221932.537060-1-jolsa@kernel.org/

Signed-off-by: Delyan Kratunov <delyank@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220203180032.1921580-1-delyank@fb.com
2022-02-04 10:27:02 -08:00
Andrii Nakryiko
a9cd83ae25 libbpf: Stop using deprecated bpf_map__is_offload_neutral()
Open-code bpf_map__is_offload_neutral() logic in one place in
to-be-deprecated bpf_prog_load_xattr2.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20220202225916.3313522-2-andrii@kernel.org
2022-02-04 10:27:02 -08:00
Delyan Kratunov
352c13cdee libbpf: Deprecate bpf_prog_test_run_xattr and bpf_prog_test_run
Deprecate non-extendable bpf_prog_test_run{,_xattr} in favor of
OPTS-based bpf_prog_test_run_opts ([0]).

  [0] Closes: https://github.com/libbpf/libbpf/issues/286

Signed-off-by: Delyan Kratunov <delyank@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220202235423.1097270-5-delyank@fb.com
2022-02-04 10:27:02 -08:00
Alexei Starovoitov
a7a3a8811c libbpf: Open code raw_tp_open and link_create commands.
Open code raw_tracepoint_open and link_create used by light skeleton
to be able to avoid full libbpf eventually.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220131220528.98088-4-alexei.starovoitov@gmail.com
2022-02-04 10:27:02 -08:00
Alexei Starovoitov
bd402dccaf libbpf: Open code low level bpf commands.
Open code low level bpf commands used by light skeleton to
be able to avoid full libbpf eventually.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220131220528.98088-3-alexei.starovoitov@gmail.com
2022-02-04 10:27:02 -08:00
Lorenzo Bianconi
c245b0eeaf libbpf: Deprecate xdp_cpumap, xdp_devmap and classifier sec definitions
Deprecate xdp_cpumap, xdp_devmap and classifier sec definitions.
Introduce xdp/devmap and xdp/cpumap definitions according to the
standard for SEC("") in libbpf:
- prog_type.prog_flags/attach_place

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/5c7bd9426b3ce6a31d9a4b1f97eb299e1467fc52.1643727185.git.lorenzo@kernel.org
2022-02-04 10:27:02 -08:00
Dave Marchevsky
74dcd1bf6a libbpf: Deprecate btf_ext rec_size APIs
btf_ext__{func,line}_info_rec_size functions are used in conjunction
with already-deprecated btf_ext__reloc_{func,line}_info functions. Since
struct btf_ext is opaque to the user it was necessary to expose rec_size
getters in the past.

btf_ext__reloc_{func,line}_info were deprecated in commit 8505e8709b5ee
("libbpf: Implement generalized .BTF.ext func/line info adjustment")
as they're not compatible with support for multiple programs per
section. It was decided[0] that users of these APIs should implement their
own .btf.ext parsing to access this data, in which case the rec_size
getters are unnecessary. So deprecate them from libbpf 0.7.0 onwards.

  [0] Closes: https://github.com/libbpf/libbpf/issues/277

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220201014610.3522985-1-davemarchevsky@fb.com
2022-02-04 10:27:02 -08:00
Jakub Sitnicki
9f276b240b selftests/bpf: Extend verifier and bpf_sock tests for dst_port loads
Add coverage to the verifier tests and tests for reading bpf_sock fields to
ensure that 32-bit, 16-bit, and 8-bit loads from dst_port field are allowed
only at intended offsets and produce expected values.

While 16-bit and 8-bit access to dst_port field is straight-forward, 32-bit
wide loads need be allowed and produce a zero-padded 16-bit value for
backward compatibility.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/r/20220130115518.213259-3-jakub@cloudflare.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-02-04 10:27:02 -08:00
Andrii Nakryiko
9a64065733 sync: move bpf-next checkpoint to include selftests fixes
There are no relevant libbpf commits, but new checkpoint commit contains
important BPF selftests commits fixing CI failures from kernel repo
side.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-01-31 23:37:33 -08:00
Evgeny Vereshchagin
ae220adbb2 ci: no longer remove elfutils while building the fuzzer
Without it coverage reports can't be built
```
[2022-01-31 00:05:36,094 DEBUG] Generating file view html index file as: "/out/report/linux/file_view_index.html".
Traceback (most recent call last):
  File "/opt/code_coverage/coverage_utils.py", line 829, in <module>
    sys.exit(Main())
  File "/opt/code_coverage/coverage_utils.py", line 823, in Main
    return _CmdPostProcess(args)
  File "/opt/code_coverage/coverage_utils.py", line 780, in _CmdPostProcess
    processor.PrepareHtmlReport()
  File "/opt/code_coverage/coverage_utils.py", line 577, in PrepareHtmlReport
    self.GenerateFileViewHtmlIndexFile(per_file_coverage_summary,
  File "/opt/code_coverage/coverage_utils.py", line 450, in GenerateFileViewHtmlIndexFile
    self.GetCoverageHtmlReportPathForFile(file_path),
  File "/opt/code_coverage/coverage_utils.py", line 422, in GetCoverageHtmlReportPathForFile
    assert os.path.isfile(
AssertionError: "/tmp/tmp.UYax4l19Gh/lib/system.h" is not a file.
```

It's a follow-up to 393a058d06

Signed-off-by: Evgeny Vereshchagin <evvers@ya.ru>
2022-01-31 15:45:11 -08:00
Andrii Nakryiko
fec0813359 sync: regenerate vmlinux.h to include TASK_COMM_LEN constant
TASK_COMM_LEN is now part of vmlinux.h on latest kernel. Regenerate
vmlinux.h to have it on 5.5 and 4.9 kernels.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-01-25 23:37:04 -08:00
Andrii Nakryiko
1e702e8ffe sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   820e6e227c4053b6b631ae65ef1f65d560cb392b
Checkpoint bpf-next commit: c446fdacb10dcb3b9a9ed3b91d91e72d71d94b03
Baseline bpf commit:        baa59504c1cd0cca7d41954a45ee0b3dc78e41a0
Checkpoint bpf commit:      e2bcbd7769ee8f05e1b3d10848aace98973844e4

Andrii Nakryiko (3):
  libbpf: hide and discourage inconsistently named getters
  libbpf: deprecate bpf_map__resize()
  libbpf: deprecate bpf_program__is_<type>() and
    bpf_program__set_<type>() APIs

Christy Lee (2):
  libbpf: Mark bpf_object__open_buffer() API deprecated
  libbpf: Mark bpf_object__open_xattr() deprecated

Kajol Jain (1):
  perf: Add new macros for mem_hops field

Kenny Yu (2):
  bpf: Add bpf_copy_from_user_task() helper
  libbpf: Add "iter.s" section for sleepable bpf iterator programs

Kenta Tada (1):
  libbpf: Fix the incorrect register read for syscalls on x86_64

Lorenzo Bianconi (4):
  bpf: introduce BPF_F_XDP_HAS_FRAGS flag in prog_flags loading the ebpf
    program
  bpf: introduce bpf_xdp_get_buff_len helper
  libbpf: Add SEC name for xdp frags programs
  net: xdp: introduce bpf_xdp_pointer utility routine

 include/uapi/linux/bpf.h        | 41 +++++++++++++++++++++++++++++++++
 include/uapi/linux/perf_event.h |  5 +++-
 src/bpf_tracing.h               | 34 +++++++++++++++++++++++++++
 src/btf.h                       |  5 +---
 src/libbpf.c                    | 32 +++++++++++++++++--------
 src/libbpf.h                    | 34 ++++++++++++++++++++++++---
 src/libbpf.map                  |  2 ++
 src/libbpf_internal.h           |  3 +++
 src/libbpf_legacy.h             | 17 ++++++++++++++
 9 files changed, 155 insertions(+), 18 deletions(-)

--
2.30.2
2022-01-25 23:37:04 -08:00
Andrii Nakryiko
bad4fa116c sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2022-01-25 23:37:04 -08:00
Andrii Nakryiko
b2a7b16287 libbpf: deprecate bpf_program__is_<type>() and bpf_program__set_<type>() APIs
Not sure why these APIs were added in the first place instead of
a completely generic (and not requiring constantly adding new APIs with
each new BPF program type) bpf_program__type() and
bpf_program__set_type() APIs. But as it is right now, there are 13 such
specialized is_type/set_type APIs, while latest kernel is already at 30+
BPF program types.

Instead of completing the set of APIs and keep chasing kernel's
bpf_prog_type enum, deprecate existing subset and recommend generic
bpf_program__type() and bpf_program__set_type() APIs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220124194254.2051434-4-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-25 23:37:04 -08:00
Andrii Nakryiko
638311dcb8 libbpf: deprecate bpf_map__resize()
Deprecated bpf_map__resize() in favor of bpf_map__set_max_entries()
setter. In addition to having a surprising name (users often don't
realize that they need to use bpf_map__resize()), the name also implies
some magic way of resizing BPF map after it is created, which is clearly
not the case.

Another minor annoyance is that bpf_map__resize() disallows 0 value for
max_entries, which in some cases is totally acceptable (e.g., like for
BPF perf buf case to let libbpf auto-create one buffer per each
available CPU core).

  [0] Closes: https://github.com/libbpf/libbpf/issues/304

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220124194254.2051434-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-25 23:37:04 -08:00
Andrii Nakryiko
e72ac682ae libbpf: hide and discourage inconsistently named getters
Move a bunch of "getters" into libbpf_legacy.h to keep them there in
libbpf 1.0. See [0] for discussion of "Discouraged APIs". These getters
don't add any maintenance burden and are simple alias, but they are
inconsistent in naming. So keep them in libbpf_legacy.h instead of
libbpf.h to "hide" them in favor of preferred getters ([1]). Also add two
missing getters: bpf_program__type() and bpf_program__expected_attach_type().

  [0] https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0#handling-deprecation-of-apis-and-functionality
  [1] Closes: https://github.com/libbpf/libbpf/issues/307

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220124194254.2051434-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-25 23:37:04 -08:00
Kenta Tada
d2818a8f2c libbpf: Fix the incorrect register read for syscalls on x86_64
Currently, rcx is read as the fourth parameter of syscall on x86_64.
But x86_64 Linux System Call convention uses r10 actually.
This commit adds the wrapper for users who want to access to
syscall params to analyze the user space.

Signed-off-by: Kenta Tada <Kenta.Tada@sony.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220124141622.4378-3-Kenta.Tada@sony.com
2022-01-25 23:37:04 -08:00
Christy Lee
c0c3f46ca6 libbpf: Mark bpf_object__open_xattr() deprecated
Mark bpf_object__open_xattr() as deprecated, use
bpf_object__open_file() instead.

  [0] Closes: https://github.com/libbpf/libbpf/issues/287

Signed-off-by: Christy Lee <christylee@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220125010917.679975-1-christylee@fb.com
2022-01-25 23:37:04 -08:00
Christy Lee
1af0f62fac libbpf: Mark bpf_object__open_buffer() API deprecated
Deprecate bpf_object__open_buffer() API in favor of the unified
opts-based bpf_object__open_mem() API.

Signed-off-by: Christy Lee <christylee@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220125005923.418339-2-christylee@fb.com
2022-01-25 23:37:04 -08:00
Kenny Yu
04aef6ce9b libbpf: Add "iter.s" section for sleepable bpf iterator programs
This adds a new bpf section "iter.s" to allow bpf iterator programs to
be sleepable.

Signed-off-by: Kenny Yu <kennyyu@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220124185403.468466-4-kennyyu@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-25 23:37:04 -08:00
Kenny Yu
a9491bb920 bpf: Add bpf_copy_from_user_task() helper
This adds a helper for bpf programs to read the memory of other
tasks.

As an example use case at Meta, we are using a bpf task iterator program
and this new helper to print C++ async stack traces for all threads of
a given process.

Signed-off-by: Kenny Yu <kennyyu@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220124185403.468466-3-kennyyu@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-25 23:37:04 -08:00
Lorenzo Bianconi
fb0b8a7cea net: xdp: introduce bpf_xdp_pointer utility routine
Similar to skb_header_pointer, introduce bpf_xdp_pointer utility routine
to return a pointer to a given position in the xdp_buff if the requested
area (offset + len) is contained in a contiguous memory area otherwise it
will be copied in a bounce buffer provided by the caller.
Similar to the tc counterpart, introduce the two following xdp helpers:
- bpf_xdp_load_bytes
- bpf_xdp_store_bytes

Reviewed-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/ab285c1efdd5b7a9d361348b1e7d3ef49f6382b3.1642758637.git.lorenzo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-25 23:37:04 -08:00
Lorenzo Bianconi
4ae89237d4 libbpf: Add SEC name for xdp frags programs
Introduce support for the following SEC entries for XDP frags
property:
- SEC("xdp.frags")
- SEC("xdp.frags/devmap")
- SEC("xdp.frags/cpumap")

Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/af23b6e4841c171ad1af01917839b77847a4bc27.1642758637.git.lorenzo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-25 23:37:04 -08:00
Lorenzo Bianconi
9af0e19376 bpf: introduce bpf_xdp_get_buff_len helper
Introduce bpf_xdp_get_buff_len helper in order to return the xdp buffer
total size (linear and paged area)

Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/aac9ac3504c84026cf66a3c71b7c5ae89bc991be.1642758637.git.lorenzo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-25 23:37:04 -08:00
Lorenzo Bianconi
57df70e180 bpf: introduce BPF_F_XDP_HAS_FRAGS flag in prog_flags loading the ebpf program
Introduce BPF_F_XDP_HAS_FRAGS and the related field in bpf_prog_aux
in order to notify the driver the loaded program support xdp frags.

Acked-by: Toke Hoiland-Jorgensen <toke@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/db2e8075b7032a356003f407d1b0deb99adaa0ed.1642758637.git.lorenzo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-25 23:37:04 -08:00
Kajol Jain
da06e3efcc perf: Add new macros for mem_hops field
Add new macros for mem_hops field which can be used to
represent remote-node, socket and board level details.

Currently the code had macro for HOPS_0, which corresponds
to data coming from another core but same node.
Add new macros for HOPS_1 to HOPS_3 to represent
remote-node, socket and board level data.

For ex: Encodings for mem_hops fields with L2 cache:

L2			- local L2
L2 | REMOTE | HOPS_0	- remote core, same node L2
L2 | REMOTE | HOPS_1	- remote node, same socket L2
L2 | REMOTE | HOPS_2	- remote socket, same board L2
L2 | REMOTE | HOPS_3	- remote board L2

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211206091749.87585-2-kjain@linux.ibm.com
2022-01-25 23:37:04 -08:00
Andrii Nakryiko
b4b6e4dc20 sync: start syncing perf_event.h UAPI header as well
This header is necessary for libbpf-sys to generate perf-related Rust
bindings. It's more convenient to have it available locally with libbpf.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-01-25 23:37:04 -08:00
Ilya Leoshkevich
0ee425cdd7 ci: add Ubuntu instructions for s390x-self-hosted-builder
There is a new Ubuntu-based builder, and setup steps for it are
slightly different from what we had on the old RHEL-based one.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
2022-01-25 17:21:08 -08:00
Ilya Leoshkevich
f1085fe3c3 ci: add s390x-self-hosted-builder's user to kvm group
RHEL's podman sets /dev/kvm permissions to 0666, while Ubuntu's docker
sets them to 0660. Therefore, in order to use KVM from a container,
the user within must belong to the kvm group.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
2022-01-25 17:21:08 -08:00
Evgeny Vereshchagin
393a058d06 tests: move the fuzzer upstream
It should make it easier to start using CFLite or something like that
to fuzz libbpf without getting pointless CVEs :-) More importantly,
now it's possible to build the fuzzer by just cloning the repository,
installing clang and running `./scripts/build-fuzzers.h`:
```
git clone https://github.com/libbpf/libbpf
./scripts/build-fuzzers.h
unzip -d CORPUS fuzz/bpf-object-fuzzer_seed_corpus.zip
./out/bpf-object-fuzzer CORPUS
```

It should make it easier (for me at least) to report some
elfutils bugs because they are much easier to reproduce manually
now.
2022-01-24 15:37:36 -08:00
Andrii Nakryiko
3febb8a165 ci: update s390x blacklist
Add bpf_mod_race and bpf_nf selftests to blacklist for s390x.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-01-21 16:54:32 -08:00
Andrii Nakryiko
5daafdccf9 ci: regenerate and check in latest vmlinux.h
Add a bunch of new kernel types that are needed for successful selftest
compilation.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2022-01-21 16:54:32 -08:00
Andrii Nakryiko
78e816a15d sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   e80f2a0d194605553315de68284fc41969f81f62
Checkpoint bpf-next commit: 820e6e227c4053b6b631ae65ef1f65d560cb392b
Baseline bpf commit:        343e53754b21ae45530623222aa079fecd3cf942
Checkpoint bpf commit:      baa59504c1cd0cca7d41954a45ee0b3dc78e41a0

Andrii Nakryiko (2):
  libbpf: deprecate legacy BPF map definitions
  libbpf: streamline low-level XDP APIs

Kui-Feng Lee (1):
  libbpf: Improve btf__add_btf() with an additional hashmap for strings.

Toke Høiland-Jørgensen (1):
  libbpf: Define BTF_KIND_* constants in btf.h to avoid compilation
    errors

Usama Arif (1):
  uapi/bpf: Add missing description and returns for helper documentation

YiFei Zhu (1):
  bpf: Add cgroup helpers bpf_{get,set}_retval to get/set syscall return
    value

 include/uapi/linux/bpf.h |  33 +++++++++++
 src/bpf_helpers.h        |   2 +-
 src/btf.c                |  31 ++++++++++-
 src/btf.h                |  22 +++++++-
 src/libbpf.c             |   8 +++
 src/libbpf.h             |  29 ++++++++++
 src/libbpf.map           |   4 ++
 src/libbpf_legacy.h      |   5 ++
 src/netlink.c            | 117 ++++++++++++++++++++++++++++-----------
 9 files changed, 215 insertions(+), 36 deletions(-)

--
2.30.2
2022-01-21 16:54:32 -08:00
Andrii Nakryiko
d2ea0e2d03 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2022-01-21 16:54:32 -08:00
Andrii Nakryiko
8fbe7eec3a libbpf: streamline low-level XDP APIs
Introduce 4 new netlink-based XDP APIs for attaching, detaching, and
querying XDP programs:
  - bpf_xdp_attach;
  - bpf_xdp_detach;
  - bpf_xdp_query;
  - bpf_xdp_query_id.

These APIs replace bpf_set_link_xdp_fd, bpf_set_link_xdp_fd_opts,
bpf_get_link_xdp_id, and bpf_get_link_xdp_info APIs ([0]). The latter
don't follow a consistent naming pattern and some of them use
non-extensible approaches (e.g., struct xdp_link_info which can't be
modified without breaking libbpf ABI).

The approach I took with these low-level XDP APIs is similar to what we
did with low-level TC APIs. There is a nice duality of bpf_tc_attach vs
bpf_xdp_attach, and so on. I left bpf_xdp_attach() to support detaching
when -1 is specified for prog_fd for generality and convenience, but
bpf_xdp_detach() is preferred due to clearer naming and associated
semantics. Both bpf_xdp_attach() and bpf_xdp_detach() accept the same
opts struct allowing to specify expected old_prog_fd.

While doing the refactoring, I noticed that old APIs require users to
specify opts with old_fd == -1 to declare "don't care about already
attached XDP prog fd" condition. Otherwise, FD 0 is assumed, which is
essentially never an intended behavior. So I made this behavior
consistent with other kernel and libbpf APIs, in which zero FD means "no
FD". This seems to be more in line with the latest thinking in BPF land
and should cause less user confusion, hopefully.

For querying, I left two APIs, both more generic bpf_xdp_query()
allowing to query multiple IDs and attach mode, but also
a specialization of it, bpf_xdp_query_id(), which returns only requested
prog_id. Uses of prog_id returning bpf_get_link_xdp_id() were so
prevalent across selftests and samples, that it seemed a very common use
case and using bpf_xdp_query() for doing it felt very cumbersome with
a highly branches if/else chain based on flags and attach mode.

Old APIs are scheduled for deprecation in libbpf 0.8 release.

  [0] Closes: https://github.com/libbpf/libbpf/issues/309

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20220120061422.2710637-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21 16:54:32 -08:00
Andrii Nakryiko
d788cd57b5 libbpf: deprecate legacy BPF map definitions
Enact deprecation of legacy BPF map definition in SEC("maps") ([0]). For
the definitions themselves introduce LIBBPF_STRICT_MAP_DEFINITIONS flag
for libbpf strict mode. If it is set, error out on any struct
bpf_map_def-based map definition. If not set, libbpf will print out
a warning for each legacy BPF map to raise awareness that it goes away.

For any use of BPF_ANNOTATE_KV_PAIR() macro providing a legacy way to
associate BTF key/value type information with legacy BPF map definition,
warn through libbpf's pr_warn() error message (but don't fail BPF object
open).

BPF-side struct bpf_map_def is marked as deprecated. User-space struct
bpf_map_def has to be used internally in libbpf, so it is left
untouched. It should be enough for bpf_map__def() to be marked
deprecated to raise awareness that it goes away.

bpftool is an interesting case that utilizes libbpf to open BPF ELF
object to generate skeleton. As such, even though bpftool itself uses
full on strict libbpf mode (LIBBPF_STRICT_ALL), it has to relax it a bit
for BPF map definition handling to minimize unnecessary disruptions. So
opt-out of LIBBPF_STRICT_MAP_DEFINITIONS for bpftool. User's code that
will later use generated skeleton will make its own decision whether to
enforce LIBBPF_STRICT_MAP_DEFINITIONS or not.

There are few tests in selftests/bpf that are consciously using legacy
BPF map definitions to test libbpf functionality. For those, temporary
opt out of LIBBPF_STRICT_MAP_DEFINITIONS mode for the duration of those
tests.

  [0] Closes: https://github.com/libbpf/libbpf/issues/272

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220120060529.1890907-4-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21 16:54:32 -08:00
YiFei Zhu
ab022c8eb4 bpf: Add cgroup helpers bpf_{get,set}_retval to get/set syscall return value
The helpers continue to use int for retval because all the hooks
are int-returning rather than long-returning. The return value of
bpf_set_retval is int for future-proofing, in case in the future
there may be errors trying to set the retval.

After the previous patch, if a program rejects a syscall by
returning 0, an -EPERM will be generated no matter if the retval
is already set to -err. This patch change it being forced only if
retval is not -err. This is because we want to support, for
example, invoking bpf_set_retval(-EINVAL) and return 0, and have
the syscall return value be -EINVAL not -EPERM.

For BPF_PROG_CGROUP_INET_EGRESS_RUN_ARRAY, the prior behavior is
that, if the return value is NET_XMIT_DROP, the packet is silently
dropped. We preserve this behavior for backward compatibility
reasons, so even if an errno is set, the errno does not return to
caller. However, setting a non-err to retval cannot propagate so
this is not allowed and we return a -EFAULT in that case.

Signed-off-by: YiFei Zhu <zhuyifei@google.com>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/b4013fd5d16bed0b01977c1fafdeae12e1de61fb.1639619851.git.zhuyifei@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-01-21 16:54:32 -08:00
Kui-Feng Lee
75c7c722f5 libbpf: Improve btf__add_btf() with an additional hashmap for strings.
Add a hashmap to map the string offsets from a source btf to the
string offsets from a target btf to reduce overheads.

btf__add_btf() calls btf__add_str() to add strings from a source to a
target btf.  It causes many string comparisons, and it is a major
hotspot when adding a big btf.  btf__add_str() uses strcmp() to check
if a hash entry is the right one.  The extra hashmap here compares
offsets of strings, that are much cheaper.  It remembers the results
of btf__add_str() for later uses to reduce the cost.

We are parallelizing BTF encoding for pahole by creating separated btf
instances for worker threads.  These per-thread btf instances will be
added to the btf instance of the main thread by calling btf__add_str()
to deduplicate and write out.  With this patch and -j4, the running
time of pahole drops to about 6.0s from 6.6s.

The following lines are the summary of 'perf stat' w/o the change.

       6.668126396 seconds time elapsed

      13.451054000 seconds user
       0.715520000 seconds sys

The following lines are the summary w/ the change.

       5.986973919 seconds time elapsed

      12.939903000 seconds user
       0.724152000 seconds sys

V4 fixes a bug of error checking against the pointer returned by
hashmap__new().

[v3] https://lore.kernel.org/bpf/20220118232053.2113139-1-kuifeng@fb.com/
[v2] https://lore.kernel.org/bpf/20220114193713.461349-1-kuifeng@fb.com/

Signed-off-by: Kui-Feng Lee <kuifeng@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220119180214.255634-1-kuifeng@fb.com
2022-01-21 16:54:32 -08:00
Usama Arif
0f99d97a9c uapi/bpf: Add missing description and returns for helper documentation
Both description and returns section will become mandatory
for helpers and syscalls in a later commit to generate man pages.

This commit also adds in the documentation that BPF_PROG_RUN is
an alias for BPF_PROG_TEST_RUN for anyone searching for the
syscall in the generated man pages.

Signed-off-by: Usama Arif <usama.arif@bytedance.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220119114442.1452088-1-usama.arif@bytedance.com
2022-01-21 16:54:32 -08:00
Toke Høiland-Jørgensen
8404d1396c libbpf: Define BTF_KIND_* constants in btf.h to avoid compilation errors
The btf.h header included with libbpf contains inline helper functions to
check for various BTF kinds. These helpers directly reference the
BTF_KIND_* constants defined in the kernel header, and because the header
file is included in user applications, this happens in the user application
compile units.

This presents a problem if a user application is compiled on a system with
older kernel headers because the constants are not available. To avoid
this, add #defines of the constants directly in btf.h before using them.

Since the kernel header moved to an enum for BTF_KIND_*, the #defines can
shadow the enum values without any errors, so we only need #ifndef guards
for the constants that predates the conversion to enum. We group these so
there's only one guard for groups of values that were added together.

  [0] Closes: https://github.com/libbpf/libbpf/issues/436

Fixes: 223f903e9c83 ("bpf: Rename BTF_KIND_TAG to BTF_KIND_DECL_TAG")
Fixes: 5b84bd10363e ("libbpf: Add support for BTF_KIND_TAG")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/bpf/20220118141327.34231-1-toke@redhat.com
2022-01-21 16:54:32 -08:00
Andrii Nakryiko
be89b28f96 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   44bab87d8ca6f0544a9f8fc97bdf33aa5b3c899e
Checkpoint bpf-next commit: e80f2a0d194605553315de68284fc41969f81f62
Baseline bpf commit:        d6d86830705f173fca6087a3e67ceaf68db80523
Checkpoint bpf commit:      343e53754b21ae45530623222aa079fecd3cf942

Christy Lee (2):
  libbpf: Rename bpf_prog_attach_xattr() to bpf_prog_attach_opts()
  libbpf: Deprecate bpf_map__def() API

Coco Li (1):
  gro: add ability to control gro max packet size

Mauricio Vásquez (1):
  libbpf: Use IS_ERR_OR_NULL() in hashmap__free()

Yafang Shao (1):
  libbpf: Fix possible NULL pointer dereference when destroying skeleton

 include/uapi/linux/if_link.h | 1 +
 src/bpf.c                    | 9 +++++++--
 src/bpf.h                    | 4 ++++
 src/hashmap.c                | 3 +--
 src/libbpf.c                 | 3 +++
 src/libbpf.h                 | 3 ++-
 src/libbpf.map               | 1 +
 7 files changed, 19 insertions(+), 5 deletions(-)

--
2.30.2
2022-01-14 22:08:26 -08:00
Christy Lee
7b8e97bffc libbpf: Deprecate bpf_map__def() API
All fields accessed via bpf_map_def can now be accessed via
appropirate getters and setters. Mark bpf_map__def() API as deprecated.

Signed-off-by: Christy Lee <christylee@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220108004218.355761-6-christylee@fb.com
2022-01-14 22:08:26 -08:00
Yafang Shao
5b6dfd7f6b libbpf: Fix possible NULL pointer dereference when destroying skeleton
When I checked the code in skeleton header file generated with my own
bpf prog, I found there may be possible NULL pointer dereference when
destroying skeleton. Then I checked the in-tree bpf progs, finding that is
a common issue. Let's take the generated samples/bpf/xdp_redirect_cpu.skel.h
for example. Below is the generated code in
xdp_redirect_cpu__create_skeleton():

	xdp_redirect_cpu__create_skeleton
		struct bpf_object_skeleton *s;
		s = (struct bpf_object_skeleton *)calloc(1, sizeof(*s));
		if (!s)
			goto error;
		...
	error:
		bpf_object__destroy_skeleton(s);
		return  -ENOMEM;

After goto error, the NULL 's' will be deferenced in
bpf_object__destroy_skeleton().

We can simply fix this issue by just adding a NULL check in
bpf_object__destroy_skeleton().

Fixes: d66562fba1ce ("libbpf: Add BPF object skeleton support")
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220108134739.32541-1-laoar.shao@gmail.com
2022-01-14 22:08:26 -08:00
Christy Lee
e0de05d1b1 libbpf: Rename bpf_prog_attach_xattr() to bpf_prog_attach_opts()
All xattr APIs are being dropped, let's converge to the convention used in
high-level APIs and rename bpf_prog_attach_xattr to bpf_prog_attach_opts.

Signed-off-by: Christy Lee <christylee@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220107184604.3668544-2-christylee@fb.com
2022-01-14 22:08:26 -08:00
Mauricio Vásquez
d5daf275c7 libbpf: Use IS_ERR_OR_NULL() in hashmap__free()
hashmap__new() uses ERR_PTR() to return an error so it's better to
use IS_ERR_OR_NULL() in order to check the pointer before calling
free(). This will prevent freeing an invalid pointer if somebody calls
hashmap__free() with the result of a failed hashmap__new() call.

Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20220107152620.192327-1-mauricio@kinvolk.io
2022-01-14 22:08:26 -08:00
Coco Li
cde5b418dd gro: add ability to control gro max packet size
Eric Dumazet suggested to allow users to modify max GRO packet size.

We have seen GRO being disabled by users of appliances (such as
wifi access points) because of claimed bufferbloat issues,
or some work arounds in sch_cake, to split GRO/GSO packets.

Instead of disabling GRO completely, one can chose to limit
the maximum packet size of GRO packets, depending on their
latency constraints.

This patch adds a per device gro_max_size attribute
that can be changed with ip link command.

ip link set dev eth0 gro_max_size 16000

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Coco Li <lixiaoyan@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-14 22:08:26 -08:00
Adam Jensen
060c8a99c4 include: Include linux/stddef.h
This fixes the build in environments such as Alpine Linux.
See [0] for discussion.

  [0] https://github.com/libbpf/libbpf/pull/41

Signed-off-by: Adam Jensen
2022-01-14 12:21:53 -08:00
Kumar Kartikeya Dwivedi
22411acc4b ci: Add userfaultfd kernel config
Add necessary kernel config values to run BPF mod race test in
conntrack-bpf series.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
2022-01-11 12:40:59 -08:00
Andrii Nakryiko
e99f34e144 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   ecf45e60a62dfeb65658abac02f0bdb45b786911
Checkpoint bpf-next commit: 44bab87d8ca6f0544a9f8fc97bdf33aa5b3c899e
Baseline bpf commit:        819d11507f6637731947836e6308f5966d64cf9d
Checkpoint bpf commit:      d6d86830705f173fca6087a3e67ceaf68db80523

Andrii Nakryiko (3):
  libbpf: Normalize PT_REGS_xxx() macro definitions
  libbpf: Use 100-character limit to make bpf_tracing.h easier to read
  libbpf: Improve LINUX_VERSION_CODE detection

Christy Lee (3):
  libbpf: Deprecate bpf_perf_event_read_simple() API
  libbpf 1.0: Deprecate bpf_map__is_offload_neutral()
  libbpf 1.0: Deprecate bpf_object__find_map_by_offset() API

Grant Seltzer (1):
  libbpf: Add documentation for bpf_map batch operations

Qiang Wang (2):
  libbpf: Use probe_name for legacy kprobe
  libbpf: Support repeated legacy kprobes on same function

 src/bpf.c             |   8 +-
 src/bpf.h             | 115 ++++++++++-
 src/bpf_tracing.h     | 431 +++++++++++++++++-------------------------
 src/libbpf.c          |  56 ++++--
 src/libbpf.h          |   5 +-
 src/libbpf_internal.h |   2 +
 src/libbpf_probes.c   |  16 --
 7 files changed, 342 insertions(+), 291 deletions(-)

--
2.30.2
2022-01-06 16:20:54 -08:00
Grant Seltzer
4449d71509 libbpf: Add documentation for bpf_map batch operations
This adds documention for:

- bpf_map_delete_batch()
- bpf_map_lookup_batch()
- bpf_map_lookup_and_delete_batch()
- bpf_map_update_batch()

This also updates the public API for the `keys` parameter
of `bpf_map_delete_batch()`, and both the
`keys` and `values` parameters of `bpf_map_update_batch()`
to be constants.

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220106201304.112675-1-grantseltzer@gmail.com
2022-01-06 16:20:54 -08:00
Christy Lee
12932191c6 libbpf 1.0: Deprecate bpf_object__find_map_by_offset() API
API created with simplistic assumptions about BPF map definitions.
It hasn’t worked for a while, deprecate it in preparation for
libbpf 1.0.

  [0] Closes: https://github.com/libbpf/libbpf/issues/302

Signed-off-by: Christy Lee <christylee@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220105003120.2222673-1-christylee@fb.com
2022-01-06 16:20:54 -08:00
Christy Lee
8440112546 libbpf 1.0: Deprecate bpf_map__is_offload_neutral()
Deprecate bpf_map__is_offload_neutral(). It’s most probably broken
already. PERF_EVENT_ARRAY isn’t the only map that’s not suitable
for hardware offloading. Applications can directly check map type
instead.

  [0] Closes: https://github.com/libbpf/libbpf/issues/306

Signed-off-by: Christy Lee <christylee@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220105000601.2090044-1-christylee@fb.com
2022-01-06 16:20:54 -08:00
Qiang Wang
b0c3d7133f libbpf: Support repeated legacy kprobes on same function
If repeated legacy kprobes on same function in one process,
libbpf will register using the same probe name and got -EBUSY
error. So append index to the probe name format to fix this
problem.

Co-developed-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Qiang Wang <wangqiang.wq.frank@bytedance.com>
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211227130713.66933-2-wangqiang.wq.frank@bytedance.com
2022-01-06 16:20:54 -08:00
Qiang Wang
c2f2c26cb2 libbpf: Use probe_name for legacy kprobe
Fix a bug in commit 46ed5fc33db9, which wrongly used the
func_name instead of probe_name to register legacy kprobe.

Fixes: 46ed5fc33db9 ("libbpf: Refactor and simplify legacy kprobe code")
Co-developed-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Qiang Wang <wangqiang.wq.frank@bytedance.com>
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Hengqi Chen <hengqi.chen@gmail.com>
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Link: https://lore.kernel.org/bpf/20211227130713.66933-1-wangqiang.wq.frank@bytedance.com
2022-01-06 16:20:54 -08:00
Christy Lee
2e52e09bc2 libbpf: Deprecate bpf_perf_event_read_simple() API
With perf_buffer__poll() and perf_buffer__consume() APIs available,
there is no reason to expose bpf_perf_event_read_simple() API to
users. If users need custom perf buffer, they could re-implement
the function.

Mark bpf_perf_event_read_simple() and move the logic to a new
static function so it can still be called by other functions in the
same file.

  [0] Closes: https://github.com/libbpf/libbpf/issues/310

Signed-off-by: Christy Lee <christylee@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211229204156.13569-1-christylee@fb.com
2022-01-06 16:20:54 -08:00
Andrii Nakryiko
0171976dc5 libbpf: Improve LINUX_VERSION_CODE detection
Ubuntu reports incorrect kernel version through uname(), which on older
kernels leads to kprobe BPF programs failing to load due to the version
check mismatch.

Accommodate Ubuntu's quirks with LINUX_VERSION_CODE by using
Ubuntu-specific /proc/version_code to fetch major/minor/patch versions
to form LINUX_VERSION_CODE.

While at it, consolide libbpf's kernel version detection code between
libbpf.c and libbpf_probes.c.

  [0] Closes: https://github.com/libbpf/libbpf/issues/421

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211222231003.2334940-1-andrii@kernel.org
2022-01-06 16:20:54 -08:00
Andrii Nakryiko
3f592a59d7 libbpf: Use 100-character limit to make bpf_tracing.h easier to read
Improve bpf_tracing.h's macro definition readability by keeping them
single-line and better aligned. This makes it easier to follow all those
variadic patterns.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211222213924.1869758-2-andrii@kernel.org
2022-01-06 16:20:54 -08:00
Andrii Nakryiko
0557ad0a9c libbpf: Normalize PT_REGS_xxx() macro definitions
Refactor PT_REGS macros definitions in  bpf_tracing.h to avoid excessive
duplication. We currently have classic PT_REGS_xxx() and CO-RE-enabled
PT_REGS_xxx_CORE(). We are about to add also _SYSCALL variants, which
would require excessive copying of all the per-architecture definitions.

Instead, separate architecture-specific field/register names from the
final macro that utilize them. That way for upcoming _SYSCALL variants
we'll be able to just define x86_64 exception and otherwise have one
common set of _SYSCALL macro definitions common for all architectures.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/bpf/20211222213924.1869758-1-andrii@kernel.org
2022-01-06 16:20:54 -08:00
Kumar Kartikeya Dwivedi
7c382f0df9 ci: Add conntrack kernel config
Add necessary kernel config values to test BPF kfunc functionality for
netfilter's conntrack subsystem.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
2022-01-05 12:58:11 -08:00
Chris Tarazi
ceba6a788a travis-ci/rootfs: Fix mount(8) invocation for Arch Linux
Given that the rootfs for Arch Linux uses the busybox variant of
mount(8), the `-l` doesn't exist on that binary and gives the following
error msg with version v1.34.1 of busybox when invoking
mkrootfs_arch.sh:

```
...
[    0.781471] random: fast init done
starting pid 72, tty '': '/etc/init.d/rcS'
+ for path in /etc/rcS.d/S*
+ '[' -x /etc/rcS.d/S10-mount ']'
+ /etc/rcS.d/S10-mount
+ /bin/mount proc /proc -t proc
++ /bin/mount -l -t devtmpfs
/bin/mount: unrecognized option: l
...
```

This prevented me from generating a rootfs. This is fixed by removing
the `-l`, as plainly invoking `mount -t devtmpfs` returns the same
output with `mount -l ...` on the non-busybox variant (on Arch Linux, it
comes from the `utils-linux` package). After this change, I was able to
run `./tools/testing/selftests/bpf/vmtest.sh -i` (from the kernel src;
with modification to the script to pick up my locally-generated .zstd of
the new rootfs) and it worked.

Signed-off-by: Chris Tarazi <tarazichris@gmail.com>
2022-01-05 12:57:27 -08:00
grantseltzer
bf7aacea49 Fix comparison operator in API documentation 2022-01-04 21:17:07 -08:00
Andrii Nakryiko
af2da673d8 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   e967a20a8fabc6442a78e2e2059e63a4bb6aed08
Checkpoint bpf-next commit: ecf45e60a62dfeb65658abac02f0bdb45b786911
Baseline bpf commit:        819d11507f6637731947836e6308f5966d64cf9d
Checkpoint bpf commit:      819d11507f6637731947836e6308f5966d64cf9d

Jiri Olsa (1):
  libbpf: Do not use btf_dump__new() macro in C++ mode

 src/btf.h | 6 ++++++
 1 file changed, 6 insertions(+)

--
2.30.2
2021-12-23 20:00:21 -08:00
Jiri Olsa
1321a8bb49 libbpf: Do not use btf_dump__new() macro in C++ mode
As reported in here [0], C++ compilers don't support
__builtin_types_compatible_p(), so at least don't screw up compilation
for them and let C++ users pick btf_dump__new vs
btf_dump__new_deprecated explicitly.

  [0] https://github.com/libbpf/libbpf/issues/283#issuecomment-986100727

Fixes: 6084f5dc928f ("libbpf: Ensure btf_dump__new() and btf_dump_opts are future-proof")
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211223131736.483956-1-jolsa@kernel.org
2021-12-23 14:17:35 +01:00
grantseltzer
287d0d097b Add documentation for error checking in API
Signed-off-by: grantseltzer <grantseltzer@gmail.com>
2021-12-23 19:59:03 -08:00
Yucong Sun
9fab7c81ec ci: Add a step to patch kernel with temporary fixes
Apply a custom set of patches against bpf-next kernel tree before
building vmlinux image.
2021-12-20 20:31:42 -08:00
Andrii Nakryiko
96268bf0c2 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   a34efe503bc55c5732e328e5191ad549eb899f31
Checkpoint bpf-next commit: e967a20a8fabc6442a78e2e2059e63a4bb6aed08
Baseline bpf commit:        f7abc4c8df8c7930d0b9c56d9abee9a1fca635e9
Checkpoint bpf commit:      819d11507f6637731947836e6308f5966d64cf9d

Andrii Nakryiko (2):
  libbpf: Avoid reading past ELF data section end when copying license
  libbpf: Rework feature-probing APIs

 src/libbpf.c        |   5 +-
 src/libbpf.h        |  52 +++++++++-
 src/libbpf.map      |   3 +
 src/libbpf_probes.c | 235 ++++++++++++++++++++++++++++++++++----------
 4 files changed, 240 insertions(+), 55 deletions(-)

--
2.30.2
2021-12-17 17:13:53 -08:00
Andrii Nakryiko
168cf9b8ae libbpf: Rework feature-probing APIs
Create three extensible alternatives to inconsistently named
feature-probing APIs:

  - libbpf_probe_bpf_prog_type() instead of bpf_probe_prog_type();
  - libbpf_probe_bpf_map_type() instead of bpf_probe_map_type();
  - libbpf_probe_bpf_helper() instead of bpf_probe_helper().

Set up return values such that libbpf can report errors (e.g., if some
combination of input arguments isn't possible to validate, etc), in
addition to whether the feature is supported (return value 1) or not
supported (return value 0).

Also schedule deprecation of those three APIs. Also schedule deprecation
of bpf_probe_large_insn_limit().

Also fix all the existing detection logic for various program and map
types that never worked:

  - BPF_PROG_TYPE_LIRC_MODE2;
  - BPF_PROG_TYPE_TRACING;
  - BPF_PROG_TYPE_LSM;
  - BPF_PROG_TYPE_EXT;
  - BPF_PROG_TYPE_SYSCALL;
  - BPF_PROG_TYPE_STRUCT_OPS;
  - BPF_MAP_TYPE_STRUCT_OPS;
  - BPF_MAP_TYPE_BLOOM_FILTER.

Above prog/map types needed special setups and detection logic to work.
Subsequent patch adds selftests that will make sure that all the
detection logic keeps working for all current and future program and map
types, avoiding otherwise inevitable bit rot.

  [0] Closes: https://github.com/libbpf/libbpf/issues/312

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Cc: Julia Kartseva <hex@fb.com>
Link: https://lore.kernel.org/bpf/20211217171202.3352835-2-andrii@kernel.org
2021-12-17 17:13:53 -08:00
Andrii Nakryiko
8e706ddc6c libbpf: Avoid reading past ELF data section end when copying license
Fix possible read beyond ELF "license" data section if the license
string is not properly zero-terminated. Use the fact that libbpf_strlcpy
never accesses the (N-1)st byte of the source string because it's
replaced with '\0' anyways.

If this happens, it's a violation of contract between libbpf and a user,
but not handling this more robustly upsets CIFuzz, so given the fix is
trivial, let's fix the potential issue.

Fixes: 9fc205b413b3 ("libbpf: Add sane strncpy alternative and use it internally")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211214232054.3458774-1-andrii@kernel.org
2021-12-17 17:13:53 -08:00
Andrii Nakryiko
dc49f2d07b ci: add LIRC kernel config
Add necessary kernel config values to make BPF_PROG_TYPE_LIRC_MODE2
programs work.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-12-16 16:44:52 -08:00
Andrii Nakryiko
19656636a9 vmtest: blacklist bpf_loop and get_func_args_test for s390x
bpf_loop is using arch-specific sys_nanosleep attach function.
get_func_args_test relies on BPF trampoline.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-12-14 17:06:30 -08:00
Andrii Nakryiko
61acde2308 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   229fae38d0fc0d6ff58d57cbeb1432da55e58d4f
Checkpoint bpf-next commit: a34efe503bc55c5732e328e5191ad549eb899f31
Baseline bpf commit:        0be2516f865f5a876837184a8385163ff64a5889
Checkpoint bpf commit:      f7abc4c8df8c7930d0b9c56d9abee9a1fca635e9

Alexei Starovoitov (1):
  libbpf: Fix gen_loader assumption on number of programs.

Andrii Nakryiko (4):
  libbpf: Don't validate TYPE_ID relo's original imm value
  libbpf: Fix potential uninit memory read
  libbpf: Add sane strncpy alternative and use it internally
  libbpf: Auto-bump RLIMIT_MEMLOCK if kernel needs it for BPF

Grant Seltzer (1):
  libbpf: Add doc comments for bpf_program__(un)pin()

Hangbin Liu (1):
  Bonding: add arp_missed_max option

Hou Tao (1):
  bpf: Add bpf_strncmp helper

Jiri Olsa (1):
  bpf: Add get_func_[arg|ret|arg_cnt] helpers

Kui-Feng Lee (1):
  libbpf: Mark bpf_object__find_program_by_title API deprecated.

 include/uapi/linux/bpf.h     | 39 +++++++++++++++++
 include/uapi/linux/if_link.h |  1 +
 src/bpf.c                    | 85 +++++++++++++++++++++++++++++++++++-
 src/bpf.h                    |  2 +
 src/btf_dump.c               |  4 +-
 src/gen_loader.c             | 12 +++--
 src/libbpf.c                 | 55 +++++------------------
 src/libbpf.h                 | 25 +++++++++++
 src/libbpf.map               |  1 +
 src/libbpf_internal.h        | 58 ++++++++++++++++++++++++
 src/libbpf_legacy.h          | 12 ++++-
 src/relo_core.c              | 20 ++++++---
 src/xsk.c                    |  9 ++--
 13 files changed, 260 insertions(+), 63 deletions(-)

--
2.30.2
2021-12-14 17:06:30 -08:00
Andrii Nakryiko
266e897ad2 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2021-12-14 17:06:30 -08:00
Kui-Feng Lee
7152ecf163 libbpf: Mark bpf_object__find_program_by_title API deprecated.
Deprecate this API since v0.7.  All callers should move to
bpf_object__find_program_by_name if possible, otherwise use
bpf_object__for_each_program to find a program out from a given
section.

[0] Closes: https://github.com/libbpf/libbpf/issues/292

Signed-off-by: Kui-Feng Lee <kuifeng@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211214035931.1148209-5-kuifeng@fb.com
2021-12-14 17:06:30 -08:00
Andrii Nakryiko
216eaa760e libbpf: Auto-bump RLIMIT_MEMLOCK if kernel needs it for BPF
The need to increase RLIMIT_MEMLOCK to do anything useful with BPF is
one of the first extremely frustrating gotchas that all new BPF users go
through and in some cases have to learn it a very hard way.

Luckily, starting with upstream Linux kernel version 5.11, BPF subsystem
dropped the dependency on memlock and uses memcg-based memory accounting
instead. Unfortunately, detecting memcg-based BPF memory accounting is
far from trivial (as can be evidenced by this patch), so in practice
most BPF applications still do unconditional RLIMIT_MEMLOCK increase.

As we move towards libbpf 1.0, it would be good to allow users to forget
about RLIMIT_MEMLOCK vs memcg and let libbpf do the sensible adjustment
automatically. This patch paves the way forward in this matter. Libbpf
will do feature detection of memcg-based accounting, and if detected,
will do nothing. But if the kernel is too old, just like BCC, libbpf
will automatically increase RLIMIT_MEMLOCK on behalf of user
application ([0]).

As this is technically a breaking change, during the transition period
applications have to opt into libbpf 1.0 mode by setting
LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK bit when calling
libbpf_set_strict_mode().

Libbpf allows to control the exact amount of set RLIMIT_MEMLOCK limit
with libbpf_set_memlock_rlim_max() API. Passing 0 will make libbpf do
nothing with RLIMIT_MEMLOCK. libbpf_set_memlock_rlim_max() has to be
called before the first bpf_prog_load(), bpf_btf_load(), or
bpf_object__load() call, otherwise it has no effect and will return
-EBUSY.

  [0] Closes: https://github.com/libbpf/libbpf/issues/369

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211214195904.1785155-2-andrii@kernel.org
2021-12-14 17:06:30 -08:00
Andrii Nakryiko
a4e725f8f5 libbpf: Add sane strncpy alternative and use it internally
strncpy() has a notoriously error-prone semantics which makes GCC
complain about it a lot (and quite often completely completely falsely
at that). Instead of pleasing GCC all the time (-Wno-stringop-truncation
is unfortunately only supported by GCC, so it's a bit too messy to just
enable it in Makefile), add libbpf-internal libbpf_strlcpy() helper
which follows what FreeBSD's strlcpy() does and what most people would
expect from strncpy(): copies up to N-1 first bytes from source string
into destination string and ensures zero-termination afterwards.

Replace all the relevant uses of strncpy/strncat/memcpy in libbpf with
libbpf_strlcpy().

This also fixes the issue reported by Emmanuel Deloget in xsk.c where
memcpy() could access source string beyond its end.

Fixes: 2f6324a3937f8 (libbpf: Support shared umems between queues and devices)
Reported-by: Emmanuel Deloget <emmanuel.deloget@eho.link>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211211004043.2374068-1-andrii@kernel.org
2021-12-14 17:06:30 -08:00
Andrii Nakryiko
df5689f1c8 libbpf: Fix potential uninit memory read
In case of BPF_CORE_TYPE_ID_LOCAL we fill out target result explicitly.
But targ_res itself isn't initialized in such a case, and subsequent
call to bpf_core_patch_insn() might read uninitialized field (like
fail_memsz_adjust in this case). So ensure that targ_res is
zero-initialized for BPF_CORE_TYPE_ID_LOCAL case.

This was reported by Coverity static analyzer.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211214010032.3843804-1-andrii@kernel.org
2021-12-14 17:06:30 -08:00
Grant Seltzer
6894f573d2 libbpf: Add doc comments for bpf_program__(un)pin()
This adds doc comments for the two bpf_program pinning functions,
bpf_program__pin() and bpf_program__unpin()

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211209232222.541733-1-grantseltzer@gmail.com
2021-12-14 17:06:30 -08:00
Jiri Olsa
2b0d408764 bpf: Add get_func_[arg|ret|arg_cnt] helpers
Adding following helpers for tracing programs:

Get n-th argument of the traced function:
  long bpf_get_func_arg(void *ctx, u32 n, u64 *value)

Get return value of the traced function:
  long bpf_get_func_ret(void *ctx, u64 *value)

Get arguments count of the traced function:
  long bpf_get_func_arg_cnt(void *ctx)

The trampoline now stores number of arguments on ctx-8
address, so it's easy to verify argument index and find
return value argument's position.

Moving function ip address on the trampoline stack behind
the number of functions arguments, so it's now stored on
ctx-16 address if it's needed.

All helpers above are inlined by verifier.

Also bit unrelated small change - using newly added function
bpf_prog_has_trampoline in check_get_func_ip.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211208193245.172141-5-jolsa@kernel.org
2021-12-14 17:06:30 -08:00
Andrii Nakryiko
ac20634cdc libbpf: Don't validate TYPE_ID relo's original imm value
During linking, type IDs in the resulting linked BPF object file can
change, and so ldimm64 instructions corresponding to
BPF_CORE_TYPE_ID_TARGET and BPF_CORE_TYPE_ID_LOCAL CO-RE relos can get
their imm value out of sync with actual CO-RE relocation information
that's updated by BPF linker properly during linking process.

We could teach BPF linker to adjust such instructions, but it feels
a bit too much for linker to re-implement good chunk of
bpf_core_patch_insns logic just for this. This is a redundant safety
check for TYPE_ID relocations, as the real validation is in matching
CO-RE specs, so if that works fine, it's very unlikely that there is
something wrong with the instruction itself.

So, instead, teach libbpf (and kernel) to ignore insn->imm for
BPF_CORE_TYPE_ID_TARGET and BPF_CORE_TYPE_ID_LOCAL relos.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211213010706.100231-1-andrii@kernel.org
2021-12-14 17:06:30 -08:00
Hou Tao
04804b4710 bpf: Add bpf_strncmp helper
The helper compares two strings: one string is a null-terminated
read-only string, and another string has const max storage size
but doesn't need to be null-terminated. It can be used to compare
file name in tracing or LSM program.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211210141652.877186-2-houtao1@huawei.com
2021-12-14 17:06:30 -08:00
Alexei Starovoitov
16bb788578 libbpf: Fix gen_loader assumption on number of programs.
libbpf's obj->nr_programs includes static and global functions. That number
could be higher than the actual number of bpf programs going be loaded by
gen_loader. Passing larger nr_programs to bpf_gen__init() doesn't hurt. Those
exra stack slots will stay as zero. bpf_gen__finish() needs to check that
actual number of progs that gen_loader saw is less than or equal to
obj->nr_programs.

Fixes: ba05fd36b851 ("libbpf: Perform map fd cleanup for gen_loader in case of error")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-12-14 17:06:30 -08:00
Hangbin Liu
5eb804a2db Bonding: add arp_missed_max option
Currently, we use hard code number to verify if we are in the
arp_interval timeslice. But some user may want to reduce/extend
the verify timeslice. With the similar team option 'missed_max'
the uers could change that number based on their own environment.

Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-14 17:06:30 -08:00
Andrii Nakryiko
bcf58fc7a5 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   43174f0d4597325cb91f1f1f55263eb6e6101036
Checkpoint bpf-next commit: 229fae38d0fc0d6ff58d57cbeb1432da55e58d4f
Baseline bpf commit:        c0d95d3380ee099d735e08618c0d599e72f6c8b0
Checkpoint bpf commit:      0be2516f865f5a876837184a8385163ff64a5889

Alexei Starovoitov (8):
  libbpf: Replace btf__type_by_id() with btf_type_by_id().
  bpf: Prepare relo_core.c for kernel duty.
  bpf: Define enum bpf_core_relo_kind as uapi.
  bpf: Pass a set of bpf_core_relo-s to prog_load command.
  libbpf: Use CO-RE in the kernel in light skeleton.
  libbpf: Support init of inner maps in light skeleton.
  libbpf: Clean gen_loader's attach kind.
  libbpf: Reduce bpf_core_apply_relo_insn() stack usage.

Andrii Nakryiko (12):
  libbpf: Cleanup struct bpf_core_cand.
  libbpf: Use __u32 fields in bpf_map_create_opts
  libbpf: Add API to get/set log_level at per-program level
  libbpf: Deprecate bpf_prog_load_xattr() API
  libbpf: Fix bpf_prog_load() log_buf logic for log_level 0
  libbpf: Add OPTS-based bpf_btf_load() API
  libbpf: Allow passing preallocated log_buf when loading BTF into
    kernel
  libbpf: Allow passing user log setting through bpf_object_open_opts
  libbpf: Improve logging around BPF program loading
  libbpf: Preserve kernel error code and remove kprobe prog type
    guessing
  libbpf: Add per-program log buffer setter and getter
  libbpf: Deprecate bpf_object__load_xattr()

Eric Dumazet (1):
  tools: sync uapi/linux/if_link.h header

Grant Seltzer (1):
  libbpf: Add doc comments in libbpf.h

Joanne Koong (1):
  bpf: Add bpf_loop helper

Kumar Kartikeya Dwivedi (2):
  libbpf: Avoid double stores for success/failure case of ksym
    relocations
  libbpf: Avoid reload of imm for weak, unresolved, repeating ksym

Mehrdad Arshad Rad (1):
  libbpf: Remove duplicate assignments

Shuyi Cheng (1):
  libbpf: Add "bool skipped" to struct bpf_map

Vincent Minet (1):
  libbpf: Fix typo in btf__dedup@LIBBPF_0.0.2 definition

huangxuesen (1):
  libbpf: Fix trivial typo

 include/uapi/linux/bpf.h     | 103 +++++++++-
 include/uapi/linux/if_link.h | 293 +++++++++++++++++++++++----
 src/bpf.c                    |  88 ++++++---
 src/bpf.h                    |  30 ++-
 src/bpf_gen_internal.h       |   4 +
 src/btf.c                    |  82 +++++---
 src/gen_loader.c             | 114 ++++++++---
 src/libbpf.c                 | 371 ++++++++++++++++++++++++-----------
 src/libbpf.h                 | 109 +++++++++-
 src/libbpf.map               |   9 +
 src/libbpf_common.h          |   5 +
 src/libbpf_internal.h        |   3 +-
 src/libbpf_probes.c          |   2 +-
 src/libbpf_version.h         |   2 +-
 src/relo_core.c              | 231 ++++++++++++----------
 src/relo_core.h              | 103 +++-------
 16 files changed, 1138 insertions(+), 411 deletions(-)

--
2.30.2
2021-12-10 16:17:33 -08:00
Andrii Nakryiko
1f83414ea4 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2021-12-10 14:03:13 -08:00
Eric Dumazet
a0ddf21c92 tools: sync uapi/linux/if_link.h header
This file has not been updated for a while.

Sync it before BIG TCP patch series.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20211122184810.769159-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-10 14:03:13 -08:00
Shuyi Cheng
bb14c6f5b5 libbpf: Add "bool skipped" to struct bpf_map
Fix error: "failed to pin map: Bad file descriptor, path:
/sys/fs/bpf/_rodata_str1_1."

In the old kernel, the global data map will not be created, see [0]. So
we should skip the pinning of the global data map to avoid
bpf_object__pin_maps returning error. Therefore, when the map is not
created, we mark “map->skipped" as true and then check during relocation
and during pinning.

Fixes: 16e0c35c6f7a ("libbpf: Load global data maps lazily on legacy kernels")
Signed-off-by: Shuyi Cheng <chengshuyi@linux.alibaba.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-12-10 14:03:13 -08:00
Vincent Minet
c7dedfe23f libbpf: Fix typo in btf__dedup@LIBBPF_0.0.2 definition
The btf__dedup_deprecated name was misspelled in the definition of the
compat symbol for btf__dedup. This leads it to be missing from the
shared library.

This fixes it.

Fixes: 957d350a8b94 ("libbpf: Turn btf_dedup_opts into OPTS-based struct")
Signed-off-by: Vincent Minet <vincent@vincent-minet.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211210063112.80047-1-vincent@vincent-minet.net
2021-12-10 14:03:13 -08:00
Andrii Nakryiko
f13e766fa4 libbpf: Deprecate bpf_object__load_xattr()
Deprecate non-extensible bpf_object__load_xattr() in v0.8 ([0]).

With log_level control through bpf_object_open_opts or
bpf_program__set_log_level(), we are finally at the point where
bpf_object__load_xattr() doesn't provide any functionality that can't be
accessed through other (better) ways. The other feature,
target_btf_path, is also controllable through bpf_object_open_opts.

  [0] Closes: https://github.com/libbpf/libbpf/issues/289

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211209193840.1248570-9-andrii@kernel.org
2021-12-10 14:03:13 -08:00
Andrii Nakryiko
90910812b5 libbpf: Add per-program log buffer setter and getter
Allow to set user-provided log buffer on a per-program basis ([0]). This
gives great deal of flexibility in terms of which programs are loaded
with logging enabled and where corresponding logs go.

Log buffer set with bpf_program__set_log_buf() overrides kernel_log_buf
and kernel_log_size settings set at bpf_object open time through
bpf_object_open_opts, if any.

Adjust bpf_object_load_prog_instance() logic to not perform own log buf
allocation and load retry if custom log buffer is provided by the user.

  [0] Closes: https://github.com/libbpf/libbpf/issues/418

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211209193840.1248570-8-andrii@kernel.org
2021-12-10 14:03:13 -08:00
Andrii Nakryiko
eb9d74e7ad libbpf: Preserve kernel error code and remove kprobe prog type guessing
Instead of rewriting error code returned by the kernel of prog load with
libbpf-sepcific variants pass through the original error.

There is now also no need to have a backup generic -LIBBPF_ERRNO__LOAD
fallback error as bpf_prog_load() guarantees that errno will be properly
set no matter what.

Also drop a completely outdated and pretty useless BPF_PROG_TYPE_KPROBE
guess logic. It's not necessary and neither it's helpful in modern BPF
applications.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211209193840.1248570-7-andrii@kernel.org
2021-12-10 14:03:13 -08:00
Andrii Nakryiko
d9b3fae391 libbpf: Improve logging around BPF program loading
Add missing "prog '%s': " prefixes in few places and use consistently
markers for beginning and end of program load logs. Here's an example of
log output:

libbpf: prog 'handler': BPF program load failed: Permission denied
libbpf: -- BEGIN PROG LOAD LOG ---
arg#0 reference type('UNKNOWN ') size cannot be determined: -22
; out1 = in1;
0: (18) r1 = 0xffffc9000cdcc000
2: (61) r1 = *(u32 *)(r1 +0)

...

81: (63) *(u32 *)(r4 +0) = r5
 R1_w=map_value(id=0,off=16,ks=4,vs=20,imm=0) R4=map_value(id=0,off=400,ks=4,vs=16,imm=0)
invalid access to map value, value_size=16 off=400 size=4
R4 min value is outside of the allowed memory range
processed 63 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
 -- END PROG LOAD LOG --
libbpf: failed to load program 'handler'
libbpf: failed to load object 'test_skeleton'

The entire verifier log, including BEGIN and END markers are now always
youtput during a single print callback call. This should make it much
easier to post-process or parse it, if necessary. It's not an explicit
API guarantee, but it can be reasonably expected to stay like that.

Also __bpf_object__open is renamed to bpf_object_open() as it's always
an adventure to find the exact function that implements bpf_object's
open phase, so drop the double underscored and use internal libbpf
naming convention.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211209193840.1248570-6-andrii@kernel.org
2021-12-10 14:03:13 -08:00
Andrii Nakryiko
dc1df24314 libbpf: Allow passing user log setting through bpf_object_open_opts
Allow users to provide their own custom log_buf, log_size, and log_level
at bpf_object level through bpf_object_open_opts. This log_buf will be
used during BTF loading. Subsequent patch will use same log_buf during
BPF program loading, unless overriden at per-bpf_program level.

When such custom log_buf is provided, libbpf won't be attempting
retrying loading of BTF to try to provide its own log buffer to capture
kernel's error log output. User is responsible to provide big enough
buffer, otherwise they run a risk of getting -ENOSPC error from the
bpf() syscall.

See also comments in bpf_object_open_opts regarding log_level and
log_buf interactions.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211209193840.1248570-5-andrii@kernel.org
2021-12-10 14:03:13 -08:00
Andrii Nakryiko
0504b7ff22 libbpf: Allow passing preallocated log_buf when loading BTF into kernel
Add libbpf-internal btf_load_into_kernel() that allows to pass
preallocated log_buf and custom log_level to be passed into kernel
during BPF_BTF_LOAD call. When custom log_buf is provided,
btf_load_into_kernel() won't attempt an retry with automatically
allocated internal temporary buffer to capture BTF validation log.

It's important to note the relation between log_buf and log_level, which
slightly deviates from stricter kernel logic. From kernel's POV, if
log_buf is specified, log_level has to be > 0, and vice versa. While
kernel has good reasons to request such "sanity, this, in practice, is
a bit unconvenient and restrictive for libbpf's high-level bpf_object APIs.

So libbpf will allow to set non-NULL log_buf and log_level == 0. This is
fine and means to attempt to load BTF without logging requested, but if
it failes, retry the load with custom log_buf and log_level 1. Similar
logic will be implemented for program loading. In practice this means
that users can provide custom log buffer just in case error happens, but
not really request slower verbose logging all the time. This is also
consistent with libbpf behavior when custom log_buf is not set: libbpf
first tries to load everything with log_level=0, and only if error
happens allocates internal log buffer and retries with log_level=1.

Also, while at it, make BTF validation log more obvious and follow the log
pattern libbpf is using for dumping BPF verifier log during
BPF_PROG_LOAD. BTF loading resulting in an error will look like this:

libbpf: BTF loading error: -22
libbpf: -- BEGIN BTF LOAD LOG ---
magic: 0xeb9f
version: 1
flags: 0x0
hdr_len: 24
type_off: 0
type_len: 1040
str_off: 1040
str_len: 2063598257
btf_total_size: 1753
Total section length too long
-- END BTF LOAD LOG --
libbpf: Error loading .BTF into kernel: -22. BTF is optional, ignoring.

This makes it much easier to find relevant parts in libbpf log output.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211209193840.1248570-4-andrii@kernel.org
2021-12-10 14:03:13 -08:00
Andrii Nakryiko
3c93f7ddb2 libbpf: Add OPTS-based bpf_btf_load() API
Similar to previous bpf_prog_load() and bpf_map_create() APIs, add
bpf_btf_load() API which is taking optional OPTS struct. Schedule
bpf_load_btf() for deprecation in v0.8 ([0]).

This makes naming consistent with BPF_BTF_LOAD command, sets up an API
for extensibility in the future, moves options parameters (log-related
fields) into optional options, and also allows to pass log_level
directly.

It also removes log buffer auto-allocation logic from low-level API
(consistent with bpf_prog_load() behavior), but preserves a special
treatment of log_level == 0 with non-NULL log_buf, which matches
low-level bpf_prog_load() and high-level libbpf APIs for BTF and program
loading behaviors.

  [0] Closes: https://github.com/libbpf/libbpf/issues/419

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211209193840.1248570-3-andrii@kernel.org
2021-12-10 14:03:13 -08:00
Andrii Nakryiko
896a3ae0d0 libbpf: Fix bpf_prog_load() log_buf logic for log_level 0
To unify libbpf APIs behavior w.r.t. log_buf and log_level, fix
bpf_prog_load() to follow the same logic as bpf_btf_load() and
high-level bpf_object__load() API will follow in the subsequent patches:
  - if log_level is 0 and non-NULL log_buf is provided by a user, attempt
    load operation initially with no log_buf and log_level set;
  - if successful, we are done, return new FD;
  - on error, retry the load operation with log_level bumped to 1 and
    log_buf set; this way verbose logging will be requested only when we
    are sure that there is a failure, but will be fast in the
    common/expected success case.

Of course, user can still specify log_level > 0 from the very beginning
to force log collection.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211209193840.1248570-2-andrii@kernel.org
2021-12-10 14:03:13 -08:00
Grant Seltzer
728b1721e5 libbpf: Add doc comments in libbpf.h
This adds comments above functions in libbpf.h which document
their uses. These comments are of a format that doxygen and sphinx
can pick up and render. These are rendered by libbpf.readthedocs.org

These doc comments are for:

- bpf_object__open_file()
- bpf_object__open_mem()
- bpf_program__attach_uprobe()
- bpf_program__attach_uprobe_opts()

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211206203709.332530-1-grantseltzer@gmail.com
2021-12-10 14:03:13 -08:00
huangxuesen
04941813a5 libbpf: Fix trivial typo
Fix typo in comment from 'bpf_skeleton_map' to 'bpf_map_skeleton'
and from 'bpf_skeleton_prog' to 'bpf_prog_skeleton'.

Signed-off-by: huangxuesen <huangxuesen@kuaishou.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1638755236-3851199-1-git-send-email-hxseverything@gmail.com
2021-12-10 14:03:13 -08:00
Alexei Starovoitov
c3c540b402 libbpf: Reduce bpf_core_apply_relo_insn() stack usage.
Reduce bpf_core_apply_relo_insn() stack usage and bump
BPF_CORE_SPEC_MAX_LEN limit back to 64.

Fixes: 29db4bea1d10 ("bpf: Prepare relo_core.c for kernel duty.")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211203182836.16646-1-alexei.starovoitov@gmail.com
2021-12-10 14:03:13 -08:00
Andrii Nakryiko
b633ace366 libbpf: Deprecate bpf_prog_load_xattr() API
bpf_prog_load_xattr() is high-level API that's named as a low-level
BPF_PROG_LOAD wrapper APIs, but it actually operates on struct
bpf_object. It's badly and confusingly misnamed as it will load all the
progs insige bpf_object, returning prog_fd of the very first BPF
program. It also has a bunch of ad-hoc things like log_level override,
map_ifindex auto-setting, etc. All this can be expressed more explicitly
and cleanly through existing libbpf APIs. This patch marks
bpf_prog_load_xattr() for deprecation in libbpf v0.8 ([0]).

  [0] Closes: https://github.com/libbpf/libbpf/issues/308

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211201232824.3166325-10-andrii@kernel.org
2021-12-10 14:03:13 -08:00
Andrii Nakryiko
d761220e33 libbpf: Add API to get/set log_level at per-program level
Add bpf_program__set_log_level() and bpf_program__log_level() to fetch
and adjust log_level sent during BPF_PROG_LOAD command. This allows to
selectively request more or less verbose output in BPF verifier log.

Also bump libbpf version to 0.7 and make these APIs the first in v0.7.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211201232824.3166325-3-andrii@kernel.org
2021-12-10 14:03:13 -08:00
Andrii Nakryiko
ac1c007607 libbpf: Use __u32 fields in bpf_map_create_opts
Corresponding Linux UAPI struct uses __u32, not int, so keep it
consistent.

Fixes: 992c4225419a ("libbpf: Unify low-level map creation APIs w/ new bpf_map_create()")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211201232824.3166325-2-andrii@kernel.org
2021-12-10 14:03:13 -08:00
Alexei Starovoitov
01b2b45e8d libbpf: Clean gen_loader's attach kind.
The gen_loader has to clear attach_kind otherwise the programs
without attach_btf_id will fail load if they follow programs
with attach_btf_id.

Fixes: 67234743736a ("libbpf: Generate loader program out of BPF ELF file.")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211201181040.23337-12-alexei.starovoitov@gmail.com
2021-12-10 14:03:13 -08:00
Alexei Starovoitov
a7935b996f libbpf: Support init of inner maps in light skeleton.
Add ability to initialize inner maps in light skeleton.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211201181040.23337-11-alexei.starovoitov@gmail.com
2021-12-10 14:03:13 -08:00
Alexei Starovoitov
5f887b332c libbpf: Use CO-RE in the kernel in light skeleton.
Without lskel the CO-RE relocations are processed by libbpf before any other
work is done. Instead, when lskel is needed, remember relocation as RELO_CORE
kind. Then when loader prog is generated for a given bpf program pass CO-RE
relos of that program to gen loader via bpf_gen__record_relo_core(). The gen
loader will remember them as-is and pass it later as-is into the kernel.

The normal libbpf flow is to process CO-RE early before call relos happen. In
case of gen_loader the core relos have to be added to other relos to be copied
together when bpf static function is appended in different places to other main
bpf progs. During the copy the append_subprog_relos() will adjust insn_idx for
normal relos and for RELO_CORE kind too. When that is done each struct
reloc_desc has good relos for specific main prog.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211201181040.23337-10-alexei.starovoitov@gmail.com
2021-12-10 14:03:13 -08:00
Andrii Nakryiko
20e7ed521a libbpf: Cleanup struct bpf_core_cand.
Remove two redundant fields from struct bpf_core_cand.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211201181040.23337-8-alexei.starovoitov@gmail.com
2021-12-10 14:03:13 -08:00
Alexei Starovoitov
d785a21c71 bpf: Pass a set of bpf_core_relo-s to prog_load command.
struct bpf_core_relo is generated by llvm and processed by libbpf.
It's a de-facto uapi.
With CO-RE in the kernel the struct bpf_core_relo becomes uapi de-jure.
Add an ability to pass a set of 'struct bpf_core_relo' to prog_load command
and let the kernel perform CO-RE relocations.

Note the struct bpf_line_info and struct bpf_func_info have the same
layout when passed from LLVM to libbpf and from libbpf to the kernel
except "insn_off" fields means "byte offset" when LLVM generates it.
Then libbpf converts it to "insn index" to pass to the kernel.
The struct bpf_core_relo's "insn_off" field is always "byte offset".

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211201181040.23337-6-alexei.starovoitov@gmail.com
2021-12-10 14:03:13 -08:00
Alexei Starovoitov
8e43882e53 bpf: Define enum bpf_core_relo_kind as uapi.
enum bpf_core_relo_kind is generated by llvm and processed by libbpf.
It's a de-facto uapi.
With CO-RE in the kernel the bpf_core_relo_kind values become uapi de-jure.
Also rename them with BPF_CORE_ prefix to distinguish from conflicting names in
bpf_core_read.h. The enums bpf_field_info_kind, bpf_type_id_kind,
bpf_type_info_kind, bpf_enum_value_kind are passing different values from bpf
program into llvm.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211201181040.23337-5-alexei.starovoitov@gmail.com
2021-12-10 14:03:13 -08:00
Alexei Starovoitov
2be7e6a830 bpf: Prepare relo_core.c for kernel duty.
Make relo_core.c to be compiled for the kernel and for user space libbpf.

Note the patch is reducing BPF_CORE_SPEC_MAX_LEN from 64 to 32.
This is the maximum number of nested structs and arrays.
For example:
 struct sample {
     int a;
     struct {
         int b[10];
     };
 };

 struct sample *s = ...;
 int *y = &s->b[5];
This field access is encoded as "0:1:0:5" and spec len is 4.

The follow up patch might bump it back to 64.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211201181040.23337-4-alexei.starovoitov@gmail.com
2021-12-10 14:03:13 -08:00
Alexei Starovoitov
b9bd1f8682 libbpf: Replace btf__type_by_id() with btf_type_by_id().
To prepare relo_core.c to be compiled in the kernel and the user space
replace btf__type_by_id with btf_type_by_id.

In libbpf btf__type_by_id and btf_type_by_id have different behavior.

bpf_core_apply_relo_insn() needs behavior of uapi btf__type_by_id
vs internal btf_type_by_id, but type_id range check is already done
in bpf_core_apply_relo(), so it's safe to replace it everywhere.
The kernel btf_type_by_id() does the check anyway. It doesn't hurt.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211201181040.23337-2-alexei.starovoitov@gmail.com
2021-12-10 14:03:13 -08:00
Kumar Kartikeya Dwivedi
c4da8092cc libbpf: Avoid reload of imm for weak, unresolved, repeating ksym
Alexei pointed out that we can use BPF_REG_0 which already contains imm
from move_blob2blob computation. Note that we now compare the second
insn's imm, but this should not matter, since both will be zeroed out
for the error case for the insn populated earlier.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211122235733.634914-4-memxor@gmail.com
2021-12-10 14:03:13 -08:00
Kumar Kartikeya Dwivedi
b2b45a3131 libbpf: Avoid double stores for success/failure case of ksym relocations
Instead, jump directly to success case stores in case ret >= 0, else do
the default 0 value store and jump over the success case. This is better
in terms of readability. Readjust the code for kfunc relocation as well
to follow a similar pattern, also leads to easier to follow code now.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211122235733.634914-3-memxor@gmail.com
2021-12-10 14:03:13 -08:00
Joanne Koong
73c8768db7 bpf: Add bpf_loop helper
This patch adds the kernel-side and API changes for a new helper
function, bpf_loop:

long bpf_loop(u32 nr_loops, void *callback_fn, void *callback_ctx,
u64 flags);

where long (*callback_fn)(u32 index, void *ctx);

bpf_loop invokes the "callback_fn" **nr_loops** times or until the
callback_fn returns 1. The callback_fn can only return 0 or 1, and
this is enforced by the verifier. The callback_fn index is zero-indexed.

A few things to please note:
~ The "u64 flags" parameter is currently unused but is included in
case a future use case for it arises.
~ In the kernel-side implementation of bpf_loop (kernel/bpf/bpf_iter.c),
bpf_callback_t is used as the callback function cast.
~ A program can have nested bpf_loop calls but the program must
still adhere to the verifier constraint of its stack depth (the stack depth
cannot exceed MAX_BPF_STACK))
~ Recursive callback_fns do not pass the verifier, due to the call stack
for these being too deep.
~ The next patch will include the tests and benchmark

Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211130030622.4131246-2-joannekoong@fb.com
2021-12-10 14:03:13 -08:00
Mehrdad Arshad Rad
bafda72319 libbpf: Remove duplicate assignments
There is a same action when load_attr.attach_btf_id is initialized.

Signed-off-by: Mehrdad Arshad Rad <arshad.rad@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211128193337.10628-1-arshad.rad@gmail.com
2021-12-10 14:03:13 -08:00
Andrii Nakryiko
33ec2ca026 sync: improve patch application process by using patch command
git apply -3 doesn't always leave conflicted files in the working
directory. Use patch --merge instead, it seems to work better in more
complicated situations.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-12-09 22:43:20 -08:00
Yucong Sun
7e89be4022 Migrate vmtest to modular actions in libbpf/ci 2021-12-09 14:09:04 -08:00
thiagoftsm
e61e089911 Merge branch 'libbpf:master' into master 2021-12-05 19:50:33 +00:00
Ilya Leoshkevich
93e89b3474 ci: upgrade s390x runner to v2.285.0
This is needed for composite actions with conditional steps.

Fixes #416.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
2021-12-03 10:42:54 -08:00
thiagoftsm
b9d46530c3 Merge branch 'libbpf:master' into master 2021-12-02 19:07:38 +00:00
Quentin Monnet
4884bf3dbd ci: fix test on /exitstatus existence and size
The condition for the test is incorrect, we want to add a default exit
status if the file is empty. But [[ -s file ]] returns true if the file
exists and has a size greater than zero. Let's reverse the condition.

Fixes: 385b2d1738 ("ci: change VM's /exitstatus format to prepare it for several results")
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
2021-12-01 15:46:22 -08:00
Andrii Nakryiko
690d0531f9 ci: whitelist legacy_printk tests on 4.9 and 5.5
legacy_printk selftests is specially designed to be runnable on old
kernels and validate libbpf's bpf_trace_printk-related macros. Run them
in CI.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-11-30 16:39:31 -08:00
Quentin Monnet
7cda69caeb ci: add folding markers to avoid getting output out of sections
Move commands to existing sections, or add markers for new sections when
no surrounding existing section is relevant. This is to avoid having
logs outside of subsection for the “vmtest” step of the GitHub action.
2021-11-30 14:39:02 -08:00
Quentin Monnet
1f7db672e4 ci: carry on after selftest failure and report test group results
Two changes come with this patch:

- Test groups report their exit status individually for the final
  summary, by appending it to /exitstatus in the VM. “Test groups” are
  test_maps, test_verifier, test_progs, and test_progs-no_alu32.

- For these separate reports to make sense, allow the CI action to carry
  on even after one of the groups fails, by adding "&& true" to the
  commands in order to neutralise the effect of the "set -e".
2021-11-30 14:39:02 -08:00
Quentin Monnet
385b2d1738 ci: change VM's /exitstatus format to prepare it for several results
We recently introduced a summary for the results of the different groups
of tests for the CI, displayed after the machine is shut down. There are
currently two groups, "bpftool" and "vm_tests". We want to split the
latter into different subgroups. For that, we will make each group of
tests that runs in the VM print its exit status to the /exitstatus file.

In preparation for this, let's update the format of this /exitstatus
file. Instead of containing just an integer, it now contains a line with
a group name, a colon, and the integer result. This is easy enough to
parse on the other end.

We also drop the associative array, and iterate on /exitstatus instead
to produce the summary: this way, the order of the checks is preserved.
2021-11-30 14:39:02 -08:00
Quentin Monnet
7f11cd48d6 ci: create helpers for formatting errors and notices
Create helpers for formatting errors and notices for GitHub actions,
instead of directly printing the double colons and attributes.
2021-11-30 14:39:02 -08:00
Andrii Nakryiko
4374bad784 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   8f6f41f39348f25db843f2fcb2f1c166b4bfa2d7
Checkpoint bpf-next commit: 43174f0d4597325cb91f1f1f55263eb6e6101036
Baseline bpf commit:        c0d95d3380ee099d735e08618c0d599e72f6c8b0
Checkpoint bpf commit:      c0d95d3380ee099d735e08618c0d599e72f6c8b0

Alan Maguire (1):
  libbpf: Silence uninitialized warning/error in btf_dump_dump_type_data

Hengqi Chen (1):
  libbpf: Support static initialization of BPF_MAP_TYPE_PROG_ARRAY

Tiezhu Yang (1):
  bpf, mips: Fix build errors about __NR_bpf undeclared

 src/bpf.c           |   6 ++
 src/btf_dump.c      |   2 +-
 src/libbpf.c        | 154 ++++++++++++++++++++++++++++++++++----------
 src/skel_internal.h |  10 +++
 4 files changed, 138 insertions(+), 34 deletions(-)

--
2.30.2
2021-11-29 11:20:00 -08:00
Alan Maguire
55b057565f libbpf: Silence uninitialized warning/error in btf_dump_dump_type_data
When compiling libbpf with gcc 4.8.5, we see:

  CC       staticobjs/btf_dump.o
btf_dump.c: In function ‘btf_dump_dump_type_data.isra.24’:
btf_dump.c:2296:5: error: ‘err’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
  if (err < 0)
     ^
cc1: all warnings being treated as errors
make: *** [staticobjs/btf_dump.o] Error 1

While gcc 4.8.5 is too old to build the upstream kernel, it's possible it
could be used to build standalone libbpf which suffers from the same problem.
Silence the error by initializing 'err' to 0.  The warning/error seems to be
a false positive since err is set early in the function.  Regardless we
shouldn't prevent libbpf from building for this.

Fixes: 920d16af9b42 ("libbpf: BTF dumper support for typed data")
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1638180040-8037-1-git-send-email-alan.maguire@oracle.com
2021-11-29 11:20:00 -08:00
Hengqi Chen
472c0726e8 libbpf: Support static initialization of BPF_MAP_TYPE_PROG_ARRAY
Support static initialization of BPF_MAP_TYPE_PROG_ARRAY with a
syntax similar to map-in-map initialization ([0]):

    SEC("socket")
    int tailcall_1(void *ctx)
    {
        return 0;
    }

    struct {
        __uint(type, BPF_MAP_TYPE_PROG_ARRAY);
        __uint(max_entries, 2);
        __uint(key_size, sizeof(__u32));
        __array(values, int (void *));
    } prog_array_init SEC(".maps") = {
        .values = {
            [1] = (void *)&tailcall_1,
        },
    };

Here's the relevant part of libbpf debug log showing what's
going on with prog-array initialization:

libbpf: sec '.relsocket': collecting relocation for section(3) 'socket'
libbpf: sec '.relsocket': relo #0: insn #2 against 'prog_array_init'
libbpf: prog 'entry': found map 0 (prog_array_init, sec 4, off 0) for insn #0
libbpf: .maps relo #0: for 3 value 0 rel->r_offset 32 name 53 ('tailcall_1')
libbpf: .maps relo #0: map 'prog_array_init' slot [1] points to prog 'tailcall_1'
libbpf: map 'prog_array_init': created successfully, fd=5
libbpf: map 'prog_array_init': slot [1] set to prog 'tailcall_1' fd=6

  [0] Closes: https://github.com/libbpf/libbpf/issues/354

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211128141633.502339-2-hengqi.chen@gmail.com
2021-11-29 11:20:00 -08:00
Tiezhu Yang
c86cb27d5b bpf, mips: Fix build errors about __NR_bpf undeclared
Add the __NR_bpf definitions to fix the following build errors for mips:

  $ cd tools/bpf/bpftool
  $ make
  [...]
  bpf.c:54:4: error: #error __NR_bpf not defined. libbpf does not support your arch.
   #  error __NR_bpf not defined. libbpf does not support your arch.
      ^~~~~
  bpf.c: In function ‘sys_bpf’:
  bpf.c:66:17: error: ‘__NR_bpf’ undeclared (first use in this function); did you mean ‘__NR_brk’?
    return syscall(__NR_bpf, cmd, attr, size);
                   ^~~~~~~~
                   __NR_brk
  [...]
  In file included from gen_loader.c:15:0:
  skel_internal.h: In function ‘skel_sys_bpf’:
  skel_internal.h:53:17: error: ‘__NR_bpf’ undeclared (first use in this function); did you mean ‘__NR_brk’?
    return syscall(__NR_bpf, cmd, attr, size);
                   ^~~~~~~~
                   __NR_brk

We can see the following generated definitions:

  $ grep -r "#define __NR_bpf" arch/mips
  arch/mips/include/generated/uapi/asm/unistd_o32.h:#define __NR_bpf (__NR_Linux + 355)
  arch/mips/include/generated/uapi/asm/unistd_n64.h:#define __NR_bpf (__NR_Linux + 315)
  arch/mips/include/generated/uapi/asm/unistd_n32.h:#define __NR_bpf (__NR_Linux + 319)

The __NR_Linux is defined in arch/mips/include/uapi/asm/unistd.h:

  $ grep -r "#define __NR_Linux" arch/mips
  arch/mips/include/uapi/asm/unistd.h:#define __NR_Linux	4000
  arch/mips/include/uapi/asm/unistd.h:#define __NR_Linux	5000
  arch/mips/include/uapi/asm/unistd.h:#define __NR_Linux	6000

That is to say, __NR_bpf is:

  4000 + 355 = 4355 for mips o32,
  6000 + 319 = 6319 for mips n32,
  5000 + 315 = 5315 for mips n64.

So use the GCC pre-defined macro _ABIO32, _ABIN32 and _ABI64 [1] to define
the corresponding __NR_bpf.

This patch is similar with commit bad1926dd2f6 ("bpf, s390: fix build for
libbpf and selftest suite").

  [1] https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/mips/mips.h#l549

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/1637804167-8323-1-git-send-email-yangtiezhu@loongson.cn
2021-11-29 11:20:00 -08:00
Andrii Nakryiko
3ef05a585e sync: try harder when git am -3 fails
`git am -3` will give up frequently even in cases when patch can be
auto-merged with:

```
Applying: libbpf: Unify low-level map creation APIs w/ new bpf_map_create()
error: sha1 information is lacking or useless (src/libbpf.c).
error: could not build fake ancestor
Patch failed at 0001 libbpf: Unify low-level map creation APIs w/ new bpf_map_create()
```

But `git apply -3` in the same situation will succeed with three-way merge just
fine:

```
error: patch failed: src/bpf_gen_internal.h:51
Falling back to three-way merge...
Applied patch to 'src/bpf_gen_internal.h' cleanly.
```

So if git am fails, try git apply and if that succeeds, automatically
`git am --continue`. If not, fallback to user actions.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-11-26 13:51:29 -08:00
Andrii Nakryiko
493bfa8a59 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   fa721d4f0b91f525339996f4faef7bb072d70162
Checkpoint bpf-next commit: 8f6f41f39348f25db843f2fcb2f1c166b4bfa2d7
Baseline bpf commit:        c0d95d3380ee099d735e08618c0d599e72f6c8b0
Checkpoint bpf commit:      c0d95d3380ee099d735e08618c0d599e72f6c8b0

Andrii Nakryiko (8):
  libbpf: Load global data maps lazily on legacy kernels
  libbpf: Unify low-level map creation APIs w/ new bpf_map_create()
  libbpf: Use bpf_map_create() consistently internally
  libbpf: Prevent deprecation warnings in xsk.c
  libbpf: Fix potential misaligned memory access in btf_ext__new()
  libbpf: Don't call libc APIs with NULL pointers
  libbpf: Fix glob_syms memory leak in bpf_linker
  libbpf: Fix using invalidated memory in bpf_linker

 src/bpf.c              | 140 +++++++++++++++++------------------------
 src/bpf.h              |  33 +++++++++-
 src/bpf_gen_internal.h |   5 +-
 src/btf.c              |  10 +--
 src/btf.h              |   2 +-
 src/gen_loader.c       |  46 +++++---------
 src/libbpf.c           | 107 +++++++++++++++++--------------
 src/libbpf.map         |   1 +
 src/libbpf_internal.h  |  21 -------
 src/libbpf_probes.c    |  30 ++++-----
 src/linker.c           |   6 +-
 src/skel_internal.h    |   3 +-
 src/xsk.c              |  18 +++---
 13 files changed, 204 insertions(+), 218 deletions(-)

--
2.30.2
2021-11-26 13:51:29 -08:00
Andrii Nakryiko
9f006f1ed6 libbpf: Fix using invalidated memory in bpf_linker
add_dst_sec() can invalidate bpf_linker's section index making
dst_symtab pointer pointing into unallocated memory. Reinitialize
dst_symtab pointer on each iteration to make sure it's always valid.

Fixes: faf6ed321cf6 ("libbpf: Add BPF static linker APIs")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211124002325.1737739-7-andrii@kernel.org
2021-11-26 13:51:29 -08:00
Andrii Nakryiko
5fc0d66cad libbpf: Fix glob_syms memory leak in bpf_linker
glob_syms array wasn't freed on bpf_link__free(). Fix that.

Fixes: a46349227cd8 ("libbpf: Add linker extern resolution support for functions and global variables")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211124002325.1737739-6-andrii@kernel.org
2021-11-26 13:51:29 -08:00
Andrii Nakryiko
37c3e92657 libbpf: Don't call libc APIs with NULL pointers
Sanitizer complains about qsort(), bsearch(), and memcpy() being called
with NULL pointer. This can only happen when the associated number of
elements is zero, so no harm should be done. But still prevent this from
happening to keep sanitizer runs clean from extra noise.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211124002325.1737739-5-andrii@kernel.org
2021-11-26 13:51:29 -08:00
Andrii Nakryiko
25eb5c4e02 libbpf: Fix potential misaligned memory access in btf_ext__new()
Perform a memory copy before we do the sanity checks of btf_ext_hdr.
This prevents misaligned memory access if raw btf_ext data is not 4-byte
aligned ([0]).

While at it, also add missing const qualifier.

  [0] Closes: https://github.com/libbpf/libbpf/issues/391

Fixes: 2993e0515bb4 ("tools/bpf: add support to read .BTF.ext sections")
Reported-by: Evgeny Vereshchagin <evvers@ya.ru>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211124002325.1737739-3-andrii@kernel.org
2021-11-26 13:51:29 -08:00
Andrii Nakryiko
07e4e0cb04 libbpf: Prevent deprecation warnings in xsk.c
xsk.c is using own APIs that are marked for deprecation internally.
Given xsk.c and xsk.h will be gone in libbpf 1.0, there is no reason to
do public vs internal function split just to avoid deprecation warnings.
So just add a pragma to silence deprecation warnings (until the code is
removed completely).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211124193233.3115996-4-andrii@kernel.org
2021-11-26 13:51:29 -08:00
Andrii Nakryiko
316b60fa89 libbpf: Use bpf_map_create() consistently internally
Remove all the remaining uses of to-be-deprecated bpf_create_map*() APIs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211124193233.3115996-3-andrii@kernel.org
2021-11-26 13:51:29 -08:00
Andrii Nakryiko
6cfb97c561 libbpf: Unify low-level map creation APIs w/ new bpf_map_create()
Mark the entire zoo of low-level map creation APIs for deprecation in
libbpf 0.7 ([0]) and introduce a new bpf_map_create() API that is
OPTS-based (and thus future-proof) and matches the BPF_MAP_CREATE
command name.

While at it, ensure that gen_loader sends map_extra field. Also remove
now unneeded btf_key_type_id/btf_value_type_id logic that libbpf is
doing anyways.

  [0] Closes: https://github.com/libbpf/libbpf/issues/282

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211124193233.3115996-2-andrii@kernel.org
2021-11-26 13:51:29 -08:00
Andrii Nakryiko
5c31bcf220 libbpf: Load global data maps lazily on legacy kernels
Load global data maps lazily, if kernel is too old to support global
data. Make sure that programs are still correct by detecting if any of
the to-be-loaded programs have relocation against any of such maps.

This allows to solve the issue ([0]) with bpf_printk() and Clang
generating unnecessary and unreferenced .rodata.strX.Y sections, but it
also goes further along the CO-RE lines, allowing to have a BPF object
in which some code can work on very old kernels and relies only on BPF
maps explicitly, while other BPF programs might enjoy global variable
support. If such programs are correctly set to not load at runtime on
old kernels, bpf_object will load and function correctly now.

  [0] https://lore.kernel.org/bpf/CAK-59YFPU3qO+_pXWOH+c1LSA=8WA1yabJZfREjOEXNHAqgXNg@mail.gmail.com/

Fixes: aed659170a31 ("libbpf: Support multiple .rodata.* and .data.* BPF maps")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211123200105.387855-1-andrii@kernel.org
2021-11-26 13:51:29 -08:00
Andrii Nakryiko
5b4dbd8141 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   d41bc48bfab2076f7db88d079a3a3203dd9c4a54
Checkpoint bpf-next commit: fa721d4f0b91f525339996f4faef7bb072d70162
Baseline bpf commit:        099f896f498a2b26d84f4ddae039b2c542c18b48
Checkpoint bpf commit:      c0d95d3380ee099d735e08618c0d599e72f6c8b0

Andrii Nakryiko (2):
  libbpf: Add runtime APIs to query libbpf version
  libbpf: Accommodate DWARF/compiler bug with duplicated structs

Dave Tucker (1):
  bpf, docs: Fix ordering of bpf documentation

Florent Revest (1):
  libbpf: Change bpf_program__set_extra_flags to bpf_program__set_flags

 docs/index.rst |  4 ++--
 src/btf.c      | 45 +++++++++++++++++++++++++++++++++++++++++----
 src/libbpf.c   | 23 +++++++++++++++++++++--
 src/libbpf.h   |  6 +++++-
 src/libbpf.map |  5 ++++-
 5 files changed, 73 insertions(+), 10 deletions(-)

--
2.30.2
2021-11-23 23:04:18 -08:00
Florent Revest
14e12f4290 libbpf: Change bpf_program__set_extra_flags to bpf_program__set_flags
bpf_program__set_extra_flags has just been introduced so we can still
change it without breaking users.

This new interface is a bit more flexible (for example if someone wants
to clear a flag).

Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211119180035.1396139-1-revest@chromium.org
2021-11-23 23:04:18 -08:00
Andrii Nakryiko
60ce9af668 libbpf: Accommodate DWARF/compiler bug with duplicated structs
According to [0], compilers sometimes might produce duplicate DWARF
definitions for exactly the same struct/union within the same
compilation unit (CU). We've had similar issues with identical arrays
and handled them with a similar workaround in 6b6e6b1d09aa ("libbpf:
Accomodate DWARF/compiler bug with duplicated identical arrays"). Do the
same for struct/union by ensuring that two structs/unions are exactly
the same, down to the integer values of field referenced type IDs.

Solving this more generically (allowing referenced types to be
equivalent, but using different type IDs, all within a single CU)
requires a huge complexity increase to handle many-to-many mappings
between canonidal and candidate type graphs. Before we invest in that,
let's see if this approach handles all the instances of this issue in
practice. Thankfully it's pretty rare, it seems.

  [0] https://lore.kernel.org/bpf/YXr2NFlJTAhHdZqq@krava/

Reported-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211117194114.347675-1-andrii@kernel.org
2021-11-23 23:04:18 -08:00
Andrii Nakryiko
d29571725a libbpf: Add runtime APIs to query libbpf version
Libbpf provided LIBBPF_MAJOR_VERSION and LIBBPF_MINOR_VERSION macros to
check libbpf version at compilation time. This doesn't cover all the
needs, though, because version of libbpf that application is compiled
against doesn't necessarily match the version of libbpf at runtime,
especially if libbpf is used as a shared library.

Add libbpf_major_version() and libbpf_minor_version() returning major
and minor versions, respectively, as integers. Also add a convenience
libbpf_version_string() for various tooling using libbpf to print out
libbpf version in a human-readable form. Currently it will return
"v0.6", but in the future it can contains some extra information, so the
format itself is not part of a stable API and shouldn't be relied upon.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20211118174054.2699477-1-andrii@kernel.org
2021-11-23 23:04:18 -08:00
Dave Tucker
842c5b7bff bpf, docs: Fix ordering of bpf documentation
This commit fixes the display of the BPF documentation in the sidebar
when rendered as HTML.

Before this patch, the sidebar would render as follows for some
sections:

| BPF Documentation
  |- BPF Type Format (BTF)
    |- BPF Type Format (BTF)

This was due to creating a heading in index.rst followed by
a sphinx toctree, where the file referenced carries the same
title as the section heading.

To fix this I applied a pattern that has been established in other
subfolders of Documentation:

1. Re-wrote index.rst to have a single toctree
2. Split the sections out in to their own files

Additionally maps.rst and programs.rst make use of a glob pattern to
include map_* or prog_* rst files in their toctree, meaning future map
or program type documentation will be automatically included.

Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/1a1eed800e7b9dc13b458de113a489641519b0cc.1636749493.git.dave@dtucker.co.uk
2021-11-23 23:04:18 -08:00
Quentin Monnet
9109d6a4b4 ci: create summary for tests and account for bpftool checks result
The bpftool checks work as expected when the CI runs, except that they
do not set any error status code for the script on error, which means
that the failures are lost among the logs and not reported in a clear
way to the reviewers.

This commit aims at fixing the issue. We could simply exit with a
non-zero error status when the bpftool checks, but that would prevent
the other tests from running. Instead, we propose to store the result of
the bpftool checks in a bash array. This array can later be reused to
print a summary of the different groups of tests, at the end of the CI
run, to help the reviewers understand where the failure happened without
having to manually unfold all the sections on the GitHub interface.

Currently, there are only two groups: the bpftool checks and the "VM
tests". The latter may later be split into test_maps, test_progs,
test_progs-no_alu32, etc. by teaching each of them to append their exit
status code to the "exitstatus" file.

Fixes: 88649fe655 ("ci: run script to test bpftool types/options sync")
2021-11-18 11:19:36 -08:00
Quentin Monnet
eab19ffead ci: pass shutdown fold description to fold command
For displaying a coloured title for the shutdown section in the logs,
instead of having the colour control codes directly written in run.sh,
we can pass the section title as an argument to "travis_fold()" and have
it format and print it for us.

This is cleaner, and slightly more in-line with what we do in the CI
files of the vmtest repository.
2021-11-18 11:19:36 -08:00
Andrii Nakryiko
94a49850c5 Makefile: enforce gnu89 standard
libbpf conforms to kernel style and uses the same -std=gnu89 standard
for compilation. So enforce it on Github projection as well.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-11-16 13:17:03 -08:00
Andrii Nakryiko
d71409b508 Makefile: don't hide relevant parts of file path
File path has shared vs static path, it's useful to see. So preserve it
in pretty output.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-11-16 13:17:03 -08:00
Andrii Nakryiko
f0ecdeed3a sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   9faaffbe85edcbdc54096f7f87baa3bc4842a7e2
Checkpoint bpf-next commit: d41bc48bfab2076f7db88d079a3a3203dd9c4a54
Baseline bpf commit:        5833291ab6de9c3e2374336b51c814e515e8f3a5
Checkpoint bpf commit:      099f896f498a2b26d84f4ddae039b2c542c18b48

Kumar Kartikeya Dwivedi (1):
  libbpf: Perform map fd cleanup for gen_loader in case of error

Tiezhu Yang (1):
  bpf: Change value of MAX_TAIL_CALL_CNT from 32 to 33

Yonghong Song (1):
  libbpf: Fix a couple of missed btf_type_tag handling in btf.c

 include/uapi/linux/bpf.h |  2 +-
 src/bpf_gen_internal.h   |  4 ++--
 src/btf.c                |  2 ++
 src/gen_loader.c         | 47 +++++++++++++++++++++++++---------------
 src/libbpf.c             |  4 ++--
 5 files changed, 37 insertions(+), 22 deletions(-)

--
2.30.2
2021-11-16 13:16:07 -08:00
Andrii Nakryiko
d924fa62ee sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2021-11-16 13:16:07 -08:00
Kumar Kartikeya Dwivedi
0f5a62b2d8 libbpf: Perform map fd cleanup for gen_loader in case of error
Alexei reported a fd leak issue in gen loader (when invoked from
bpftool) [0]. When adding ksym support, map fd allocation was moved from
stack to loader map, however I missed closing these fds (relevant when
cleanup label is jumped to on error). For the success case, the
allocated fd is returned in loader ctx, hence this problem is not
noticed.

Make three changes, first MAX_USED_MAPS in MAX_FD_ARRAY_SZ instead of
MAX_USED_PROGS, the braino was not a problem until now for this case as
we didn't try to close map fds (otherwise use of it would have tried
closing 32 additional fds in ksym btf fd range). Then, do a cleanup for
all nr_maps fds in cleanup label code, so that in case of error all
temporary map fds from bpf_gen__map_create are closed.

Then, adjust the cleanup label to only generate code for the required
number of program and map fds.  To trim code for remaining program
fds, lay out prog_fd array in stack in the end, so that we can
directly skip the remaining instances.  Still stack size remains same,
since changing that would require changes in a lot of places
(including adjustment of stack_off macro), so nr_progs_sz variable is
only used to track required number of iterations (and jump over
cleanup size calculated from that), stack offset calculation remains
unaffected.

The difference for test_ksyms_module.o is as follows:
libbpf: //prog cleanup iterations: before = 34, after = 5
libbpf: //maps cleanup iterations: before = 64, after = 2

Also, move allocation of gen->fd_array offset to bpf_gen__init. Since
offset can now be 0, and we already continue even if add_data returns 0
in case of failure, we do not need to distinguish between 0 offset and
failure case 0, as we rely on bpf_gen__finish to check errors. We can
also skip check for gen->fd_array in add_*_fd functions, since
bpf_gen__init will take care of it.

  [0]: https://lore.kernel.org/bpf/CAADnVQJ6jSitKSNKyxOrUzwY2qDRX0sPkJ=VLGHuCLVJ=qOt9g@mail.gmail.com

Fixes: 18f4fccbf314 ("libbpf: Update gen_loader to emit BTF_KIND_FUNC relocations")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211112232022.899074-1-memxor@gmail.com
2021-11-16 13:16:07 -08:00
Tiezhu Yang
5ca49d2b32 bpf: Change value of MAX_TAIL_CALL_CNT from 32 to 33
In the current code, the actual max tail call count is 33 which is greater
than MAX_TAIL_CALL_CNT (defined as 32). The actual limit is not consistent
with the meaning of MAX_TAIL_CALL_CNT and thus confusing at first glance.
We can see the historical evolution from commit 04fd61ab36ec ("bpf: allow
bpf programs to tail-call other bpf programs") and commit f9dabe016b63
("bpf: Undo off-by-one in interpreter tail call count limit"). In order
to avoid changing existing behavior, the actual limit is 33 now, this is
reasonable.

After commit 874be05f525e ("bpf, tests: Add tail call test suite"), we can
see there exists failed testcase.

On all archs when CONFIG_BPF_JIT_ALWAYS_ON is not set:
 # echo 0 > /proc/sys/net/core/bpf_jit_enable
 # modprobe test_bpf
 # dmesg | grep -w FAIL
 Tail call error path, max count reached jited:0 ret 34 != 33 FAIL

On some archs:
 # echo 1 > /proc/sys/net/core/bpf_jit_enable
 # modprobe test_bpf
 # dmesg | grep -w FAIL
 Tail call error path, max count reached jited:1 ret 34 != 33 FAIL

Although the above failed testcase has been fixed in commit 18935a72eb25
("bpf/tests: Fix error in tail call limit tests"), it would still be good
to change the value of MAX_TAIL_CALL_CNT from 32 to 33 to make the code
more readable.

The 32-bit x86 JIT was using a limit of 32, just fix the wrong comments and
limit to 33 tail calls as the constant MAX_TAIL_CALL_CNT updated. For the
mips64 JIT, use "ori" instead of "addiu" as suggested by Johan Almbladh.
For the riscv JIT, use RV_REG_TCC directly to save one register move as
suggested by Björn Töpel. For the other implementations, no function changes,
it does not change the current limit 33, the new value of MAX_TAIL_CALL_CNT
can reflect the actual max tail call count, the related tail call testcases
in test_bpf module and selftests can work well for the interpreter and the
JIT.

Here are the test results on x86_64:

 # uname -m
 x86_64
 # echo 0 > /proc/sys/net/core/bpf_jit_enable
 # modprobe test_bpf test_suite=test_tail_calls
 # dmesg | tail -1
 test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [0/8 JIT'ed]
 # rmmod test_bpf
 # echo 1 > /proc/sys/net/core/bpf_jit_enable
 # modprobe test_bpf test_suite=test_tail_calls
 # dmesg | tail -1
 test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [8/8 JIT'ed]
 # rmmod test_bpf
 # ./test_progs -t tailcalls
 #142 tailcalls:OK
 Summary: 1/11 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Acked-by: Björn Töpel <bjorn@kernel.org>
Acked-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/bpf/1636075800-3264-1-git-send-email-yangtiezhu@loongson.cn
2021-11-16 13:16:07 -08:00
Yonghong Song
219c8e11e0 libbpf: Fix a couple of missed btf_type_tag handling in btf.c
Commit 2dc1e488e5cd ("libbpf: Support BTF_KIND_TYPE_TAG") added the
BTF_KIND_TYPE_TAG support. But to test vmlinux build with ...

  #define __user __attribute__((btf_type_tag("user")))

... I needed to sync libbpf repo and manually copy libbpf sources to
pahole. To simplify process, I used BTF_KIND_RESTRICT to simulate the
BTF_KIND_TYPE_TAG with vmlinux build as "restrict" modifier is barely
used in kernel.

But this approach missed one case in dedup with structures where
BTF_KIND_RESTRICT is handled and BTF_KIND_TYPE_TAG is not handled in
btf_dedup_is_equiv(), and this will result in a pahole dedup failure.
This patch fixed this issue and a selftest is added in the subsequent
patch to test this scenario.

The other missed handling is in btf__resolve_size(). Currently the compiler
always emit like PTR->TYPE_TAG->... so in practice we don't hit the missing
BTF_KIND_TYPE_TAG handling issue with compiler generated code. But lets
add case BTF_KIND_TYPE_TAG in the switch statement to be future proof.

Fixes: 2dc1e488e5cd ("libbpf: Support BTF_KIND_TYPE_TAG")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211115163937.3922235-1-yhs@fb.com
2021-11-16 13:16:07 -08:00
Ilya Leoshkevich
140b902274 ci: add s390x vmtest
Run it on the self-hosted builder with tag "z15".

Also add the infrastructure code for the self-hosted builder.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
2021-11-15 22:39:49 -08:00
Ilya Leoshkevich
1987a34fc9 vmtest: use libguestfs for disk image manipulations
Running vmtest inside a container removes the ability to use certain
root powers, among other things - mounting arbitrary images. Use
libguestfs in order to avoid having to mount anything.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
2021-11-15 22:39:49 -08:00
Ilya Leoshkevich
3b1714aa92 vmtest: add s390x blacklist
A lot of tests in test_progs fail due to the missing trampoline
implementation on s390x (and a handful for other reasons). Yet, a lot
more pass, so disabling test_progs altogether is too heavy-handed.

So add a mechanism for arch-specific blacklists (as discussed in [1])
and introduce a s390x blacklist, that simply reflects the status quo.

[1] https://github.com/libbpf/libbpf/pull/204#discussion_r601768628

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
2021-11-15 22:39:49 -08:00
Ilya Leoshkevich
554054d876 vmtest: tweak qemu invocation for s390x
We need a different binary and console. Also use a fixed number of
cores in order to avoid OOM in case a builder has too many of them.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
2021-11-15 22:39:49 -08:00
Ilya Leoshkevich
26e196d449 vmtest: add s390x image
Generated by simply running mkrootfs_debian.sh.

Also use $ARCH as an image name prefix.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
2021-11-15 22:39:49 -08:00
Ilya Leoshkevich
3fac0b3d08 vmtest: add s390x config
Select the current config based on $ARCH value and thus rename
the existing config to config-latest.$ARCH.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
2021-11-15 22:39:49 -08:00
Ilya Leoshkevich
ac4a0fa400 vmtest: add debootstrap-based mkrootfs script
The existing mkrootfs.sh is based on Arch Linux, which supports only
x86_64.

Add mkrootfs_debian.sh: a debootstrap-based script. Debian was chosen,
because it supports more architectures than other mainstream distros.
Move init setup to mkrootfs_tweak.sh, rename the existing Arch script
to mkrootfs_arch.sh.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
2021-11-15 22:39:49 -08:00
Ilya Leoshkevich
6ad73f5083 vmtest: do not install lld
s390x LLVM does not have it, and it is not needed for the libbpf CI.
So drop it.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
2021-11-15 22:39:49 -08:00
Ilya Leoshkevich
8a52e49575 vmtest: use python3-docutils instead of python-docutils
There is no python-docutils on Debian Bullseye, but python3-docutils
exists everywhere. Since Python 2 is EOL anyway, use the Python 3
version.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
2021-11-15 22:39:49 -08:00
Andrii Nakryiko
5b9d079c7f sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   b8b5cb55f5d3f03cc1479a3768d68173a10359ad
Checkpoint bpf-next commit: 9faaffbe85edcbdc54096f7f87baa3bc4842a7e2
Baseline bpf commit:        47b3708c6088a60e7dc3b809dbb0d4c46590b32f
Checkpoint bpf commit:      5833291ab6de9c3e2374336b51c814e515e8f3a5

Andrii Nakryiko (11):
  libbpf: Rename DECLARE_LIBBPF_OPTS into LIBBPF_OPTS
  libbpf: Pass number of prog load attempts explicitly
  libbpf: Unify low-level BPF_PROG_LOAD APIs into bpf_prog_load()
  libbpf: Remove internal use of deprecated bpf_prog_load() variants
  libbpf: Stop using to-be-deprecated APIs
  libbpf: Remove deprecation attribute from struct bpf_prog_prep_result
  libbpf: Free up resources used by inner map definition
  libbpf: Add ability to get/set per-program load flags
  libbpf: Turn btf_dedup_opts into OPTS-based struct
  libbpf: Ensure btf_dump__new() and btf_dump_opts are future-proof
  libbpf: Make perf_buffer__new() use OPTS-based interface

Mark Pashmfouroush (1):
  bpf: Add ingress_ifindex to bpf_sk_lookup

Song Liu (1):
  bpf: Introduce helper bpf_find_vma

Yonghong Song (2):
  bpf: Support BTF_KIND_TYPE_TAG for btf_type_tag attributes
  libbpf: Support BTF_KIND_TYPE_TAG

 include/uapi/linux/bpf.h |  21 +++
 include/uapi/linux/btf.h |   3 +-
 src/bpf.c                | 166 +++++++++++++---------
 src/bpf.h                |  74 +++++++++-
 src/bpf_gen_internal.h   |   8 +-
 src/btf.c                |  69 ++++++---
 src/btf.h                |  80 +++++++++--
 src/btf_dump.c           |  40 ++++--
 src/gen_loader.c         |  30 ++--
 src/libbpf.c             | 297 +++++++++++++++++++++++----------------
 src/libbpf.h             |  95 ++++++++++---
 src/libbpf.map           |  13 ++
 src/libbpf_common.h      |  14 +-
 src/libbpf_internal.h    |  33 +----
 src/libbpf_legacy.h      |   1 +
 src/libbpf_probes.c      |  20 ++-
 src/linker.c             |   4 +-
 src/xsk.c                |  34 ++---
 18 files changed, 670 insertions(+), 332 deletions(-)

--
2.30.2
2021-11-12 23:46:09 -08:00
Andrii Nakryiko
bc66d28b68 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2021-11-12 23:46:09 -08:00
Yonghong Song
98181e0546 libbpf: Support BTF_KIND_TYPE_TAG
Add libbpf support for BTF_KIND_TYPE_TAG.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211112012614.1505315-1-yhs@fb.com
2021-11-12 23:46:09 -08:00
Yonghong Song
d932a1a46b bpf: Support BTF_KIND_TYPE_TAG for btf_type_tag attributes
LLVM patches ([1] for clang, [2] and [3] for BPF backend)
added support for btf_type_tag attributes. This patch
added support for the kernel.

The main motivation for btf_type_tag is to bring kernel
annotations __user, __rcu etc. to btf. With such information
available in btf, bpf verifier can detect mis-usages
and reject the program. For example, for __user tagged pointer,
developers can then use proper helper like bpf_probe_read_user()
etc. to read the data.

BTF_KIND_TYPE_TAG may also useful for other tracing
facility where instead of to require user to specify
kernel/user address type, the kernel can detect it
by itself with btf.

  [1] https://reviews.llvm.org/D111199
  [2] https://reviews.llvm.org/D113222
  [3] https://reviews.llvm.org/D113496

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211112012609.1505032-1-yhs@fb.com
2021-11-12 23:46:09 -08:00
Andrii Nakryiko
011a01594c libbpf: Make perf_buffer__new() use OPTS-based interface
Add new variants of perf_buffer__new() and perf_buffer__new_raw() that
use OPTS-based options for future extensibility ([0]). Given all the
currently used API names are best fits, re-use them and use
___libbpf_override() approach and symbol versioning to preserve ABI and
source code compatibility. struct perf_buffer_opts and struct
perf_buffer_raw_opts are kept as well, but they are restructured such
that they are OPTS-based when used with new APIs. For struct
perf_buffer_raw_opts we keep few fields intact, so we have to also
preserve the memory location of them both when used as OPTS and for
legacy API variants. This is achieved with anonymous padding for OPTS
"incarnation" of the struct.  These pads can be eventually used for new
options.

  [0] Closes: https://github.com/libbpf/libbpf/issues/311

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211111053624.190580-6-andrii@kernel.org
2021-11-12 23:46:09 -08:00
Andrii Nakryiko
b9a88a4533 libbpf: Ensure btf_dump__new() and btf_dump_opts are future-proof
Change btf_dump__new() and corresponding struct btf_dump_ops structure
to be extensible by using OPTS "framework" ([0]). Given we don't change
the names, we use a similar approach as with bpf_prog_load(), but this
time we ended up with two APIs with the same name and same number of
arguments, so overloading based on number of arguments with
___libbpf_override() doesn't work.

Instead, use "overloading" based on types. In this particular case,
print callback has to be specified, so we detect which argument is
a callback. If it's 4th (last) argument, old implementation of API is
used by user code. If not, it must be 2nd, and thus new implementation
is selected. The rest is handled by the same symbol versioning approach.

btf_ext argument is dropped as it was never used and isn't necessary
either. If in the future we'll need btf_ext, that will be added into
OPTS-based struct btf_dump_opts.

struct btf_dump_opts is reused for both old API and new APIs. ctx field
is marked deprecated in v0.7+ and it's put at the same memory location
as OPTS's sz field. Any user of new-style btf_dump__new() will have to
set sz field and doesn't/shouldn't use ctx, as ctx is now passed along
the callback as mandatory input argument, following the other APIs in
libbpf that accept callbacks consistently.

Again, this is quite ugly in implementation, but is done in the name of
backwards compatibility and uniform and extensible future APIs (at the
same time, sigh). And it will be gone in libbpf 1.0.

  [0] Closes: https://github.com/libbpf/libbpf/issues/283

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211111053624.190580-5-andrii@kernel.org
2021-11-12 23:46:09 -08:00
Andrii Nakryiko
969018545d libbpf: Turn btf_dedup_opts into OPTS-based struct
btf__dedup() and struct btf_dedup_opts were added before we figured out
OPTS mechanism. As such, btf_dedup_opts is non-extensible without
breaking an ABI and potentially crashing user application.

Unfortunately, btf__dedup() and btf_dedup_opts are short and succinct
names that would be great to preserve and use going forward. So we use
___libbpf_override() macro approach, used previously for bpf_prog_load()
API, to define a new btf__dedup() variant that accepts only struct btf *
and struct btf_dedup_opts * arguments, and rename the old btf__dedup()
implementation into btf__dedup_deprecated(). This keeps both source and
binary compatibility with old and new applications.

The biggest problem was struct btf_dedup_opts, which wasn't OPTS-based,
and as such doesn't have `size_t sz;` as a first field. But btf__dedup()
is a pretty rarely used API and I believe that the only currently known
users (besides selftests) are libbpf's own bpf_linker and pahole.
Neither use case actually uses options and just passes NULL. So instead
of doing extra hacks, just rewrite struct btf_dedup_opts into OPTS-based
one, move btf_ext argument into those opts (only bpf_linker needs to
dedup btf_ext, so it's not a typical thing to specify), and drop never
used `dont_resolve_fwds` option (it was never used anywhere, AFAIK, it
makes BTF dedup much less useful and efficient).

Just in case, for old implementation, btf__dedup_deprecated(), detect
non-NULL options and error out with helpful message, to help users
migrate, if there are any user playing with btf__dedup().

The last remaining piece is dedup_table_size, which is another
anachronism from very early days of BTF dedup. Since then it has been
reduced to the only valid value, 1, to request forced hash collisions.
This is only used during testing. So instead introduce a bool flag to
force collisions explicitly.

This patch also adapts selftests to new btf__dedup() and btf_dedup_opts
use to avoid selftests breakage.

  [0] Closes: https://github.com/libbpf/libbpf/issues/281

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211111053624.190580-4-andrii@kernel.org
2021-11-12 23:46:09 -08:00
Andrii Nakryiko
0e80b7dc3f libbpf: Add ability to get/set per-program load flags
Add bpf_program__flags() API to retrieve prog_flags that will be (or
were) supplied to BPF_PROG_LOAD command.

Also add bpf_program__set_extra_flags() API to allow to set *extra*
flags, in addition to those determined by program's SEC() definition.
Such flags are logically OR'ed with libbpf-derived flags.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211111051758.92283-2-andrii@kernel.org
2021-11-12 23:46:09 -08:00
Mark Pashmfouroush
932800b20b bpf: Add ingress_ifindex to bpf_sk_lookup
It may be helpful to have access to the ifindex during bpf socket
lookup. An example may be to scope certain socket lookup logic to
specific interfaces, i.e. an interface may be made exempt from custom
lookup code.

Add the ifindex of the arriving connection to the bpf_sk_lookup API.

Signed-off-by: Mark Pashmfouroush <markpash@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211110111016.5670-2-markpash@cloudflare.com
2021-11-12 23:46:09 -08:00
Song Liu
cfc69268e5 bpf: Introduce helper bpf_find_vma
In some profiler use cases, it is necessary to map an address to the
backing file, e.g., a shared library. bpf_find_vma helper provides a
flexible way to achieve this. bpf_find_vma maps an address of a task to
the vma (vm_area_struct) for this address, and feed the vma to an callback
BPF function. The callback function is necessary here, as we need to
ensure mmap_sem is unlocked.

It is necessary to lock mmap_sem for find_vma. To lock and unlock mmap_sem
safely when irqs are disable, we use the same mechanism as stackmap with
build_id. Specifically, when irqs are disabled, the unlocked is postponed
in an irq_work. Refactor stackmap.c so that the irq_work is shared among
bpf_find_vma and stackmap helpers.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Hengqi Chen <hengqi.chen@gmail.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211105232330.1936330-2-songliubraving@fb.com
2021-11-12 23:46:09 -08:00
Andrii Nakryiko
9b2bbdefd5 libbpf: Free up resources used by inner map definition
It's not enough to just free(map->inner_map), as inner_map itself can
have extra memory allocated, like map name.

Fixes: 646f02ffdd49 ("libbpf: Add BTF-defined map-in-map support")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Hengqi Chen <hengqi.chen@gmail.com>
Link: https://lore.kernel.org/bpf/20211107165521.9240-3-andrii@kernel.org
2021-11-12 23:46:09 -08:00
Andrii Nakryiko
c7236a5342 libbpf: Remove deprecation attribute from struct bpf_prog_prep_result
This deprecation annotation has no effect because for struct deprecation
attribute has to be declared after struct definition. But instead of
moving it to the end of struct definition, remove it. When deprecation
will go in effect at libbpf v0.7, this deprecation attribute will cause
libbpf's own source code compilation to trigger deprecation warnings,
which is unavoidable because libbpf still has to support that API.

So keep deprecation of APIs, but don't mark structs used in API as
deprecated.

Fixes: e21d585cb3db ("libbpf: Deprecate multi-instance bpf_program APIs")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20211103220845.2676888-8-andrii@kernel.org
2021-11-12 23:46:09 -08:00
Andrii Nakryiko
a611094604 libbpf: Stop using to-be-deprecated APIs
Remove all the internal uses of libbpf APIs that are slated to be
deprecated in v0.7.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211103220845.2676888-6-andrii@kernel.org
2021-11-12 23:46:09 -08:00
Andrii Nakryiko
9b422137af libbpf: Remove internal use of deprecated bpf_prog_load() variants
Remove all the internal uses of bpf_load_program_xattr(), which is
slated for deprecation in v0.7.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211103220845.2676888-5-andrii@kernel.org
2021-11-12 23:46:09 -08:00
Andrii Nakryiko
65cdd0c73d libbpf: Unify low-level BPF_PROG_LOAD APIs into bpf_prog_load()
Add a new unified OPTS-based low-level API for program loading,
bpf_prog_load() ([0]).  bpf_prog_load() accepts few "mandatory"
parameters as input arguments (program type, name, license,
instructions) and all the other optional (as in not required to specify
for all types of BPF programs) fields into struct bpf_prog_load_opts.

This makes all the other non-extensible APIs variant for BPF_PROG_LOAD
obsolete and they are slated for deprecation in libbpf v0.7:
  - bpf_load_program();
  - bpf_load_program_xattr();
  - bpf_verify_program().

Implementation-wise, internal helper libbpf__bpf_prog_load is refactored
to become a public bpf_prog_load() API. struct bpf_prog_load_params used
internally is replaced by public struct bpf_prog_load_opts.

Unfortunately, while conceptually all this is pretty straightforward,
the biggest complication comes from the already existing bpf_prog_load()
*high-level* API, which has nothing to do with BPF_PROG_LOAD command.

We try really hard to have a new API named bpf_prog_load(), though,
because it maps naturally to BPF_PROG_LOAD command.

For that, we rename old bpf_prog_load() into bpf_prog_load_deprecated()
and mark it as COMPAT_VERSION() for shared library users compiled
against old version of libbpf. Statically linked users and shared lib
users compiled against new version of libbpf headers will get "rerouted"
to bpf_prog_deprecated() through a macro helper that decides whether to
use new or old bpf_prog_load() based on number of input arguments (see
___libbpf_overload in libbpf_common.h).

To test that existing
bpf_prog_load()-using code compiles and works as expected, I've compiled
and ran selftests as is. I had to remove (locally) selftest/bpf/Makefile
-Dbpf_prog_load=bpf_prog_test_load hack because it was conflicting with
the macro-based overload approach. I don't expect anyone else to do
something like this in practice, though. This is testing-specific way to
replace bpf_prog_load() calls with special testing variant of it, which
adds extra prog_flags value. After testing I kept this selftests hack,
but ensured that we use a new bpf_prog_load_deprecated name for this.

This patch also marks bpf_prog_load() and bpf_prog_load_xattr() as deprecated.
bpf_object interface has to be used for working with struct bpf_program.
Libbpf doesn't support loading just a bpf_program.

The silver lining is that when we get to libbpf 1.0 all these
complication will be gone and we'll have one clean bpf_prog_load()
low-level API with no backwards compatibility hackery surrounding it.

  [0] Closes: https://github.com/libbpf/libbpf/issues/284

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211103220845.2676888-4-andrii@kernel.org
2021-11-12 23:46:09 -08:00
Andrii Nakryiko
6b2db898cc libbpf: Pass number of prog load attempts explicitly
Allow to control number of BPF_PROG_LOAD attempts from outside the
sys_bpf_prog_load() helper.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20211103220845.2676888-3-andrii@kernel.org
2021-11-12 23:46:09 -08:00
Andrii Nakryiko
ea6c242fc6 libbpf: Rename DECLARE_LIBBPF_OPTS into LIBBPF_OPTS
It's confusing that libbpf-provided helper macro doesn't start with
LIBBPF. Also "declare" vs "define" is confusing terminology, I can never
remember and always have to look up previous examples.

Bypass both issues by renaming DECLARE_LIBBPF_OPTS into a short and
clean LIBBPF_OPTS. To avoid breaking existing code, provide:

  #define DECLARE_LIBBPF_OPTS LIBBPF_OPTS

in libbpf_legacy.h. We can decide later if we ever want to remove it or
we'll keep it forever because it doesn't add any maintainability burden.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20211103220845.2676888-2-andrii@kernel.org
2021-11-12 23:46:09 -08:00
Thiago Marques
86175df408 Merge remote-tracking branch 'upstream/master' 2021-11-11 19:17:34 +00:00
Andrii Nakryiko
26e768783c sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   8388092b2551f7ae34dad800ce828779f7c948c9
Checkpoint bpf-next commit: b8b5cb55f5d3f03cc1479a3768d68173a10359ad
Baseline bpf commit:        c08455dec5acf4668f5d1eb099f7fedb29f2de5f
Checkpoint bpf commit:      47b3708c6088a60e7dc3b809dbb0d4c46590b32f

Andrii Nakryiko (7):
  libbpf: Detect corrupted ELF symbols section
  libbpf: Improve sanity checking during BTF fix up
  libbpf: Validate that .BTF and .BTF.ext sections contain data
  libbpf: Fix section counting logic
  libbpf: Improve ELF relo sanitization
  libbpf: Deprecate bpf_program__load() API
  libbpf: Fix non-C89 loop variable declaration in gen_loader.c

Mehrdad Arshad Rad (1):
  libbpf: Fix lookup_and_delete_elem_flags error reporting

 src/bpf.c        |  4 ++-
 src/gen_loader.c |  3 +-
 src/libbpf.c     | 79 +++++++++++++++++++++++++++++++-----------------
 src/libbpf.h     |  4 +--
 4 files changed, 59 insertions(+), 31 deletions(-)

--
2.30.2
2021-11-06 19:33:03 -07:00
Mehrdad Arshad Rad
88209a3c44 libbpf: Fix lookup_and_delete_elem_flags error reporting
Fix bpf_map_lookup_and_delete_elem_flags() to pass the return code through
libbpf_err_errno() as we do similarly in bpf_map_lookup_and_delete_elem().

Fixes: f12b65432728 ("libbpf: Streamline error reporting for low-level APIs")
Signed-off-by: Mehrdad Arshad Rad <arshad.rad@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211104171354.11072-1-arshad.rad@gmail.com
2021-11-06 19:33:03 -07:00
Andrii Nakryiko
2ab2615926 libbpf: Fix non-C89 loop variable declaration in gen_loader.c
Fix the `int i` declaration inside the for statement. This is non-C89
compliant. See [0] for user report breaking BCC build.

  [0] https://github.com/libbpf/libbpf/issues/403

Fixes: 18f4fccbf314 ("libbpf: Update gen_loader to emit BTF_KIND_FUNC relocations")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/bpf/20211105191055.3324874-1-andrii@kernel.org
2021-11-06 19:33:03 -07:00
Andrii Nakryiko
c03b183a6e libbpf: Deprecate bpf_program__load() API
Mark bpf_program__load() as deprecated ([0]) since v0.6. Also rename few
internal program loading bpf_object helper functions to have more
consistent naming.

  [0] Closes: https://github.com/libbpf/libbpf/issues/301

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211103051449.1884903-1-andrii@kernel.org
2021-11-06 19:33:03 -07:00
Andrii Nakryiko
36cc591ac8 libbpf: Improve ELF relo sanitization
Add few sanity checks for relocations to prevent div-by-zero and
out-of-bounds array accesses in libbpf.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211103173213.1376990-6-andrii@kernel.org
2021-11-06 19:33:03 -07:00
Andrii Nakryiko
3acf7c289a libbpf: Fix section counting logic
e_shnum does include section #0 and as such is exactly the number of ELF
sections that we need to allocate memory for to use section indices as
array indices. Fix the off-by-one error.

This is purely accounting fix, previously we were overallocating one
too many array items. But no correctness errors otherwise.

Fixes: 25bbbd7a444b ("libbpf: Remove assumptions about uniqueness of .rodata/.data/.bss maps")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211103173213.1376990-5-andrii@kernel.org
2021-11-06 19:33:03 -07:00
Andrii Nakryiko
a383b3e200 libbpf: Validate that .BTF and .BTF.ext sections contain data
.BTF and .BTF.ext ELF sections should have SHT_PROGBITS type and contain
data. If they are not, ELF is invalid or corrupted, so bail out.
Otherwise this can lead to data->d_buf being NULL and SIGSEGV later on.
Reported by oss-fuzz project.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211103173213.1376990-4-andrii@kernel.org
2021-11-06 19:33:03 -07:00
Andrii Nakryiko
2f52e2afc0 libbpf: Improve sanity checking during BTF fix up
If BTF is corrupted DATASEC's variable type ID might be incorrect.
Prevent this easy to detect situation with extra NULL check.
Reported by oss-fuzz project.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211103173213.1376990-3-andrii@kernel.org
2021-11-06 19:33:03 -07:00
Andrii Nakryiko
738277b773 libbpf: Detect corrupted ELF symbols section
Prevent divide-by-zero if ELF is corrupted and has zero sh_entsize.
Reported by oss-fuzz project.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211103173213.1376990-2-andrii@kernel.org
2021-11-06 19:33:03 -07:00
Andrii Nakryiko
16dfb4ffe4 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   36e70b9b06bf14a0fac87315f0e73d6e17e80aad
Checkpoint bpf-next commit: 8388092b2551f7ae34dad800ce828779f7c948c9
Baseline bpf commit:        72f898ca0ab85fde6facf78b14d9f67a4a7b32d1
Checkpoint bpf commit:      c08455dec5acf4668f5d1eb099f7fedb29f2de5f

Dave Marchevsky (1):
  libbpf: Deprecate bpf_program__get_prog_info_linear

Joanne Koong (1):
  bpf: Add alignment padding for "map_extra" + consolidate holes

Magnus Karlsson (1):
  libbpf: Deprecate AF_XDP support

 include/uapi/linux/bpf.h |  1 +
 src/libbpf.h             |  3 ++
 src/xsk.h                | 90 +++++++++++++++++++++++-----------------
 3 files changed, 56 insertions(+), 38 deletions(-)

--
2.30.2
2021-11-03 16:00:04 -07:00
Dave Marchevsky
826770613d libbpf: Deprecate bpf_program__get_prog_info_linear
As part of the road to libbpf 1.0, and discussed in libbpf issue tracker
[0], bpf_program__get_prog_info_linear and its associated structs and
helper functions should be deprecated. The functionality is too specific
to the needs of 'perf', and there's little/no out-of-tree usage to
preclude introduction of a more general helper in the future.

  [0] Closes: https://github.com/libbpf/libbpf/issues/313

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211101224357.2651181-5-davemarchevsky@fb.com
2021-11-03 16:00:04 -07:00
Magnus Karlsson
277846bc6c libbpf: Deprecate AF_XDP support
Deprecate AF_XDP support in libbpf ([0]). This has been moved to
libxdp as it is a better fit for that library. The AF_XDP support only
uses the public libbpf functions and can therefore just use libbpf as
a library from libxdp. The libxdp APIs are exactly the same so it
should just be linking with libxdp instead of libbpf for the AF_XDP
functionality. If not, please submit a bug report. Linking with both
libraries is supported but make sure you link in the correct order so
that the new functions in libxdp are used instead of the deprecated
ones in libbpf.

Libxdp can be found at https://github.com/xdp-project/xdp-tools.

  [0] Closes: https://github.com/libbpf/libbpf/issues/270

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20211029090111.4733-1-magnus.karlsson@gmail.com
2021-11-03 16:00:04 -07:00
Joanne Koong
1e97e84c86 bpf: Add alignment padding for "map_extra" + consolidate holes
This patch makes 2 changes regarding alignment padding
for the "map_extra" field.

1) In the kernel header, "map_extra" and "btf_value_type_id"
are rearranged to consolidate the hole.

Before:
struct bpf_map {
	...
        u32		max_entries;	/*    36     4	*/
        u32		map_flags;	/*    40     4	*/

        /* XXX 4 bytes hole, try to pack */

        u64		map_extra;	/*    48     8	*/
        int		spin_lock_off;	/*    56     4	*/
        int		timer_off;	/*    60     4	*/
        /* --- cacheline 1 boundary (64 bytes) --- */
        u32		id;		/*    64     4	*/
        int		numa_node;	/*    68     4	*/
	...
        bool		frozen;		/*   117     1	*/

        /* XXX 10 bytes hole, try to pack */

        /* --- cacheline 2 boundary (128 bytes) --- */
	...
        struct work_struct	work;	/*   144    72	*/

        /* --- cacheline 3 boundary (192 bytes) was 24 bytes ago --- */
	struct mutex	freeze_mutex;	/*   216   144 	*/

        /* --- cacheline 5 boundary (320 bytes) was 40 bytes ago --- */
        u64		writecnt; 	/*   360     8	*/

    /* size: 384, cachelines: 6, members: 26 */
    /* sum members: 354, holes: 2, sum holes: 14 */
    /* padding: 16 */
    /* forced alignments: 2, forced holes: 1, sum forced holes: 10 */

} __attribute__((__aligned__(64)));

After:
struct bpf_map {
	...
        u32		max_entries;	/*    36     4	*/
        u64		map_extra;	/*    40     8 	*/
        u32		map_flags;	/*    48     4	*/
        int		spin_lock_off;	/*    52     4	*/
        int		timer_off;	/*    56     4	*/
        u32		id;		/*    60     4	*/

        /* --- cacheline 1 boundary (64 bytes) --- */
        int		numa_node;	/*    64     4	*/
	...
	bool		frozen		/*   113     1  */

        /* XXX 14 bytes hole, try to pack */

        /* --- cacheline 2 boundary (128 bytes) --- */
	...
        struct work_struct	work;	/*   144    72	*/

        /* --- cacheline 3 boundary (192 bytes) was 24 bytes ago --- */
        struct mutex	freeze_mutex;	/*   216   144	*/

        /* --- cacheline 5 boundary (320 bytes) was 40 bytes ago --- */
        u64		writecnt;       /*   360     8	*/

    /* size: 384, cachelines: 6, members: 26 */
    /* sum members: 354, holes: 1, sum holes: 14 */
    /* padding: 16 */
    /* forced alignments: 2, forced holes: 1, sum forced holes: 14 */

} __attribute__((__aligned__(64)));

2) Add alignment padding to the bpf_map_info struct
More details can be found in commit 36f9814a494a ("bpf: fix uapi hole
for 32 bit compat applications")

Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211029224909.1721024-3-joannekoong@fb.com
2021-11-03 16:00:04 -07:00
Andrii Nakryiko
17d7f04e7c include: add BPF_ALU32_IMM macro implementation
BPF_ALU32_IMM is now used in gen_loader.c. Add its definition to
include/linux/filter.h header.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-11-01 15:10:25 -07:00
Andrii Nakryiko
c4f9ee9fbb sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   c825f5fee19caf301d9821cd79abaa734322de26
Checkpoint bpf-next commit: 36e70b9b06bf14a0fac87315f0e73d6e17e80aad
Baseline bpf commit:        04f8ef5643bcd8bcde25dfdebef998aea480b2ba
Checkpoint bpf commit:      72f898ca0ab85fde6facf78b14d9f67a4a7b32d1

Andrii Nakryiko (4):
  libbpf: Fix off-by-one bug in bpf_core_apply_relo()
  libbpf: Add ability to fetch bpf_program's underlying instructions
  libbpf: Deprecate multi-instance bpf_program APIs
  libbpf: Deprecate ambiguously-named bpf_program__size() API

Björn Töpel (1):
  riscv, libbpf: Add RISC-V (RV64) support to bpf_tracing.h

Ilya Leoshkevich (2):
  libbpf: Fix endianness detection in BPF_CORE_READ_BITFIELD_PROBED()
  libbpf: Use __BYTE_ORDER__

Joanne Koong (2):
  bpf: Add bloom filter map implementation
  libbpf: Add "map_extra" as a per-map-type extra flag

Joe Burton (1):
  libbpf: Deprecate bpf_objects_list

Kumar Kartikeya Dwivedi (5):
  bpf: Add bpf_kallsyms_lookup_name helper
  libbpf: Add typeless ksym support to gen_loader
  libbpf: Add weak ksym support to gen_loader
  libbpf: Ensure that BPF syscall fds are never 0, 1, or 2
  libbpf: Use O_CLOEXEC uniformly when opening fds

 include/uapi/linux/bpf.h |  25 ++++++++
 src/bpf.c                |  62 ++++++++++++++----
 src/bpf_core_read.h      |   2 +-
 src/bpf_gen_internal.h   |  14 ++--
 src/bpf_tracing.h        |  32 ++++++++++
 src/btf.c                |   6 +-
 src/btf_dump.c           |   8 +--
 src/gen_loader.c         | 135 ++++++++++++++++++++++++++++++++++-----
 src/libbpf.c             | 105 +++++++++++++++++++++---------
 src/libbpf.h             |  49 ++++++++++++--
 src/libbpf.map           |   4 ++
 src/libbpf_internal.h    |  49 +++++++++++++-
 src/libbpf_legacy.h      |   6 ++
 src/libbpf_probes.c      |   2 +-
 src/linker.c             |  16 ++---
 src/relo_core.c          |   2 +-
 src/xsk.c                |   6 +-
 17 files changed, 432 insertions(+), 91 deletions(-)

--
2.30.2
2021-11-01 15:10:25 -07:00
Andrii Nakryiko
6fd2ee5486 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2021-11-01 15:10:25 -07:00
Björn Töpel
7beaa2ef90 riscv, libbpf: Add RISC-V (RV64) support to bpf_tracing.h
Add macros for 64-bit RISC-V PT_REGS to bpf_tracing.h.

Signed-off-by: Björn Töpel <bjorn@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211028161057.520552-4-bjorn@kernel.org
2021-11-01 15:10:25 -07:00
Kumar Kartikeya Dwivedi
a0195b3078 libbpf: Use O_CLOEXEC uniformly when opening fds
There are some instances where we don't use O_CLOEXEC when opening an
fd, fix these up. Otherwise, it is possible that a parallel fork causes
these fds to leak into a child process on execve.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211028063501.2239335-6-memxor@gmail.com
2021-11-01 15:10:25 -07:00
Kumar Kartikeya Dwivedi
bedab00b50 libbpf: Ensure that BPF syscall fds are never 0, 1, or 2
Add a simple wrapper for passing an fd and getting a new one >= 3 if it
is one of 0, 1, or 2. There are two primary reasons to make this change:
First, libbpf relies on the assumption a certain BPF fd is never 0 (e.g.
most recently noticed in [0]). Second, Alexei pointed out in [1] that
some environments reset stdin, stdout, and stderr if they notice an
invalid fd at these numbers. To protect against both these cases, switch
all internal BPF syscall wrappers in libbpf to always return an fd >= 3.
We only need to modify the syscall wrappers and not other code that
assumes a valid fd by doing >= 0, to avoid pointless churn, and because
it is still a valid assumption. The cost paid is two additional syscalls
if fd is in range [0, 2].

  [0]: e31eec77e4ab ("bpf: selftests: Fix fd cleanup in get_branch_snapshot")
  [1]: https://lore.kernel.org/bpf/CAADnVQKVKY8o_3aU8Gzke443+uHa-eGoM0h7W4srChMXU1S4Bg@mail.gmail.com

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211028063501.2239335-5-memxor@gmail.com
2021-11-01 15:10:25 -07:00
Kumar Kartikeya Dwivedi
c95bf6714d libbpf: Add weak ksym support to gen_loader
This extends existing ksym relocation code to also support relocating
weak ksyms. Care needs to be taken to zero out the src_reg (currently
BPF_PSEUOD_BTF_ID, always set for gen_loader by bpf_object__relocate_data)
when the BTF ID lookup fails at runtime.  This is not a problem for
libbpf as it only sets ext->is_set when BTF ID lookup succeeds (and only
proceeds in case of failure if ext->is_weak, leading to src_reg
remaining as 0 for weak unresolved ksym).

A pattern similar to emit_relo_kfunc_btf is followed of first storing
the default values and then jumping over actual stores in case of an
error. For src_reg adjustment, we also need to perform it when copying
the populated instruction, so depending on if copied insn[0].imm is 0 or
not, we decide to jump over the adjustment.

We cannot reach that point unless the ksym was weak and resolved and
zeroed out, as the emit_check_err will cause us to jump to cleanup
label, so we do not need to recheck whether the ksym is weak before
doing the adjustment after copying BTF ID and BTF FD.

This is consistent with how libbpf relocates weak ksym. Logging
statements are added to show the relocation result and aid debugging.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211028063501.2239335-4-memxor@gmail.com
2021-11-01 15:10:25 -07:00
Kumar Kartikeya Dwivedi
8e697cf9fd libbpf: Add typeless ksym support to gen_loader
This uses the bpf_kallsyms_lookup_name helper added in previous patches
to relocate typeless ksyms. The return value ENOENT can be ignored, and
the value written to 'res' can be directly stored to the insn, as it is
overwritten to 0 on lookup failure. For repeating symbols, we can simply
copy the previously populated bpf_insn.

Also, we need to take care to not close fds for typeless ksym_desc, so
reuse the 'off' member's space to add a marker for typeless ksym and use
that to skip them in cleanup_relos.

We add a emit_ksym_relo_log helper that avoids duplicating common
logging instructions between typeless and weak ksym (for future commit).

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211028063501.2239335-3-memxor@gmail.com
2021-11-01 15:10:25 -07:00
Kumar Kartikeya Dwivedi
1dd20d7144 bpf: Add bpf_kallsyms_lookup_name helper
This helper allows us to get the address of a kernel symbol from inside
a BPF_PROG_TYPE_SYSCALL prog (used by gen_loader), so that we can
relocate typeless ksym vars.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211028063501.2239335-2-memxor@gmail.com
2021-11-01 15:10:25 -07:00
Joanne Koong
28c8a2c179 libbpf: Add "map_extra" as a per-map-type extra flag
This patch adds the libbpf infrastructure for supporting a
per-map-type "map_extra" field, whose definition will be
idiosyncratic depending on map type.

For example, for the bloom filter map, the lower 4 bits of
map_extra is used to denote the number of hash functions.

Please note that until libbpf 1.0 is here, the
"bpf_create_map_params" struct is used as a temporary
means for propagating the map_extra field to the kernel.

Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211027234504.30744-3-joannekoong@fb.com
2021-11-01 15:10:25 -07:00
Joanne Koong
11f873fd5b bpf: Add bloom filter map implementation
This patch adds the kernel-side changes for the implementation of
a bpf bloom filter map.

The bloom filter map supports peek (determining whether an element
is present in the map) and push (adding an element to the map)
operations.These operations are exposed to userspace applications
through the already existing syscalls in the following way:

BPF_MAP_LOOKUP_ELEM -> peek
BPF_MAP_UPDATE_ELEM -> push

The bloom filter map does not have keys, only values. In light of
this, the bloom filter map's API matches that of queue stack maps:
user applications use BPF_MAP_LOOKUP_ELEM/BPF_MAP_UPDATE_ELEM
which correspond internally to bpf_map_peek_elem/bpf_map_push_elem,
and bpf programs must use the bpf_map_peek_elem and bpf_map_push_elem
APIs to query or add an element to the bloom filter map. When the
bloom filter map is created, it must be created with a key_size of 0.

For updates, the user will pass in the element to add to the map
as the value, with a NULL key. For lookups, the user will pass in the
element to query in the map as the value, with a NULL key. In the
verifier layer, this requires us to modify the argument type of
a bloom filter's BPF_FUNC_map_peek_elem call to ARG_PTR_TO_MAP_VALUE;
as well, in the syscall layer, we need to copy over the user value
so that in bpf_map_peek_elem, we know which specific value to query.

A few things to please take note of:
 * If there are any concurrent lookups + updates, the user is
responsible for synchronizing this to ensure no false negative lookups
occur.
 * The number of hashes to use for the bloom filter is configurable from
userspace. If no number is specified, the default used will be 5 hash
functions. The benchmarks later in this patchset can help compare the
performance of using different number of hashes on different entry
sizes. In general, using more hashes decreases both the false positive
rate and the speed of a lookup.
 * Deleting an element in the bloom filter map is not supported.
 * The bloom filter map may be used as an inner map.
 * The "max_entries" size that is specified at map creation time is used
to approximate a reasonable bitmap size for the bloom filter, and is not
otherwise strictly enforced. If the user wishes to insert more entries
into the bloom filter than "max_entries", they may do so but they should
be aware that this may lead to a higher false positive rate.

Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211027234504.30744-2-joannekoong@fb.com
2021-11-01 15:10:25 -07:00
Joe Burton
50041f432d libbpf: Deprecate bpf_objects_list
Add a flag to `enum libbpf_strict_mode' to disable the global
`bpf_objects_list', preventing race conditions when concurrent threads
call bpf_object__open() or bpf_object__close().

bpf_object__next() will return NULL if this option is set.

Callers may achieve the same workflow by tracking bpf_objects in
application code.

  [0] Closes: https://github.com/libbpf/libbpf/issues/293

Signed-off-by: Joe Burton <jevburton@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211026223528.413950-1-jevburton.kernel@gmail.com
2021-11-01 15:10:25 -07:00
Ilya Leoshkevich
87a9622982 libbpf: Use __BYTE_ORDER__
Use the compiler-defined __BYTE_ORDER__ instead of the libc-defined
__BYTE_ORDER for consistency.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211026010831.748682-3-iii@linux.ibm.com
2021-11-01 15:10:25 -07:00
Ilya Leoshkevich
5b732fc1d8 libbpf: Fix endianness detection in BPF_CORE_READ_BITFIELD_PROBED()
__BYTE_ORDER is supposed to be defined by a libc, and __BYTE_ORDER__ -
by a compiler. bpf_core_read.h checks __BYTE_ORDER == __LITTLE_ENDIAN,
which is true if neither are defined, leading to incorrect behavior on
big-endian hosts if libc headers are not included, which is often the
case.

Fixes: ee26dade0e3b ("libbpf: Add support for relocatable bitfields")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211026010831.748682-2-iii@linux.ibm.com
2021-11-01 15:10:25 -07:00
Andrii Nakryiko
ffc5139acd libbpf: Deprecate ambiguously-named bpf_program__size() API
The name of the API doesn't convey clearly that this size is in number
of bytes (there needed to be a separate comment to make this clear in
libbpf.h). Further, measuring the size of BPF program in bytes is not
exactly the best fit, because BPF programs always consist of 8-byte
instructions. As such, bpf_program__insn_cnt() is a better alternative
in pretty much any imaginable case.

So schedule bpf_program__size() deprecation starting from v0.7 and it
will be removed in libbpf 1.0.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211025224531.1088894-5-andrii@kernel.org
2021-11-01 15:10:25 -07:00
Andrii Nakryiko
cfbdceb99d libbpf: Deprecate multi-instance bpf_program APIs
Schedule deprecation of a set of APIs that are related to multi-instance
bpf_programs:
  - bpf_program__set_prep() ([0]);
  - bpf_program__{set,unset}_instance() ([1]);
  - bpf_program__nth_fd().

These APIs are obscure, very niche, and don't seem to be used much in
practice. bpf_program__set_prep() is pretty useless for anything but the
simplest BPF programs, as it doesn't allow to adjust BPF program load
attributes, among other things. In short, it already bitrotted and will
bitrot some more if not removed.

With bpf_program__insns() API, which gives access to post-processed BPF
program instructions of any given entry-point BPF program, it's now
possible to do whatever necessary adjustments were possible with
set_prep() API before, but also more. Given any such use case is
automatically an advanced use case, requiring users to stick to
low-level bpf_prog_load() APIs and managing their own prog FDs is
reasonable.

  [0] Closes: https://github.com/libbpf/libbpf/issues/299
  [1] Closes: https://github.com/libbpf/libbpf/issues/300

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211025224531.1088894-4-andrii@kernel.org
2021-11-01 15:10:25 -07:00
Andrii Nakryiko
9871f15dd6 libbpf: Add ability to fetch bpf_program's underlying instructions
Add APIs providing read-only access to bpf_program BPF instructions ([0]).
This is useful for diagnostics purposes, but it also allows a cleaner
support for cloning BPF programs after libbpf did all the FD resolution
and CO-RE relocations, subprog instructions appending, etc. Currently,
cloning BPF program is possible only through hijacking a half-broken
bpf_program__set_prep() API, which doesn't really work well for anything
but most primitive programs. For instance, set_prep() API doesn't allow
adjusting BPF program load parameters which are necessary for loading
fentry/fexit BPF programs (the case where BPF program cloning is
a necessity if doing some sort of mass-attachment functionality).

Given bpf_program__set_prep() API is set to be deprecated, having
a cleaner alternative is a must. libbpf internally already keeps track
of linear array of struct bpf_insn, so it's not hard to expose it. The
only gotcha is that libbpf previously freed instructions array during
bpf_object load time, which would make this API much less useful overall,
because in between bpf_object__open() and bpf_object__load() a lot of
changes to instructions are done by libbpf.

So this patch makes libbpf hold onto prog->insns array even after BPF
program loading. I think this is a small price for added functionality
and improved introspection of BPF program code.

See retsnoop PR ([1]) for how it can be used in practice and code
savings compared to relying on bpf_program__set_prep().

  [0] Closes: https://github.com/libbpf/libbpf/issues/298
  [1] https://github.com/anakryiko/retsnoop/pull/1

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211025224531.1088894-3-andrii@kernel.org
2021-11-01 15:10:25 -07:00
Andrii Nakryiko
93c109c9ee libbpf: Fix off-by-one bug in bpf_core_apply_relo()
Fix instruction index validity check which has off-by-one error.

Fixes: 3ee4f5335511 ("libbpf: Split bpf_core_apply_relo() into bpf_program independent helper.")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211025224531.1088894-2-andrii@kernel.org
2021-11-01 15:10:25 -07:00
Quentin Monnet
eaea2bce02 sync: remove redundant test on $BPF_BRANCH
The sync-kernel.sh script has two consecutive tests for $BPF_BRANCH
being provided by the user (and so the second one can currently never
fail). Looking at the error message displayed in each case, we want to
keep the second one. Let's remove the first check.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
2021-10-26 14:39:07 -07:00
Quentin Monnet
f05791d8cf sync: fix comment for commit_signature() (subject instead of hash)
The commit_signature() function does not use the hash of the commit,
which typically differs between the kernel repo and the mirrored
version, but the subject for this commit. Fix the comment accordingly.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
2021-10-25 13:29:16 -07:00
Andrii Nakryiko
2bb8f041b0 README: add links to BPF CO-RE reference guide
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-10-24 16:19:38 -07:00
Evgeny Vereshchagin
50ae3acfe9 [ci] turn on CIFuzz
https://google.github.io/oss-fuzz/getting-started/continuous-integration/

Signed-off-by: Evgeny Vereshchagin <evvers@ya.ru>
2021-10-22 18:41:23 -07:00
Andrii Nakryiko
07ba0eeb8e sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   5319255b8df9271474bc9027cabf82253934f28d
Checkpoint bpf-next commit: c825f5fee19caf301d9821cd79abaa734322de26
Baseline bpf commit:        8d6c414cd2fb74aa6812e9bfec6178f8246c4f3a
Checkpoint bpf commit:      04f8ef5643bcd8bcde25dfdebef998aea480b2ba

Andrii Nakryiko (9):
  libbpf: Deprecate btf__finalize_data() and move it into libbpf.c
  libbpf: Extract ELF processing state into separate struct
  libbpf: Use Elf64-specific types explicitly for dealing with ELF
  libbpf: Remove assumptions about uniqueness of .rodata/.data/.bss maps
  libbpf: Support multiple .rodata.* and .data.* BPF maps
  libbpf: Simplify look up by name of internal maps
  libbpf: Fix the use of aligned attribute
  libbpf: Fix overflow in BTF sanity checks
  libbpf: Fix BTF header parsing checks

Dave Marchevsky (2):
  libbpf: Migrate internal use of bpf_program__get_prog_info_linear
  bpf: Add verified_insns to bpf_prog_info and fdinfo

Hengqi Chen (2):
  bpf: Add bpf_skc_to_unix_sock() helper
  libbpf: Add btf__type_cnt() and btf__raw_data() APIs

Ilya Leoshkevich (3):
  libbpf: Fix dumping big-endian bitfields
  libbpf: Fix dumping non-aligned __int128
  libbpf: Fix ptr_is_aligned() usages

Mauricio Vásquez (1):
  libbpf: Fix memory leak in btf__dedup()

Stanislav Fomichev (1):
  libbpf: Use func name when pinning programs with
    LIBBPF_STRICT_SEC_NAME

Yonghong Song (1):
  bpf: Rename BTF_KIND_TAG to BTF_KIND_DECL_TAG

 include/uapi/linux/bpf.h |   8 +
 include/uapi/linux/btf.h |   8 +-
 src/btf.c                | 187 +++-----
 src/btf.h                |  17 +-
 src/btf_dump.c           |  56 ++-
 src/libbpf.c             | 984 ++++++++++++++++++++++++---------------
 src/libbpf.map           |   4 +-
 src/libbpf_internal.h    |  12 +-
 src/libbpf_legacy.h      |   3 +
 src/linker.c             |  29 +-
 10 files changed, 744 insertions(+), 564 deletions(-)

--
2.30.2
2021-10-22 18:38:55 -07:00
Andrii Nakryiko
b15d479ef7 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2021-10-22 18:38:55 -07:00
Andrii Nakryiko
d374094d8c libbpf: Fix BTF header parsing checks
Original code assumed fixed and correct BTF header length. That's not
always the case, though, so fix this bug with a proper additional check.
And use actual header length instead of sizeof(struct btf_header) in
sanity checks.

Fixes: 8a138aed4a80 ("bpf: btf: Add BTF support to libbpf")
Reported-by: Evgeny Vereshchagin <evvers@ya.ru>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211023003157.726961-2-andrii@kernel.org
2021-10-22 18:38:55 -07:00
Andrii Nakryiko
19d260d144 libbpf: Fix overflow in BTF sanity checks
btf_header's str_off+str_len or type_off+type_len can overflow as they
are u32s. This will lead to bypassing the sanity checks during BTF
parsing, resulting in crashes afterwards. Fix by using 64-bit signed
integers for comparison.

Fixes: d8123624506c ("libbpf: Fix BTF data layout checks and allow empty BTF")
Reported-by: Evgeny Vereshchagin <evvers@ya.ru>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211023003157.726961-1-andrii@kernel.org
2021-10-22 18:38:55 -07:00
Stanislav Fomichev
f1558d7a23 libbpf: Use func name when pinning programs with LIBBPF_STRICT_SEC_NAME
We can't use section name anymore because they are not unique
and pinning objects with multiple programs with the same
progtype/secname will fail.

  [0] Closes: https://github.com/libbpf/libbpf/issues/273

Fixes: 33a2c75c55e2 ("libbpf: add internal pin_name")
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20211021214814.1236114-2-sdf@google.com
2021-10-22 18:38:55 -07:00
Hengqi Chen
596c9a2d77 libbpf: Add btf__type_cnt() and btf__raw_data() APIs
Add btf__type_cnt() and btf__raw_data() APIs and deprecate
btf__get_nr_type() and btf__get_raw_data() since the old APIs
don't follow the libbpf naming convention for getters which
omit 'get' in the name (see [0]). btf__raw_data() is just an
alias to the existing btf__get_raw_data(). btf__type_cnt()
now returns the number of all types of the BTF object
including 'void'.

  [0] Closes: https://github.com/libbpf/libbpf/issues/279

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211022130623.1548429-2-hengqi.chen@gmail.com
2021-10-22 18:38:55 -07:00
Mauricio Vásquez
eb10610a3b libbpf: Fix memory leak in btf__dedup()
Free btf_dedup if btf_ensure_modifiable() returns error.

Fixes: 919d2b1dbb07 ("libbpf: Allow modification of BTF and add btf__add_str API")
Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211022202035.48868-1-mauricio@kinvolk.io
2021-10-22 18:38:55 -07:00
Andrii Nakryiko
d7982f3948 libbpf: Fix the use of aligned attribute
Building libbpf sources out of kernel tree (in Github repo) we run into
compilation error due to unknown __aligned attribute. It must be coming
from some kernel header, which is not available to Github sources. Use
explicit __attribute__((aligned(16))) instead.

Fixes: 961632d54163 ("libbpf: Fix dumping non-aligned __int128")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211022192502.2975553-1-andrii@kernel.org
2021-10-22 18:38:55 -07:00
Andrii Nakryiko
76b4bf4295 libbpf: Simplify look up by name of internal maps
Map name that's assigned to internal maps (.rodata, .data, .bss, etc)
consist of a small prefix of bpf_object's name and ELF section name as
a suffix. This makes it hard for users to "guess" the name to use for
looking up by name with bpf_object__find_map_by_name() API.

One proposal was to drop object name prefix from the map name and just
use ".rodata", ".data", etc, names. One downside called out was that
when multiple BPF applications are active on the host, it will be hard
to distinguish between multiple instances of .rodata and know which BPF
object (app) they belong to. Having few first characters, while quite
limiting, still can give a bit of a clue, in general.

Note, though, that btf_value_type_id for such global data maps (ARRAY)
points to DATASEC type, which encodes full ELF name, so tools like
bpftool can take advantage of this fact to "recover" full original name
of the map. This is also the reason why for custom .data.* and .rodata.*
maps libbpf uses only their ELF names and doesn't prepend object name at
all.

Another downside of such approach is that it is not backwards compatible
and, among direct use of bpf_object__find_map_by_name() API, will break
any BPF skeleton generated using bpftool that was compiled with older
libbpf version.

Instead of causing all this pain, libbpf will still generate map name
using a combination of object name and ELF section name, but it will
allow looking such maps up by their natural names, which correspond to
their respective ELF section names. This means non-truncated ELF section
names longer than 15 characters are going to be expected and supported.

With such set up, we get the best of both worlds: leave small bits of
a clue about BPF application that instantiated such maps, as well as
making it easy for user apps to lookup such maps at runtime. In this
sense it closes corresponding libbpf 1.0 issue ([0]).

BPF skeletons will continue using full names for lookups.

  [0] Closes: https://github.com/libbpf/libbpf/issues/275

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211021014404.2635234-10-andrii@kernel.org
2021-10-22 18:38:55 -07:00
Andrii Nakryiko
5bf62459b1 libbpf: Support multiple .rodata.* and .data.* BPF maps
Add support for having multiple .rodata and .data data sections ([0]).
.rodata/.data are supported like the usual, but now also
.rodata.<whatever> and .data.<whatever> are also supported. Each such
section will get its own backing BPF_MAP_TYPE_ARRAY, just like
.rodata and .data.

Multiple .bss maps are not supported, as the whole '.bss' name is
confusing and might be deprecated soon, as well as user would need to
specify custom ELF section with SEC() attribute anyway, so might as well
stick to just .data.* and .rodata.* convention.

User-visible map name for such new maps is going to be just their ELF
section names.

  [0] https://github.com/libbpf/libbpf/issues/274

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211021014404.2635234-8-andrii@kernel.org
2021-10-22 18:38:55 -07:00
Andrii Nakryiko
421213a052 libbpf: Remove assumptions about uniqueness of .rodata/.data/.bss maps
Remove internal libbpf assumption that there can be only one .rodata,
.data, and .bss map per BPF object. To achieve that, extend and
generalize the scheme that was used for keeping track of relocation ELF
sections. Now each ELF section has a temporary extra index that keeps
track of logical type of ELF section (relocations, data, read-only data,
BSS). Switch relocation to this scheme, as well as .rodata/.data/.bss
handling.

We don't yet allow multiple .rodata, .data, and .bss sections, but no
libbpf internal code makes an assumption that there can be only one of
each and thus they can be explicitly referenced by a single index. Next
patches will actually allow multiple .rodata and .data sections.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211021014404.2635234-5-andrii@kernel.org
2021-10-22 18:38:55 -07:00
Andrii Nakryiko
4cafbf7527 libbpf: Use Elf64-specific types explicitly for dealing with ELF
Minimize the usage of class-agnostic gelf_xxx() APIs from libelf. These
APIs require copying ELF data structures into local GElf_xxx structs and
have a more cumbersome API. BPF ELF file is defined to be always 64-bit
ELF object, even when intended to be run on 32-bit host architectures,
so there is no need to do class-agnostic conversions everywhere. BPF
static linker implementation within libbpf has been using Elf64-specific
types since initial implementation.

Add two simple helpers, elf_sym_by_idx() and elf_rel_by_idx(), for more
succinct direct access to ELF symbol and relocation records within ELF
data itself and switch all the GElf_xxx usage into Elf64_xxx
equivalents. The only remaining place within libbpf.c that's still using
gelf API is gelf_getclass(), as there doesn't seem to be a direct way to
get underlying ELF bitness.

No functional changes intended.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211021014404.2635234-4-andrii@kernel.org
2021-10-22 18:38:55 -07:00
Andrii Nakryiko
f687443178 libbpf: Extract ELF processing state into separate struct
Name currently anonymous internal struct that keeps ELF-related state
for bpf_object. Just a bit of clean up, no functional changes.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211021014404.2635234-3-andrii@kernel.org
2021-10-22 18:38:55 -07:00
Andrii Nakryiko
38fb8cfc0c libbpf: Deprecate btf__finalize_data() and move it into libbpf.c
There isn't a good use case where anyone but libbpf itself needs to call
btf__finalize_data(). It was implemented for internal use and it's not
clear why it was made into public API in the first place. To function, it
requires active ELF data, which is stored inside bpf_object for the
duration of opening phase only. But the only BTF that needs bpf_object's
ELF is that bpf_object's BTF itself, which libbpf fixes up automatically
during bpf_object__open() operation anyways. There is no need for any
additional fix up and no reasonable scenario where it's useful and
appropriate.

Thus, btf__finalize_data() is just an API atavism and is better removed.
So this patch marks it as deprecated immediately (v0.6+) and moves the
code from btf.c into libbpf.c where it's used in the context of
bpf_object opening phase. Such code co-location allows to make code
structure more straightforward and remove bpf_object__section_size() and
bpf_object__variable_offset() internal helpers from libbpf_internal.h,
making them static. Their naming is also adjusted to not create
a wrong illusion that they are some sort of method of bpf_object. They
are internal helpers and are called appropriately.

This is part of libbpf 1.0 effort ([0]).

  [0] Closes: https://github.com/libbpf/libbpf/issues/276

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211021014404.2635234-2-andrii@kernel.org
2021-10-22 18:38:55 -07:00
Dave Marchevsky
45115706b6 bpf: Add verified_insns to bpf_prog_info and fdinfo
This stat is currently printed in the verifier log and not stored
anywhere. To ease consumption of this data, add a field to bpf_prog_aux
so it can be exposed via BPF_OBJ_GET_INFO_BY_FD and fdinfo.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211020074818.1017682-2-davemarchevsky@fb.com
2021-10-22 18:38:55 -07:00
Ilya Leoshkevich
bde69b0ee0 libbpf: Fix ptr_is_aligned() usages
Currently ptr_is_aligned() takes size, and not alignment, as a
parameter, which may be overly pessimistic e.g. for __i128 on s390,
which must be only 8-byte aligned. Fix by using btf__align_of().

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211021104658.624944-2-iii@linux.ibm.com
2021-10-22 18:38:55 -07:00
Hengqi Chen
19c6144c09 bpf: Add bpf_skc_to_unix_sock() helper
The helper is used in tracing programs to cast a socket
pointer to a unix_sock pointer.
The return value could be NULL if the casting is illegal.

Suggested-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211021134752.1223426-2-hengqi.chen@gmail.com
2021-10-22 18:38:55 -07:00
Ilya Leoshkevich
760c39208c libbpf: Fix dumping non-aligned __int128
Non-aligned integers are dumped as bitfields, which is supported for at
most 64-bit integers. Fix by using the same trick as
btf_dump_float_data(): copy non-aligned values to the local buffer.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211013160902.428340-4-iii@linux.ibm.com
2021-10-22 18:38:55 -07:00
Ilya Leoshkevich
fa93001e85 libbpf: Fix dumping big-endian bitfields
On big-endian arches not only bytes, but also bits are numbered in
reverse order (see e.g. S/390 ELF ABI Supplement, but this is also true
for other big-endian arches as well).

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211013160902.428340-3-iii@linux.ibm.com
2021-10-22 18:38:55 -07:00
Dave Marchevsky
50028712c4 libbpf: Migrate internal use of bpf_program__get_prog_info_linear
In preparation for bpf_program__get_prog_info_linear deprecation, move
the single use in libbpf to call bpf_obj_get_info_by_fd directly.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211011082031.4148337-2-davemarchevsky@fb.com
2021-10-22 18:38:55 -07:00
Yonghong Song
0f3ba10651 bpf: Rename BTF_KIND_TAG to BTF_KIND_DECL_TAG
Patch set [1] introduced BTF_KIND_TAG to allow tagging
declarations for struct/union, struct/union field, var, func
and func arguments and these tags will be encoded into
dwarf. They are also encoded to btf by llvm for the bpf target.

After BTF_KIND_TAG is introduced, we intended to use it
for kernel __user attributes. But kernel __user is actually
a type attribute. Upstream and internal discussion showed
it is not a good idea to mix declaration attribute and
type attribute. So we proposed to introduce btf_type_tag
as a type attribute and existing btf_tag renamed to
btf_decl_tag ([2]).

This patch renamed BTF_KIND_TAG to BTF_KIND_DECL_TAG and some
other declarations with *_tag to *_decl_tag to make it clear
the tag is for declaration. In the future, BTF_KIND_TYPE_TAG
might be introduced per [3].

 [1] https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/
 [2] https://reviews.llvm.org/D111588
 [3] https://reviews.llvm.org/D111199

Fixes: b5ea834dde6b ("bpf: Support for new btf kind BTF_KIND_TAG")
Fixes: 5b84bd10363e ("libbpf: Add support for BTF_KIND_TAG")
Fixes: 5c07f2fec003 ("bpftool: Add support for BTF_KIND_TAG")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com
2021-10-22 18:38:55 -07:00
Evgeny Vereshchagin
ebf17ac628 README: add OSS-Fuzz badge
Signed-off-by: Evgeny Vereshchagin <evvers@ya.ru>
2021-10-22 09:20:25 -07:00
Evgeny Vereshchagin
06390e2371 [coverity] skip forks
to stop GitHub from sending notifications about failed attemtps
to run the Coverity workflow without `COVERITY_SCAN_TOKEN` like
https://github.com/evverx/libbpf/actions/runs/1364759947

Signed-off-by: Evgeny Vereshchagin <evvers@ya.ru>
2021-10-21 16:13:30 -07:00
Thiago Marques
720324afab Merge remote-tracking branch 'upstream/master' 2021-10-19 16:58:24 +00:00
Andrii Nakryiko
92c1e61a60 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   0e545dbaa2797133f57bf8387e8f74cd245cedea
Checkpoint bpf-next commit: 5319255b8df9271474bc9027cabf82253934f28d
Baseline bpf commit:        d0c6416bd7091647f6041599f396bfa19ae30368
Checkpoint bpf commit:      8d6c414cd2fb74aa6812e9bfec6178f8246c4f3a

Hou Tao (1):
  libbpf: Support detecting and attaching of writable tracepoint program

 src/libbpf.c | 26 +++++++++++++++++++++-----
 1 file changed, 21 insertions(+), 5 deletions(-)

--
2.30.2
2021-10-11 12:30:36 -07:00
Hou Tao
133e3603ec libbpf: Support detecting and attaching of writable tracepoint program
Program on writable tracepoint is BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE,
but its attachment is the same as BPF_PROG_TYPE_RAW_TRACEPOINT.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211004094857.30868-3-hotforest@gmail.com
2021-10-11 12:30:36 -07:00
Andrii Nakryiko
f9f6e92458 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   38261f369fb905552ebdd3feb9699c0788fd3371
Checkpoint bpf-next commit: 0e545dbaa2797133f57bf8387e8f74cd245cedea
Baseline bpf commit:        571fa247ab411f3233eeaaf837c6e646a513b9f8
Checkpoint bpf commit:      d0c6416bd7091647f6041599f396bfa19ae30368

Alexei Starovoitov (1):
  libbpf: Make gen_loader data aligned.

Andrii Nakryiko (2):
  libbpf: Add API that copies all BTF types from one BTF object to
    another
  libbpf: Fix memory leak in strset

Grant Seltzer (1):
  libbpf: Add API documentation convention guidelines

Hengqi Chen (3):
  libbpf: Support uniform BTF-defined key/value specification across all
    BPF maps
  libbpf: Deprecate bpf_object__unload() API since v0.6
  libbpf: Deprecate bpf_{map,program}__{prev,next} APIs since v0.7

Kumar Kartikeya Dwivedi (5):
  libbpf: Fix skel_internal.h to set errno on loader retval < 0
  libbpf: Support kernel module function calls
  libbpf: Resolve invalid weak kfunc calls with imm = 0, off = 0
  libbpf: Update gen_loader to emit BTF_KIND_FUNC relocations
  libbpf: Fix segfault in light skeleton for objects without BTF

Toke Høiland-Jørgensen (1):
  libbpf: Properly ignore STT_SECTION symbols in legacy map definitions

 docs/libbpf_naming_convention.rst |  40 ++++
 src/bpf.c                         |   1 +
 src/bpf_gen_internal.h            |  16 +-
 src/btf.c                         | 132 ++++++++++++-
 src/btf.h                         |  22 +++
 src/gen_loader.c                  | 317 +++++++++++++++++++++++++-----
 src/libbpf.c                      | 165 ++++++++++++----
 src/libbpf.h                      |  36 ++--
 src/libbpf.map                    |   5 +
 src/libbpf_internal.h             |   3 +
 src/skel_internal.h               |   6 +-
 src/strset.c                      |   1 +
 12 files changed, 633 insertions(+), 111 deletions(-)

--
2.30.2
2021-10-06 14:40:18 -07:00
Andrii Nakryiko
b05bace770 libbpf: Fix memory leak in strset
Free struct strset itself, not just its internal parts.

Fixes: 90d76d3ececc ("libbpf: Extract internal set-of-strings datastructure APIs")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20211001185910.86492-1-andrii@kernel.org
2021-10-06 14:40:18 -07:00
Kumar Kartikeya Dwivedi
7022527a7b libbpf: Fix segfault in light skeleton for objects without BTF
When fed an empty BPF object, bpftool gen skeleton -L crashes at
btf__set_fd() since it assumes presence of obj->btf, however for
the sequence below clang adds no .BTF section (hence no BTF).

Reproducer:

  $ touch a.bpf.c
  $ clang -O2 -g -target bpf -c a.bpf.c
  $ bpftool gen skeleton -L a.bpf.o
  /* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
  /* THIS FILE IS AUTOGENERATED! */

  struct a_bpf {
	struct bpf_loader_ctx ctx;
  Segmentation fault (core dumped)

The same occurs for files compiled without BTF info, i.e. without
clang's -g flag.

Fixes: 67234743736a (libbpf: Generate loader program out of BPF ELF file.)
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210930061634.1840768-1-memxor@gmail.com
2021-10-06 14:40:18 -07:00
Hengqi Chen
d3169df794 libbpf: Deprecate bpf_{map,program}__{prev,next} APIs since v0.7
Deprecate bpf_{map,program}__{prev,next} APIs. Replace them with
a new set of APIs named bpf_object__{prev,next}_{program,map} which
follow the libbpf API naming convention ([0]). No functionality changes.

  [0] Closes: https://github.com/libbpf/libbpf/issues/296

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211003165844.4054931-2-hengqi.chen@gmail.com
2021-10-06 14:40:18 -07:00
Hengqi Chen
d665ca0bb0 libbpf: Deprecate bpf_object__unload() API since v0.6
BPF objects are not reloadable after unload. Users are expected to use
bpf_object__close() to unload and free up resources in one operation.
No need to expose bpf_object__unload() as a public API, deprecate it
([0]).  Add bpf_object__unload() as an alias to internal
bpf_object_unload() and replace all bpf_object__unload() uses to avoid
compilation errors.

  [0] Closes: https://github.com/libbpf/libbpf/issues/290

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211002161000.3854559-1-hengqi.chen@gmail.com
2021-10-06 14:40:18 -07:00
Grant Seltzer
744fd961c7 libbpf: Add API documentation convention guidelines
This adds a section to the documentation for libbpf
naming convention which describes how to document
API features in libbpf, specifically the format of
which API doc comments need to conform to.

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211004215644.497327-1-grantseltzer@gmail.com
2021-10-06 14:40:18 -07:00
Andrii Nakryiko
13ebb60ab6 libbpf: Add API that copies all BTF types from one BTF object to another
Add a bulk copying api, btf__add_btf(), that speeds up and simplifies
appending entire contents of one BTF object to another one, taking care
of copying BTF type data, adjusting resulting BTF type IDs according to
their new locations in the destination BTF object, as well as copying
and deduplicating all the referenced strings and updating all the string
offsets in new BTF types as appropriate.

This API is intended to be used from tools that are generating and
otherwise manipulating BTFs generically, such as pahole. In pahole's
case, this API is useful for speeding up parallelized BTF encoding, as
it allows pahole to offload all the intricacies of BTF type copying to
libbpf and handle the parallelization aspects of the process.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/bpf/20211006051107.17921-2-andrii@kernel.org
2021-10-06 14:40:18 -07:00
Kumar Kartikeya Dwivedi
3298393748 libbpf: Update gen_loader to emit BTF_KIND_FUNC relocations
This change updates the BPF syscall loader to relocate BTF_KIND_FUNC
relocations, with support for weak kfunc relocations. The general idea
is to move map_fds to loader map, and also use the data for storing
kfunc BTF fds. Since both reuse the fd_array parameter, they need to be
kept together.

For map_fds, we reserve MAX_USED_MAPS slots in a region, and for kfunc,
we reserve MAX_KFUNC_DESCS. This is done so that insn->off has more
chances of being <= INT16_MAX than treating data map as a sparse array
and adding fd as needed.

When the MAX_KFUNC_DESCS limit is reached, we fall back to the sparse
array model, so that as long as it does remain <= INT16_MAX, we pass an
index relative to the start of fd_array.

We store all ksyms in an array where we try to avoid calling the
bpf_btf_find_by_name_kind helper, and also reuse the BTF fd that was
already stored. This also speeds up the loading process compared to
emitting calls in all cases, in later tests.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211002011757.311265-9-memxor@gmail.com
2021-10-06 14:40:18 -07:00
Kumar Kartikeya Dwivedi
0141d9dded libbpf: Resolve invalid weak kfunc calls with imm = 0, off = 0
Preserve these calls as it allows verifier to succeed in loading the
program if they are determined to be unreachable after dead code
elimination during program load. If not, the verifier will fail at
runtime. This is done for ext->is_weak symbols similar to the case for
variable ksyms.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211002011757.311265-8-memxor@gmail.com
2021-10-06 14:40:18 -07:00
Kumar Kartikeya Dwivedi
7dde8f8f8d libbpf: Support kernel module function calls
This patch adds libbpf support for kernel module function call support.
The fd_array parameter is used during BPF program load to pass module
BTFs referenced by the program. insn->off is set to index into this
array, but starts from 1, because insn->off as 0 is reserved for
btf_vmlinux.

We try to use existing insn->off for a module, since the kernel limits
the maximum distinct module BTFs for kfuncs to 256, and also because
index must never exceed the maximum allowed value that can fit in
insn->off (INT16_MAX). In the future, if kernel interprets signed offset
as unsigned for kfunc calls, this limit can be increased to UINT16_MAX.

Also introduce a btf__find_by_name_kind_own helper to start searching
from module BTF's start id when we know that the BTF ID is not present
in vmlinux BTF (in find_ksym_btf_id).

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211002011757.311265-7-memxor@gmail.com
2021-10-06 14:40:18 -07:00
Hengqi Chen
962f241379 libbpf: Support uniform BTF-defined key/value specification across all BPF maps
A bunch of BPF maps do not support specifying BTF types for key and value.
This is non-uniform and inconvenient[0]. Currently, libbpf uses a retry
logic which removes BTF type IDs when BPF map creation failed. Instead
of retrying, this commit recognizes those specialized maps and removes
BTF type IDs when creating BPF map.

  [0] Closes: https://github.com/libbpf/libbpf/issues/355

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210930161456.3444544-2-hengqi.chen@gmail.com
2021-10-06 14:40:18 -07:00
Kumar Kartikeya Dwivedi
8c2d905ff4 libbpf: Fix skel_internal.h to set errno on loader retval < 0
When the loader indicates an internal error (result of a checked bpf
system call), it returns the result in attr.test.retval. However, tests
that rely on ASSERT_OK_PTR on NULL (returned from light skeleton) may
miss that NULL denotes an error if errno is set to 0. This would result
in skel pointer being NULL, while ASSERT_OK_PTR returning 1, leading to
a SEGV on dereference of skel, because libbpf_get_error relies on the
assumption that errno is always set in case of error for ptr == NULL.

In particular, this was observed for the ksyms_module test. When
executed using `./test_progs -t ksyms`, prior tests manipulated errno
and the test didn't crash when it failed at ksyms_module load, while
using `./test_progs -t ksyms_module` crashed due to errno being
untouched.

Fixes: 67234743736a (libbpf: Generate loader program out of BPF ELF file.)
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210927145941.1383001-11-memxor@gmail.com
2021-10-06 14:40:18 -07:00
Toke Høiland-Jørgensen
531be4d879 libbpf: Properly ignore STT_SECTION symbols in legacy map definitions
The previous patch to ignore STT_SECTION symbols only added the ignore
condition in one of them. This fails if there's more than one map
definition in the 'maps' section, because the subsequent modulus check will
fail, resulting in error messages like:

libbpf: elf: unable to determine legacy map definition size in ./xdpdump_xdp.o

Fix this by also ignoring STT_SECTION in the first loop.

Fixes: c3e8c44a9063 ("libbpf: Ignore STT_SECTION symbols in 'maps' section")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210929213837.832449-1-toke@redhat.com
2021-10-06 14:40:18 -07:00
Alexei Starovoitov
f49c65d216 libbpf: Make gen_loader data aligned.
Align gen_loader data to 8 byte boundary to make sure union bpf_attr,
bpf_insns and other structs are aligned.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210927145941.1383001-9-memxor@gmail.com
2021-10-06 14:40:18 -07:00
Sergei Iudin
b0feb9b4d5 Remove caching of pahole test result
Initial idea was to run it hourly, but when it runs hourly it produce a
lot of useless noise and we had to switch it do daily. When we run it
daily caching make much less sense and only make debugging more complex.
2021-10-04 11:20:32 -07:00
Andrii Nakryiko
e671a47bc2 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   69cd823956ba8ce266a901170b1060db8073bddd
Checkpoint bpf-next commit: 38261f369fb905552ebdd3feb9699c0788fd3371
Baseline bpf commit:        bc23f724481759d0fac61dfb5ce979af2190bbe0
Checkpoint bpf commit:      571fa247ab411f3233eeaaf837c6e646a513b9f8

Andrii Nakryiko (15):
  libbpf: Use pre-setup sec_def in libbpf_find_attach_btf_id()
  libbpf: Deprecated bpf_object_open_opts.relaxed_core_relocs
  libbpf: Allow skipping attach_func_name in
    bpf_program__set_attach_target()
  libbpf: Schedule open_opts.attach_prog_fd deprecation since v0.7
  libbpf: Constify all high-level program attach APIs
  libbpf: Fix memory leak in legacy kprobe attach logic
  libbpf: Refactor and simplify legacy kprobe code
  libbpf: Add legacy uprobe attaching support
  libbpf: Add "tc" SEC_DEF which is a better name for "classifier"
  libbpf: Refactor internal sec_def handling to enable pluggability
  libbpf: Reduce reliance of attach_fns on sec_def internals
  libbpf: Refactor ELF section handler definitions
  libbpf: Complete SEC() table unification for
    BPF_APROG_SEC/BPF_EAPROG_SEC
  libbpf: Add opt-in strict BPF program section name handling logic
  selftests/bpf: Switch sk_lookup selftests to strict SEC("sk_lookup")
    use

Dave Marchevsky (4):
  bpf: Add bpf_trace_vprintk helper
  libbpf: Modify bpf_printk to choose helper based on arg count
  libbpf: Use static const fmt string in __bpf_printk
  bpf: Clarify data_len param in bpf_snprintf and bpf_seq_printf
    comments

Grant Seltzer (1):
  libbpf: Add doc comments in libbpf.h

Kumar Kartikeya Dwivedi (1):
  libbpf: Fix segfault in static linker for objects without BTF

Matteo Croce (1):
  bpf: Update bpf_get_smp_processor_id() documentation

Toke Høiland-Jørgensen (1):
  libbpf: Ignore STT_SECTION symbols in 'maps' section

 include/uapi/linux/bpf.h |  18 +-
 src/bpf_helpers.h        |  51 ++-
 src/libbpf.c             | 882 +++++++++++++++++++++++----------------
 src/libbpf.h             | 106 +++--
 src/libbpf_common.h      |   5 +
 src/libbpf_internal.h    |   7 +
 src/libbpf_legacy.h      |   9 +
 src/linker.c             |   8 +-
 8 files changed, 679 insertions(+), 407 deletions(-)

--
2.30.2
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
151e3cb314 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2021-09-29 12:05:46 -07:00
Kumar Kartikeya Dwivedi
2cfeea135c libbpf: Fix segfault in static linker for objects without BTF
When a BPF object is compiled without BTF info (without -g),
trying to link such objects using bpftool causes a SIGSEGV due to
btf__get_nr_types accessing obj->btf which is NULL. Fix this by
checking for the NULL pointer, and return error.

Reproducer:
$ cat a.bpf.c
extern int foo(void);
int bar(void) { return foo(); }
$ cat b.bpf.c
int foo(void) { return 0; }
$ clang -O2 -target bpf -c a.bpf.c
$ clang -O2 -target bpf -c b.bpf.c
$ bpftool gen obj out a.bpf.o b.bpf.o
Segmentation fault (core dumped)

After fix:
$ bpftool gen obj out a.bpf.o b.bpf.o
libbpf: failed to find BTF info for object 'a.bpf.o'
Error: failed to link 'a.bpf.o': Unknown error -22 (-22)

Fixes: a46349227cd8 (libbpf: Add linker extern resolution support for functions and global variables)
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210924023725.70228-1-memxor@gmail.com
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
3fe8ee2edb selftests/bpf: Switch sk_lookup selftests to strict SEC("sk_lookup") use
Update "sk_lookup/" definition to be a stand-alone type specifier,
with backwards-compatible prefix match logic in non-libbpf-1.0 mode.

Currently in selftests all the "sk_lookup/<whatever>" uses just use
<whatever> for duplicated unique name encoding, which is redundant as
BPF program's name (C function name) uniquely and descriptively
identifies the intended use for such BPF programs.

With libbpf's SEC_DEF("sk_lookup") definition updated, switch existing
sk_lookup programs to use "unqualified" SEC("sk_lookup") section names,
with no random text after it.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-11-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
823881b4f6 libbpf: Add opt-in strict BPF program section name handling logic
Implement strict ELF section name handling for BPF programs. It utilizes
`libbpf_set_strict_mode()` framework and adds new flag: LIBBPF_STRICT_SEC_NAME.

If this flag is set, libbpf will enforce exact section name matching for
a lot of program types that previously allowed just partial prefix
match. E.g., if previously SEC("xdp_whatever_i_want") was allowed, now
in strict mode only SEC("xdp") will be accepted, which makes SEC("")
definitions cleaner and more structured. SEC() now won't be used as yet
another way to uniquely encode BPF program identifier (for that
C function name is better and is guaranteed to be unique within
bpf_object). Now SEC() is strictly BPF program type and, depending on
program type, extra load/attach parameter specification.

Libbpf completely supports multiple BPF programs in the same ELF
section, so multiple BPF programs of the same type/specification easily
co-exist together within the same bpf_object scope.

Additionally, a new (for now internal) convention is introduced: section
name that can be a stand-alone exact BPF program type specificator, but
also could have extra parameters after '/' delimiter. An example of such
section is "struct_ops", which can be specified by itself, but also
allows to specify the intended operation to be attached to, e.g.,
"struct_ops/dctcp_init". Note, that "struct_ops_some_op" is not allowed.
Such section definition is specified as "struct_ops+".

This change is part of libbpf 1.0 effort ([0], [1]).

  [0] Closes: https://github.com/libbpf/libbpf/issues/271
  [1] https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0#stricter-and-more-uniform-bpf-program-section-name-sec-handling

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-10-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
20c1aa11b1 libbpf: Complete SEC() table unification for BPF_APROG_SEC/BPF_EAPROG_SEC
Complete SEC() table refactoring towards unified form by rewriting
BPF_APROG_SEC and BPF_EAPROG_SEC definitions with
SEC_DEF(SEC_ATTACHABLE_OPT) (for optional expected_attach_type) and
SEC_DEF(SEC_ATTACHABLE) (mandatory expected_attach_type), respectively.
Drop BPF_APROG_SEC, BPF_EAPROG_SEC, and BPF_PROG_SEC_IMPL macros after
that, leaving SEC_DEF() macro as the only one used.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-9-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
62ea715f71 libbpf: Refactor ELF section handler definitions
Refactor ELF section handler definitions table to use a set of flags and
unified SEC_DEF() macro. This allows for more succinct and table-like
set of definitions, and allows to more easily extend the logic without
adding more verbosity (this is utilized in later patches in the series).

This approach is also making libbpf-internal program pre-load callback
not rely on bpf_sec_def definition, which demonstrates that future
pluggable ELF section handlers will be able to achieve similar level of
integration without libbpf having to expose extra types and APIs.

For starters, update SEC_DEF() definitions and make them more succinct.
Also convert BPF_PROG_SEC() and BPF_APROG_COMPAT() definitions to
a common SEC_DEF() use.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-8-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
3c5c62097e libbpf: Reduce reliance of attach_fns on sec_def internals
Move closer to not relying on bpf_sec_def internals that won't be part
of public API, when pluggable SEC() handlers will be allowed. Drop
pre-calculated prefix length, and in various helpers don't rely on this
prefix length availability. Also minimize reliance on knowing
bpf_sec_def's prefix for few places where section prefix shortcuts are
supported (e.g., tp vs tracepoint, raw_tp vs raw_tracepoint).

Given checking some string for having a given string-constant prefix is
such a common operation and so annoying to be done with pure C code, add
a small macro helper, str_has_pfx(), and reuse it throughout libbpf.c
where prefix comparison is performed. With __builtin_constant_p() it's
possible to have a convenient helper that checks some string for having
a given prefix, where prefix is either string literal (or compile-time
known string due to compiler optimization) or just a runtime string
pointer, which is quite convenient and saves a lot of typing and string
literal duplication.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-7-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
fba5f02401 libbpf: Refactor internal sec_def handling to enable pluggability
Refactor internals of libbpf to allow adding custom SEC() handling logic
easily from outside of libbpf. To that effect, each SEC()-handling
registration sets mandatory program type/expected attach type for
a given prefix and can provide three callbacks called at different
points of BPF program lifetime:

  - init callback for right after bpf_program is initialized and
  prog_type/expected_attach_type is set. This happens during
  bpf_object__open() step, close to the very end of constructing
  bpf_object, so all the libbpf APIs for querying and updating
  bpf_program properties should be available;

  - pre-load callback is called right before BPF_PROG_LOAD command is
  called in the kernel. This callbacks has ability to set both
  bpf_program properties, as well as program load attributes, overriding
  and augmenting the standard libbpf handling of them;

  - optional auto-attach callback, which makes a given SEC() handler
  support auto-attachment of a BPF program through bpf_program__attach()
  API and/or BPF skeletons <skel>__attach() method.

Each callbacks gets a `long cookie` parameter passed in, which is
specified during SEC() handling. This can be used by callbacks to lookup
whatever additional information is necessary.

This is not yet completely ready to be exposed to the outside world,
mainly due to non-public nature of struct bpf_prog_load_params. Instead
of making it part of public API, we'll wait until the planned low-level
libbpf API improvements for BPF_PROG_LOAD and other typical bpf()
syscall APIs, at which point we'll have a public, probably OPTS-based,
way to fully specify BPF program load parameters, which will be used as
an interface for custom pre-load callbacks.

But this change itself is already a good first step to unify the BPF
program hanling logic even within the libbpf itself. As one example, all
the extra per-program type handling (sleepable bit, attach_btf_id
resolution, unsetting optional expected attach type) is now more obvious
and is gathered in one place.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-6-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
bb833f8129 libbpf: Add "tc" SEC_DEF which is a better name for "classifier"
As argued in [0], add "tc" ELF section definition for SCHED_CLS BPF
program type. "classifier" is a misleading terminology and should be
migrated away from.

  [0] https://lore.kernel.org/bpf/270e27b1-e5be-5b1c-b343-51bd644d0747@iogearbox.net/

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210928161946.2512801-2-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Toke Høiland-Jørgensen
cb12e83136 libbpf: Ignore STT_SECTION symbols in 'maps' section
When parsing legacy map definitions, libbpf would error out when
encountering an STT_SECTION symbol. This becomes a problem because some
versions of binutils will produce SECTION symbols for every section when
processing an ELF file, so BPF files run through 'strip' will end up with
such symbols, making libbpf refuse to load them.

There's not really any reason why erroring out is strictly necessary, so
change libbpf to just ignore SECTION symbols when parsing the ELF.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210927205810.715656-1-toke@redhat.com
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
09b528e847 libbpf: Add legacy uprobe attaching support
Similarly to recently added legacy kprobe attach interface support
through tracefs, support attaching uprobes using the legacy interface if
host kernel doesn't support newer FD-based interface.

For uprobes event name consists of "libbpf_" prefix, PID, sanitized
binary path and offset within that binary. Structuraly the code is
aligned with kprobe logic refactoring in previous patch. struct
bpf_link_perf is re-used and all the same legacy_probe_name and
legacy_is_retprobe fields are used to ensure proper cleanup on
bpf_link__destroy().

Users should be aware, though, that on old kernels which don't support
FD-based interface for kprobe/uprobe attachment, if the application
crashes before bpf_link__destroy() is called, uprobe legacy
events will be left in tracefs. This is the same limitation as with
legacy kprobe interfaces.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210921210036.1545557-5-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
9ecc0dcc19 libbpf: Refactor and simplify legacy kprobe code
Refactor legacy kprobe handling code to follow the same logic as uprobe
legacy logic added in the next patchs:
  - add append_to_file() helper that makes it simpler to work with
    tracefs file-based interface for creating and deleting probes;
  - move out probe/event name generation outside of the code that
    adds/removes it, which simplifies bookkeeping significantly;
  - change the probe name format to start with "libbpf_" prefix and
    include offset within kernel function;
  - switch 'unsigned long' to 'size_t' for specifying kprobe offsets,
    which is consistent with how uprobes define that, simplifies
    printf()-ing internally, and also avoids unnecessary complications on
    architectures where sizeof(long) != sizeof(void *).

This patch also implicitly fixes the problem with invalid open() error
handling present in poke_kprobe_events(), which (the function) this
patch removes.

Fixes: ca304b40c20d ("libbpf: Introduce legacy kprobe events support")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210921210036.1545557-4-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
246b61780b libbpf: Fix memory leak in legacy kprobe attach logic
In some error scenarios legacy_probe string won't be free()'d. Fix this.
This was reported by Coverity static analysis.

Fixes: ca304b40c20d ("libbpf: Introduce legacy kprobe events support")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210921210036.1545557-2-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Grant Seltzer
cdf14ff22b libbpf: Add doc comments in libbpf.h
This adds comments above functions in libbpf.h which document
their uses. These comments are of a format that doxygen and sphinx
can pick up and render. These are rendered by libbpf.readthedocs.org

These doc comments are for:
- bpf_object__find_map_by_name()
- bpf_map__fd()
- bpf_map__is_internal()
- libbpf_get_error()
- libbpf_num_possible_cpus()

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210918031457.36204-1-grantseltzer@gmail.com
2021-09-29 12:05:46 -07:00
Dave Marchevsky
e1966114cc bpf: Clarify data_len param in bpf_snprintf and bpf_seq_printf comments
Since the data_len in these two functions is a byte len of the preceding
u64 *data array, it must always be a multiple of 8. If this isn't the
case both helpers error out, so let's make the requirement explicit so
users don't need to infer it.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-10-davemarchevsky@fb.com
2021-09-29 12:05:46 -07:00
Dave Marchevsky
989d7189cd libbpf: Use static const fmt string in __bpf_printk
The __bpf_printk convenience macro was using a 'char' fmt string holder
as it predates support for globals in libbpf. Move to more efficient
'static const char', but provide a fallback to the old way via
BPF_NO_GLOBAL_DATA so users on old kernels can still use the macro.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-6-davemarchevsky@fb.com
2021-09-29 12:05:46 -07:00
Dave Marchevsky
18922504c3 libbpf: Modify bpf_printk to choose helper based on arg count
Instead of being a thin wrapper which calls into bpf_trace_printk,
libbpf's bpf_printk convenience macro now chooses between
bpf_trace_printk and bpf_trace_vprintk. If the arg count (excluding
format string) is >3, use bpf_trace_vprintk, otherwise use the older
helper.

The motivation behind this added complexity - instead of migrating
entirely to bpf_trace_vprintk - is to maintain good developer experience
for users compiling against new libbpf but running on older kernels.
Users who are passing <=3 args to bpf_printk will see no change in their
bytecode.

__bpf_vprintk functions similarly to BPF_SEQ_PRINTF and BPF_SNPRINTF
macros elsewhere in the file - it allows use of bpf_trace_vprintk
without manual conversion of varargs to u64 array. Previous
implementation of bpf_printk macro is moved to __bpf_printk for use by
the new implementation.

This does change behavior of bpf_printk calls with >3 args in the "new
libbpf, old kernels" scenario. Before this patch, attempting to use 4
args to bpf_printk results in a compile-time error. After this patch,
using bpf_printk with 4 args results in a trace_vprintk helper call
being emitted and a load-time failure on older kernels.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-5-davemarchevsky@fb.com
2021-09-29 12:05:46 -07:00
Dave Marchevsky
029273039d bpf: Add bpf_trace_vprintk helper
This helper is meant to be "bpf_trace_printk, but with proper vararg
support". Follow bpf_snprintf's example and take a u64 pseudo-vararg
array. Write to /sys/kernel/debug/tracing/trace_pipe using the same
mechanism as bpf_trace_printk. The functionality of this helper was
requested in the libbpf issue tracker [0].

[0] Closes: https://github.com/libbpf/libbpf/issues/315

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210917182911.2426606-4-davemarchevsky@fb.com
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
b177fac2e2 libbpf: Constify all high-level program attach APIs
Attach APIs shouldn't need to modify bpf_program/bpf_map structs, so
change all struct bpf_program and struct bpf_map pointers to const
pointers. This is completely backwards compatible with no functional
change.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-8-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
3e7c04669e libbpf: Schedule open_opts.attach_prog_fd deprecation since v0.7
bpf_object_open_opts.attach_prog_fd makes a pretty strong assumption
that bpf_object contains either only single freplace BPF program or all
of BPF programs in BPF object are freplaces intended to replace
different subprograms of the same target BPF program. This seems both
a bit confusing, too assuming, and limiting.

We've had bpf_program__set_attach_target() API which allows more
fine-grained control over this, on a per-program level. As such, mark
open_opts.attach_prog_fd as deprecated starting from v0.7, so that we
have one more universal way of setting freplace targets. With previous
change to allow NULL attach_func_name argument, and especially combined
with BPF skeleton, arguable bpf_program__set_attach_target() is a more
convenient and explicit API as well.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-7-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
599b999f5d libbpf: Allow skipping attach_func_name in bpf_program__set_attach_target()
Allow to use bpf_program__set_attach_target to only set target attach
program FD, while letting libbpf to use target attach function name from
SEC() definition. This might be useful for some scenarios where
bpf_object contains multiple related freplace BPF programs intended to
replace different sub-programs in target BPF program. In such case all
programs will have the same attach_prog_fd, but different
attach_func_name. It's convenient to specify such target function names
declaratively in SEC() definitions, but attach_prog_fd is a dynamic
runtime setting.

To simplify such scenario, allow bpf_program__set_attach_target() to
delay BTF ID resolution till the BPF program load time by providing NULL
attach_func_name. In that case the behavior will be similar to using
bpf_object_open_opts.attach_prog_fd (which is marked deprecated since
v0.7), but has the benefit of allowing more control by user in what is
attached to what. Such setup allows having BPF programs attached to
different target attach_prog_fd with target functions still declaratively
recorded in BPF source code in SEC() definitions.

Selftests changes in the next patch should make this more obvious.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-5-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
e856a7d560 libbpf: Deprecated bpf_object_open_opts.relaxed_core_relocs
It's relevant and hasn't been doing anything for a long while now.
Deprecated it.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-4-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
0a76ce1395 libbpf: Use pre-setup sec_def in libbpf_find_attach_btf_id()
Don't perform another search for sec_def inside
libbpf_find_attach_btf_id(), as each recognized bpf_program already has
prog->sec_def set.

Also remove unnecessary NULL check for prog->sec_name, as it can never
be NULL.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210916015836.1248906-2-andrii@kernel.org
2021-09-29 12:05:46 -07:00
Matteo Croce
1eba3a6470 bpf: Update bpf_get_smp_processor_id() documentation
BPF programs run with migration disabled regardless of preemption, as
they are protected by migrate_disable(). Update the uapi documentation
accordingly.

Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210914235400.59427-1-mcroce@linux.microsoft.com
2021-09-29 12:05:46 -07:00
Andrii Nakryiko
7d365e49f3 ci: use -a/-d with exact string match for black/white-listing
Doing substring matches allows accidental new tests to be enabled,
when they are not supposed to be. E.g., whitelisting "xdp" allows new
"xdpwall" test on 5.5.0, which wasn't supposed to happen.

Cc: Yucong Sun <fallentree@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-09-29 11:15:38 -07:00
Andrii Nakryiko
980777cc16 vmtest: temporarily blacklist spammy selftests
There is a problem in bpf-next tree which causes get_stack_raw_tp and
few other selftests to produce tons of kernel warnings, timing out and
failing CI test runs. Blacklist until bpf tree, which has a fix, is
merged into bpf-next.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-09-17 14:03:39 -07:00
Andrii Nakryiko
24dbcb3a30 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   47bb27a20d6ea22cd092c1fc2bb4fcecac374838
Checkpoint bpf-next commit: 69cd823956ba8ce266a901170b1060db8073bddd
Baseline bpf commit:        3776f3517ed94d40ff0e3851d7ce2ce17b63099f
Checkpoint bpf commit:      bc23f724481759d0fac61dfb5ce979af2190bbe0

Andrii Nakryiko (4):
  libbpf: Make libbpf_version.h non-auto-generated
  libbpf: Ensure BPF prog types are set before relocations
  libbpf: Simplify BPF program auto-attach code
  libbpf: Minimize explicit iterator of section definition array

Grant Seltzer (1):
  libbpf: Add sphinx code documentation comments

Quentin Monnet (1):
  libbpf: Add LIBBPF_DEPRECATED_SINCE macro for scheduling API
    deprecations

Rafael David Tinoco (1):
  libbpf: Introduce legacy kprobe events support

Rocco Yue (1):
  ipv6: add IFLA_INET6_RA_MTU to expose mtu value

Song Liu (1):
  bpf: Introduce helper bpf_get_branch_snapshot

Vadim Fedorenko (1):
  bpf: Add hardware timestamp field to __sk_buff

Yonghong Song (4):
  btf: Change BTF_KIND_* macros to enums
  bpf: Support for new btf kind BTF_KIND_TAG
  libbpf: Rename btf_{hash,equal}_int to btf_{hash,equal}_int_tag
  libbpf: Add support for BTF_KIND_TAG

 include/uapi/linux/bpf.h     |  24 +++
 include/uapi/linux/btf.h     |  55 ++++--
 include/uapi/linux/if_link.h |   1 +
 src/btf.c                    |  84 +++++++-
 src/btf.h                    |  87 +++++++++
 src/btf_dump.c               |   3 +
 src/libbpf.c                 | 359 ++++++++++++++++++++++++-----------
 src/libbpf.map               |   5 +
 src/libbpf_common.h          |  19 ++
 src/libbpf_internal.h        |   2 +
 src/libbpf_version.h         |   9 +
 11 files changed, 505 insertions(+), 143 deletions(-)
 create mode 100644 src/libbpf_version.h

--
2.30.2
2021-09-17 14:03:39 -07:00
Andrii Nakryiko
42da89eb16 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2021-09-17 14:03:39 -07:00
Grant Seltzer
627cbb395b libbpf: Add sphinx code documentation comments
This adds comments above five functions in btf.h which document
their uses. These comments are of a format that doxygen and sphinx
can pick up and render. These are rendered by libbpf.readthedocs.org

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210915021951.117186-1-grantseltzer@gmail.com
2021-09-17 14:03:39 -07:00
Yonghong Song
7e7f59d658 libbpf: Add support for BTF_KIND_TAG
Add BTF_KIND_TAG support for parsing and dedup.
Also added sanitization for BTF_KIND_TAG. If BTF_KIND_TAG is not
supported in the kernel, sanitize it to INTs.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223025.246687-1-yhs@fb.com
2021-09-17 14:03:39 -07:00
Yonghong Song
966ba8918d libbpf: Rename btf_{hash,equal}_int to btf_{hash,equal}_int_tag
This patch renames functions btf_{hash,equal}_int() to
btf_{hash,equal}_int_tag() so they can be reused for
BTF_KIND_TAG support. There is no functionality change for
this patch.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223020.245829-1-yhs@fb.com
2021-09-17 14:03:39 -07:00
Yonghong Song
fad7357469 bpf: Support for new btf kind BTF_KIND_TAG
LLVM14 added support for a new C attribute ([1])
  __attribute__((btf_tag("arbitrary_str")))
This attribute will be emitted to dwarf ([2]) and pahole
will convert it to BTF. Or for bpf target, this
attribute will be emitted to BTF directly ([3], [4]).
The attribute is intended to provide additional
information for
  - struct/union type or struct/union member
  - static/global variables
  - static/global function or function parameter.

For linux kernel, the btf_tag can be applied
in various places to specify user pointer,
function pre- or post- condition, function
allow/deny in certain context, etc. Such information
will be encoded in vmlinux BTF and can be used
by verifier.

The btf_tag can also be applied to bpf programs
to help global verifiable functions, e.g.,
specifying preconditions, etc.

This patch added basic parsing and checking support
in kernel for new BTF_KIND_TAG kind.

 [1] https://reviews.llvm.org/D106614
 [2] https://reviews.llvm.org/D106621
 [3] https://reviews.llvm.org/D106622
 [4] https://reviews.llvm.org/D109560

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223015.245546-1-yhs@fb.com
2021-09-17 14:03:39 -07:00
Yonghong Song
a6188fc5b4 btf: Change BTF_KIND_* macros to enums
Change BTF_KIND_* macros to enums so they are encoded in dwarf and
appear in vmlinux.h. This will make it easier for bpf programs
to use these constants without macro definitions.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210914223009.245307-1-yhs@fb.com
2021-09-17 14:03:39 -07:00
Andrii Nakryiko
d328ba7768 libbpf: Minimize explicit iterator of section definition array
Remove almost all the code that explicitly iterated BPF program section
definitions in favor of using find_sec_def(). The only remaining user of
section_defs is libbpf_get_type_names that has to iterate all of them to
construct its result.

Having one internal API entry point for section definitions will
simplify further refactorings around libbpf's program section
definitions parsing.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210914014733.2768-5-andrii@kernel.org
2021-09-17 14:03:39 -07:00
Andrii Nakryiko
27d14b6e3b libbpf: Simplify BPF program auto-attach code
Remove the need to explicitly pass bpf_sec_def for auto-attachable BPF
programs, as it is already recorded at bpf_object__open() time for all
recognized type of BPF programs. This further reduces number of explicit
calls to find_sec_def(), simplifying further refactorings.

No functional changes are done by this patch.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210914014733.2768-4-andrii@kernel.org
2021-09-17 14:03:39 -07:00
Andrii Nakryiko
f3c744997f libbpf: Ensure BPF prog types are set before relocations
Refactor bpf_object__open() sequencing to perform BPF program type
detection based on SEC() definitions before we get to relocations
collection. This allows to have more information about BPF program by
the time we get to, say, struct_ops relocation gathering. This,
subsequently, simplifies struct_ops logic and removes the need to
perform extra find_sec_def() resolution.

With this patch libbpf will require all struct_ops BPF programs to be
marked with SEC("struct_ops") or SEC("struct_ops/xxx") annotations.
Real-world applications are already doing that through something like
selftests's BPF_STRUCT_OPS() macro. This change streamlines libbpf's
internal handling of SEC() definitions and is in the sprit of
upcoming libbpf-1.0 section strictness changes ([0]).

  [0] https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0#stricter-and-more-uniform-bpf-program-section-name-sec-handling

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210914014733.2768-3-andrii@kernel.org
2021-09-17 14:03:39 -07:00
Rafael David Tinoco
749b3942a0 libbpf: Introduce legacy kprobe events support
Allow kprobe tracepoint events creation through legacy interface, as the
kprobe dynamic PMUs support, used by default, was only created in v4.17.

Store legacy kprobe name in struct bpf_perf_link, instead of creating
a new "subclass" off of bpf_perf_link. This is ok as it's just two new
fields, which are also going to be reused for legacy uprobe support in
follow up patches.

Signed-off-by: Rafael David Tinoco <rafaeldtinoco@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210912064844.3181742-1-rafaeldtinoco@gmail.com
2021-09-17 14:03:39 -07:00
Andrii Nakryiko
8ade99a6f8 libbpf: Make libbpf_version.h non-auto-generated
Turn previously auto-generated libbpf_version.h header into a normal
header file. This prevents various tricky Makefile integration issues,
simplifies the overall build process, but also allows to further extend
it with some more versioning-related APIs in the future.

To prevent accidental out-of-sync versions as defined by libbpf.map and
libbpf_version.h, Makefile checks their consistency at build time.

Simultaneously with this change bump libbpf.map to v0.6.

Also undo adding libbpf's output directory into include path for
kernel/bpf/preload, bpftool, and resolve_btfids, which is not necessary
because libbpf_version.h is just a normal header like any other.

Fixes: 0b46b7550560 ("libbpf: Add LIBBPF_DEPRECATED_SINCE macro for scheduling API deprecations")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210913222309.3220849-1-andrii@kernel.org
2021-09-17 14:03:39 -07:00
Song Liu
0c0f4a57da bpf: Introduce helper bpf_get_branch_snapshot
Introduce bpf_get_branch_snapshot(), which allows tracing pogram to get
branch trace from hardware (e.g. Intel LBR). To use the feature, the
user need to create perf_event with proper branch_record filtering
on each cpu, and then calls bpf_get_branch_snapshot in the bpf function.
On Intel CPUs, VLBR event (raw event 0x1b00) can be use for this.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210910183352.3151445-3-songliubraving@fb.com
2021-09-17 14:03:39 -07:00
Vadim Fedorenko
03f31a6aed bpf: Add hardware timestamp field to __sk_buff
BPF programs may want to know hardware timestamps if NIC supports
such timestamping.

Expose this data as hwtstamp field of __sk_buff the same way as
gso_segs/gso_size. This field could be accessed from the same
programs as tstamp field, but it's read-only field. Explicit test
to deny access to padding data is added to bpf_skb_is_valid_access.

Also update BPF_PROG_TEST_RUN tests of the feature.

Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210909220409.8804-2-vfedorenko@novek.ru
2021-09-17 14:03:39 -07:00
Quentin Monnet
f8983f7fb0 libbpf: Add LIBBPF_DEPRECATED_SINCE macro for scheduling API deprecations
Introduce a macro LIBBPF_DEPRECATED_SINCE(major, minor, message) to prepare
the deprecation of two API functions. This macro marks functions as deprecated
when libbpf's version reaches the values passed as an argument.

As part of this change libbpf_version.h header is added with recorded major
(LIBBPF_MAJOR_VERSION) and minor (LIBBPF_MINOR_VERSION) libbpf version macros.
They are now part of libbpf public API and can be relied upon by user code.
libbpf_version.h is installed system-wide along other libbpf public headers.

Due to this new build-time auto-generated header, in-kernel applications
relying on libbpf (resolve_btfids, bpftool, bpf_preload) are updated to
include libbpf's output directory as part of a list of include search paths.
Better fix would be to use libbpf's make_install target to install public API
headers, but that clean up is left out as a future improvement. The build
changes were tested by building kernel (with KBUILD_OUTPUT and O= specified
explicitly), bpftool, libbpf, selftests/bpf, and resolve_btfids builds. No
problems were detected.

Note that because of the constraints of the C preprocessor we have to write
a few lines of macro magic for each version used to prepare deprecation (0.6
for now).

Also, use LIBBPF_DEPRECATED_SINCE() to schedule deprecation of
btf__get_from_id() and btf__load(), which are replaced by
btf__load_from_kernel_by_id() and btf__load_into_kernel(), respectively,
starting from future libbpf v0.6. This is part of libbpf 1.0 effort ([0]).

  [0] Closes: https://github.com/libbpf/libbpf/issues/278

Co-developed-by: Quentin Monnet <quentin@isovalent.com>
Co-developed-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210908213226.1871016-1-andrii@kernel.org
2021-09-17 14:03:39 -07:00
Rocco Yue
514bb47ac5 ipv6: add IFLA_INET6_RA_MTU to expose mtu value
The kernel provides a "/proc/sys/net/ipv6/conf/<iface>/mtu"
file, which can temporarily record the mtu value of the last
received RA message when the RA mtu value is lower than the
interface mtu, but this proc has following limitations:

(1) when the interface mtu (/sys/class/net/<iface>/mtu) is
updeated, mtu6 (/proc/sys/net/ipv6/conf/<iface>/mtu) will
be updated to the value of interface mtu;
(2) mtu6 (/proc/sys/net/ipv6/conf/<iface>/mtu) only affect
ipv6 connection, and not affect ipv4.

Therefore, when the mtu option is carried in the RA message,
there will be a problem that the user sometimes cannot obtain
RA mtu value correctly by reading mtu6.

After this patch set, if a RA message carries the mtu option,
you can send a netlink msg which nlmsg_type is RTM_GETLINK,
and then by parsing the attribute of IFLA_INET6_RA_MTU to
get the mtu value carried in the RA message received on the
inet6 device. In addition, you can also get a link notification
when ra_mtu is updated so it doesn't have to poll.

In this way, if the MTU values that the device receives from
the network in the PCO IPv4 and the RA IPv6 procedures are
different, the user can obtain the correct ipv6 ra_mtu value
and compare the value of ra_mtu and ipv4 mtu, then the device
can use the lower MTU value for both IPv4 and IPv6.

Signed-off-by: Rocco Yue <rocco.yue@mediatek.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20210827150412.9267-1-rocco.yue@mediatek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-17 14:03:39 -07:00
Andrii Nakryiko
05d95ef6fa makefile: add libbpf_version.h to the list of installed headers
libbpf_version.h is a new header that is installed along other public
libbpf API headers.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-09-17 14:03:39 -07:00
Andrii Nakryiko
eda1ebe520 ci: use test_progs -l option to generate the list of whitelisted tests
Use this list of enabled tests as a whitelist, so that we don't have to
keep updating BLACKLIST-5.5.0 anymore. I'll keep BLACKLIST-5.5.0 for
now, because it serves as a nice historic log of which tests depend on
which kernels.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-09-17 14:03:39 -07:00
grantseltzer
7c26fe30f3 Add enum to be displayed in documentation
Signed-off-by: grantseltzer <grantseltzer@gmail.com>
2021-09-16 11:39:22 -07:00
Andrii Nakryiko
5579664205 libbpf: Fix build with latest gcc/binutils with LTO
After updating to binutils 2.35, the build began to fail with an
assembler error. A bug was opened on the Red Hat Bugzilla a few days
later for the same issue.

Work around the problem by using the new `symver` attribute (introduced
in GCC 10) as needed instead of assembler directives.

This addresses Red Hat ([0]) and OpenSUSE ([1]) bug reports, as well as libbpf
issue ([2]).

  [0]: https://bugzilla.redhat.com/show_bug.cgi?id=1863059
  [1]: https://bugzilla.opensuse.org/show_bug.cgi?id=1188749
  [2]: Closes: https://github.com/libbpf/libbpf/issues/338

Co-developed-by: Patrick McCarty <patrick.mccarty@intel.com>
Co-developed-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Patrick McCarty <patrick.mccarty@intel.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210907221023.2660953-1-andrii@kernel.org
2021-09-08 16:01:51 -07:00
Matt Smith
860b201cd0 libbpf: Change bpf_object_skeleton data field to const pointer
This change was necessary to enforce the implied contract
that bpf_object_skeleton->data should not be mutated.  The data
will be cast to `void *` during assignment to handle the case
where a user is compiling with older libbpf headers to avoid
a compiler warning of `const void *` data being cast to `void *`

Signed-off-by: Matt Smith <alastorze@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210901194439.3853238-2-alastorze@fb.com
2021-09-08 16:01:51 -07:00
Toke Høiland-Jørgensen
50e13993a1 libbpf: Don't crash on object files with no symbol tables
If libbpf encounters an ELF file that has been stripped of its symbol
table, it will crash in bpf_object__add_programs() when trying to
dereference the obj->efile.symbols pointer.

Fix this by erroring out of bpf_object__elf_collect() if it is not able
able to find the symbol table.

v2:
  - Move check into bpf_object__elf_collect() and add nice error message

Fixes: 6245947c1b3c ("libbpf: Allow gaps in BPF program sections to support overriden weak functions")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210901114812.204720-1-toke@redhat.com
2021-09-08 16:01:51 -07:00
Andrii Nakryiko
72fd44da53 ci: blacklist few new selftests for 5.5
Blacklist netns_cookie, sockopt_qos_to_cc, task_pt_regs for 5.5.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-08-31 14:21:44 -07:00
Andrii Nakryiko
d6e9681b0d sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   d20b41115ad53293201cc07ee429a38740cb056b
Checkpoint bpf-next commit: 47bb27a20d6ea22cd092c1fc2bb4fcecac374838
Baseline bpf commit:        3776f3517ed94d40ff0e3851d7ce2ce17b63099f
Checkpoint bpf commit:      3776f3517ed94d40ff0e3851d7ce2ce17b63099f

Daniel Xu (1):
  bpf: Add bpf_task_pt_regs() helper

Dave Marchevsky (1):
  bpf: Migrate cgroup_bpf to internal cgroup_bpf_attach_type enum

 include/uapi/linux/bpf.h | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

--
2.30.2
2021-08-31 14:21:44 -07:00
Andrii Nakryiko
517762deca sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2021-08-31 14:21:44 -07:00
Daniel Xu
1670e6100b bpf: Add bpf_task_pt_regs() helper
The motivation behind this helper is to access userspace pt_regs in a
kprobe handler.

uprobe's ctx is the userspace pt_regs. kprobe's ctx is the kernelspace
pt_regs. bpf_task_pt_regs() allows accessing userspace pt_regs in a
kprobe handler. The final case (kernelspace pt_regs in uprobe) is
pretty rare (usermode helper) so I think that can be solved later if
necessary.

More concretely, this helper is useful in doing BPF-based DWARF stack
unwinding. Currently the kernel can only do framepointer based stack
unwinds for userspace code. This is because the DWARF state machines are
too fragile to be computed in kernelspace [0]. The idea behind
DWARF-based stack unwinds w/ BPF is to copy a chunk of the userspace
stack (while in prog context) and send it up to userspace for unwinding
(probably with libunwind) [1]. This would effectively enable profiling
applications with -fomit-frame-pointer using kprobes and uprobes.

[0]: https://lkml.org/lkml/2012/2/10/356
[1]: https://github.com/danobi/bpf-dwarf-walk

Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/e2718ced2d51ef4268590ab8562962438ab82815.1629772842.git.dxu@dxuuu.xyz
2021-08-31 14:21:44 -07:00
Dave Marchevsky
ed529685db bpf: Migrate cgroup_bpf to internal cgroup_bpf_attach_type enum
Add an enum (cgroup_bpf_attach_type) containing only valid cgroup_bpf
attach types and a function to map bpf_attach_type values to the new
enum. Inspired by netns_bpf_attach_type.

Then, migrate cgroup_bpf to use cgroup_bpf_attach_type wherever
possible.  Functionality is unchanged as attach_type_to_prog_type
switches in bpf/syscall.c were preventing non-cgroup programs from
making use of the invalid cgroup_bpf array slots.

As a result struct cgroup_bpf uses 504 fewer bytes relative to when its
arrays were sized using MAX_BPF_ATTACH_TYPE.

bpf_cgroup_storage is notably not migrated as struct
bpf_cgroup_storage_key is part of uapi and contains a bpf_attach_type
member which is not meant to be opaque. Similarly, bpf_cgroup_link
continues to report its bpf_attach_type member to userspace via fdinfo
and bpf_link_info.

To ease disambiguation, bpf_attach_type variables are renamed from
'type' to 'atype' when changed to cgroup_bpf_attach_type.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210819092420.1984861-2-davemarchevsky@fb.com
2021-08-31 14:21:44 -07:00
Andrii Nakryiko
c92a5d043e ci: don't hang on kernel crashes in qemu
Backport of kernel-patches PR ([0]), done by @thefallentree.

  [0] https://github.com/kernel-patches/vmtest/pull/29

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-08-30 12:52:51 -07:00
Thiago Marques
aea40f7179 Merge remote-tracking branch 'upstream/master' 2021-08-20 01:33:45 +00:00
grantseltzer
8bdc267e7b sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   3c3bd542ffbb2ac09631313ede46ae66660ae550
Checkpoint bpf-next commit: d20b41115ad53293201cc07ee429a38740cb056b
Baseline bpf commit:        3776f3517ed94d40ff0e3851d7ce2ce17b63099f
Checkpoint bpf commit:      3776f3517ed94d40ff0e3851d7ce2ce17b63099f

Grant Seltzer (1):
  libbpf: Rename libbpf documentation index file

 docs/{libbpf.rst => index.rst}    | 8 ++++++++
 docs/libbpf_naming_convention.rst | 2 +-
 2 files changed, 9 insertions(+), 1 deletion(-)
 rename docs/{libbpf.rst => index.rst} (75%)

--
2.31.1
2021-08-18 15:22:43 -07:00
Grant Seltzer
d0c398be4f libbpf: Rename libbpf documentation index file
This patch renames a documentation libbpf.rst to index.rst. In order
for readthedocs.org to pick this file up and properly build the
documentation site.

It also changes the title type of the ABI subsection in the
naming convention doc. This is so that readthedocs.org doesn't treat this
section as a separate document.

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210818151313.49992-1-grantseltzer@gmail.com
2021-08-18 15:22:43 -07:00
grantseltzer
7d9cc837ef Fix path to Doxygen source code input
Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
2021-08-18 12:28:09 -07:00
Andrii Nakryiko
a3c0cc19d4 ci: blacklist new selftests on 5.5
Blacklist bpf_cookie, perf_link, and xdp_bonding selftests on 5.5 kernel.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-08-17 08:18:57 -07:00
Andrii Nakryiko
7c6d34a2c9 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   372642ea83ff1c71a5d567a704c912359eb59776
Checkpoint bpf-next commit: 3c3bd542ffbb2ac09631313ede46ae66660ae550
Baseline bpf commit:        7c4a22339e7ce7b6ed473a8e682da622c3a774ee
Checkpoint bpf commit:      3776f3517ed94d40ff0e3851d7ce2ce17b63099f

Andrii Nakryiko (8):
  bpf: Implement minimal BPF perf link
  bpf: Allow to specify user-provided bpf_cookie for BPF perf links
  bpf: Add bpf_get_attach_cookie() BPF helper to access bpf_cookie value
  libbpf: Remove unused bpf_link's destroy operation, but add dealloc
  libbpf: Use BPF perf link when supported by kernel
  libbpf: Add bpf_cookie support to bpf_link_create() API
  libbpf: Add bpf_cookie to perf_event, kprobe, uprobe, and tp attach
    APIs
  libbpf: Add uprobe ref counter offset support for USDT semaphores

Hangbin Liu (1):
  bonding: add new option lacp_active

Hao Luo (1):
  libbpf: Support weak typed ksyms.

Randy Dunlap (1):
  libbpf, doc: Eliminate warnings in libbpf_naming_convention

Robin Gögge (1):
  libbpf: Fix probe for BPF_PROG_TYPE_CGROUP_SOCKOPT

grantseltzer (1):
  bpf: Reconfigure libbpf docs to remove unversioned API

 docs/libbpf_api.rst               |  27 ----
 docs/libbpf_naming_convention.rst |   4 +-
 include/uapi/linux/bpf.h          |  25 ++++
 include/uapi/linux/if_link.h      |   1 +
 src/bpf.c                         |  32 ++++-
 src/bpf.h                         |   8 +-
 src/libbpf.c                      | 229 +++++++++++++++++++++++-------
 src/libbpf.h                      |  75 ++++++++--
 src/libbpf.map                    |   3 +
 src/libbpf_internal.h             |  32 +++--
 src/libbpf_probes.c               |   4 +-
 11 files changed, 332 insertions(+), 108 deletions(-)
 delete mode 100644 docs/libbpf_api.rst

--
2.30.2
2021-08-17 08:18:57 -07:00
Andrii Nakryiko
a69c52bb11 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2021-08-17 08:18:57 -07:00
Andrii Nakryiko
6d67d53143 libbpf: Add uprobe ref counter offset support for USDT semaphores
When attaching to uprobes through perf subsystem, it's possible to specify
offset of a so-called USDT semaphore, which is just a reference counted u16,
used by kernel to keep track of how many tracers are attached to a given
location. Support for this feature was added in [0], so just wire this through
uprobe_opts. This is important to enable implementing USDT attachment and
tracing through libbpf's bpf_program__attach_uprobe_opts() API.

  [0] a6ca88b241d5 ("trace_uprobe: support reference counter in fd-based uprobe")

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210815070609.987780-16-andrii@kernel.org
2021-08-17 08:18:57 -07:00
Andrii Nakryiko
91259bc676 libbpf: Add bpf_cookie to perf_event, kprobe, uprobe, and tp attach APIs
Wire through bpf_cookie for all attach APIs that use perf_event_open under the
hood:
  - for kprobes, extend existing bpf_kprobe_opts with bpf_cookie field;
  - for perf_event, uprobe, and tracepoint APIs, add their _opts variants and
    pass bpf_cookie through opts.

For kernel that don't support BPF_LINK_CREATE for perf_events, and thus
bpf_cookie is not supported either, return error and log warning for user.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210815070609.987780-12-andrii@kernel.org
2021-08-17 08:18:57 -07:00
Andrii Nakryiko
a3f8c5a306 libbpf: Add bpf_cookie support to bpf_link_create() API
Add ability to specify bpf_cookie value when creating BPF perf link with
bpf_link_create() low-level API.

Given BPF_LINK_CREATE command is growing and keeps getting new fields that are
specific to the type of BPF_LINK, extend libbpf side of bpf_link_create() API
and corresponding OPTS struct to accomodate such changes. Add extra checks to
prevent using incompatible/unexpected combinations of fields.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210815070609.987780-11-andrii@kernel.org
2021-08-17 08:18:57 -07:00
Andrii Nakryiko
d23679b415 libbpf: Use BPF perf link when supported by kernel
Detect kernel support for BPF perf link and prefer it when attaching to
perf_event, tracepoint, kprobe/uprobe. Underlying perf_event FD will be kept
open until BPF link is destroyed, at which point both perf_event FD and BPF
link FD will be closed.

This preserves current behavior in which perf_event FD is open for the
duration of bpf_link's lifetime and user is able to "disconnect" bpf_link from
underlying FD (with bpf_link__disconnect()), so that bpf_link__destroy()
doesn't close underlying perf_event FD.When BPF perf link is used, disconnect
will keep both perf_event and bpf_link FDs open, so it will be up to
(advanced) user to close them. This approach is demonstrated in bpf_cookie.c
selftests, added in this patch set.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210815070609.987780-10-andrii@kernel.org
2021-08-17 08:18:57 -07:00
Andrii Nakryiko
9923f25600 libbpf: Remove unused bpf_link's destroy operation, but add dealloc
bpf_link->destroy() isn't used by any code, so remove it. Instead, add ability
to override deallocation procedure, with default doing plain free(link). This
is necessary for cases when we want to "subclass" struct bpf_link to keep
extra information, as is the case in the next patch adding struct
bpf_link_perf.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210815070609.987780-9-andrii@kernel.org
2021-08-17 08:18:57 -07:00
Andrii Nakryiko
40160ed4d4 bpf: Add bpf_get_attach_cookie() BPF helper to access bpf_cookie value
Add new BPF helper, bpf_get_attach_cookie(), which can be used by BPF programs
to get access to a user-provided bpf_cookie value, specified during BPF
program attachment (BPF link creation) time.

Naming is hard, though. With the concept being named "BPF cookie", I've
considered calling the helper:
  - bpf_get_cookie() -- seems too unspecific and easily mistaken with socket
    cookie;
  - bpf_get_bpf_cookie() -- too much tautology;
  - bpf_get_link_cookie() -- would be ok, but while we create a BPF link to
    attach BPF program to BPF hook, it's still an "attachment" and the
    bpf_cookie is associated with BPF program attachment to a hook, not a BPF
    link itself. Technically, we could support bpf_cookie with old-style
    cgroup programs.So I ultimately rejected it in favor of
    bpf_get_attach_cookie().

Currently all perf_event-backed BPF program types support
bpf_get_attach_cookie() helper. Follow-up patches will add support for
fentry/fexit programs as well.

While at it, mark bpf_tracing_func_proto() as static to make it obvious that
it's only used from within the kernel/trace/bpf_trace.c.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210815070609.987780-7-andrii@kernel.org
2021-08-17 08:18:57 -07:00
Andrii Nakryiko
7b22fc4cdb bpf: Allow to specify user-provided bpf_cookie for BPF perf links
Add ability for users to specify custom u64 value (bpf_cookie) when creating
BPF link for perf_event-backed BPF programs (kprobe/uprobe, perf_event,
tracepoints).

This is useful for cases when the same BPF program is used for attaching and
processing invocation of different tracepoints/kprobes/uprobes in a generic
fashion, but such that each invocation is distinguished from each other (e.g.,
BPF program can look up additional information associated with a specific
kernel function without having to rely on function IP lookups). This enables
new use cases to be implemented simply and efficiently that previously were
possible only through code generation (and thus multiple instances of almost
identical BPF program) or compilation at runtime (BCC-style) on target hosts
(even more expensive resource-wise). For uprobes it is not even possible in
some cases to know function IP before hand (e.g., when attaching to shared
library without PID filtering, in which case base load address is not known
for a library).

This is done by storing u64 bpf_cookie in struct bpf_prog_array_item,
corresponding to each attached and run BPF program. Given cgroup BPF programs
already use two 8-byte pointers for their needs and cgroup BPF programs don't
have (yet?) support for bpf_cookie, reuse that space through union of
cgroup_storage and new bpf_cookie field.

Make it available to kprobe/tracepoint BPF programs through bpf_trace_run_ctx.
This is set by BPF_PROG_RUN_ARRAY, used by kprobe/uprobe/tracepoint BPF
program execution code, which luckily is now also split from
BPF_PROG_RUN_ARRAY_CG. This run context will be utilized by a new BPF helper
giving access to this user-provided cookie value from inside a BPF program.
Generic perf_event BPF programs will access this value from perf_event itself
through passed in BPF program context.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/bpf/20210815070609.987780-6-andrii@kernel.org
2021-08-17 08:18:57 -07:00
Andrii Nakryiko
152882e17a bpf: Implement minimal BPF perf link
Introduce a new type of BPF link - BPF perf link. This brings perf_event-based
BPF program attachments (perf_event, tracepoints, kprobes, and uprobes) into
the common BPF link infrastructure, allowing to list all active perf_event
based attachments, auto-detaching BPF program from perf_event when link's FD
is closed, get generic BPF link fdinfo/get_info functionality.

BPF_LINK_CREATE command expects perf_event's FD as target_fd. No extra flags
are currently supported.

Force-detaching and atomic BPF program updates are not yet implemented, but
with perf_event-based BPF links we now have common framework for this without
the need to extend ioctl()-based perf_event interface.

One interesting consideration is a new value for bpf_attach_type, which
BPF_LINK_CREATE command expects. Generally, it's either 1-to-1 mapping from
bpf_attach_type to bpf_prog_type, or many-to-1 mapping from a subset of
bpf_attach_types to one bpf_prog_type (e.g., see BPF_PROG_TYPE_SK_SKB or
BPF_PROG_TYPE_CGROUP_SOCK). In this case, though, we have three different
program types (KPROBE, TRACEPOINT, PERF_EVENT) using the same perf_event-based
mechanism, so it's many bpf_prog_types to one bpf_attach_type. I chose to
define a single BPF_PERF_EVENT attach type for all of them and adjust
link_create()'s logic for checking correspondence between attach type and
program type.

The alternative would be to define three new attach types (e.g., BPF_KPROBE,
BPF_TRACEPOINT, and BPF_PERF_EVENT), but that seemed like unnecessary overkill
and BPF_KPROBE will cause naming conflicts with BPF_KPROBE() macro, defined by
libbpf. I chose to not do this to avoid unnecessary proliferation of
bpf_attach_type enum values and not have to deal with naming conflicts.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/bpf/20210815070609.987780-5-andrii@kernel.org
2021-08-17 08:18:57 -07:00
Hao Luo
0e7520949e libbpf: Support weak typed ksyms.
Currently weak typeless ksyms have default value zero, when they don't
exist in the kernel. However, weak typed ksyms are rejected by libbpf
if they can not be resolved. This means that if a bpf object contains
the declaration of a nonexistent weak typed ksym, it will be rejected
even if there is no program that references the symbol.

Nonexistent weak typed ksyms can also default to zero just like
typeless ones. This allows programs that access weak typed ksyms to be
accepted by verifier, if the accesses are guarded. For example,

extern const int bpf_link_fops3 __ksym __weak;

/* then in BPF program */

if (&bpf_link_fops3) {
   /* use bpf_link_fops3 */
}

If actual use of nonexistent typed ksym is not guarded properly,
verifier would see that register is not PTR_TO_BTF_ID and wouldn't
allow to use it for direct memory reads or passing it to BPF helpers.

Signed-off-by: Hao Luo <haoluo@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210812003819.2439037-1-haoluo@google.com
2021-08-17 08:18:57 -07:00
Randy Dunlap
c3f7daaab5 libbpf, doc: Eliminate warnings in libbpf_naming_convention
Use "code-block: none" instead of "c" for non-C-language code blocks.
Removes these warnings:

  lnx-514-rc4/Documentation/bpf/libbpf/libbpf_naming_convention.rst:111: WARNING: Could not lex literal_block as "c". Highlighting skipped.
  lnx-514-rc4/Documentation/bpf/libbpf/libbpf_naming_convention.rst:124: WARNING: Could not lex literal_block as "c". Highlighting skipped.

Fixes: f42cfb469f9b ("bpf: Add documentation for libbpf including API autogen")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210802015037.787-1-rdunlap@infradead.org
2021-08-17 08:18:57 -07:00
Hangbin Liu
1a1e7a0612 bonding: add new option lacp_active
Add an option lacp_active, which is similar with team's runner.active.
This option specifies whether to send LACPDU frames periodically. If set
on, the LACPDU frames are sent along with the configured lacp_rate
setting. If set off, the LACPDU frames acts as "speak when spoken to".

Note, the LACPDU state frames still will be sent when init or unbind port.

v2: remove module parameter

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-17 08:18:57 -07:00
Andrii Nakryiko
827963ffb3 sync: fix up docs sync path mapping
Kernel docs from Documentation/bpf/libbpf go straight to docs/ under libbpf.
Also ignore libbpf-only parts of docs subdir.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-08-16 22:37:57 -07:00
Andrii Nakryiko
4ab24e7d62 docs: initial set of libbpf docs
Add libbpf-related .rst files before they started being synced automatically.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-08-16 22:37:57 -07:00
grantseltzer
b2a63c974d docs: reconfigure libbpf documentation syncing
This adds documentation files, including ones for autogenerating API
documentation based on code comments in the source code that's pulled
in via the mirror.

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
2021-08-16 22:30:23 -07:00
Quentin Monnet
88649fe655 ci: run script to test bpftool types/options sync
When new eBPF program, map, or attach types are added to the kernel,
bpftool needs to be updated in order to support the related features.
These updates should add the new types to the code itself, but also to
the help messages, documentation, and bash completion. Given that it is
easy to omit one of those, a script has been created to attempt to
validate that all parts have been consistently updated.

This new script for bpftool is hosted in the kernel repository, amongst
the BPF selftests. But it is not called from the Makefile, and not run
along with the other selftests. If it was, all patches updating the BPF
UAPI would require the relevant changes in bpftool at the same time, _in
the same patches_, which is not desirable.

To ensure that bpftool's parts remain in sync, let's run this script
from the CI. This patch adds a new section to the run.sh script, focused
on bpftool, and calling the new test_bpftool_synctypes.py.
2021-08-16 15:16:08 -07:00
Sergei Iudin
1778e0b1bd Make CI tests compatible with vanilla kernel tree
This is required to migrate kernel-patches CI to use this code
instead of fork
2021-08-11 16:06:23 -07:00
Andrii Nakryiko
64f027efda ci: restore all temporary disabled tests
Upstream bpf-next should be good, so no temporary blocked tests should remain.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-08-09 13:54:27 -07:00
Yucong Sun
6bf8babb33 Add a test step to produce a minimal binary using libbpf.
This patch adds a test step to link a minimal program to libbpf library produced,
making sure that the library works.
2021-08-09 13:54:19 -07:00
Rafael David Tinoco
70ad3e8314 makefile: fix missing object for static compilation
Makefile needs relo_core object added to objects list to avoid static
linking errors when doing static compilation:

/bin/ld: .../libbpf.a(libbpf.o): in function `bpf_core_apply_relo':
.../libbpf/src/libbpf.c:5134: undefined reference to `bpf_core_apply_relo_insn'

Signed-off-by: Rafael David Tinoco <rafaeldtinoco@gmail.com>
2021-08-09 13:54:19 -07:00
Andrii Nakryiko
dbdd8f3b34 ci: make CI build log less verbose
Only keep stderr output in case of errors for kernel and selftests builds.
Having a multi-thousand-line output isn't useful and slows down Github
Actions' log view UI.

Also quiet down wget's "progress bar" output. While at the same time see some
totals from tar, just for the fun of it.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-08-09 13:54:19 -07:00
Andrii Nakryiko
52e96052a2 ci: blacklist newly migrated netcnt selftest
Seems like netcnt uses some map operations not supported by 5.5.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-08-09 13:54:19 -07:00
Andrii Nakryiko
41db5534d8 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   807b8f0e24e6004984094e1bcbbd2b297011a085
Checkpoint bpf-next commit: 372642ea83ff1c71a5d567a704c912359eb59776
Baseline bpf commit:        d6371c76e20d7d3f61b05fd67b596af4d14a8886
Checkpoint bpf commit:      a02215ce72a37a19a690803b23b091186ee4f7b2

Alexei Starovoitov (4):
  libbpf: Cleanup the layering between CORE and bpf_program.
  libbpf: Split bpf_core_apply_relo() into bpf_program independent
    helper.
  libbpf: Move CO-RE types into relo_core.h.
  libbpf: Split CO-RE logic into relo_core.c.

Daniel Xu (1):
  libbpf: Do not close un-owned FD 0 on errors

Evgeniy Litvinenko (1):
  libbpf: Add bpf_map__pin_path function

Hengqi Chen (1):
  libbpf: Add btf__load_vmlinux_btf/btf__load_module_btf

Jason Wang (1):
  libbpf: Fix comment typo

Jiri Olsa (3):
  libbpf: Fix func leak in attach_kprobe
  libbpf: Allow decimal offset for kprobes
  libbpf: Export bpf_program__attach_kprobe_opts function

Martynas Pumputis (1):
  libbpf: Fix race when pinning maps in parallel

Quentin Monnet (4):
  libbpf: Return non-null error on failures in libbpf_find_prog_btf_id()
  libbpf: Rename btf__load() as btf__load_into_kernel()
  libbpf: Rename btf__get_from_id() as btf__load_from_kernel_by_id()
  libbpf: Add split BTF support for btf__load_from_kernel_by_id()

Robin Gögge (1):
  libbpf: Fix probe for BPF_PROG_TYPE_CGROUP_SOCKOPT

 src/btf.c             |   50 +-
 src/btf.h             |   12 +-
 src/libbpf.c          | 1419 +++--------------------------------------
 src/libbpf.h          |   16 +
 src/libbpf.map        |    7 +
 src/libbpf_internal.h |   81 +--
 src/libbpf_probes.c   |    4 +-
 src/relo_core.c       | 1295 +++++++++++++++++++++++++++++++++++++
 src/relo_core.h       |  100 +++
 9 files changed, 1561 insertions(+), 1423 deletions(-)
 create mode 100644 src/relo_core.c
 create mode 100644 src/relo_core.h

--
2.30.2
2021-08-09 13:54:14 -07:00
Andrii Nakryiko
54a7bc87d5 ci: restore all temporary disabled tests
Upstream bpf-next should be good, so no temporary blocked tests should remain.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-08-06 20:09:33 -07:00
Yucong Sun
9979463ccf Remove shared linking for now 2021-08-06 15:04:04 -07:00
Yucong Sun
b91ca01922 Add a test step to produce a minimal binary using libbpf.
This patch adds a test step to link a minimal program to libbpf library produced,
making sure that the library works.
2021-08-06 15:04:04 -07:00
Rafael David Tinoco
8ded7c6db0 makefile: fix missing object for static compilation
Makefile needs relo_core object added to objects list to avoid static
linking errors when doing static compilation:

/bin/ld: .../libbpf.a(libbpf.o): in function `bpf_core_apply_relo':
.../libbpf/src/libbpf.c:5134: undefined reference to `bpf_core_apply_relo_insn'

Signed-off-by: Rafael David Tinoco <rafaeldtinoco@gmail.com>
2021-08-06 13:50:20 -07:00
Andrii Nakryiko
7df4ea0f0d ci: make CI build log less verbose
Only keep stderr output in case of errors for kernel and selftests builds.
Having a multi-thousand-line output isn't useful and slows down Github
Actions' log view UI.

Also quiet down wget's "progress bar" output. While at the same time see some
totals from tar, just for the fun of it.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-08-04 23:54:46 -07:00
Daniel Xu
02efadd0b0 libbpf: Do not close un-owned FD 0 on errors
Before this patch, btf_new() was liable to close an arbitrary FD 0 if
BTF parsing failed. This was because:

* btf->fd was initialized to 0 through the calloc()
* btf__free() (in the `done` label) closed any FDs >= 0
* btf->fd is left at 0 if parsing fails

This issue was discovered on a system using libbpf v0.3 (without
BTF_KIND_FLOAT support) but with a kernel that had BTF_KIND_FLOAT types
in BTF. Thus, parsing fails.

While this patch technically doesn't fix any issues b/c upstream libbpf
has BTF_KIND_FLOAT support, it'll help prevent issues in the future if
more BTF types are added. It also allow the fix to be backported to
older libbpf's.

Fixes: 3289959b97ca ("libbpf: Support BTF loading and raw data output in both endianness")
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/5969bb991adedb03c6ae93e051fd2a00d293cf25.1627513670.git.dxu@dxuuu.xyz
2021-08-04 18:27:12 -07:00
Andrii Nakryiko
02333ba360 ci: blacklist newly migrated netcnt selftest
Seems like netcnt uses some map operations not supported by 5.5.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-08-04 18:27:12 -07:00
Robin Gögge
2805c2a4ca libbpf: Fix probe for BPF_PROG_TYPE_CGROUP_SOCKOPT
This patch fixes the probe for BPF_PROG_TYPE_CGROUP_SOCKOPT,
so the probe reports accurate results when used by e.g.
bpftool.

Fixes: 4cdbfb59c44a ("libbpf: support sockopt hooks")
Signed-off-by: Robin Gögge <r.goegge@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20210728225825.2357586-1-r.goegge@gmail.com
2021-08-04 18:27:12 -07:00
Andrii Nakryiko
6921017d25 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   807b8f0e24e6004984094e1bcbbd2b297011a085
Checkpoint bpf-next commit: 372642ea83ff1c71a5d567a704c912359eb59776
Baseline bpf commit:        d6371c76e20d7d3f61b05fd67b596af4d14a8886
Checkpoint bpf commit:      a02215ce72a37a19a690803b23b091186ee4f7b2

Alexei Starovoitov (4):
  libbpf: Cleanup the layering between CORE and bpf_program.
  libbpf: Split bpf_core_apply_relo() into bpf_program independent
    helper.
  libbpf: Move CO-RE types into relo_core.h.
  libbpf: Split CO-RE logic into relo_core.c.

Daniel Xu (1):
  libbpf: Do not close un-owned FD 0 on errors

Evgeniy Litvinenko (1):
  libbpf: Add bpf_map__pin_path function

Hengqi Chen (1):
  libbpf: Add btf__load_vmlinux_btf/btf__load_module_btf

Jason Wang (1):
  libbpf: Fix comment typo

Jiri Olsa (3):
  libbpf: Fix func leak in attach_kprobe
  libbpf: Allow decimal offset for kprobes
  libbpf: Export bpf_program__attach_kprobe_opts function

Martynas Pumputis (1):
  libbpf: Fix race when pinning maps in parallel

Quentin Monnet (4):
  libbpf: Return non-null error on failures in libbpf_find_prog_btf_id()
  libbpf: Rename btf__load() as btf__load_into_kernel()
  libbpf: Rename btf__get_from_id() as btf__load_from_kernel_by_id()
  libbpf: Add split BTF support for btf__load_from_kernel_by_id()

Robin Gögge (1):
  libbpf: Fix probe for BPF_PROG_TYPE_CGROUP_SOCKOPT

 src/btf.c             |   50 +-
 src/btf.h             |   12 +-
 src/libbpf.c          | 1419 +++--------------------------------------
 src/libbpf.h          |   16 +
 src/libbpf.map        |    7 +
 src/libbpf_internal.h |   81 +--
 src/libbpf_probes.c   |    4 +-
 src/relo_core.c       | 1295 +++++++++++++++++++++++++++++++++++++
 src/relo_core.h       |  100 +++
 9 files changed, 1561 insertions(+), 1423 deletions(-)
 create mode 100644 src/relo_core.c
 create mode 100644 src/relo_core.h

--
2.30.2
2021-08-04 18:27:12 -07:00
Hengqi Chen
e65d128903 libbpf: Add btf__load_vmlinux_btf/btf__load_module_btf
Add two new APIs: btf__load_vmlinux_btf and btf__load_module_btf.
btf__load_vmlinux_btf is just an alias to the existing API named
libbpf_find_kernel_btf, rename to be more precisely and consistent
with existing BTF APIs. btf__load_module_btf can be used to load
module BTF, add it for completeness. These two APIs are useful for
implementing tracing tools and introspection tools. This is part
of the effort towards libbpf 1.0 ([0]).

  [0] Closes: https://github.com/libbpf/libbpf/issues/280

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210730114012.494408-1-hengqi.chen@gmail.com
2021-08-04 18:27:12 -07:00
Quentin Monnet
512b472d97 libbpf: Add split BTF support for btf__load_from_kernel_by_id()
Add a new API function btf__load_from_kernel_by_id_split(), which takes
a pointer to a base BTF object in order to support split BTF objects
when retrieving BTF information from the kernel.

Reference: https://github.com/libbpf/libbpf/issues/314

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210729162028.29512-8-quentin@isovalent.com
2021-08-04 18:27:12 -07:00
Quentin Monnet
73788dd22f libbpf: Rename btf__get_from_id() as btf__load_from_kernel_by_id()
Rename function btf__get_from_id() as btf__load_from_kernel_by_id() to
better indicate what the function does. Change the new function so that,
instead of requiring a pointer to the pointer to update and returning
with an error code, it takes a single argument (the id of the BTF
object) and returns the corresponding pointer. This is more in line with
the existing constructors.

The other tools calling the (soon-to-be) deprecated btf__get_from_id()
function will be updated in a future commit.

References:

- https://github.com/libbpf/libbpf/issues/278
- https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0#btfh-apis

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210729162028.29512-4-quentin@isovalent.com
2021-08-04 18:27:12 -07:00
Quentin Monnet
a180eb551e libbpf: Rename btf__load() as btf__load_into_kernel()
As part of the effort to move towards a v1.0 for libbpf, rename
btf__load() function, used to "upload" BTF information into the kernel,
as btf__load_into_kernel(). This new name better reflects what the
function does.

References:

- https://github.com/libbpf/libbpf/issues/278
- https://github.com/libbpf/libbpf/wiki/Libbpf:-the-road-to-v1.0#btfh-apis

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210729162028.29512-3-quentin@isovalent.com
2021-08-04 18:27:12 -07:00
Quentin Monnet
9d2b7e471b libbpf: Return non-null error on failures in libbpf_find_prog_btf_id()
Variable "err" is initialised to -EINVAL so that this error code is
returned when something goes wrong in libbpf_find_prog_btf_id().
However, a recent change in the function made use of the variable in
such a way that it is set to 0 if retrieving linear information on the
program is successful, and this 0 value remains if we error out on
failures at later stages.

Let's fix this by setting err to -EINVAL later in the function.

Fixes: e9fc3ce99b34 ("libbpf: Streamline error reporting for high-level APIs")
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210729162028.29512-2-quentin@isovalent.com
2021-08-04 18:27:12 -07:00
Martynas Pumputis
3a0fc666ef libbpf: Fix race when pinning maps in parallel
When loading in parallel multiple programs which use the same to-be
pinned map, it is possible that two instances of the loader will call
bpf_object__create_maps() at the same time. If the map doesn't exist
when both instances call bpf_object__reuse_map(), then one of the
instances will fail with EEXIST when calling bpf_map__pin().

Fix the race by retrying reusing a map if bpf_map__pin() returns
EEXIST. The fix is similar to the one in iproute2: e4c4685fd6e4 ("bpf:
Fix race condition with map pinning").

Before retrying the pinning, we don't do any special cleaning of an
internal map state. The closer code inspection revealed that it's not
required:

    - bpf_object__create_map(): map->inner_map is destroyed after a
      successful call, map->fd is closed if pinning fails.
    - bpf_object__populate_internal_map(): created map elements is
      destroyed upon close(map->fd).
    - init_map_slots(): slots are freed after their initialization.

Signed-off-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210726152001.34845-1-m@lambda.lt
2021-08-04 18:27:12 -07:00
Jason Wang
7c25b1d569 libbpf: Fix comment typo
Remove the repeated word 'the' in line 48.

Signed-off-by: Jason Wang <wangborong@cdjrlc.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210727115928.74600-1-wangborong@cdjrlc.com
2021-08-04 18:27:12 -07:00
Alexei Starovoitov
d41e821ccf libbpf: Split CO-RE logic into relo_core.c.
Move CO-RE logic into separate file.
The internal interface between libbpf and CO-RE is through
bpf_core_apply_relo_insn() function and few structs defined in relo_core.h.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210721000822.40958-5-alexei.starovoitov@gmail.com
2021-08-04 18:27:12 -07:00
Alexei Starovoitov
2fe57e40ac libbpf: Move CO-RE types into relo_core.h.
In order to make a clean split of CO-RE logic move its types
into independent header file.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210721000822.40958-4-alexei.starovoitov@gmail.com
2021-08-04 18:27:12 -07:00
Alexei Starovoitov
f81dbd3475 libbpf: Split bpf_core_apply_relo() into bpf_program independent helper.
bpf_core_apply_relo() doesn't need to know bpf_program internals
and hashmap details.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210721000822.40958-3-alexei.starovoitov@gmail.com
2021-08-04 18:27:12 -07:00
Alexei Starovoitov
035fd6aca0 libbpf: Cleanup the layering between CORE and bpf_program.
CO-RE processing functions don't need to know 'struct bpf_program' details.
Cleanup the layering to eventually be able to move CO-RE logic into a separate file.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210721000822.40958-2-alexei.starovoitov@gmail.com
2021-08-04 18:27:12 -07:00
Evgeniy Litvinenko
e44c8486c6 libbpf: Add bpf_map__pin_path function
Add bpf_map__pin_path, so that the inconsistently named
bpf_map__get_pin_path can be deprecated later. This is part of the
effort towards libbpf v1.0: https://github.com/libbpf/libbpf/issues/307

Also, add a selftest for the new function.

Signed-off-by: Evgeniy Litvinenko <evgeniyl@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210723221511.803683-1-evgeniyl@fb.com
2021-08-04 18:27:12 -07:00
Jiri Olsa
14f5433b2e libbpf: Export bpf_program__attach_kprobe_opts function
Export bpf_program__attach_kprobe_opts as a public API.

Rename bpf_program_attach_kprobe_opts to bpf_kprobe_opts and turn it into OPTS
struct.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20210721215810.889975-4-jolsa@kernel.org
2021-08-04 18:27:12 -07:00
Jiri Olsa
d7a2de020b libbpf: Allow decimal offset for kprobes
Allow to specify decimal offset in SEC macro, like:
  SEC("kprobe/bpf_fentry_test7+5")

Add selftest for that.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20210721215810.889975-3-jolsa@kernel.org
2021-08-04 18:27:12 -07:00
Jiri Olsa
bb92e7ab4d libbpf: Fix func leak in attach_kprobe
Add missing free() for func pointer in attach_kprobe function.

Fixes: a2488b5f483f ("libbpf: Allow specification of "kprobe/function+offset"")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20210721215810.889975-2-jolsa@kernel.org
2021-08-04 18:27:12 -07:00
Yucong Sun
3f22535d56 Fix text grouping issue on github actions
github action grouping is broken because we were outputing "::endgroup" where
it needs "::endgroup::". This patch also added some addtional grouping around
contianer setup phase, making output easier to read.
2021-08-04 12:18:41 -07:00
Andrii Nakryiko
f8ab8bde8e ci: bump nightly Clang/LLVM version to 14
CI started to fail with missing clang-13 warning. Bump the version to 14.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-08-03 12:44:00 -07:00
Sergei Iudin
506a544834 ci: improve CI UX a little by setting names a hiding debug 2021-07-29 17:35:23 -07:00
Andrii Nakryiko
ec2c78c034 README: remove Travis CI build badge
We stop using Travis CI from now on.
2021-07-29 14:22:49 -07:00
Andrii Nakryiko
030ff87857 ci: fix log folding in Github Actions
Backport kernel-patches fix for the same issue ([0] and [1]).

  [0] 17c596fe8b
  [1] a38de685c7

Cc: Sergei Iudin <siudin@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-07-27 19:26:06 -07:00
Andrii Nakryiko
0db006d28e ci: add cancel-in-progress behavior for main test CI workflow
Make sure that only the latest enqueued workflow is running for any given PR
and/or branch.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-07-22 23:22:42 -07:00
Andrii Nakryiko
6e6f18ac5d ci: add Github Actions status badge
Add badge to show the Github Actions test.yml workflow status.
2021-07-21 11:20:34 -07:00
Andrii Nakryiko
deca7932c3 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   08f71a1e39a1f07a464ac782d9b612d6a74c7015
Checkpoint bpf-next commit: 807b8f0e24e6004984094e1bcbbd2b297011a085
Baseline bpf commit:        d6371c76e20d7d3f61b05fd67b596af4d14a8886
Checkpoint bpf commit:      d6371c76e20d7d3f61b05fd67b596af4d14a8886

Alan Maguire (2):
  libbpf: Avoid use of __int128 in typed dump display
  libbpf: Propagate errors when retrieving enum value for typed data
    display

 src/btf_dump.c | 103 ++++++++++++++++++++++++++++++++-----------------
 1 file changed, 68 insertions(+), 35 deletions(-)

--
2.30.2
2021-07-21 11:16:01 -07:00
Alan Maguire
ebcae72279 libbpf: Propagate errors when retrieving enum value for typed data display
When retrieving the enum value associated with typed data during
"is data zero?" checking in btf_dump_type_data_check_zero(), the
return value of btf_dump_get_enum_value() is not passed to the caller
if the function returns a non-zero (error) value.  Currently, 0
is returned if the function returns an error.  We should instead
propagate the error to the caller.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1626770993-11073-4-git-send-email-alan.maguire@oracle.com
2021-07-21 11:16:01 -07:00
Alan Maguire
64362b8896 libbpf: Avoid use of __int128 in typed dump display
__int128 is not supported for some 32-bit platforms (arm and i386).
__int128 was used in carrying out computations on bitfields which
aid display, but the same calculations could be done with __u64
with the small effect of not supporting 128-bit bitfields.

With these changes, a big-endian issue with casting 128-bit integers
to 64-bit for enum bitfields is solved also, as we now use 64-bit
integers for bitfield calculations.

Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1626770993-11073-2-git-send-email-alan.maguire@oracle.com
2021-07-21 11:16:01 -07:00
Michal Suchanek
df01b246df README: State the source origin more prominently.
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
2021-07-20 14:45:49 -07:00
Michal Suchanek
6eb5e25905 Makefile: Default LIBSUBDIR to lib64 on 64bit architectures.
commit a82a66e ("Extend build and add install rules to Makefile") adds
special handling for LIBSUBDIR on x86_64. Expand this to all
architectures with 64 in name which suggests a 32bit variant exists, and
s390x which is 64bit extension of s390.

Fixes: #337
Fixes: a82a66e ("Extend build and add install rules to Makefile")
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
2021-07-20 14:45:49 -07:00
Andrii Nakryiko
a603965dad sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   068dfc655b666b54e08fc3d7108b309d7f906d34
Checkpoint bpf-next commit: 08f71a1e39a1f07a464ac782d9b612d6a74c7015
Baseline bpf commit:        a6c39de76d709f30982d4b80a9b9537e1d388858
Checkpoint bpf commit:      d6371c76e20d7d3f61b05fd67b596af4d14a8886

Alan Maguire (3):
  libbpf: Clarify/fix unaligned data issues for btf typed dump
  libbpf: Fix compilation errors on ppc64le for btf dump typed data
  libbpf: Btf typed dump does not need to allocate dump data

Martynas Pumputis (1):
  libbpf: Fix removal of inner map in bpf_object__create_map

 src/btf_dump.c | 41 ++++++++++++++++++++++++++++++-----------
 src/libbpf.c   | 10 ++++------
 2 files changed, 34 insertions(+), 17 deletions(-)

--
2.30.2
2021-07-19 17:45:10 -07:00
Martynas Pumputis
f61c3b318b libbpf: Fix removal of inner map in bpf_object__create_map
If creating an outer map of a BTF-defined map-in-map fails (via
bpf_object__create_map()), then the previously created its inner map
won't be destroyed.

Fix this by ensuring that the destroy routines are not bypassed in the
case of a failure.

Fixes: 646f02ffdd49c ("libbpf: Add BTF-defined map-in-map support")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210719173838.423148-2-m@lambda.lt
2021-07-19 17:45:10 -07:00
Alan Maguire
8235032464 libbpf: Btf typed dump does not need to allocate dump data
By using the stack for this small structure, we avoid the need
for freeing memory in error paths.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1626475617-25984-4-git-send-email-alan.maguire@oracle.com
2021-07-19 17:45:10 -07:00
Alan Maguire
dc2c53b7f6 libbpf: Fix compilation errors on ppc64le for btf dump typed data
__s64 can be defined as either long or long long, depending on the
architecture. On ppc64le it's defined as long, giving this error:

 In file included from btf_dump.c:22:
btf_dump.c: In function 'btf_dump_type_data_check_overflow':
libbpf_internal.h:111:22: error: format '%lld' expects argument of
type 'long long int', but argument 3 has type '__s64' {aka 'long int'}
[-Werror=format=]
  111 |  libbpf_print(level, "libbpf: " fmt, ##__VA_ARGS__); \
      |                      ^~~~~~~~~~
libbpf_internal.h:114:27: note: in expansion of macro '__pr'
  114 | #define pr_warn(fmt, ...) __pr(LIBBPF_WARN, fmt, ##__VA_ARGS__)
      |                           ^~~~
btf_dump.c:1992:3: note: in expansion of macro 'pr_warn'
 1992 |   pr_warn("unexpected size [%lld] for id [%u]\n",
      |   ^~~~~~~
btf_dump.c:1992:32: note: format string is defined here
 1992 |   pr_warn("unexpected size [%lld] for id [%u]\n",
      |                             ~~~^
      |                                |
      |                                long long int
      |                             %ld

Cast to size_t and use %zu instead.

Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1626475617-25984-3-git-send-email-alan.maguire@oracle.com
2021-07-19 17:45:10 -07:00
Alan Maguire
fb3809e940 libbpf: Clarify/fix unaligned data issues for btf typed dump
If data is packed, data structures can store it outside of usual
boundaries.  For example a 4-byte int can be stored on a unaligned
boundary in a case like this:

struct s {
	char f1;
	int f2;
} __attribute((packed));

...the int is stored at an offset of one byte.  Some platforms have
problems dereferencing data that is not aligned with its size, and
code exists to handle most cases of this for BTF typed data display.
However pointer display was missed, and a simple function to test if
"ptr_is_aligned(data, data_sz)" would help clarify this code.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1626475617-25984-2-git-send-email-alan.maguire@oracle.com
2021-07-19 17:45:10 -07:00
Fejes Ferenc
74d3571880 Update README.md 2021-07-19 16:43:05 -07:00
Fejes Ferenc
be570b29c1 Update README.md
Manjaro is a popular and friendly Arch based distro. Recently they also enabled the BTF support: https://forum.manjaro.org/t/co-re-support-in-kernel/46134/19

I can confirm that:
[user@pc ~]$ uname -a
Linux pc 5.12.16-1-MANJARO #1 SMP PREEMPT Sun Jul 11 13:23:34 UTC 2021 x86_64 GNU/Linux
[user@pc ~]$ ls -la /sys/kernel/btf/vmlinux
-r--r--r-- 1 root root 4226769 jul   17 15.27 /sys/kernel/btf/vmlinux
2021-07-19 16:43:05 -07:00
Sergei Iudin
9aa71e1040 Run apt-get update as a first step for GH actions
otherwise container may contain stall repo metadata cached
2021-07-19 14:57:35 -07:00
Andrii Nakryiko
b3ffd258fc vmtest: blacklist 5.5 selftests
Add few new selftests to blacklist. They can't succeed on 5.5.
Also temporarily remove btf_dump for 4.9 due to newly added data dumping
subtests.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-07-19 11:36:37 -07:00
Andrii Nakryiko
4447ac82d4 ci: temporary work-around to get green CI builds back
Temporary disable tc_bpf tests that seem to have regressed.

Temporary and artificially bump pahole version from 1.21 to 1.22 to get
per-CPU BTF data built.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-07-19 11:36:37 -07:00
Andrii Nakryiko
8fa229c455 ci: disable -Wstringop-truncation for GCC10 configurations as well
We used to have it disabled for GCC8, but now GCC10 is false-report same
warnings, so disable stringop-truncation warnigs for GCC10 as well.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-07-19 11:36:37 -07:00
Andrii Nakryiko
8a670b7422 vmtest: regenerate latest vmlinux.h
This is necessary to make runqslower compile with task->__state field on old
kernels, for which we don't have an actual vmlinux.h.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-07-19 11:36:37 -07:00
Andrii Nakryiko
21f90f61b0 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   f42cfb469f9b4a1c002a03cce3d9329376800a6f
Checkpoint bpf-next commit: 068dfc655b666b54e08fc3d7108b309d7f906d34
Baseline bpf commit:        61e8aeda9398925f8c6fc290585bdd9727d154c4
Checkpoint bpf commit:      a6c39de76d709f30982d4b80a9b9537e1d388858

Alan Maguire (2):
  libbpf: Allow specification of "kprobe/function+offset"
  libbpf: BTF dumper support for typed data

Alexei Starovoitov (2):
  bpf: Sync tools/include/uapi/linux/bpf.h
  bpf: Introduce bpf timers.

Jiri Olsa (3):
  bpf: Add bpf_get_func_ip helper for tracing programs
  bpf: Add bpf_get_func_ip helper for kprobe programs
  libbpf: Add bpf_program__attach_kprobe_opts function

Jonathan Edwards (1):
  libbpf: Add extra BPF_PROG_TYPE check to bpf_object__probe_loading

Kumar Kartikeya Dwivedi (2):
  libbpf: Add request buffer type for netlink messages
  libbpf: Switch to void * casting in netlink helpers

Kuniyuki Iwashima (1):
  bpf: Fix a typo of reuseport map in bpf.h.

Martynas Pumputis (1):
  libbpf: Fix reuse of pinned map on older kernel

Shuyi Cheng (2):
  libbpf: Introduce 'btf_custom_path' to 'bpf_obj_open_opts'
  libbpf: Fix the possible memory leak on error

Toke Høiland-Jørgensen (1):
  libbpf: Restore errno return for functions that were already returning
    it

 include/uapi/linux/bpf.h |  85 +++-
 src/btf.h                |  19 +
 src/btf_dump.c           | 819 ++++++++++++++++++++++++++++++++++++++-
 src/libbpf.c             | 146 ++++++-
 src/libbpf.h             |   9 +-
 src/libbpf.map           |   1 +
 src/netlink.c            | 115 +++---
 src/nlattr.c             |   2 +-
 src/nlattr.h             |  38 +-
 9 files changed, 1117 insertions(+), 117 deletions(-)

--
2.30.2
2021-07-16 17:05:44 -07:00
Andrii Nakryiko
c8b1d14b03 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2021-07-16 17:05:44 -07:00
Alan Maguire
c0b2ceba1d libbpf: BTF dumper support for typed data
Add a BTF dumper for typed data, so that the user can dump a typed
version of the data provided.

The API is

int btf_dump__dump_type_data(struct btf_dump *d, __u32 id,
                             void *data, size_t data_sz,
                             const struct btf_dump_type_data_opts *opts);

...where the id is the BTF id of the data pointed to by the "void *"
argument; for example the BTF id of "struct sk_buff" for a
"struct skb *" data pointer.  Options supported are

 - a starting indent level (indent_lvl)
 - a user-specified indent string which will be printed once per
   indent level; if NULL, tab is chosen but any string <= 32 chars
   can be provided.
 - a set of boolean options to control dump display, similar to those
   used for BPF helper bpf_snprintf_btf().  Options are
        - compact : omit newlines and other indentation
        - skip_names: omit member names
        - emit_zeroes: show zero-value members

Default output format is identical to that dumped by bpf_snprintf_btf(),
for example a "struct sk_buff" representation would look like this:

struct sk_buff){
	(union){
		(struct){
			.next = (struct sk_buff *)0xffffffffffffffff,
			.prev = (struct sk_buff *)0xffffffffffffffff,
		(union){
			.dev = (struct net_device *)0xffffffffffffffff,
			.dev_scratch = (long unsigned int)18446744073709551615,
		},
	},
...

If the data structure is larger than the *data_sz*
number of bytes that are available in *data*, as much
of the data as possible will be dumped and -E2BIG will
be returned.  This is useful as tracers will sometimes
not be able to capture all of the data associated with
a type; for example a "struct task_struct" is ~16k.
Being able to specify that only a subset is available is
important for such cases.  On success, the amount of data
dumped is returned.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1626362126-27775-2-git-send-email-alan.maguire@oracle.com
2021-07-16 17:05:44 -07:00
Shuyi Cheng
bd25fc7df1 libbpf: Fix the possible memory leak on error
If the strdup() fails then we need to call bpf_object__close(obj) to
avoid a resource leak.

Fixes: 166750bc1dd2 ("libbpf: Support libbpf-provided extern variables")
Signed-off-by: Shuyi Cheng <chengshuyi@linux.alibaba.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1626180159-112996-3-git-send-email-chengshuyi@linux.alibaba.com
2021-07-16 17:05:44 -07:00
Shuyi Cheng
4920031c88 libbpf: Introduce 'btf_custom_path' to 'bpf_obj_open_opts'
btf_custom_path allows developers to load custom BTF which libbpf will
subsequently use for CO-RE relocation instead of vmlinux BTF.

Having btf_custom_path in bpf_object_open_opts one can directly use the
skeleton's <objname>_bpf__open_opts() API to pass in the btf_custom_path
parameter, as opposed to using bpf_object__load_xattr() which is slated to be
deprecated ([0]).

This work continues previous work started by another developer ([1]).

  [0] https://lore.kernel.org/bpf/CAEf4BzbJZLjNoiK8_VfeVg_Vrg=9iYFv+po-38SMe=UzwDKJ=Q@mail.gmail.com/#t
  [1] https://yhbt.net/lore/all/CAEf4Bzbgw49w2PtowsrzKQNcxD4fZRE6AKByX-5-dMo-+oWHHA@mail.gmail.com/

Signed-off-by: Shuyi Cheng <chengshuyi@linux.alibaba.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1626180159-112996-2-git-send-email-chengshuyi@linux.alibaba.com
2021-07-16 17:05:44 -07:00
Alan Maguire
8fa50e86c1 libbpf: Allow specification of "kprobe/function+offset"
kprobes can be placed on most instructions in a function, not
just entry, and ftrace and bpftrace support the function+offset
notification for probe placement.  Adding parsing of func_name
into func+offset to bpf_program__attach_kprobe() allows the
user to specify

SEC("kprobe/bpf_fentry_test5+0x6")

...for example, and the offset can be passed to perf_event_open_probe()
to support kprobe attachment.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210714094400.396467-8-jolsa@kernel.org
2021-07-16 17:05:44 -07:00
Jiri Olsa
330a158982 libbpf: Add bpf_program__attach_kprobe_opts function
Adding bpf_program__attach_kprobe_opts that does the same
as bpf_program__attach_kprobe, but takes opts argument.

Currently opts struct holds just retprobe bool, but we will
add new field in following patch.

The function is not exported, so there's no need to add
size to the struct bpf_program_attach_kprobe_opts for now.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210714094400.396467-7-jolsa@kernel.org
2021-07-16 17:05:44 -07:00
Jiri Olsa
a524ae0bbf bpf: Add bpf_get_func_ip helper for kprobe programs
Adding bpf_get_func_ip helper for BPF_PROG_TYPE_KPROBE programs,
so it's now possible to call bpf_get_func_ip from both kprobe and
kretprobe programs.

Taking the caller's address from 'struct kprobe::addr', which is
defined for both kprobe and kretprobe.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/bpf/20210714094400.396467-5-jolsa@kernel.org
2021-07-16 17:05:44 -07:00
Jiri Olsa
97e2a9c9a1 bpf: Add bpf_get_func_ip helper for tracing programs
Adding bpf_get_func_ip helper for BPF_PROG_TYPE_TRACING programs,
specifically for all trampoline attach types.

The trampoline's caller IP address is stored in (ctx - 8) address.
so there's no reason to actually call the helper, but rather fixup
the call instruction and return [ctx - 8] value directly.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210714094400.396467-4-jolsa@kernel.org
2021-07-16 17:05:44 -07:00
Alexei Starovoitov
bef77595ca bpf: Introduce bpf timers.
Introduce 'struct bpf_timer { __u64 :64; __u64 :64; };' that can be embedded
in hash/array/lru maps as a regular field and helpers to operate on it:

// Initialize the timer.
// First 4 bits of 'flags' specify clockid.
// Only CLOCK_MONOTONIC, CLOCK_REALTIME, CLOCK_BOOTTIME are allowed.
long bpf_timer_init(struct bpf_timer *timer, struct bpf_map *map, int flags);

// Configure the timer to call 'callback_fn' static function.
long bpf_timer_set_callback(struct bpf_timer *timer, void *callback_fn);

// Arm the timer to expire 'nsec' nanoseconds from the current time.
long bpf_timer_start(struct bpf_timer *timer, u64 nsec, u64 flags);

// Cancel the timer and wait for callback_fn to finish if it was running.
long bpf_timer_cancel(struct bpf_timer *timer);

Here is how BPF program might look like:
struct map_elem {
    int counter;
    struct bpf_timer timer;
};

struct {
    __uint(type, BPF_MAP_TYPE_HASH);
    __uint(max_entries, 1000);
    __type(key, int);
    __type(value, struct map_elem);
} hmap SEC(".maps");

static int timer_cb(void *map, int *key, struct map_elem *val);
/* val points to particular map element that contains bpf_timer. */

SEC("fentry/bpf_fentry_test1")
int BPF_PROG(test1, int a)
{
    struct map_elem *val;
    int key = 0;

    val = bpf_map_lookup_elem(&hmap, &key);
    if (val) {
        bpf_timer_init(&val->timer, &hmap, CLOCK_REALTIME);
        bpf_timer_set_callback(&val->timer, timer_cb);
        bpf_timer_start(&val->timer, 1000 /* call timer_cb2 in 1 usec */, 0);
    }
}

This patch adds helper implementations that rely on hrtimers
to call bpf functions as timers expire.
The following patches add necessary safety checks.

Only programs with CAP_BPF are allowed to use bpf_timer.

The amount of timers used by the program is constrained by
the memcg recorded at map creation time.

The bpf_timer_init() helper needs explicit 'map' argument because inner maps
are dynamic and not known at load time. While the bpf_timer_set_callback() is
receiving hidden 'aux->prog' argument supplied by the verifier.

The prog pointer is needed to do refcnting of bpf program to make sure that
program doesn't get freed while the timer is armed. This approach relies on
"user refcnt" scheme used in prog_array that stores bpf programs for
bpf_tail_call. The bpf_timer_set_callback() will increment the prog refcnt which is
paired with bpf_timer_cancel() that will drop the prog refcnt. The
ops->map_release_uref is responsible for cancelling the timers and dropping
prog refcnt when user space reference to a map reaches zero.
This uref approach is done to make sure that Ctrl-C of user space process will
not leave timers running forever unless the user space explicitly pinned a map
that contained timers in bpffs.

bpf_timer_init() and bpf_timer_set_callback() will return -EPERM if map doesn't
have user references (is not held by open file descriptor from user space and
not pinned in bpffs).

The bpf_map_delete_elem() and bpf_map_update_elem() operations cancel
and free the timer if given map element had it allocated.
"bpftool map update" command can be used to cancel timers.

The 'struct bpf_timer' is explicitly __attribute__((aligned(8))) because
'__u64 :64' has 1 byte alignment of 8 byte padding.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210715005417.78572-4-alexei.starovoitov@gmail.com
2021-07-16 17:05:44 -07:00
Kuniyuki Iwashima
6f7839f477 bpf: Fix a typo of reuseport map in bpf.h.
Fix s/BPF_MAP_TYPE_REUSEPORT_ARRAY/BPF_MAP_TYPE_REUSEPORT_SOCKARRAY/ typo
in bpf.h.

Fixes: 2dbb9b9e6df6 ("bpf: Introduce BPF_PROG_TYPE_SK_REUSEPORT")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210714124317.67526-1-kuniyu@amazon.co.jp
2021-07-16 17:05:44 -07:00
Alexei Starovoitov
90aba5e582 bpf: Sync tools/include/uapi/linux/bpf.h
Commit 47316f4a3053 missed updating tools/.../bpf.h.
Sync it.

Fixes: 47316f4a3053 ("bpf: Support input xdp_md context in BPF_PROG_TEST_RUN")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-07-16 17:05:44 -07:00
Martynas Pumputis
4dc3aeb072 libbpf: Fix reuse of pinned map on older kernel
When loading a BPF program with a pinned map, the loader checks whether
the pinned map can be reused, i.e. their properties match. To derive
such of the pinned map, the loader invokes BPF_OBJ_GET_INFO_BY_FD and
then does the comparison.

Unfortunately, on < 4.12 kernels the BPF_OBJ_GET_INFO_BY_FD is not
available, so loading the program fails with the following error:

	libbpf: failed to get map info for map FD 5: Invalid argument
	libbpf: couldn't reuse pinned map at
		'/sys/fs/bpf/tc/globals/cilium_call_policy': parameter
		mismatch"
	libbpf: map 'cilium_call_policy': error reusing pinned map
	libbpf: map 'cilium_call_policy': failed to create:
		Invalid argument(-22)
	libbpf: failed to load object 'bpf_overlay.o'

To fix this, fallback to derivation of the map properties via
/proc/$PID/fdinfo/$MAP_FD if BPF_OBJ_GET_INFO_BY_FD fails with EINVAL,
which can be used as an indicator that the kernel doesn't support
the latter.

Signed-off-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210712125552.58705-1-m@lambda.lt
2021-07-16 17:05:44 -07:00
Toke Høiland-Jørgensen
4ce0551ee5 libbpf: Restore errno return for functions that were already returning it
The update to streamline libbpf error reporting intended to change all
functions to return the errno as a negative return value if
LIBBPF_STRICT_DIRECT_ERRS is set. However, if the flag is *not* set, the
return value changes for the two functions that were already returning a
negative errno unconditionally: bpf_link__unpin() and perf_buffer__poll().

This is a user-visible API change that breaks applications; so let's revert
these two functions back to unconditionally returning a negative errno
value.

Fixes: e9fc3ce99b34 ("libbpf: Streamline error reporting for high-level APIs")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210706122355.236082-1-toke@redhat.com
2021-07-16 17:05:44 -07:00
Kumar Kartikeya Dwivedi
f8411901c4 libbpf: Switch to void * casting in netlink helpers
Netlink helpers I added in 8bbb77b7c7a2 ("libbpf: Add various netlink
helpers") used char * casts everywhere, and there were a few more that
existed from before.

Convert all of them to void * cast, as it is treated equivalently by
clang/gcc for the purposes of pointer arithmetic and to follow the
convention elsewhere in the kernel/libbpf.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210619041454.417577-2-memxor@gmail.com
2021-07-16 17:05:44 -07:00
Kumar Kartikeya Dwivedi
9ff2b76693 libbpf: Add request buffer type for netlink messages
Coverity complains about OOB writes to nlmsghdr. There is no OOB as we
write to the trailing buffer, but static analyzers and compilers may
rightfully be confused as the nlmsghdr pointer has subobject provenance
(and hence subobject bounds).

Fix this by using an explicit request structure containing the nlmsghdr,
struct tcmsg/ifinfomsg, and attribute buffer.

Also switch nh_tail (renamed to req_tail) to cast req * to char * so
that it can be understood as arithmetic on pointer to the representation
array (hence having same bound as request structure), which should
further appease analyzers.

As a bonus, callers don't have to pass sizeof(req) all the time now, as
size is implicitly obtained using the pointer. While at it, also reduce
the size of attribute buffer to 128 bytes (132 for ifinfomsg using
functions due to the padding).

Summary of problem:

  Even though C standard allows interconvertibility of pointer to first
  member and pointer to struct, for the purposes of alias analysis it
  would still consider the first as having pointer value "pointer to T"
  where T is type of first member hence having subobject bounds,
  allowing analyzers within reason to complain when object is accessed
  beyond the size of pointed to object.

  The only exception to this rule may be when a char * is formed to a
  member subobject. It is not possible for the compiler to be able to
  tell the intent of the programmer that it is a pointer to member
  object or the underlying representation array of the containing
  object, so such diagnosis is suppressed.

Fixes: 715c5ce454a6 ("libbpf: Add low level TC-BPF management API")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210619041454.417577-1-memxor@gmail.com
2021-07-16 17:05:44 -07:00
Jonathan Edwards
df023f5cfc libbpf: Add extra BPF_PROG_TYPE check to bpf_object__probe_loading
eBPF has been backported for RHEL 7 w/ kernel 3.10-940+ [0]. However only
the following program types are supported [1]:

  BPF_PROG_TYPE_KPROBE
  BPF_PROG_TYPE_TRACEPOINT
  BPF_PROG_TYPE_PERF_EVENT

For libbpf this causes an EINVAL return during the bpf_object__probe_loading
call which only checks to see if programs of type BPF_PROG_TYPE_SOCKET_FILTER
can load.

The following will try BPF_PROG_TYPE_TRACEPOINT as a fallback attempt before
erroring out. BPF_PROG_TYPE_KPROBE was not a good candidate because on some
kernels it requires knowledge of the LINUX_VERSION_CODE.

  [0] https://www.redhat.com/en/blog/introduction-ebpf-red-hat-enterprise-linux-7
  [1] https://access.redhat.com/articles/3550581

Signed-off-by: Jonathan Edwards <jonathan.edwards@165gc.onmicrosoft.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210619151007.GA6963@165gc.onmicrosoft.com
2021-07-16 17:05:44 -07:00
Andrii Nakryiko
ae62c159ec include: initial sync of pkt_cls.h and pkt_sched.h
Add pkt_cls.h and pkt_sched.h to include/uapi/linux.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-07-16 14:22:07 -07:00
Yonghong Song
8bf016110e sync uapi headers linux/pkt_cls.h and linux/pkt_sched.h
Let us sync linux/{pkt_cls.h,pkt_sched.h} to libbpf repo.
Otherwise, on ubuntu 16.04, system headers will be picked up
and this will result in compilation error like:
  .../netlink.c:416:23: error: ‘TC_H_CLSACT’ undeclared (first use in this function)
     *parent = TC_H_MAKE(TC_H_CLSACT,
                         ^
  .../netlink.c:418:9: error: ‘TC_H_MIN_INGRESS’ undeclared (first use in this function)
           TC_H_MIN_INGRESS : TC_H_MIN_EGRESS);
           ^
  .../netlink.c:418:28: error: ‘TC_H_MIN_EGRESS’ undeclared (first use in this function)
           TC_H_MIN_INGRESS : TC_H_MIN_EGRESS);
                              ^
  .../netlink.c: In function ‘__get_tc_info’:
  .../netlink.c:522:11: error: ‘TCA_BPF_ID’ undeclared (first use in this function)
    if (!tbb[TCA_BPF_ID])
             ^

Signed-off-by: Yonghong Song <yhs@fb.com>
2021-07-12 14:01:21 -07:00
Yucong Sun
d3e4039a0a create ondemand vmtest workflow 2021-07-09 14:09:51 -07:00
Jussi Mäki
dd34504b43 vmtest: Set CONFIG_BONDING=y in latest.config
This is preparation for the XDP bonding patch set [1] to avoid having to mangle the kernel configuration from vmtest.sh.
 
[1]: https://lore.kernel.org/bpf/202106221509.kwNvAAZg-lkp@intel.com/T/#m4635dc0003944f38a54059b11147ab46abeffa13

Signed-off-by: Jussi Maki <joamaki@gmail.com>
2021-07-08 15:18:32 -07:00
Andrii Nakryiko
bec2ae0c6e sync: update rewritten bpf-next SHA
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-07-06 13:31:12 -07:00
Andrii Nakryiko
1d6106cf45 ci: blacklist few new tests on 5.5
tc_redirect and migrate_reuseport use new functionality not present on 5.5

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-06-18 13:05:10 -07:00
Andrii Nakryiko
95e51c1dbe ci: disable fail-fast for Github Actions tests
Make sure we run all of the tests even if some of them fail. This allows to
test all of them independently, especially kernel LATEST slow test.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-06-18 13:05:10 -07:00
Andrii Nakryiko
db132757c9 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   cf68fa431d5da7ef0b5ea142dd603611696cbd44
Checkpoint bpf-next commit: f540a7d2c37f9ae0867de0a14bf06cf50b63d65e
Baseline bpf commit:        11fc79fc9f2e395aa39fa5baccae62767c5d8280
Checkpoint bpf commit:      61e8aeda9398925f8c6fc290585bdd9727d154c4

Kumar Kartikeya Dwivedi (2):
  libbpf: Remove unneeded check for flags during tc detach
  libbpf: Set NLM_F_EXCL when creating qdisc

Kuniyuki Iwashima (3):
  bpf: Support BPF_FUNC_get_socket_cookie() for
    BPF_PROG_TYPE_SK_REUSEPORT.
  bpf: Support socket migration by eBPF.
  libbpf: Set expected_attach_type for BPF_PROG_TYPE_SK_REUSEPORT.

Lorenz Bauer (1):
  libbpf: Fail compilation if target arch is missing

Wang Hai (1):
  libbpf: Simplify the return expression of bpf_object__init_maps
    function

grantseltzer (1):
  Add documentation for libbpf including API autogen

 include/uapi/linux/bpf.h |  16 ++++
 src/README.rst           | 168 ---------------------------------------
 src/bpf_tracing.h        |  46 ++++++++++-
 src/libbpf.c             |   9 ++-
 src/netlink.c            |   4 +-
 5 files changed, 64 insertions(+), 179 deletions(-)
 delete mode 100644 src/README.rst

--
2.30.2
2021-06-18 13:05:10 -07:00
grantseltzer
41cddf18f4 Add documentation for libbpf including API autogen
This patch is meant to start the initiative to document libbpf.
It includes .rst files which are text documentation describing building,
API naming convention, as well as an index to generated API documentation.

In this approach the generated API documentation is enabled by the kernels
existing kernel documentation system which uses sphinx. The resulting docs
would then be synced to kernel.org/doc

You can test this by running `make htmldocs` and serving the html in
Documentation/output. Since libbpf does not yet have comments in kernel
doc format, see kernel.org/doc/html/latest/doc-guide/kernel-doc.html for
an example so you can test this.

The advantage of this approach is to use the existing sphinx
infrastructure that the kernel has, and have libbpf docs in
the same place as everything else.

The current plan is to have the libbpf mirror sync the generated docs
and version them based on the libbpf releases which are cut on github.

This patch includes the addition of libbpf_api.rst which pulls comment
documentation from header files in libbpf under tools/lib/bpf/. The comment
docs would be of the standard kernel doc format.

Signed-off-by: grantseltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210618140459.9887-2-grantseltzer@gmail.com
2021-06-18 13:05:10 -07:00
Lorenz Bauer
f883bbf3f4 libbpf: Fail compilation if target arch is missing
bpf2go is the Go equivalent of libbpf skeleton. The convention is that
the compiled BPF is checked into the repository to facilitate distributing
BPF as part of Go packages. To make this portable, bpf2go by default
generates both bpfel and bpfeb variants of the C.

Using bpf_tracing.h is inherently non-portable since the fields of
struct pt_regs differ between platforms, so CO-RE can't help us here.
The only way of working around this is to compile for each target
platform independently. bpf2go can't do this by default since there
are too many platforms.

Define the various PT_... macros when no target can be determined and
turn them into compilation failures. This works because bpf2go always
compiles for bpf targets, so the compiler fallback doesn't kick in.
Conditionally define __BPF_MISSING_TARGET so that we can inject a
more appropriate error message at build time. The user can then
choose which platform to target explicitly.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210616083635.11434-1-lmb@cloudflare.com
2021-06-18 13:05:10 -07:00
Kuniyuki Iwashima
db8982bcaa libbpf: Set expected_attach_type for BPF_PROG_TYPE_SK_REUSEPORT.
This commit introduces a new section (sk_reuseport/migrate) and sets
expected_attach_type to two each section in BPF_PROG_TYPE_SK_REUSEPORT
program.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210612123224.12525-11-kuniyu@amazon.co.jp
2021-06-18 13:05:10 -07:00
Kuniyuki Iwashima
d1571ab5ce bpf: Support socket migration by eBPF.
This patch introduces a new bpf_attach_type for BPF_PROG_TYPE_SK_REUSEPORT
to check if the attached eBPF program is capable of migrating sockets. When
the eBPF program is attached, we run it for socket migration if the
expected_attach_type is BPF_SK_REUSEPORT_SELECT_OR_MIGRATE or
net.ipv4.tcp_migrate_req is enabled.

Currently, the expected_attach_type is not enforced for the
BPF_PROG_TYPE_SK_REUSEPORT type of program. Thus, this commit follows the
earlier idea in the commit aac3fc320d94 ("bpf: Post-hooks for sys_bind") to
fix up the zero expected_attach_type in bpf_prog_load_fixup_attach_type().

Moreover, this patch adds a new field (migrating_sk) to sk_reuseport_md to
select a new listener based on the child socket. migrating_sk varies
depending on if it is migrating a request in the accept queue or during
3WHS.

  - accept_queue : sock (ESTABLISHED/SYN_RECV)
  - 3WHS         : request_sock (NEW_SYN_RECV)

In the eBPF program, we can select a new listener by
BPF_FUNC_sk_select_reuseport(). Also, we can cancel migration by returning
SK_DROP. This feature is useful when listeners have different settings at
the socket API level or when we want to free resources as soon as possible.

  - SK_PASS with selected_sk, select it as a new listener
  - SK_PASS with selected_sk NULL, fallbacks to the random selection
  - SK_DROP, cancel the migration.

There is a noteworthy point. We select a listening socket in three places,
but we do not have struct skb at closing a listener or retransmitting a
SYN+ACK. On the other hand, some helper functions do not expect skb is NULL
(e.g. skb_header_pointer() in BPF_FUNC_skb_load_bytes(), skb_tail_pointer()
in BPF_FUNC_skb_load_bytes_relative()). So we allocate an empty skb
temporarily before running the eBPF program.

Suggested-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/netdev/20201123003828.xjpjdtk4ygl6tg6h@kafai-mbp.dhcp.thefacebook.com/
Link: https://lore.kernel.org/netdev/20201203042402.6cskdlit5f3mw4ru@kafai-mbp.dhcp.thefacebook.com/
Link: https://lore.kernel.org/netdev/20201209030903.hhow5r53l6fmozjn@kafai-mbp.dhcp.thefacebook.com/
Link: https://lore.kernel.org/bpf/20210612123224.12525-10-kuniyu@amazon.co.jp
2021-06-18 13:05:10 -07:00
Kuniyuki Iwashima
03b0787342 bpf: Support BPF_FUNC_get_socket_cookie() for BPF_PROG_TYPE_SK_REUSEPORT.
We will call sock_reuseport.prog for socket migration in the next commit,
so the eBPF program has to know which listener is closing to select a new
listener.

We can currently get a unique ID of each listener in the userspace by
calling bpf_map_lookup_elem() for BPF_MAP_TYPE_REUSEPORT_SOCKARRAY map.

This patch makes the pointer of sk available in sk_reuseport_md so that we
can get the ID by BPF_FUNC_get_socket_cookie() in the eBPF program.

Suggested-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/netdev/20201119001154.kapwihc2plp4f7zc@kafai-mbp.dhcp.thefacebook.com/
Link: https://lore.kernel.org/bpf/20210612123224.12525-9-kuniyu@amazon.co.jp
2021-06-18 13:05:10 -07:00
Kumar Kartikeya Dwivedi
a1bd8104a9 libbpf: Set NLM_F_EXCL when creating qdisc
This got lost during the refactoring across versions. We always use
NLM_F_EXCL when creating some TC object, so reflect what the function
says and set the flag.

Fixes: 715c5ce454a6 ("libbpf: Add low level TC-BPF management API")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210612023502.1283837-3-memxor@gmail.com
2021-06-18 13:05:10 -07:00
Kumar Kartikeya Dwivedi
ccead28901 libbpf: Remove unneeded check for flags during tc detach
Coverity complained about this being unreachable code. It is right
because we already enforce flags to be unset, so a check validating
the flag value is redundant.

Fixes: 715c5ce454a6 ("libbpf: Add low level TC-BPF management API")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210612023502.1283837-2-memxor@gmail.com
2021-06-18 13:05:10 -07:00
Wang Hai
0b59d75ecd libbpf: Simplify the return expression of bpf_object__init_maps function
There is no need for special treatment of the 'ret == 0' case.
This patch simplifies the return expression.

Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210609115651.3392580-1-wanghai38@huawei.com
2021-06-18 13:05:10 -07:00
Sergei Iudin
a5ee05d505 Run pahole staging once a day 2021-06-17 17:49:56 -07:00
Sergei Iudin
42ebbbce7d test pahole 2021-06-17 13:38:13 -07:00
Sergei Iudin
26497b9a88 Add coverity workflow 2021-06-15 16:13:44 -07:00
Sergei Iudin
5d5af3f07e Migrate libbpf ci to GH actions
changes to docker command require to run it in non-interactive mode
2021-06-15 14:13:57 -07:00
Andrii Nakryiko
899c45baa2 travis-ci: extend 5.5 blacklist
Blacklist few more recent selftests.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-06-08 13:04:27 -07:00
Andrii Nakryiko
95008d47dd Makefile: sync Makefile with upstream
Add gen_loader.o to list of built object files. Complete the list of installed
headers.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-06-08 13:04:27 -07:00
Andrii Nakryiko
13acc0af00 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   f18ba26da88a89db9b50cb4ff47fadb159f2810b
Checkpoint bpf-next commit: cf68fa431d5da7ef0b5ea142dd603611696cbd44
Baseline bpf commit:        d0c0fe10ce6d87734b65c18dc8f4bcae3f4dbea4
Checkpoint bpf commit:      11fc79fc9f2e395aa39fa5baccae62767c5d8280

Alexei Starovoitov (12):
  bpf: Introduce bpf_sys_bpf() helper and program type.
  libbpf: Support for syscall program type
  bpf: Introduce fd_idx
  bpf: Add bpf_btf_find_by_name_kind() helper.
  bpf: Add bpf_sys_close() helper.
  libbpf: Change the order of data and text relocations.
  libbpf: Add bpf_object pointer to kernel_supports().
  libbpf: Preliminary support for fd_idx
  libbpf: Generate loader program out of BPF ELF file.
  libbpf: Cleanup temp FDs when intermediate sys_bpf fails.
  libbpf: Introduce bpf_map__initial_value().
  bpf: Add cmd alias BPF_PROG_RUN

Andrii Nakryiko (4):
  libbpf: Add libbpf_set_strict_mode() API to turn on libbpf 1.0
    behaviors
  libbpf: Streamline error reporting for low-level APIs
  libbpf: Streamline error reporting for high-level APIs
  libbpf: Move few APIs from 0.4 to 0.5 version

Denis Salopek (2):
  bpf: Add lookup_and_delete_elem support to hashtab
  bpf: Extend libbpf with bpf_map_lookup_and_delete_elem_flags

Florent Revest (1):
  libbpf: Move BPF_SEQ_PRINTF and BPF_SNPRINTF to bpf_helpers.h

Hangbin Liu (1):
  xdp: Extend xdp_redirect_map with broadcast support

Kev Jackson (1):
  libbpf: Fixes incorrect rx_ring_setup_done

Michal Suchanek (1):
  libbpf: Fix pr_warn type warnings on 32bit

Stanislav Fomichev (1):
  libbpf: Skip bpf_object__probe_loading for light skeleton

 include/uapi/linux/bpf.h |  66 ++-
 src/bpf.c                | 179 +++++---
 src/bpf.h                |   2 +
 src/bpf_gen_internal.h   |  41 ++
 src/bpf_helpers.h        |  66 +++
 src/bpf_prog_linfo.c     |  18 +-
 src/bpf_tracing.h        |  62 +--
 src/btf.c                | 302 ++++++-------
 src/btf_dump.c           |  14 +-
 src/gen_loader.c         | 729 +++++++++++++++++++++++++++++++
 src/libbpf.c             | 909 +++++++++++++++++++++++++--------------
 src/libbpf.h             |  14 +
 src/libbpf.map           |   8 +
 src/libbpf_errno.c       |   7 +-
 src/libbpf_internal.h    |  55 +++
 src/libbpf_legacy.h      |  59 +++
 src/linker.c             |  22 +-
 src/netlink.c            |  81 ++--
 src/ringbuf.c            |  26 +-
 src/skel_internal.h      | 123 ++++++
 src/xsk.c                |   2 +-
 21 files changed, 2135 insertions(+), 650 deletions(-)
 create mode 100644 src/bpf_gen_internal.h
 create mode 100644 src/gen_loader.c
 create mode 100644 src/libbpf_legacy.h
 create mode 100644 src/skel_internal.h

--
2.30.2
2021-06-08 13:04:27 -07:00
Andrii Nakryiko
1b9138e452 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2021-06-08 13:04:27 -07:00
Kev Jackson
2da7f66d3f libbpf: Fixes incorrect rx_ring_setup_done
When calling xsk_socket__create_shared(), the logic at line 1097 marks a
boolean flag true within the xsk_umem structure to track setup progress
in order to support multiple calls to the function.  However, instead of
marking umem->tx_ring_setup_done, the code incorrectly sets
umem->rx_ring_setup_done.  This leads to improper behaviour when
creating and destroying xsk and umem structures.

Multiple calls to this function is documented as supported.

Fixes: ca7a83e2487a ("libbpf: Only create rx and tx XDP rings when necessary")
Signed-off-by: Kev Jackson <foamdino@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/YL4aU4f3Aaik7CN0@linux-dev
2021-06-08 13:04:27 -07:00
Michal Suchanek
9d5ac4931d libbpf: Fix pr_warn type warnings on 32bit
The printed value is ptrdiff_t and is formatted wiht %ld. This works on
64bit but produces a warning on 32bit. Fix the format specifier to %td.

Fixes: 67234743736a ("libbpf: Generate loader program out of BPF ELF file.")
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210604112448.32297-1-msuchanek@suse.de
2021-06-08 13:04:27 -07:00
Andrii Nakryiko
5bfbb36440 libbpf: Move few APIs from 0.4 to 0.5 version
Official libbpf 0.4 release doesn't include three APIs that were tentatively
put into 0.4 section. Fix libbpf.map and move these three APIs:

  - bpf_map__initial_value;
  - bpf_map_lookup_and_delete_elem_flags;
  - bpf_object__gen_loader.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210603004026.2698513-2-andrii@kernel.org
2021-06-08 13:04:27 -07:00
Florent Revest
343f63e245 libbpf: Move BPF_SEQ_PRINTF and BPF_SNPRINTF to bpf_helpers.h
These macros are convenient wrappers around the bpf_seq_printf and
bpf_snprintf helpers. They are currently provided by bpf_tracing.h which
targets low level tracing primitives. bpf_helpers.h is a better fit.

The __bpf_narg and __bpf_apply are needed in both files and provided
twice. __bpf_empty isn't used anywhere and is removed from bpf_tracing.h

Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210526164643.2881368-1-revest@chromium.org
2021-06-08 13:04:27 -07:00
Hangbin Liu
0dccb885a3 xdp: Extend xdp_redirect_map with broadcast support
This patch adds two flags BPF_F_BROADCAST and BPF_F_EXCLUDE_INGRESS to
extend xdp_redirect_map for broadcast support.

With BPF_F_BROADCAST the packet will be broadcasted to all the interfaces
in the map. with BPF_F_EXCLUDE_INGRESS the ingress interface will be
excluded when do broadcasting.

When getting the devices in dev hash map via dev_map_hash_get_next_key(),
there is a possibility that we fall back to the first key when a device
was removed. This will duplicate packets on some interfaces. So just walk
the whole buckets to avoid this issue. For dev array map, we also walk the
whole map to find valid interfaces.

Function bpf_clear_redirect_map() was removed in
commit ee75aef23afe ("bpf, xdp: Restructure redirect actions").
Add it back as we need to use ri->map again.

With test topology:
  +-------------------+             +-------------------+
  | Host A (i40e 10G) |  ---------- | eno1(i40e 10G)    |
  +-------------------+             |                   |
                                    |   Host B          |
  +-------------------+             |                   |
  | Host C (i40e 10G) |  ---------- | eno2(i40e 10G)    |
  +-------------------+             |                   |
                                    |          +------+ |
                                    | veth0 -- | Peer | |
                                    | veth1 -- |      | |
                                    | veth2 -- |  NS  | |
                                    |          +------+ |
                                    +-------------------+

On Host A:
 # pktgen/pktgen_sample03_burst_single_flow.sh -i eno1 -d $dst_ip -m $dst_mac -s 64

On Host B(Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz, 128G Memory):
Use xdp_redirect_map and xdp_redirect_map_multi in samples/bpf for testing.
All the veth peers in the NS have a XDP_DROP program loaded. The
forward_map max_entries in xdp_redirect_map_multi is modify to 4.

Testing the performance impact on the regular xdp_redirect path with and
without patch (to check impact of additional check for broadcast mode):

5.12 rc4         | redirect_map        i40e->i40e      |    2.0M |  9.7M
5.12 rc4         | redirect_map        i40e->veth      |    1.7M | 11.8M
5.12 rc4 + patch | redirect_map        i40e->i40e      |    2.0M |  9.6M
5.12 rc4 + patch | redirect_map        i40e->veth      |    1.7M | 11.7M

Testing the performance when cloning packets with the redirect_map_multi
test, using a redirect map size of 4, filled with 1-3 devices:

5.12 rc4 + patch | redirect_map multi  i40e->veth (x1) |    1.7M | 11.4M
5.12 rc4 + patch | redirect_map multi  i40e->veth (x2) |    1.1M |  4.3M
5.12 rc4 + patch | redirect_map multi  i40e->veth (x3) |    0.8M |  2.6M

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/20210519090747.1655268-3-liuhangbin@gmail.com
2021-06-08 13:04:27 -07:00
Andrii Nakryiko
8e3a63ea48 libbpf: Streamline error reporting for high-level APIs
Implement changes to error reporting for high-level libbpf APIs to make them
less surprising and less error-prone to users:
  - in all the cases when error happens, errno is set to an appropriate error
    value;
  - in libbpf 1.0 mode, all pointer-returning APIs return NULL on error and
    error code is communicated through errno; this applies both to APIs that
    already returned NULL before (so now they communicate more detailed error
    codes), as well as for many APIs that used ERR_PTR() macro and encoded
    error numbers as fake pointers.
  - in legacy (default) mode, those APIs that were returning ERR_PTR(err),
    continue doing so, but still set errno.

With these changes, errno can be always used to extract actual error,
regardless of legacy or libbpf 1.0 modes. This is utilized internally in
libbpf in places where libbpf uses it's own high-level APIs.
libbpf_get_error() is adapted to handle both cases completely transparently to
end-users (and is used by libbpf consistently as well).

More context, justification, and discussion can be found in "Libbpf: the road
to v1.0" document ([0]).

  [0] https://docs.google.com/document/d/1UyjTZuPFWiPFyKk1tV5an11_iaRuec6U-ZESZ54nNTY

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210525035935.1461796-5-andrii@kernel.org
2021-06-08 13:04:27 -07:00
Andrii Nakryiko
7c7ba067fc libbpf: Streamline error reporting for low-level APIs
Ensure that low-level APIs behave uniformly across the libbpf as follows:
  - in case of an error, errno is always set to the correct error code;
  - when libbpf 1.0 mode is enabled with LIBBPF_STRICT_DIRECT_ERRS option to
    libbpf_set_strict_mode(), return -Exxx error value directly, instead of -1;
  - by default, until libbpf 1.0 is released, keep returning -1 directly.

More context, justification, and discussion can be found in "Libbpf: the road
to v1.0" document ([0]).

  [0] https://docs.google.com/document/d/1UyjTZuPFWiPFyKk1tV5an11_iaRuec6U-ZESZ54nNTY

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210525035935.1461796-4-andrii@kernel.org
2021-06-08 13:04:27 -07:00
Andrii Nakryiko
12eb2666d9 libbpf: Add libbpf_set_strict_mode() API to turn on libbpf 1.0 behaviors
Add libbpf_set_strict_mode() API that allows application to simulate libbpf
1.0 breaking changes before libbpf 1.0 is released. This will help users
migrate gradually and with confidence.

For now only ALL or NONE options are available, subsequent patches will add
more flags. This patch is preliminary for selftests/bpf changes.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210525035935.1461796-2-andrii@kernel.org
2021-06-08 13:04:27 -07:00
Denis Salopek
234dea015b bpf: Extend libbpf with bpf_map_lookup_and_delete_elem_flags
Add bpf_map_lookup_and_delete_elem_flags() libbpf API in order to use
the BPF_F_LOCK flag with the map_lookup_and_delete_elem() function.

Signed-off-by: Denis Salopek <denis.salopek@sartura.hr>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/15b05dafe46c7e0750d110f233977372029d1f62.1620763117.git.denis.salopek@sartura.hr
2021-06-08 13:04:27 -07:00
Denis Salopek
c3c2e52201 bpf: Add lookup_and_delete_elem support to hashtab
Extend the existing bpf_map_lookup_and_delete_elem() functionality to
hashtab map types, in addition to stacks and queues.
Create a new hashtab bpf_map_ops function that does lookup and deletion
of the element under the same bucket lock and add the created map_ops to
bpf.h.

Signed-off-by: Denis Salopek <denis.salopek@sartura.hr>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/4d18480a3e990ffbf14751ddef0325eed3be2966.1620763117.git.denis.salopek@sartura.hr
2021-06-08 13:04:27 -07:00
Stanislav Fomichev
b79c698300 libbpf: Skip bpf_object__probe_loading for light skeleton
I'm getting the following error when running 'gen skeleton -L' as
regular user:

libbpf: Error in bpf_object__probe_loading():Operation not permitted(1).
Couldn't load trivial BPF program. Make sure your kernel supports BPF
(CONFIG_BPF_SYSCALL=y) and/or that RLIMIT_MEMLOCK is set to big enough
value.

Fixes: 67234743736a ("libbpf: Generate loader program out of BPF ELF file.")
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210521030653.2626513-1-sdf@google.com
2021-06-08 13:04:27 -07:00
Alexei Starovoitov
546199a723 bpf: Add cmd alias BPF_PROG_RUN
Add BPF_PROG_RUN command as an alias to BPF_RPOG_TEST_RUN to better
indicate the full range of use cases done by the command.

Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20210519014032.20908-1-alexei.starovoitov@gmail.com
2021-06-08 13:04:27 -07:00
Alexei Starovoitov
b44566c71b libbpf: Introduce bpf_map__initial_value().
Introduce bpf_map__initial_value() to read initial contents
of mmaped data/rodata/bss maps.
Note that bpf_map__set_initial_value() doesn't allow modifying
kconfig map while bpf_map__initial_value() allows reading
its values.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210514003623.28033-17-alexei.starovoitov@gmail.com
2021-06-08 13:04:27 -07:00
Alexei Starovoitov
594960b3db libbpf: Cleanup temp FDs when intermediate sys_bpf fails.
Fix loader program to close temporary FDs when intermediate
sys_bpf command fails.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210514003623.28033-16-alexei.starovoitov@gmail.com
2021-06-08 13:04:27 -07:00
Alexei Starovoitov
694a70c522 libbpf: Generate loader program out of BPF ELF file.
The BPF program loading process performed by libbpf is quite complex
and consists of the following steps:
"open" phase:
- parse elf file and remember relocations, sections
- collect externs and ksyms including their btf_ids in prog's BTF
- patch BTF datasec (since llvm couldn't do it)
- init maps (old style map_def, BTF based, global data map, kconfig map)
- collect relocations against progs and maps
"load" phase:
- probe kernel features
- load vmlinux BTF
- resolve externs (kconfig and ksym)
- load program BTF
- init struct_ops
- create maps
- apply CO-RE relocations
- patch ld_imm64 insns with src_reg=PSEUDO_MAP, PSEUDO_MAP_VALUE, PSEUDO_BTF_ID
- reposition subprograms and adjust call insns
- sanitize and load progs

During this process libbpf does sys_bpf() calls to load BTF, create maps,
populate maps and finally load programs.
Instead of actually doing the syscalls generate a trace of what libbpf
would have done and represent it as the "loader program".
The "loader program" consists of single map with:
- union bpf_attr(s)
- BTF bytes
- map value bytes
- insns bytes
and single bpf program that passes bpf_attr(s) and data into bpf_sys_bpf() helper.
Executing such "loader program" via bpf_prog_test_run() command will
replay the sequence of syscalls that libbpf would have done which will result
the same maps created and programs loaded as specified in the elf file.
The "loader program" removes libelf and majority of libbpf dependency from
program loading process.

kconfig, typeless ksym, struct_ops and CO-RE are not supported yet.

The order of relocate_data and relocate_calls had to change, so that
bpf_gen__prog_load() can see all relocations for a given program with
correct insn_idx-es.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210514003623.28033-15-alexei.starovoitov@gmail.com
2021-06-08 13:04:27 -07:00
Alexei Starovoitov
c96f2f1b29 libbpf: Preliminary support for fd_idx
Prep libbpf to use FD_IDX kernel feature when generating loader program.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210514003623.28033-14-alexei.starovoitov@gmail.com
2021-06-08 13:04:27 -07:00
Alexei Starovoitov
ac2095783a libbpf: Add bpf_object pointer to kernel_supports().
Add a pointer to 'struct bpf_object' to kernel_supports() helper.
It will be used in the next patch.
No functional changes.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210514003623.28033-13-alexei.starovoitov@gmail.com
2021-06-08 13:04:27 -07:00
Alexei Starovoitov
fecf2cf6dd libbpf: Change the order of data and text relocations.
In order to be able to generate loader program in the later
patches change the order of data and text relocations.
Also improve the test to include data relos.

If the kernel supports "FD array" the map_fd relocations can be processed
before text relos since generated loader program won't need to manually
patch ld_imm64 insns with map_fd.
But ksym and kfunc relocations can only be processed after all calls
are relocated, since loader program will consist of a sequence
of calls to bpf_btf_find_by_name_kind() followed by patching of btf_id
and btf_obj_fd into corresponding ld_imm64 insns. The locations of those
ld_imm64 insns are specified in relocations.
Hence process all data relocations (maps, ksym, kfunc) together after call relos.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210514003623.28033-12-alexei.starovoitov@gmail.com
2021-06-08 13:04:27 -07:00
Alexei Starovoitov
c1f36fb3e3 bpf: Add bpf_sys_close() helper.
Add bpf_sys_close() helper to be used by the syscall/loader program to close
intermediate FDs and other cleanup.
Note this helper must never be allowed inside fdget/fdput bracketing.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210514003623.28033-11-alexei.starovoitov@gmail.com
2021-06-08 13:04:27 -07:00
Alexei Starovoitov
6eac86910c bpf: Add bpf_btf_find_by_name_kind() helper.
Add new helper:
long bpf_btf_find_by_name_kind(char *name, int name_sz, u32 kind, int flags)
Description
	Find BTF type with given name and kind in vmlinux BTF or in module's BTFs.
Return
	Returns btf_id and btf_obj_fd in lower and upper 32 bits.

It will be used by loader program to find btf_id to attach the program to
and to find btf_ids of ksyms.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210514003623.28033-10-alexei.starovoitov@gmail.com
2021-06-08 13:04:27 -07:00
Alexei Starovoitov
64a654f398 bpf: Introduce fd_idx
Typical program loading sequence involves creating bpf maps and applying
map FDs into bpf instructions in various places in the bpf program.
This job is done by libbpf that is using compiler generated ELF relocations
to patch certain instruction after maps are created and BTFs are loaded.
The goal of fd_idx is to allow bpf instructions to stay immutable
after compilation. At load time the libbpf would still create maps as usual,
but it wouldn't need to patch instructions. It would store map_fds into
__u32 fd_array[] and would pass that pointer to sys_bpf(BPF_PROG_LOAD).

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210514003623.28033-9-alexei.starovoitov@gmail.com
2021-06-08 13:04:27 -07:00
Alexei Starovoitov
34eb4fb3f1 libbpf: Support for syscall program type
Trivial support for syscall program type.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210514003623.28033-5-alexei.starovoitov@gmail.com
2021-06-08 13:04:27 -07:00
Alexei Starovoitov
007709011e bpf: Introduce bpf_sys_bpf() helper and program type.
Add placeholders for bpf_sys_bpf() helper and new program type.
Make sure to check that expected_attach_type is zero for future extensibility.
Allow tracing helper functions to be used in this program type, since they will
only execute from user context via bpf_prog_test_run.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210514003623.28033-2-alexei.starovoitov@gmail.com
2021-06-08 13:04:27 -07:00
Yonghong Song
db9614b6bd libbpf: Add support for new llvm bpf relocations
LLVM patch https://reviews.llvm.org/D102712
narrowed the scope of existing R_BPF_64_64
and R_BPF_64_32 relocations, and added three
new relocations, R_BPF_64_ABS64, R_BPF_64_ABS32
and R_BPF_64_NODYLD32. The main motivation is
to make relocations linker friendly.

This change, unfortunately, breaks libbpf build,
and we will see errors like below:
  libbpf: ELF relo #0 in section #6 has unexpected type 2 in
     /home/yhs/work/bpf-next/tools/testing/selftests/bpf/bpf_tcp_nogpl.o
  Error: failed to link
     '/home/yhs/work/bpf-next/tools/testing/selftests/bpf/bpf_tcp_nogpl.o':
     Unknown error -22 (-22)
The new relocation R_BPF_64_ABS64 is generated
and libbpf linker sanity check doesn't understand it.
Relocation section '.rel.struct_ops' at offset 0x1410 contains 1 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name
0000000000000018  0000000700000002 R_BPF_64_ABS64         0000000000000000 nogpltcp_init

Look at the selftests/bpf/bpf_tcp_nogpl.c,
  void BPF_STRUCT_OPS(nogpltcp_init, struct sock *sk)
  {
  }

  SEC(".struct_ops")
  struct tcp_congestion_ops bpf_nogpltcp = {
          .init           = (void *)nogpltcp_init,
          .name           = "bpf_nogpltcp",
  };
The new llvm relocation scheme categorizes 'nogpltcp_init' reference
as R_BPF_64_ABS64 instead of R_BPF_64_64 which is used to specify
ld_imm64 relocation in the new scheme.

Let us fix the linker sanity checking by including
R_BPF_64_ABS64 and R_BPF_64_ABS32. There is no need to
check R_BPF_64_NODYLD32 which is used for .BTF and .BTF.ext.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210522162341.3687617-1-yhs@fb.com
2021-05-24 21:24:56 -07:00
Andrii Nakryiko
57375504c6 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   c1cccec9c63637c4c5ee0aa2da2850d983c19e88
Checkpoint bpf-next commit: f18ba26da88a89db9b50cb4ff47fadb159f2810b
Baseline bpf commit:        c9a7c013569d73881199a7a3011c03336f592cc8
Checkpoint bpf commit:      d0c0fe10ce6d87734b65c18dc8f4bcae3f4dbea4

Andrii Nakryiko (1):
  libbpf: Reject static entry-point BPF programs

Kumar Kartikeya Dwivedi (2):
  libbpf: Add various netlink helpers
  libbpf: Add low level TC-BPF management API

 src/libbpf.c   |   5 +
 src/libbpf.h   |  44 ++++
 src/libbpf.map |   5 +
 src/netlink.c  | 568 +++++++++++++++++++++++++++++++++++++++++--------
 src/nlattr.h   |  48 +++++
 5 files changed, 587 insertions(+), 83 deletions(-)

--
2.30.2
2021-05-17 17:12:06 -07:00
Kumar Kartikeya Dwivedi
d71ff87a2d libbpf: Add low level TC-BPF management API
This adds functions that wrap the netlink API used for adding, manipulating,
and removing traffic control filters.

The API summary:

A bpf_tc_hook represents a location where a TC-BPF filter can be attached.
This means that creating a hook leads to creation of the backing qdisc,
while destruction either removes all filters attached to a hook, or destroys
qdisc if requested explicitly (as discussed below).

The TC-BPF API functions operate on this bpf_tc_hook to attach, replace,
query, and detach tc filters. All functions return 0 on success, and a
negative error code on failure.

bpf_tc_hook_create - Create a hook
Parameters:
	@hook - Cannot be NULL, ifindex > 0, attach_point must be set to
		proper enum constant. Note that parent must be unset when
		attach_point is one of BPF_TC_INGRESS or BPF_TC_EGRESS. Note
		that as an exception BPF_TC_INGRESS|BPF_TC_EGRESS is also a
		valid value for attach_point.

		Returns -EOPNOTSUPP when hook has attach_point as BPF_TC_CUSTOM.

bpf_tc_hook_destroy - Destroy a hook
Parameters:
	@hook - Cannot be NULL. The behaviour depends on value of
		attach_point. If BPF_TC_INGRESS, all filters attached to
		the ingress hook will be detached. If BPF_TC_EGRESS, all
		filters attached to the egress hook will be detached. If
		BPF_TC_INGRESS|BPF_TC_EGRESS, the clsact qdisc will be
		deleted, also detaching all filters. As before, parent must
		be unset for these attach_points, and set for BPF_TC_CUSTOM.

		It is advised that if the qdisc is operated on by many programs,
		then the program at least check that there are no other existing
		filters before deleting the clsact qdisc. An example is shown
		below:

		DECLARE_LIBBPF_OPTS(bpf_tc_hook, .ifindex = if_nametoindex("lo"),
				    .attach_point = BPF_TC_INGRESS);
		/* set opts as NULL, as we're not really interested in
		 * getting any info for a particular filter, but just
	 	 * detecting its presence.
		 */
		r = bpf_tc_query(&hook, NULL);
		if (r == -ENOENT) {
			/* no filters */
			hook.attach_point = BPF_TC_INGRESS|BPF_TC_EGREESS;
			return bpf_tc_hook_destroy(&hook);
		} else {
			/* failed or r == 0, the latter means filters do exist */
			return r;
		}

		Note that there is a small race between checking for no
		filters and deleting the qdisc. This is currently unavoidable.

		Returns -EOPNOTSUPP when hook has attach_point as BPF_TC_CUSTOM.

bpf_tc_attach - Attach a filter to a hook
Parameters:
	@hook - Cannot be NULL. Represents the hook the filter will be
		attached to. Requirements for ifindex and attach_point are
		same as described in bpf_tc_hook_create, but BPF_TC_CUSTOM
		is also supported.  In that case, parent must be set to the
		handle where the filter will be attached (using BPF_TC_PARENT).
		E.g. to set parent to 1:16 like in tc command line, the
		equivalent would be BPF_TC_PARENT(1, 16).

	@opts - Cannot be NULL. The following opts are optional:
		* handle   - The handle of the filter
		* priority - The priority of the filter
			     Must be >= 0 and <= UINT16_MAX
		Note that when left unset, they will be auto-allocated by
		the kernel. The following opts must be set:
		* prog_fd - The fd of the loaded SCHED_CLS prog
		The following opts must be unset:
		* prog_id - The ID of the BPF prog
		The following opts are optional:
		* flags - Currently only BPF_TC_F_REPLACE is allowed. It
			  allows replacing an existing filter instead of
			  failing with -EEXIST.
		The following opts will be filled by bpf_tc_attach on a
		successful attach operation if they are unset:
		* handle   - The handle of the attached filter
		* priority - The priority of the attached filter
		* prog_id  - The ID of the attached SCHED_CLS prog
		This way, the user can know what the auto allocated values
		for optional opts like handle and priority are for the newly
		attached filter, if they were unset.

		Note that some other attributes are set to fixed default
		values listed below (this holds for all bpf_tc_* APIs):
		protocol as ETH_P_ALL, direct action mode, chain index of 0,
		and class ID of 0 (this can be set by writing to the
		skb->tc_classid field from the BPF program).

bpf_tc_detach
Parameters:
	@hook - Cannot be NULL. Represents the hook the filter will be
		detached from. Requirements are same as described above
		in bpf_tc_attach.

	@opts - Cannot be NULL. The following opts must be set:
		* handle, priority
		The following opts must be unset:
		* prog_fd, prog_id, flags

bpf_tc_query
Parameters:
	@hook - Cannot be NULL. Represents the hook where the filter lookup will
		be performed. Requirements are same as described above in
		bpf_tc_attach().

	@opts - Cannot be NULL. The following opts must be set:
		* handle, priority
		The following opts must be unset:
		* prog_fd, prog_id, flags
		The following fields will be filled by bpf_tc_query upon a
		successful lookup:
		* prog_id

Some usage examples (using BPF skeleton infrastructure):

BPF program (test_tc_bpf.c):

	#include <linux/bpf.h>
	#include <bpf/bpf_helpers.h>

	SEC("classifier")
	int cls(struct __sk_buff *skb)
	{
		return 0;
	}

Userspace loader:

	struct test_tc_bpf *skel = NULL;
	int fd, r;

	skel = test_tc_bpf__open_and_load();
	if (!skel)
		return -ENOMEM;

	fd = bpf_program__fd(skel->progs.cls);

	DECLARE_LIBBPF_OPTS(bpf_tc_hook, hook, .ifindex =
			    if_nametoindex("lo"), .attach_point =
			    BPF_TC_INGRESS);
	/* Create clsact qdisc */
	r = bpf_tc_hook_create(&hook);
	if (r < 0)
		goto end;

	DECLARE_LIBBPF_OPTS(bpf_tc_opts, opts, .prog_fd = fd);
	r = bpf_tc_attach(&hook, &opts);
	if (r < 0)
		goto end;
	/* Print the auto allocated handle and priority */
	printf("Handle=%u", opts.handle);
	printf("Priority=%u", opts.priority);

	opts.prog_fd = opts.prog_id = 0;
	bpf_tc_detach(&hook, &opts);
end:
	test_tc_bpf__destroy(skel);

This is equivalent to doing the following using tc command line:
  # tc qdisc add dev lo clsact
  # tc filter add dev lo ingress bpf obj foo.o sec classifier da
  # tc filter del dev lo ingress handle <h> prio <p> bpf
... where the handle and priority can be found using:
  # tc filter show dev lo ingress

Another example replacing a filter (extending prior example):

	/* We can also choose both (or one), let's try replacing an
	 * existing filter.
	 */
	DECLARE_LIBBPF_OPTS(bpf_tc_opts, replace_opts, .handle =
			    opts.handle, .priority = opts.priority,
			    .prog_fd = fd);
	r = bpf_tc_attach(&hook, &replace_opts);
	if (r == -EEXIST) {
		/* Expected, now use BPF_TC_F_REPLACE to replace it */
		replace_opts.flags = BPF_TC_F_REPLACE;
		return bpf_tc_attach(&hook, &replace_opts);
	} else if (r < 0) {
		return r;
	}
	/* There must be no existing filter with these
	 * attributes, so cleanup and return an error.
	 */
	replace_opts.prog_fd = replace_opts.prog_id = 0;
	bpf_tc_detach(&hook, &replace_opts);
	return -1;

To obtain info of a particular filter:

	/* Find info for filter with handle 1 and priority 50 */
	DECLARE_LIBBPF_OPTS(bpf_tc_opts, info_opts, .handle = 1,
			    .priority = 50);
	r = bpf_tc_query(&hook, &info_opts);
	if (r == -ENOENT)
		printf("Filter not found");
	else if (r < 0)
		return r;
	printf("Prog ID: %u", info_opts.prog_id);
	return 0;

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net> # libbpf API design
[ Daniel: also did major patch cleanup ]
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210512103451.989420-3-memxor@gmail.com
2021-05-17 17:12:06 -07:00
Kumar Kartikeya Dwivedi
01515b8f05 libbpf: Add various netlink helpers
This change introduces a few helpers to wrap open coded attribute
preparation in netlink.c. It also adds a libbpf_netlink_send_recv() that
is useful to wrap send + recv handling in a generic way. Subsequent patch
will also use this function for sending and receiving a netlink response.
The libbpf_nl_get_link() helper has been removed instead, moving socket
creation into the newly named libbpf_netlink_send_recv().

Every nested attribute's closure must happen using the helper
nlattr_end_nested(), which sets its length properly. NLA_F_NESTED is
enforced using nlattr_begin_nested() helper. Other simple attributes
can be added directly.

The maxsz parameter corresponds to the size of the request structure
which is being filled in, so for instance with req being:

  struct {
	struct nlmsghdr nh;
	struct tcmsg t;
	char buf[4096];
  } req;

Then, maxsz should be sizeof(req).

This change also converts the open coded attribute preparation with these
helpers. Note that the only failure the internal call to nlattr_add()
could result in the nested helper would be -EMSGSIZE, hence that is what
we return to our caller.

The libbpf_netlink_send_recv() call takes care of opening the socket,
sending the netlink message, receiving the response, potentially invoking
callbacks, and return errors if any, and then finally close the socket.
This allows users to avoid identical socket setup code in different places.
The only user of libbpf_nl_get_link() has been converted to make use of it.
__bpf_set_link_xdp_fd_replace() has also been refactored to use it.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
[ Daniel: major patch cleanup ]
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210512103451.989420-2-memxor@gmail.com
2021-05-17 17:12:06 -07:00
Andrii Nakryiko
6028cec50c libbpf: Reject static entry-point BPF programs
Detect use of static entry-point BPF programs (those with SEC() markings) and
emit error message. This is similar to
c1cccec9c636 ("libbpf: Reject static maps") but for BPF programs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20210514195534.1440970-1-andrii@kernel.org
2021-05-17 17:12:06 -07:00
Andrii Nakryiko
68695d0173 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   9d31d2338950293ec19d9b095fbaa9030899dcb4
Checkpoint bpf-next commit: c1cccec9c63637c4c5ee0aa2da2850d983c19e88
Baseline bpf commit:        9683e5775c75097c46bd24e65411b16ac6c6cbb3
Checkpoint bpf commit:      c9a7c013569d73881199a7a3011c03336f592cc8

Andrii Nakryiko (4):
  libbpf: Add per-file linker opts
  libbpf: Fix ELF symbol visibility update logic
  libbpf: Treat STV_INTERNAL same as STV_HIDDEN for functions
  libbpf: Reject static maps

Arnaldo Carvalho de Melo (1):
  libbpf: Provide GELF_ST_VISIBILITY() define for older libelf

 src/libbpf.c          | 35 +++++++++++++++++++++++++----------
 src/libbpf.h          | 10 +++++++++-
 src/libbpf_internal.h |  5 +++++
 src/linker.c          | 18 +++++++++++++-----
 4 files changed, 52 insertions(+), 16 deletions(-)

--
2.30.2
2021-05-17 14:36:22 -07:00
Arnaldo Carvalho de Melo
72cdd6ed42 libbpf: Provide GELF_ST_VISIBILITY() define for older libelf
Where that macro isn't available.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/YJaspEh0qZr4LYOc@kernel.org
2021-05-17 14:36:22 -07:00
Andrii Nakryiko
b5bfbab488 libbpf: Reject static maps
Static maps never really worked with libbpf, because all such maps were always
silently resolved to the very first map. Detect static maps (both legacy and
BTF-defined) and report user-friendly error.

Tested locally by switching few maps (legacy and BTF-defined) in selftests to
static ones and verifying that now libbpf rejects them loudly.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210513233643.194711-2-andrii@kernel.org
2021-05-17 14:36:22 -07:00
Andrii Nakryiko
1cf1c245d1 libbpf: Treat STV_INTERNAL same as STV_HIDDEN for functions
Do the same global -> static BTF update for global functions with STV_INTERNAL
visibility to turn on static BPF verification mode.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210507054119.270888-7-andrii@kernel.org
2021-05-17 14:36:22 -07:00
Andrii Nakryiko
076dd5dadb libbpf: Fix ELF symbol visibility update logic
Fix silly bug in updating ELF symbol's visibility.

Fixes: a46349227cd8 ("libbpf: Add linker extern resolution support for functions and global variables")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210507054119.270888-6-andrii@kernel.org
2021-05-17 14:36:22 -07:00
Andrii Nakryiko
6f585ab88f libbpf: Add per-file linker opts
For better future extensibility add per-file linker options. Currently
the set of available options is empty. This changes bpf_linker__add_file()
API, but it's not a breaking change as bpf_linker APIs hasn't been released
yet.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210507054119.270888-3-andrii@kernel.org
2021-05-17 14:36:22 -07:00
Andrii Nakryiko
c5389a965b sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   87bd9e602e39585c5280556a2b6a6363bb334257
Checkpoint bpf-next commit: 9d31d2338950293ec19d9b095fbaa9030899dcb4
Baseline bpf commit:        b02265429681c9c827c45978a61a9f00be5ea9aa
Checkpoint bpf commit:      9683e5775c75097c46bd24e65411b16ac6c6cbb3

Andrii Nakryiko (2):
  libbpf: Support BTF_KIND_FLOAT during type compatibility checks in
    CO-RE
  selftests/bpf: Fix BPF_CORE_READ_BITFIELD() macro

Brendan Jackman (1):
  libbpf: Fix signed overflow in ringbuf_process_ring

Ian Rogers (1):
  libbpf: Add NULL check to add_dummy_ksym_var

 src/bpf_core_read.h | 16 ++++++++++++----
 src/libbpf.c        |  9 +++++++--
 src/ringbuf.c       | 30 +++++++++++++++++++++---------
 3 files changed, 40 insertions(+), 15 deletions(-)

--
2.30.2
2021-05-05 16:39:05 -07:00
Ian Rogers
a58b8ca93e libbpf: Add NULL check to add_dummy_ksym_var
Avoids a segv if btf isn't present. Seen on the call path
__bpf_object__open calling bpf_object__collect_externs.

Fixes: 5bd022ec01f0 (libbpf: Support extern kernel function)
Suggested-by: Stanislav Fomichev <sdf@google.com>
Suggested-by: Petar Penkov <ppenkov@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210504234910.976501-1-irogers@google.com
2021-05-05 16:39:05 -07:00
Brendan Jackman
1691c37b39 libbpf: Fix signed overflow in ringbuf_process_ring
One of our benchmarks running in (Google-internal) CI pushes data
through the ringbuf faster htan than userspace is able to consume
it. In this case it seems we're actually able to get >INT_MAX entries
in a single ring_buffer__consume() call. ASAN detected that cnt
overflows in this case.

Fix by using 64-bit counter internally and then capping the result to
INT_MAX before converting to the int return type. Do the same for
the ring_buffer__poll().

Fixes: bf99c936f947 (libbpf: Add BPF ring buffer support)
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210429130510.1621665-1-jackmanb@google.com
2021-05-05 16:39:05 -07:00
Andrii Nakryiko
242842b34c selftests/bpf: Fix BPF_CORE_READ_BITFIELD() macro
Fix BPF_CORE_READ_BITFIELD() macro used for reading CO-RE-relocatable
bitfields. Missing breaks in a switch caused 8-byte reads always. This can
confuse libbpf because it does strict checks that memory load size corresponds
to the original size of the field, which in this case quite often would be
wrong.

After fixing that, we run into another problem, which quite subtle, so worth
documenting here. The issue is in Clang optimization and CO-RE relocation
interactions. Without that asm volatile construct (also known as
barrier_var()), Clang will re-order BYTE_OFFSET and BYTE_SIZE relocations and
will apply BYTE_OFFSET 4 times for each switch case arm. This will result in
the same error from libbpf about mismatch of memory load size and original
field size. I.e., if we were reading u32, we'd still have *(u8 *), *(u16 *),
*(u32 *), and *(u64 *) memory loads, three of which will fail. Using
barrier_var() forces Clang to apply BYTE_OFFSET relocation first (and once) to
calculate p, after which value of p is used without relocation in each of
switch case arms, doing appropiately-sized memory load.

Here's the list of relevant relocations and pieces of generated BPF code
before and after this patch for test_core_reloc_bitfields_direct selftests.

BEFORE
=====
 #45: core_reloc: insn #160 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32
 #46: core_reloc: insn #167 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32
 #47: core_reloc: insn #174 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32
 #48: core_reloc: insn #178 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32
 #49: core_reloc: insn #182 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32

     157:       18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll
     159:       7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1
     160:       b7 02 00 00 04 00 00 00 r2 = 4
; BYTE_SIZE relocation here                 ^^^
     161:       66 02 07 00 03 00 00 00 if w2 s> 3 goto +7 <LBB0_63>
     162:       16 02 0d 00 01 00 00 00 if w2 == 1 goto +13 <LBB0_65>
     163:       16 02 01 00 02 00 00 00 if w2 == 2 goto +1 <LBB0_66>
     164:       05 00 12 00 00 00 00 00 goto +18 <LBB0_69>

0000000000000528 <LBB0_66>:
     165:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
     167:       69 11 08 00 00 00 00 00 r1 = *(u16 *)(r1 + 8)
; BYTE_OFFSET relo here w/ WRONG size        ^^^^^^^^^^^^^^^^
     168:       05 00 0e 00 00 00 00 00 goto +14 <LBB0_69>

0000000000000548 <LBB0_63>:
     169:       16 02 0a 00 04 00 00 00 if w2 == 4 goto +10 <LBB0_67>
     170:       16 02 01 00 08 00 00 00 if w2 == 8 goto +1 <LBB0_68>
     171:       05 00 0b 00 00 00 00 00 goto +11 <LBB0_69>

0000000000000560 <LBB0_68>:
     172:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
     174:       79 11 08 00 00 00 00 00 r1 = *(u64 *)(r1 + 8)
; BYTE_OFFSET relo here w/ WRONG size        ^^^^^^^^^^^^^^^^
     175:       05 00 07 00 00 00 00 00 goto +7 <LBB0_69>

0000000000000580 <LBB0_65>:
     176:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
     178:       71 11 08 00 00 00 00 00 r1 = *(u8 *)(r1 + 8)
; BYTE_OFFSET relo here w/ WRONG size        ^^^^^^^^^^^^^^^^
     179:       05 00 03 00 00 00 00 00 goto +3 <LBB0_69>

00000000000005a0 <LBB0_67>:
     180:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
     182:       61 11 08 00 00 00 00 00 r1 = *(u32 *)(r1 + 8)
; BYTE_OFFSET relo here w/ RIGHT size        ^^^^^^^^^^^^^^^^

00000000000005b8 <LBB0_69>:
     183:       67 01 00 00 20 00 00 00 r1 <<= 32
     184:       b7 02 00 00 00 00 00 00 r2 = 0
     185:       16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71>
     186:       c7 01 00 00 20 00 00 00 r1 s>>= 32
     187:       05 00 01 00 00 00 00 00 goto +1 <LBB0_72>

00000000000005e0 <LBB0_71>:
     188:       77 01 00 00 20 00 00 00 r1 >>= 32

AFTER
=====

 #30: core_reloc: insn #132 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32
 #31: core_reloc: insn #134 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32

     129:       18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll
     131:       7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1
     132:       b7 01 00 00 08 00 00 00 r1 = 8
; BYTE_OFFSET relo here                     ^^^
; no size check for non-memory dereferencing instructions
     133:       0f 12 00 00 00 00 00 00 r2 += r1
     134:       b7 03 00 00 04 00 00 00 r3 = 4
; BYTE_SIZE relocation here                 ^^^
     135:       66 03 05 00 03 00 00 00 if w3 s> 3 goto +5 <LBB0_63>
     136:       16 03 09 00 01 00 00 00 if w3 == 1 goto +9 <LBB0_65>
     137:       16 03 01 00 02 00 00 00 if w3 == 2 goto +1 <LBB0_66>
     138:       05 00 0a 00 00 00 00 00 goto +10 <LBB0_69>

0000000000000458 <LBB0_66>:
     139:       69 21 00 00 00 00 00 00 r1 = *(u16 *)(r2 + 0)
; NO CO-RE relocation here                   ^^^^^^^^^^^^^^^^
     140:       05 00 08 00 00 00 00 00 goto +8 <LBB0_69>

0000000000000468 <LBB0_63>:
     141:       16 03 06 00 04 00 00 00 if w3 == 4 goto +6 <LBB0_67>
     142:       16 03 01 00 08 00 00 00 if w3 == 8 goto +1 <LBB0_68>
     143:       05 00 05 00 00 00 00 00 goto +5 <LBB0_69>

0000000000000480 <LBB0_68>:
     144:       79 21 00 00 00 00 00 00 r1 = *(u64 *)(r2 + 0)
; NO CO-RE relocation here                   ^^^^^^^^^^^^^^^^
     145:       05 00 03 00 00 00 00 00 goto +3 <LBB0_69>

0000000000000490 <LBB0_65>:
     146:       71 21 00 00 00 00 00 00 r1 = *(u8 *)(r2 + 0)
; NO CO-RE relocation here                   ^^^^^^^^^^^^^^^^
     147:       05 00 01 00 00 00 00 00 goto +1 <LBB0_69>

00000000000004a0 <LBB0_67>:
     148:       61 21 00 00 00 00 00 00 r1 = *(u32 *)(r2 + 0)
; NO CO-RE relocation here                   ^^^^^^^^^^^^^^^^

00000000000004a8 <LBB0_69>:
     149:       67 01 00 00 20 00 00 00 r1 <<= 32
     150:       b7 02 00 00 00 00 00 00 r2 = 0
     151:       16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71>
     152:       c7 01 00 00 20 00 00 00 r1 s>>= 32
     153:       05 00 01 00 00 00 00 00 goto +1 <LBB0_72>

00000000000004d0 <LBB0_71>:
     154:       77 01 00 00 20 00 00 00 r1 >>= 323

Fixes: ee26dade0e3b ("libbpf: Add support for relocatable bitfields")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Lorenz Bauer <lmb@cloudflare.com>
Link: https://lore.kernel.org/bpf/20210426192949.416837-4-andrii@kernel.org
2021-05-05 16:39:05 -07:00
Andrii Nakryiko
6b14cfa56e libbpf: Support BTF_KIND_FLOAT during type compatibility checks in CO-RE
Add BTF_KIND_FLOAT support when doing CO-RE field type compatibility check.
Without this, relocations against float/double fields will fail.

Also adjust one error message to emit instruction index instead of less
convenient instruction byte offset.

Fixes: 22541a9eeb0d ("libbpf: Add BTF_KIND_FLOAT support")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Lorenz Bauer <lmb@cloudflare.com>
Link: https://lore.kernel.org/bpf/20210426192949.416837-3-andrii@kernel.org
2021-05-05 16:39:05 -07:00
Andrii Nakryiko
b8b1faa3d4 travis: fix libelf-dev build-dep issues by using aptitude instead of apt-get
Use aptitude to actually see what's wrong with the dependencies. And it
actually magically resolves whatever minor version conflicts there are.

The big surprise came from the apparent difference in build-dep command
behavior. Aptitude's build-dep doesn't seem to install the libpfelf-dev
package itself. Adding explicit `aptitude install libelf-dev` after build-dep
solves the issue for now.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-05-05 13:57:07 -07:00
Andrii Nakryiko
9e123fa5d2 vmtests: fix libc6 dependency and remove explicit libelf-dev install
Force libc6 dependency version.

Drop explicit libelf-dev install command, as it should be pre-installed by
Travis CI already, according to .travis.yaml.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-04-30 09:39:10 -07:00
Andrii Nakryiko
4ccc1f0b9f vmtest: blacklist 2 tests on 5.5
Blacklist linked_vars, which expects typed ksym support. Blacklist snprintf,
which expectes new bpf_snprintf() helper.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-04-26 16:30:18 -07:00
Andrii Nakryiko
b9278634aa sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   1e1032b0c4afaed7739a6681ff6b4cb120b82994
Checkpoint bpf-next commit: 87bd9e602e39585c5280556a2b6a6363bb334257
Baseline bpf commit:        a14d273ba15968495896a38b7b3399dba66d0270
Checkpoint bpf commit:      b02265429681c9c827c45978a61a9f00be5ea9aa

Alexei Starovoitov (1):
  libbpf: Remove unused field.

Andrii Nakryiko (11):
  libbpf: Add bpf_map__inner_map API
  libbpf: Suppress compiler warning when using SEC() macro with externs
  libbpf: Mark BPF subprogs with hidden visibility as static for BPF
    verifier
  libbpf: Allow gaps in BPF program sections to support overriden weak
    functions
  libbpf: Refactor BTF map definition parsing
  libbpf: Factor out symtab and relos sanity checks
  libbpf: Make few internal helpers available outside of libbpf.c
  libbpf: Extend sanity checking ELF symbols with externs validation
  libbpf: Tighten BTF type ID rewriting with error checking
  libbpf: Add linker extern resolution support for functions and global
    variables
  libbpf: Support extern resolution for BTF-defined maps in .maps
    section

Ciara Loftus (1):
  libbpf: Fix potential NULL pointer dereference

Daniel Borkmann (1):
  bpf: Sync bpf headers in tooling infrastucture

Florent Revest (3):
  bpf: Add a bpf_snprintf helper
  libbpf: Initialize the bpf_seq_printf parameters array field by field
  libbpf: Introduce a BPF_SNPRINTF helper macro

Pedro Tammela (1):
  libbpf: Clarify flags in ringbuf helpers

Toke Høiland-Jørgensen (1):
  bpf: Return target info when a tracing bpf_link is queried

 include/uapi/linux/bpf.h |   83 ++-
 src/bpf_helpers.h        |   19 +-
 src/bpf_tracing.h        |   58 +-
 src/btf.c                |    5 -
 src/libbpf.c             |  396 +++++++-----
 src/libbpf.h             |    1 +
 src/libbpf.map           |    1 +
 src/libbpf_internal.h    |   45 ++
 src/linker.c             | 1270 ++++++++++++++++++++++++++++++++------
 src/xsk.c                |    3 +
 10 files changed, 1512 insertions(+), 369 deletions(-)

--
2.30.2
2021-04-26 16:30:18 -07:00
Andrii Nakryiko
af47e6c199 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2021-04-26 16:30:18 -07:00
Andrii Nakryiko
b2c06aec99 libbpf: Support extern resolution for BTF-defined maps in .maps section
Add extra logic to handle map externs (only BTF-defined maps are supported for
linking). Re-use the map parsing logic used during bpf_object__open(). Map
externs are currently restricted to always match complete map definition. So
all the specified attributes will be compared (down to pining, map_flags,
numa_node, etc). In the future this restriction might be relaxed with no
backwards compatibility issues. If any attribute is mismatched between extern
and actual map definition, linker will report an error, pointing out which one
mismatches.

The original intent was to allow for extern to specify attributes that matters
(to user) to enforce. E.g., if you specify just key information and omit
value, then any value fits. Similarly, it should have been possible to enforce
map_flags, pinning, and any other possible map attribute. Unfortunately, that
means that multiple externs can be only partially overlapping with each other,
which means linker would need to combine their type definitions to end up with
the most restrictive and fullest map definition. This requires an extra amount
of BTF manipulation which at this time was deemed unnecessary and would
require further extending generic BTF writer APIs. So that is left for future
follow ups, if there will be demand for that. But the idea seems intresting
and useful, so I want to document it here.

Weak definitions are also supported, but are pretty strict as well, just
like externs: all weak map definitions have to match exactly. In the follow up
patches this most probably will be relaxed, with __weak map definitions being
able to differ between each other (with non-weak definition always winning, of
course).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210423181348.1801389-13-andrii@kernel.org
2021-04-26 16:30:18 -07:00
Andrii Nakryiko
29e4840915 libbpf: Add linker extern resolution support for functions and global variables
Add BPF static linker logic to resolve extern variables and functions across
multiple linked together BPF object files.

For that, linker maintains a separate list of struct glob_sym structures,
which keeps track of few pieces of metadata (is it extern or resolved global,
is it a weak symbol, which ELF section it belongs to, etc) and ties together
BTF type info and ELF symbol information and keeps them in sync.

With adding support for extern variables/funcs, it's now possible for some
sections to contain both extern and non-extern definitions. This means that
some sections may start out as ephemeral (if only externs are present and thus
there is not corresponding ELF section), but will be "upgraded" to actual ELF
section as symbols are resolved or new non-extern definitions are appended.

Additional care is taken to not duplicate extern entries in sections like
.kconfig and .ksyms.

Given libbpf requires BTF type to always be present for .kconfig/.ksym
externs, linker extends this requirement to all the externs, even those that
are supposed to be resolved during static linking and which won't be visible
to libbpf. With BTF information always present, static linker will check not
just ELF symbol matches, but entire BTF type signature match as well. That
logic is stricter that BPF CO-RE checks. It probably should be re-used by
.ksym resolution logic in libbpf as well, but that's left for follow up
patches.

To make it unnecessary to rewrite ELF symbols and minimize BTF type
rewriting/removal, ELF symbols that correspond to externs initially will be
updated in place once they are resolved. Similarly for BTF type info, VAR/FUNC
and var_secinfo's (sec_vars in struct bpf_linker) are staying stable, but
types they point to might get replaced when extern is resolved. This might
leave some left-over types (even though we try to minimize this for common
cases of having extern funcs with not argument names vs concrete function with
names properly specified). That can be addresses later with a generic BTF
garbage collection. That's left for a follow up as well.

Given BTF type appending phase is separate from ELF symbol
appending/resolution, special struct glob_sym->underlying_btf_id variable is
used to communicate resolution and rewrite decisions. 0 means
underlying_btf_id needs to be appended (it's not yet in final linker->btf), <0
values are used for temporary storage of source BTF type ID (not yet
rewritten), so -glob_sym->underlying_btf_id is BTF type id in obj-btf. But by
the end of linker_append_btf() phase, that underlying_btf_id will be remapped
and will always be > 0. This is the uglies part of the whole process, but
keeps the other parts much simpler due to stability of sec_var and VAR/FUNC
types, as well as ELF symbol, so please keep that in mind while reviewing.

BTF-defined maps require some extra custom logic and is addressed separate in
the next patch, so that to keep this one smaller and easier to review.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210423181348.1801389-12-andrii@kernel.org
2021-04-26 16:30:18 -07:00
Andrii Nakryiko
7078c5eae4 libbpf: Tighten BTF type ID rewriting with error checking
It should never fail, but if it does, it's better to know about this rather
than end up with nonsensical type IDs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210423181348.1801389-11-andrii@kernel.org
2021-04-26 16:30:18 -07:00
Andrii Nakryiko
692ae888bc libbpf: Extend sanity checking ELF symbols with externs validation
Add logic to validate extern symbols, plus some other minor extra checks, like
ELF symbol #0 validation, general symbol visibility and binding validations.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210423181348.1801389-10-andrii@kernel.org
2021-04-26 16:30:18 -07:00
Andrii Nakryiko
24b5d82967 libbpf: Make few internal helpers available outside of libbpf.c
Make skip_mods_and_typedefs(), btf_kind_str(), and btf_func_linkage() helpers
available outside of libbpf.c, to be used by static linker code.

Also do few cleanups (error code fixes, comment clean up, etc) that don't
deserve their own commit.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210423181348.1801389-9-andrii@kernel.org
2021-04-26 16:30:18 -07:00
Andrii Nakryiko
4dcf439178 libbpf: Factor out symtab and relos sanity checks
Factor out logic for sanity checking SHT_SYMTAB and SHT_REL sections into
separate sections. They are already quite extensive and are suffering from too
deep indentation. Subsequent changes will extend SYMTAB sanity checking
further, so it's better to factor each into a separate function.

No functional changes are intended.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210423181348.1801389-8-andrii@kernel.org
2021-04-26 16:30:18 -07:00
Andrii Nakryiko
d9b9d4a43a libbpf: Refactor BTF map definition parsing
Refactor BTF-defined maps parsing logic to allow it to be nicely reused by BPF
static linker. Further, at least for BPF static linker, it's important to know
which attributes of a BPF map were defined explicitly, so provide a bit set
for each known portion of BTF map definition. This allows BPF static linker to
do a simple check when dealing with extern map declarations.

The same capabilities allow to distinguish attributes explicitly set to zero
(e.g., __uint(max_entries, 0)) vs the case of not specifying it at all (no
max_entries attribute at all). Libbpf is currently not utilizing that, but it
could be useful for backwards compatibility reasons later.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210423181348.1801389-7-andrii@kernel.org
2021-04-26 16:30:18 -07:00
Andrii Nakryiko
7b106ea4b1 libbpf: Allow gaps in BPF program sections to support overriden weak functions
Currently libbpf is very strict about parsing BPF program instruction
sections. No gaps are allowed between sequential BPF programs within a given
ELF section. Libbpf enforced that by keeping track of the next section offset
that should start a new BPF (sub)program and cross-checks that by searching
for a corresponding STT_FUNC ELF symbol.

But this is too restrictive once we allow to have weak BPF programs and link
together two or more BPF object files. In such case, some weak BPF programs
might be "overridden" by either non-weak BPF program with the same name and
signature, or even by another weak BPF program that just happened to be linked
first. That, in turn, leaves BPF instructions of the "lost" BPF (sub)program
intact, but there is no corresponding ELF symbol, because no one is going to
be referencing it.

Libbpf already correctly handles such cases in the sense that it won't append
such dead code to actual BPF programs loaded into kernel. So the only change
that needs to be done is to relax the logic of parsing BPF instruction
sections. Instead of assuming next BPF (sub)program section offset, iterate
available STT_FUNC ELF symbols to discover all available BPF subprograms and
programs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210423181348.1801389-6-andrii@kernel.org
2021-04-26 16:30:18 -07:00
Andrii Nakryiko
3319982d34 libbpf: Mark BPF subprogs with hidden visibility as static for BPF verifier
Define __hidden helper macro in bpf_helpers.h, which is a short-hand for
__attribute__((visibility("hidden"))). Add libbpf support to mark BPF
subprograms marked with __hidden as static in BTF information to enforce BPF
verifier's static function validation algorithm, which takes more information
(caller's context) into account during a subprogram validation.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210423181348.1801389-5-andrii@kernel.org
2021-04-26 16:30:18 -07:00
Andrii Nakryiko
2e430712f5 libbpf: Suppress compiler warning when using SEC() macro with externs
When used on externs SEC() macro will trigger compilation warning about
inapplicable `__attribute__((used))`. That's expected for extern declarations,
so suppress it with the corresponding _Pragma.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210423181348.1801389-4-andrii@kernel.org
2021-04-26 16:30:18 -07:00
Florent Revest
120a21852b libbpf: Introduce a BPF_SNPRINTF helper macro
Similarly to BPF_SEQ_PRINTF, this macro turns variadic arguments into an
array of u64, making it more natural to call the bpf_snprintf helper.

Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210419155243.1632274-6-revest@chromium.org
2021-04-26 16:30:18 -07:00
Florent Revest
f4da689d90 libbpf: Initialize the bpf_seq_printf parameters array field by field
When initializing the __param array with a one liner, if all args are
const, the initial array value will be placed in the rodata section but
because libbpf does not support relocation in the rodata section, any
pointer in this array will stay NULL.

Fixes: c09add2fbc5a ("tools/libbpf: Add bpf_iter support")
Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210419155243.1632274-5-revest@chromium.org
2021-04-26 16:30:18 -07:00
Florent Revest
30755d3a1c bpf: Add a bpf_snprintf helper
The implementation takes inspiration from the existing bpf_trace_printk
helper but there are a few differences:

To allow for a large number of format-specifiers, parameters are
provided in an array, like in bpf_seq_printf.

Because the output string takes two arguments and the array of
parameters also takes two arguments, the format string needs to fit in
one argument. Thankfully, ARG_PTR_TO_CONST_STR is guaranteed to point to
a zero-terminated read-only map so we don't need a format string length
arg.

Because the format-string is known at verification time, we also do
a first pass of format string validation in the verifier logic. This
makes debugging easier.

Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210419155243.1632274-4-revest@chromium.org
2021-04-26 16:30:18 -07:00
Alexei Starovoitov
dda0dd6a87 libbpf: Remove unused field.
relo->processed is set, but not used. Remove it.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210415141817.53136-1-alexei.starovoitov@gmail.com
2021-04-26 16:30:18 -07:00
Toke Høiland-Jørgensen
c21f91bd35 bpf: Return target info when a tracing bpf_link is queried
There is currently no way to discover the target of a tracing program
attachment after the fact. Add this information to bpf_link_info and return
it when querying the bpf_link fd.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210413091607.58945-1-toke@redhat.com
2021-04-26 16:30:18 -07:00
Pedro Tammela
552dec12dc libbpf: Clarify flags in ringbuf helpers
In 'bpf_ringbuf_reserve()' we require the flag to '0' at the moment.

For 'bpf_ringbuf_{discard,submit,output}' a flag of '0' might send a
notification to the process if needed.

Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210412192434.944343-1-pctammela@mojatatu.com
2021-04-26 16:30:18 -07:00
Daniel Borkmann
678e8c8e49 bpf: Sync bpf headers in tooling infrastucture
Synchronize tools/include/uapi/linux/bpf.h which was missing changes
from various commits:

  - f3c45326ee71 ("bpf: Document PROG_TEST_RUN limitations")
  - e5e35e754c28 ("bpf: BPF-helper for MTU checking add length input")

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2021-04-26 16:30:18 -07:00
Andrii Nakryiko
f6de59dc3e libbpf: Add bpf_map__inner_map API
The API gives access to inner map for map in map types (array or
hash of map). It will be used to dynamically set max_entries in it.

Signed-off-by: Yauheni Kaliuta <yauheni.kaliuta@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210408061310.95877-7-yauheni.kaliuta@redhat.com
2021-04-26 16:30:18 -07:00
Ciara Loftus
823648416c libbpf: Fix potential NULL pointer dereference
Wait until after the UMEM is checked for null to dereference it.

Fixes: 43f1bc1efff1 ("libbpf: Restore umem state after socket create failure")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210408052009.7844-1-ciara.loftus@intel.com
2021-04-26 16:30:18 -07:00
Andrii Nakryiko
02dbcbea28 travis-ci: use default docker from Focal
Stop trying to update Docker version. Focal seems to have recent enough
Docker. This fixes Debian builds.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-04-26 08:30:13 -07:00
Ilya Leoshkevich
a7502f2707 vmtest: fix error reporting
S50-run-tests uses -e, which means that it immediately exits on test
failures without writing /exitcode. Fix by temporarily turning -e off.

Another issue is that $? in S50-run-tests is not quoted, which causes
the random value from the host to be taken (in practice always 0), so
fix that as well.

Finally, this fix has a positive side effect - QEMU no longer hangs
when tests fail. This is because rcS (generated by mkrootfs.sh) also
uses -e and immediately exits, if one of the scripts that it calls
fails, without calling S99-poweroff.

Example output after the fix:

Summary: 53/184 PASSED, 5 SKIPPED, 1 FAILED
+ exitstatus=1
+ set -e
+ echo 1
+ chmod 644 /exitstatus
+ for path in /etc/rcS.d/S*
+ '[' -x /etc/rcS.d/S99-poweroff ']'
+ /etc/rcS.d/S99-poweroff
travis_fold:start:shutdown
Shutdown

starting pid 232, tty '': '/sbin/swapoff -a'
starting pid 233, tty '': '/bin/umount -a -r'
[   45.909033] EXT4-fs (vda): re-mounted. Opts: (null)
The system is going down NOW!
Sent SIGTERM to all processes
Sent SIGKILL to all processes
Requesting system poweroff
[   48.932007] ACPI: Preparing to enter system sleep state S5
[   48.932785] reboot: Power down

Tests exit status: 1
travis_fold🔚shutdown

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
2021-04-06 21:57:01 -07:00
Andrii Nakryiko
bab780e6f9 travis-ci: update to Ubuntu Focal
Update to Ubuntu Focal 20.04.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-04-06 21:18:31 -07:00
Ilya Leoshkevich
915f3abe94 travis-ci: prohibit uninitialized variables in managers/
The scripts in this directory rely on certain environment variables, so
fail if they are not set in order to improve the debugging experience.
The vmtest/ scripts already do it.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
2021-04-06 21:18:31 -07:00
Ilya Leoshkevich
0c248143d4 travis-ci: disable GCC's -Wstringop-truncation noisy error on Ubuntu
This is the same as commit 4d86cae4f0 ("ci: disable GCC's
-Wstringop-truncation noisy error"), but for Ubuntu. Without this,
there are false positives in bpf_object__new() on Ubuntu 20.04:
this function calls strncpy() with the correct bounds, but still
triggers the warning.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
2021-04-06 21:18:31 -07:00
Ilya Leoshkevich
9f0e42b512 vmtest: blacklist stacktrace_build_id selftest
It requires v5.9+ kernel when the test code is built with a newer
toolchain. The support was added by commit b33164f2bd1c ("bpf:
Iterate through all PT_NOTE sections when looking for build id").

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
2021-04-06 21:18:31 -07:00
Andrii Nakryiko
174d0b7b49 vmtest: blacklist kfunc_call selftests
kfunc_call requires 5.13+ kernel.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-04-05 11:30:05 -07:00
Andrii Nakryiko
95f83b8b0c sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   155f556d64b1a48710f01305e14bb860734ed1e3
Checkpoint bpf-next commit: 1e1032b0c4afaed7739a6681ff6b4cb120b82994
Baseline bpf commit:        002322402dafd846c424ffa9240a937f49b48c42
Checkpoint bpf commit:      a14d273ba15968495896a38b7b3399dba66d0270

Andrii Nakryiko (2):
  libbpf: Preserve empty DATASEC BTFs during static linking
  libbpf: Fix memory leak when emitting final btf_ext

Ciara Loftus (3):
  libbpf: Ensure umem pointer is non-NULL before dereferencing
  libbpf: Restore umem state after socket create failure
  libbpf: Only create rx and tx XDP rings when necessary

Cong Wang (1):
  sock_map: Introduce BPF_SK_SKB_VERDICT

Hengqi Chen (1):
  libbpf: Fix KERNEL_VERSION macro

Maciej Fijalkowski (1):
  libbpf: xsk: Use bpf_link

Martin KaFai Lau (6):
  bpf: Support bpf program calling kernel function
  libbpf: Refactor bpf_object__resolve_ksyms_btf_id
  libbpf: Refactor codes for finding btf id of a kernel symbol
  libbpf: Rename RELO_EXTERN to RELO_EXTERN_VAR
  libbpf: Record extern sym relocation first
  libbpf: Support extern kernel function

Pedro Tammela (1):
  libbpf: Fix bail out from 'ringbuf_process_ring()' on error

Yang Yingliang (1):
  libbpf: Remove redundant semi-colon

 include/uapi/linux/bpf.h |   5 +
 src/bpf_helpers.h        |   2 +-
 src/libbpf.c             | 389 +++++++++++++++++++++++++++++----------
 src/linker.c             |  39 +++-
 src/ringbuf.c            |   2 +-
 src/xsk.c                | 315 ++++++++++++++++++++++++-------
 6 files changed, 574 insertions(+), 178 deletions(-)

--
2.30.2
2021-04-05 11:30:05 -07:00
Ciara Loftus
416343d95c libbpf: Only create rx and tx XDP rings when necessary
Prior to this commit xsk_socket__create(_shared) always attempted to create
the rx and tx rings for the socket. However this causes an issue when the
socket being setup is that which shares the fd with the UMEM. If a
previous call to this function failed with this socket after the rings were
set up, a subsequent call would always fail because the rings are not torn
down after the first call and when we try to set them up again we encounter
an error because they already exist. Solve this by remembering whether the
rings were set up by introducing new bools to struct xsk_umem which
represent the ring setup status and using them to determine whether or
not to set up the rings.

Fixes: 1cad07884239 ("libbpf: add support for using AF_XDP sockets")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210331061218.1647-4-ciara.loftus@intel.com
2021-04-05 11:30:05 -07:00
Ciara Loftus
4f932c1ee9 libbpf: Restore umem state after socket create failure
If the call to xsk_socket__create fails, the user may want to retry the
socket creation using the same umem. Ensure that the umem is in the
same state on exit if the call fails by:
1. ensuring the umem _save pointers are unmodified.
2. not unmapping the set of umem rings that were set up with the umem
during xsk_umem__create, since those maps existed before the call to
xsk_socket__create and should remain in tact even in the event of
failure.

Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210331061218.1647-3-ciara.loftus@intel.com
2021-04-05 11:30:05 -07:00
Ciara Loftus
37e838f959 libbpf: Ensure umem pointer is non-NULL before dereferencing
Calls to xsk_socket__create dereference the umem to access the
fill_save and comp_save pointers. Make sure the umem is non-NULL
before doing this.

Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20210331061218.1647-2-ciara.loftus@intel.com
2021-04-05 11:30:05 -07:00
Pedro Tammela
d98e968707 libbpf: Fix bail out from 'ringbuf_process_ring()' on error
The current code bails out with negative and positive returns.
If the callback returns a positive return code, 'ring_buffer__consume()'
and 'ring_buffer__poll()' will return a spurious number of records
consumed, but mostly important will continue the processing loop.

This patch makes positive returns from the callback a no-op.

Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support")
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210325150115.138750-1-pctammela@mojatatu.com
2021-04-05 11:30:05 -07:00
Hengqi Chen
d4beac571a libbpf: Fix KERNEL_VERSION macro
Add missing ')' for KERNEL_VERSION macro.

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210405040119.802188-1-hengqi.chen@gmail.com
2021-04-05 11:30:05 -07:00
Yang Yingliang
bea42d49f8 libbpf: Remove redundant semi-colon
Remove redundant semi-colon in finalize_btf_ext().

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210402012634.1965453-1-yangyingliang@huawei.com
2021-04-05 11:30:05 -07:00
Cong Wang
4e8d8d5cd2 sock_map: Introduce BPF_SK_SKB_VERDICT
Reusing BPF_SK_SKB_STREAM_VERDICT is possible but its name is
confusing and more importantly we still want to distinguish them
from user-space. So we can just reuse the stream verdict code but
introduce a new type of eBPF program, skb_verdict. Users are not
allowed to attach stream_verdict and skb_verdict programs to the
same map.

Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210331023237.41094-10-xiyou.wangcong@gmail.com
2021-04-05 11:30:05 -07:00
Maciej Fijalkowski
8628610c32 libbpf: xsk: Use bpf_link
Currently, if there are multiple xdpsock instances running on a single
interface and in case one of the instances is terminated, the rest of
them are left in an inoperable state due to the fact of unloaded XDP
prog from interface.

Consider the scenario below:

// load xdp prog and xskmap and add entry to xskmap at idx 10
$ sudo ./xdpsock -i ens801f0 -t -q 10

// add entry to xskmap at idx 11
$ sudo ./xdpsock -i ens801f0 -t -q 11

terminate one of the processes and another one is unable to work due to
the fact that the XDP prog was unloaded from interface.

To address that, step away from setting bpf prog in favour of bpf_link.
This means that refcounting of BPF resources will be done automatically
by bpf_link itself.

Provide backward compatibility by checking if underlying system is
bpf_link capable. Do this by looking up/creating bpf_link on loopback
device. If it failed in any way, stick with netlink-based XDP prog.
therwise, use bpf_link-based logic.

When setting up BPF resources during xsk socket creation, check whether
bpf_link for a given ifindex already exists via set of calls to
bpf_link_get_next_id -> bpf_link_get_fd_by_id -> bpf_obj_get_info_by_fd
and comparing the ifindexes from bpf_link and xsk socket.

For case where resources exist but they are not AF_XDP related, bail out
and ask user to remove existing prog and then retry.

Lastly, do a bit of refactoring within __xsk_setup_xdp_prog and pull out
existing code branches based on prog_id value onto separate functions
that are responsible for resource initialization if prog_id was 0 and
for lookup existing resources for non-zero prog_id as that implies that
XDP program is present on the underlying net device. This in turn makes
it easier to follow, especially the teardown part of both branches.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210329224316.17793-7-maciej.fijalkowski@intel.com
2021-04-05 11:30:05 -07:00
Andrii Nakryiko
2e51adc9bc libbpf: Fix memory leak when emitting final btf_ext
Free temporary allocated memory used to construct finalized .BTF.ext data.
Found by Coverity static analysis on libbpf's Github repo.

Fixes: 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20210327042502.969745-1-andrii@kernel.org
2021-04-05 11:30:05 -07:00
Martin KaFai Lau
1ccb9d99d6 libbpf: Support extern kernel function
This patch is to make libbpf able to handle the following extern
kernel function declaration and do the needed relocations before
loading the bpf program to the kernel.

extern int foo(struct sock *) __attribute__((section(".ksyms")))

In the collect extern phase, needed changes is made to
bpf_object__collect_externs() and find_extern_btf_id() to collect
extern function in ".ksyms" section.  The func in the BTF datasec also
needs to be replaced by an int var.  The idea is similar to the existing
handling in extern var.  In case the BTF may not have a var, a dummy ksym
var is added at the beginning of bpf_object__collect_externs()
if there is func under ksyms datasec.  It will also change the
func linkage from extern to global which the kernel can support.
It also assigns a param name if it does not have one.

In the collect relo phase, it will record the kernel function
call as RELO_EXTERN_FUNC.

bpf_object__resolve_ksym_func_btf_id() is added to find the func
btf_id of the running kernel.

During actual relocation, it will patch the BPF_CALL instruction with
src_reg = BPF_PSEUDO_FUNC_CALL and insn->imm set to the running
kernel func's btf_id.

The required LLVM patch: https://reviews.llvm.org/D93563

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210325015234.1548923-1-kafai@fb.com
2021-04-05 11:30:05 -07:00
Martin KaFai Lau
90e052e6dd libbpf: Record extern sym relocation first
This patch records the extern sym relocs first before recording
subprog relocs.  The later patch will have relocs for extern
kernel function call which is also using BPF_JMP | BPF_CALL.
It will be easier to handle the extern symbols first in
the later patch.

is_call_insn() helper is added.  The existing is_ldimm64() helper
is renamed to is_ldimm64_insn() for consistency.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210325015227.1548623-1-kafai@fb.com
2021-04-05 11:30:05 -07:00
Martin KaFai Lau
7ef7ed2a5d libbpf: Rename RELO_EXTERN to RELO_EXTERN_VAR
This patch renames RELO_EXTERN to RELO_EXTERN_VAR.
It is to avoid the confusion with a later patch adding
RELO_EXTERN_FUNC.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210325015221.1547722-1-kafai@fb.com
2021-04-05 11:30:05 -07:00
Martin KaFai Lau
7036f3356e libbpf: Refactor codes for finding btf id of a kernel symbol
This patch refactors code, that finds kernel btf_id by kind
and symbol name, to a new function find_ksym_btf_id().

It also adds a new helper __btf_kind_str() to return
a string by the numeric kind value.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210325015214.1547069-1-kafai@fb.com
2021-04-05 11:30:05 -07:00
Martin KaFai Lau
e5d7cbe15a libbpf: Refactor bpf_object__resolve_ksyms_btf_id
This patch refactors most of the logic from
bpf_object__resolve_ksyms_btf_id() into a new function
bpf_object__resolve_ksym_var_btf_id().
It is to get ready for a later patch adding
bpf_object__resolve_ksym_func_btf_id() which resolves
a kernel function to the running kernel btf_id.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210325015207.1546749-1-kafai@fb.com
2021-04-05 11:30:05 -07:00
Martin KaFai Lau
e35afcb289 bpf: Support bpf program calling kernel function
This patch adds support to BPF verifier to allow bpf program calling
kernel function directly.

The use case included in this set is to allow bpf-tcp-cc to directly
call some tcp-cc helper functions (e.g. "tcp_cong_avoid_ai()").  Those
functions have already been used by some kernel tcp-cc implementations.

This set will also allow the bpf-tcp-cc program to directly call the
kernel tcp-cc implementation,  For example, a bpf_dctcp may only want to
implement its own dctcp_cwnd_event() and reuse other dctcp_*() directly
from the kernel tcp_dctcp.c instead of reimplementing (or
copy-and-pasting) them.

The tcp-cc kernel functions mentioned above will be white listed
for the struct_ops bpf-tcp-cc programs to use in a later patch.
The white listed functions are not bounded to a fixed ABI contract.
Those functions have already been used by the existing kernel tcp-cc.
If any of them has changed, both in-tree and out-of-tree kernel tcp-cc
implementations have to be changed.  The same goes for the struct_ops
bpf-tcp-cc programs which have to be adjusted accordingly.

This patch is to make the required changes in the bpf verifier.

First change is in btf.c, it adds a case in "btf_check_func_arg_match()".
When the passed in "btf->kernel_btf == true", it means matching the
verifier regs' states with a kernel function.  This will handle the
PTR_TO_BTF_ID reg.  It also maps PTR_TO_SOCK_COMMON, PTR_TO_SOCKET,
and PTR_TO_TCP_SOCK to its kernel's btf_id.

In the later libbpf patch, the insn calling a kernel function will
look like:

insn->code == (BPF_JMP | BPF_CALL)
insn->src_reg == BPF_PSEUDO_KFUNC_CALL /* <- new in this patch */
insn->imm == func_btf_id /* btf_id of the running kernel */

[ For the future calling function-in-kernel-module support, an array
  of module btf_fds can be passed at the load time and insn->off
  can be used to index into this array. ]

At the early stage of verifier, the verifier will collect all kernel
function calls into "struct bpf_kfunc_desc".  Those
descriptors are stored in "prog->aux->kfunc_tab" and will
be available to the JIT.  Since this "add" operation is similar
to the current "add_subprog()" and looking for the same insn->code,
they are done together in the new "add_subprog_and_kfunc()".

In the "do_check()" stage, the new "check_kfunc_call()" is added
to verify the kernel function call instruction:
1. Ensure the kernel function can be used by a particular BPF_PROG_TYPE.
   A new bpf_verifier_ops "check_kfunc_call" is added to do that.
   The bpf-tcp-cc struct_ops program will implement this function in
   a later patch.
2. Call "btf_check_kfunc_args_match()" to ensure the regs can be
   used as the args of a kernel function.
3. Mark the regs' type, subreg_def, and zext_dst.

At the later do_misc_fixups() stage, the new fixup_kfunc_call()
will replace the insn->imm with the function address (relative
to __bpf_call_base).  If needed, the jit can find the btf_func_model
by calling the new bpf_jit_find_kfunc_model(prog, insn).
With the imm set to the function address, "bpftool prog dump xlated"
will be able to display the kernel function calls the same way as
it displays other bpf helper calls.

gpl_compatible program is required to call kernel function.

This feature currently requires JIT.

The verifier selftests are adjusted because of the changes in
the verbose log in add_subprog_and_kfunc().

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210325015142.1544736-1-kafai@fb.com
2021-04-05 11:30:05 -07:00
Andrii Nakryiko
a18b72b920 libbpf: Preserve empty DATASEC BTFs during static linking
Ensure that BPF static linker preserves all DATASEC BTF types, even if some of
them might not have any variable information at all. This may happen if the
compiler promotes local initialized variable contents into .rodata section and
there are no global or static functions in the program.

For example,

  $ cat t.c
  struct t { char a; char b; char c; };
  void bar(struct t*);
  void find() {
     struct t tmp = {1, 2, 3};
     bar(&tmp);
  }

  $ clang -target bpf -O2 -g -S t.c
         .long   104                             # BTF_KIND_DATASEC(id = 8)
         .long   251658240                       # 0xf000000
         .long   0

         .ascii  ".rodata"                       # string offset=104

  $ clang -target bpf -O2 -g -c t.c
  $ readelf -S t.o | grep data
     [ 4] .rodata           PROGBITS         0000000000000000  00000090

Fixes: 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210326043036.3081011-1-andrii@kernel.org
2021-04-05 11:30:05 -07:00
Adam Jensen
2bd682d23e README: Mention Alpine in list of packaging distros 2021-04-04 17:50:11 -07:00
Andrii Nakryiko
99bc176337 README: update links to more up-to-date CO-RE articles
Update links to point to blog posts that have some new updates and are
generally kept more up-to-date.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-03-28 13:36:49 -07:00
Andrii Nakryiko
ea5752c641 Makefile: add strset.o and linker.o to build
Fix libbpf build by linking strset.o and linker.o into libbpf.a.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-03-25 23:40:47 -07:00
Andrii Nakryiko
1d2d2d0034 vmtests: blacklist fexit_sleep on 5.5
Blacklist fexit_sleep selftest that relies on bpf_trampoline fix in 5.12.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
582b8fe21b sync: remove libbpf_util.h from the list of headers
libbpf_util.h was removed in 7e8bbe24cb8b ("libbpf: xsk: Move barriers from
libbpf_util.h to xsk.h") upstream, so remove it from the list of installable
headers.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
3ea10e46cb sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   22541a9eeb0d968c133aaebd95fa59da3208e705
Checkpoint bpf-next commit: 155f556d64b1a48710f01305e14bb860734ed1e3
Baseline bpf commit:        6185266c5a853bb0f2a459e3ff594546f277609b
Checkpoint bpf commit:      002322402dafd846c424ffa9240a937f49b48c42

Andrii Nakryiko (11):
  libbpf: Add explicit padding to bpf_xdp_set_link_opts
  libbpf: provide NULL and KERNEL_VERSION macros in bpf_helpers.h
  libbpf: Expose btf_type_by_id() internally
  libbpf: Generalize BTF and BTF.ext type ID and strings iteration
  libbpf: Rename internal memory-management helpers
  libbpf: Extract internal set-of-strings datastructure APIs
  libbpf: Add generic BTF type shallow copy API
  libbpf: Add BPF static linker APIs
  libbpf: Add BPF static linker BTF and BTF.ext support
  libbpf: Skip BTF fixup if object file has no BTF
  libbpf: Constify few bpf_program getters

Björn Töpel (3):
  libbpf, xsk: Add libbpf_smp_store_release libbpf_smp_load_acquire
  libbpf: xsk: Remove linux/compiler.h header
  libbpf: xsk: Move barriers from libbpf_util.h to xsk.h

Jean-Philippe Brucker (2):
  libbpf: Fix arm64 build
  libbpf: Fix BTF dump of pointer-to-array-of-struct

Joe Stringer (2):
  scripts/bpf: Abstract eBPF API target parameter
  tools: Sync uapi bpf.h header with latest changes

KP Singh (1):
  libbpf: Add explicit padding to btf_dump_emit_type_decl_opts

Kumar Kartikeya Dwivedi (1):
  libbpf: Use SOCK_CLOEXEC when opening the netlink socket

Lorenz Bauer (1):
  bpf: Add PROG_TEST_RUN support for sk_lookup programs

Maciej Fijalkowski (1):
  libbpf: Clear map_info before each bpf_obj_get_info_by_fd

Namhyung Kim (1):
  libbpf: Fix error path in bpf_object__elf_init()

Pedro Tammela (1):
  libbpf: Avoid inline hint definition from 'linux/stddef.h'

Rafael David Tinoco (1):
  libbpf: Add bpf object kern_version attribute setter

Xuesen Huang (1):
  bpf: Add bpf_skb_adjust_room flag BPF_F_ADJ_ROOM_ENCAP_L2_ETH

 include/uapi/linux/bpf.h |  724 +++++++++++++-
 src/bpf_helpers.h        |   21 +-
 src/btf.c                |  714 +++++++-------
 src/btf.h                |    3 +
 src/btf_dump.c           |   10 +-
 src/libbpf.c             |   32 +-
 src/libbpf.h             |   19 +-
 src/libbpf.map           |    6 +
 src/libbpf_internal.h    |   38 +-
 src/libbpf_util.h        |   47 -
 src/linker.c             | 1944 ++++++++++++++++++++++++++++++++++++++
 src/netlink.c            |    2 +-
 src/strset.c             |  176 ++++
 src/strset.h             |   21 +
 src/xsk.c                |    5 +-
 src/xsk.h                |   87 +-
 16 files changed, 3379 insertions(+), 470 deletions(-)
 delete mode 100644 src/libbpf_util.h
 create mode 100644 src/linker.c
 create mode 100644 src/strset.c
 create mode 100644 src/strset.h

--
2.30.2
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
3118d38a2e sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2021-03-25 23:31:23 -07:00
Rafael David Tinoco
b09a4999d9 libbpf: Add bpf object kern_version attribute setter
Unfortunately some distros don't have their kernel version defined
accurately in <linux/version.h> due to different long term support
reasons.

It is important to have a way to override the bpf kern_version
attribute during runtime: some old kernels might still check for
kern_version attribute during bpf_prog_load().

Signed-off-by: Rafael David Tinoco <rafaeldtinoco@ubuntu.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210323040952.2118241-1-rafaeldtinoco@ubuntu.com
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
53f0e7d8ec libbpf: Constify few bpf_program getters
bpf_program__get_type() and bpf_program__get_expected_attach_type() shouldn't
modify given bpf_program, so mark input parameter as const struct bpf_program.
This eliminates unnecessary compilation warnings or explicit casts in user
programs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210324172941.2609884-1-andrii@kernel.org
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
d4d3a88b5a libbpf: Skip BTF fixup if object file has no BTF
Skip BTF fixup step when input object file is missing BTF altogether.

Fixes: 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support")
Reported-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/bpf/20210319205909.1748642-3-andrii@kernel.org
2021-03-25 23:31:23 -07:00
KP Singh
7b1f3e310b libbpf: Add explicit padding to btf_dump_emit_type_decl_opts
Similar to
https://lore.kernel.org/bpf/20210313210920.1959628-2-andrii@kernel.org/

When DECLARE_LIBBPF_OPTS is used with inline field initialization, e.g:

  DECLARE_LIBBPF_OPTS(btf_dump_emit_type_decl_opts, opts,
    .field_name = var_ident,
    .indent_level = 2,
    .strip_mods = strip_mods,
  );

and compiled in debug mode, the compiler generates code which
leaves the padding uninitialized and triggers errors within libbpf APIs
which require strict zero initialization of OPTS structs.

Adding anonymous padding field fixes the issue.

Fixes: 9f81654eebe8 ("libbpf: Expose BTF-to-C type declaration emitting API")
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210319192117.2310658-1-kpsingh@kernel.org
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
b156979d19 libbpf: Add BPF static linker BTF and BTF.ext support
Add .BTF and .BTF.ext static linking logic.

When multiple BPF object files are linked together, their respective .BTF and
.BTF.ext sections are merged together. BTF types are not just concatenated,
but also deduplicated. .BTF.ext data is grouped by type (func info, line info,
core_relos) and target section names, and then all the records are
concatenated together, preserving their relative order. All the BTF type ID
references and string offsets are updated as necessary, to take into account
possibly deduplicated strings and types.

BTF DATASEC types are handled specially. Their respective var_secinfos are
accumulated separately in special per-section data and then final DATASEC
types are emitted at the very end during bpf_linker__finalize() operation,
just before emitting final ELF output file.

BTF data can also provide "section annotations" for some extern variables.
Such concept is missing in ELF, but BTF will have DATASEC types for such
special extern datasections (e.g., .kconfig, .ksyms). Such sections are called
"ephemeral" internally. Internally linker will keep metadata for each such
section, collecting variables information, but those sections won't be emitted
into the final ELF file.

Also, given LLVM/Clang during compilation emits BTF DATASECS that are
incomplete, missing section size and variable offsets for static variables,
BPF static linker will initially fix up such DATASECs, using ELF symbols data.
The final DATASECs will preserve section sizes and all variable offsets. This
is handled correctly by libbpf already, so won't cause any new issues. On the
other hand, it's actually a nice property to have a complete BTF data without
runtime adjustments done during bpf_object__open() by libbpf. In that sense,
BPF static linker is also a BTF normalizer.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210318194036.3521577-8-andrii@kernel.org
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
bd81770e10 libbpf: Add BPF static linker APIs
Introduce BPF static linker APIs to libbpf. BPF static linker allows to
perform static linking of multiple BPF object files into a single combined
resulting object file, preserving all the BPF programs, maps, global
variables, etc.

Data sections (.bss, .data, .rodata, .maps, maps, etc) with the same name are
concatenated together. Similarly, code sections are also concatenated. All the
symbols and ELF relocations are also concatenated in their respective ELF
sections and are adjusted accordingly to the new object file layout.

Static variables and functions are handled correctly as well, adjusting BPF
instructions offsets to reflect new variable/function offset within the
combined ELF section. Such relocations are referencing STT_SECTION symbols and
that stays intact.

Data sections in different files can have different alignment requirements, so
that is taken care of as well, adjusting sizes and offsets as necessary to
satisfy both old and new alignment requirements.

DWARF data sections are stripped out, currently. As well as LLLVM_ADDRSIG
section, which is ignored by libbpf in bpf_object__open() anyways. So, in
a way, BPF static linker is an analogue to `llvm-strip -g`, which is a pretty
nice property, especially if resulting .o file is then used to generate BPF
skeleton.

Original string sections are ignored and instead we construct our own set of
unique strings using libbpf-internal `struct strset` API.

To reduce the size of the patch, all the .BTF and .BTF.ext processing was
moved into a separate patch.

The high-level API consists of just 4 functions:
  - bpf_linker__new() creates an instance of BPF static linker. It accepts
    output filename and (currently empty) options struct;
  - bpf_linker__add_file() takes input filename and appends it to the already
    processed ELF data; it can be called multiple times, one for each BPF
    ELF object file that needs to be linked in;
  - bpf_linker__finalize() needs to be called to dump final ELF contents into
    the output file, specified when bpf_linker was created; after
    bpf_linker__finalize() is called, no more bpf_linker__add_file() and
    bpf_linker__finalize() calls are allowed, they will return error;
  - regardless of whether bpf_linker__finalize() was called or not,
    bpf_linker__free() will free up all the used resources.

Currently, BPF static linker doesn't resolve cross-object file references
(extern variables and/or functions). This will be added in the follow up patch
set.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210318194036.3521577-7-andrii@kernel.org
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
4fdc36418d libbpf: Add generic BTF type shallow copy API
Add btf__add_type() API that performs shallow copy of a given BTF type from
the source BTF into the destination BTF. All the information and type IDs are
preserved, but all the strings encountered are added into the destination BTF
and corresponding offsets are rewritten. BTF type IDs are assumed to be
correct or such that will be (somehow) modified afterwards.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210318194036.3521577-6-andrii@kernel.org
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
861ad35ceb libbpf: Extract internal set-of-strings datastructure APIs
Extract BTF logic for maintaining a set of strings data structure, used for
BTF strings section construction in writable mode, into separate re-usable
API. This data structure is going to be used by bpf_linker to maintains ELF
STRTAB section, which has the same layout as BTF strings section.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210318194036.3521577-5-andrii@kernel.org
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
7fc514acf1 libbpf: Rename internal memory-management helpers
Rename btf_add_mem() and btf_ensure_mem() helpers that abstract away details
of dynamically resizable memory to use libbpf_ prefix, as they are not
BTF-specific. No functional changes.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210318194036.3521577-4-andrii@kernel.org
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
74e94c40fe libbpf: Generalize BTF and BTF.ext type ID and strings iteration
Extract and generalize the logic to iterate BTF type ID and string offset
fields within BTF types and .BTF.ext data. Expose this internally in libbpf
for re-use by bpf_linker.

Additionally, complete strings deduplication handling for BTF.ext (e.g., CO-RE
access strings), which was previously missing. There previously was no
case of deduplicating .BTF.ext data, but bpf_linker is going to use it.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210318194036.3521577-3-andrii@kernel.org
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
082a5c6020 libbpf: Expose btf_type_by_id() internally
btf_type_by_id() is internal-only convenience API returning non-const pointer
to struct btf_type. Expose it outside of btf.c for re-use.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210318194036.3521577-2-andrii@kernel.org
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
75a2e3bda8 libbpf: provide NULL and KERNEL_VERSION macros in bpf_helpers.h
Given that vmlinux.h is not compatible with headers like stddef.h, NULL poses
an annoying problem: it is defined as #define, so is not captured in BTF, so
is not emitted into vmlinux.h. This leads to users either sticking to explicit
0, or defining their own NULL (as progs/skb_pkt_end.c does).

But it's easy for bpf_helpers.h to provide (conditionally) NULL definition.
Similarly, KERNEL_VERSION is another commonly missed macro that came up
multiple times. So this patch adds both of them, along with offsetof(), that
also is typically defined in stddef.h, just like NULL.

This might cause compilation warning for existing BPF applications defining
their own NULL and/or KERNEL_VERSION already:

  progs/skb_pkt_end.c:7:9: warning: 'NULL' macro redefined [-Wmacro-redefined]
  #define NULL 0
          ^
  /tmp/linux/tools/testing/selftests/bpf/tools/include/vmlinux.h:4:9: note: previous definition is here
  #define NULL ((void *)0)
	  ^

It is trivial to fix, though, so long-term benefits outweight temporary
inconveniences.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20210317200510.1354627-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
c8bfeae778 libbpf: Add explicit padding to bpf_xdp_set_link_opts
Adding such anonymous padding fixes the issue with uninitialized portions of
bpf_xdp_set_link_opts when using LIBBPF_DECLARE_OPTS macro with inline field
initialization:

DECLARE_LIBBPF_OPTS(bpf_xdp_set_link_opts, opts, .old_fd = -1);

When such code is compiled in debug mode, compiler is generating code that
leaves padding bytes uninitialized, which triggers error inside libbpf APIs
that do strict zero initialization checks for OPTS structs.

Adding anonymous padding field fixes the issue.

Fixes: bd5ca3ef93cd ("libbpf: Add function to set link XDP fd while specifying old program")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210313210920.1959628-2-andrii@kernel.org
2021-03-25 23:31:23 -07:00
Pedro Tammela
a0ad81d9c4 libbpf: Avoid inline hint definition from 'linux/stddef.h'
Linux headers might pull 'linux/stddef.h' which defines
'__always_inline' as the following:

   #ifndef __always_inline
   #define __always_inline inline
   #endif

This becomes an issue if the program picks up the 'linux/stddef.h'
definition as the macro now just hints inline to clang.

This change now enforces the proper definition for BPF programs
regardless of the include order.

Signed-off-by: Pedro Tammela <pctammela@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210314173839.457768-1-pctammela@gmail.com
2021-03-25 23:31:23 -07:00
Björn Töpel
b727e2deca libbpf: xsk: Move barriers from libbpf_util.h to xsk.h
The only user of libbpf_util.h is xsk.h. Move the barriers to xsk.h,
and remove libbpf_util.h. The barriers are used as an implementation
detail, and should not be considered part of the stable API.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210310080929.641212-3-bjorn.topel@gmail.com
2021-03-25 23:31:23 -07:00
Björn Töpel
27db7104d5 libbpf: xsk: Remove linux/compiler.h header
In commit 291471dd1559 ("libbpf, xsk: Add libbpf_smp_store_release
libbpf_smp_load_acquire") linux/compiler.h was added as a dependency
to xsk.h, which is the user-facing API. This makes it harder for
userspace application to consume the library. Here the header
inclusion is removed, and instead {READ,WRITE}_ONCE() is added
explicitly.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210310080929.641212-2-bjorn.topel@gmail.com
2021-03-25 23:31:23 -07:00
Jean-Philippe Brucker
c14f7e5dcf libbpf: Fix BTF dump of pointer-to-array-of-struct
The vmlinux.h generated from BTF is invalid when building
drivers/phy/ti/phy-gmii-sel.c with clang:

vmlinux.h:61702:27: error: array type has incomplete element type ‘struct reg_field’
61702 |  const struct reg_field (*regfields)[3];
      |                           ^~~~~~~~~

bpftool generates a forward declaration for this struct regfield, which
compilers aren't happy about. Here's a simplified reproducer:

	struct inner {
		int val;
	};
	struct outer {
		struct inner (*ptr_to_array)[2];
	} A;

After build with clang -> bpftool btf dump c -> clang/gcc:
./def-clang.h:11:23: error: array has incomplete element type 'struct inner'
        struct inner (*ptr_to_array)[2];

Member ptr_to_array of struct outer is a pointer to an array of struct
inner. In the DWARF generated by clang, struct outer appears before
struct inner, so when converting BTF of struct outer into C, bpftool
issues a forward declaration to struct inner. With GCC the DWARF info is
reversed so struct inner gets fully defined.

That forward declaration is not sufficient when compilers handle an
array of the struct, even when it's only used through a pointer. Note
that we can trigger the same issue with an intermediate typedef:

	struct inner {
	        int val;
	};
	typedef struct inner inner2_t[2];
	struct outer {
	        inner2_t *ptr_to_array;
	} A;

Becomes:

	struct inner;
	typedef struct inner inner2_t[2];

And causes:

./def-clang.h:10:30: error: array has incomplete element type 'struct inner'
	typedef struct inner inner2_t[2];

To fix this, clear through_ptr whenever we encounter an intermediate
array, to make the inner struct part of a strong link and force full
declaration.

Fixes: 351131b51c7a ("libbpf: add btf_dump API for BTF-to-C conversion")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210319112554.794552-2-jean-philippe@linaro.org
2021-03-25 23:31:23 -07:00
Kumar Kartikeya Dwivedi
bbc65156d7 libbpf: Use SOCK_CLOEXEC when opening the netlink socket
Otherwise, there exists a small window between the opening and closing
of the socket fd where it may leak into processes launched by some other
thread.

Fixes: 949abbe88436 ("libbpf: add function to setup XDP")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210317115857.6536-1-memxor@gmail.com
2021-03-25 23:31:23 -07:00
Namhyung Kim
c903b3ab70 libbpf: Fix error path in bpf_object__elf_init()
When it failed to get section names, it should call into
bpf_object__elf_finish() like others.

Fixes: 88a82120282b ("libbpf: Factor out common ELF operations and improve logging")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210317145414.884817-1-namhyung@kernel.org
2021-03-25 23:31:23 -07:00
Jean-Philippe Brucker
186ffbe0b5 libbpf: Fix arm64 build
The macro for libbpf_smp_store_release() doesn't build on arm64, fix it.

Fixes: 291471dd1559 ("libbpf, xsk: Add libbpf_smp_store_release libbpf_smp_load_acquire")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210308182521.155536-1-jean-philippe@linaro.org
2021-03-25 23:31:23 -07:00
Björn Töpel
1d483b45fc libbpf, xsk: Add libbpf_smp_store_release libbpf_smp_load_acquire
Now that the AF_XDP rings have load-acquire/store-release semantics,
move libbpf to that as well.

The library-internal libbpf_smp_{load_acquire,store_release} are only
valid for 32-bit words on ARM64.

Also, remove the barriers that are no longer in use.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210305094113.413544-3-bjorn.topel@gmail.com
2021-03-25 23:31:23 -07:00
Xuesen Huang
d64f8d3207 bpf: Add bpf_skb_adjust_room flag BPF_F_ADJ_ROOM_ENCAP_L2_ETH
bpf_skb_adjust_room sets the inner_protocol as skb->protocol for packets
encapsulation. But that is not appropriate when pushing Ethernet header.

Add an option to further specify encap L2 type and set the inner_protocol
as ETH_P_TEB.

Suggested-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Xuesen Huang <huangxuesen@kuaishou.com>
Signed-off-by: Zhiyong Cheng <chengzhiyong@kuaishou.com>
Signed-off-by: Li Wang <wangli09@kuaishou.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/bpf/20210304064046.6232-1-hxseverything@gmail.com
2021-03-25 23:31:23 -07:00
Lorenz Bauer
4f2e1ecbd9 bpf: Add PROG_TEST_RUN support for sk_lookup programs
Allow to pass sk_lookup programs to PROG_TEST_RUN. User space
provides the full bpf_sk_lookup struct as context. Since the
context includes a socket pointer that can't be exposed
to user space we define that PROG_TEST_RUN returns the cookie
of the selected socket or zero in place of the socket pointer.

We don't support testing programs that select a reuseport socket,
since this would mean running another (unrelated) BPF program
from the sk_lookup test handler.

Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210303101816.36774-3-lmb@cloudflare.com
2021-03-25 23:31:23 -07:00
Joe Stringer
21f523f235 tools: Sync uapi bpf.h header with latest changes
Synchronize the header after all of the recent changes.

Signed-off-by: Joe Stringer <joe@cilium.io>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210302171947.2268128-16-joe@cilium.io
2021-03-25 23:31:23 -07:00
Joe Stringer
18c0f03e2d scripts/bpf: Abstract eBPF API target parameter
Abstract out the target parameter so that upcoming commits, more than
just the existing "helpers" target can be called to generate specific
portions of docs from the eBPF UAPI headers.

Signed-off-by: Joe Stringer <joe@cilium.io>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20210302171947.2268128-10-joe@cilium.io
2021-03-25 23:31:23 -07:00
Maciej Fijalkowski
fade1c32e6 libbpf: Clear map_info before each bpf_obj_get_info_by_fd
xsk_lookup_bpf_maps, based on prog_fd, looks whether current prog has a
reference to XSKMAP. BPF prog can include insns that work on various BPF
maps and this is covered by iterating through map_ids.

The bpf_map_info that is passed to bpf_obj_get_info_by_fd for filling
needs to be cleared at each iteration, so that it doesn't contain any
outdated fields and that is currently missing in the function of
interest.

To fix that, zero-init map_info via memset before each
bpf_obj_get_info_by_fd call.

Also, since the area of this code is touched, in general strcmp is
considered harmful, so let's convert it to strncmp and provide the
size of the array name for current map_info.

While at it, do s/continue/break/ once we have found the xsks_map to
terminate the search.

Fixes: 5750902a6e9b ("libbpf: proper XSKMAP cleanup")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Link: https://lore.kernel.org/bpf/20210303185636.18070-4-maciej.fijalkowski@intel.com
2021-03-25 23:31:23 -07:00
Ilya Leoshkevich
8c2c7e5bcf sync: use bpf_doc.py
In the latest bpf-next bpf_helpers_doc.py has been renamed to
bpf_doc.py.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
2021-03-25 23:31:23 -07:00
Andrii Nakryiko
092a606856 Makefile: fix install flags order
Having -m flag between source and destination breaks install on MacOS, as
reported in [0]. Fix it by moving the flag to the front.

  [0] https://github.com/openwrt/openwrt/pull/3959

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-03-05 17:48:04 -08:00
Ilya Leoshkevich
986962fade sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   303dcc25b5c782547eb13b9f29426de843dd6f34
Checkpoint bpf-next commit: 22541a9eeb0d968c133aaebd95fa59da3208e705
Baseline bpf commit:        6185266c5a853bb0f2a459e3ff594546f277609b
Checkpoint bpf commit:      22541a9eeb0d968c133aaebd95fa59da3208e705

Ilya Leoshkevich (3):
  bpf: Add BTF_KIND_FLOAT to uapi
  libbpf: Fix whitespace in btf_add_composite() comment
  libbpf: Add BTF_KIND_FLOAT support

 include/uapi/linux/btf.h |  5 ++--
 src/btf.c                | 51 +++++++++++++++++++++++++++++++++++++++-
 src/btf.h                |  6 +++++
 src/btf_dump.c           |  4 ++++
 src/libbpf.c             | 29 ++++++++++++++++++++++-
 src/libbpf.map           |  5 ++++
 src/libbpf_internal.h    |  2 ++
 7 files changed, 98 insertions(+), 4 deletions(-)

--
2.25.1
2021-03-05 14:15:26 -08:00
Ilya Leoshkevich
471e7c241d libbpf: Add BTF_KIND_FLOAT support
The logic follows that of BTF_KIND_INT most of the time. Sanitization
replaces BTF_KIND_FLOATs with equally-sized empty BTF_KIND_STRUCTs on
older kernels, for example, the following:

    [4] FLOAT 'float' size=4

becomes the following:

    [4] STRUCT '(anon)' size=4 vlen=0

With dwarves patch [1] and this patch, the older kernels, which were
failing with the floating-point-related errors, will now start working
correctly.

[1] https://github.com/iii-i/dwarves/commit/btf-kind-float-v2

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210226202256.116518-4-iii@linux.ibm.com
2021-03-05 14:15:26 -08:00
Ilya Leoshkevich
617f781804 libbpf: Fix whitespace in btf_add_composite() comment
Remove trailing space.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210226202256.116518-3-iii@linux.ibm.com
2021-03-05 14:15:26 -08:00
Ilya Leoshkevich
473899d4f7 bpf: Add BTF_KIND_FLOAT to uapi
Add a new kind value and expand the kind bitfield.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210226202256.116518-2-iii@linux.ibm.com
2021-03-05 14:15:26 -08:00
Andrii Nakryiko
7e03685b8d vmtests: blacklist new tests on 5.5 kernel
Extend 5.5 blacklist.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-03-03 08:35:13 -08:00
Andrii Nakryiko
7065a809fc sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   86ce322d21eb032ed8fdd294d0fb095d2debb430
Checkpoint bpf-next commit: 303dcc25b5c782547eb13b9f29426de843dd6f34
Baseline bpf commit:        78031381ae9c88f4f914d66154f4745122149c58
Checkpoint bpf commit:      6185266c5a853bb0f2a459e3ff594546f277609b

Alexei Starovoitov (1):
  bpf: Count the number of times recursion was prevented

Florent Revest (2):
  bpf: Be less specific about socket cookies guarantees
  bpf: Expose bpf_get_socket_cookie to tracing programs

Hangbin Liu (1):
  bpf: Remove blank line in bpf helper description comment

Jesper Dangaard Brouer (2):
  bpf: bpf_fib_lookup return MTU value as output when looked up
  bpf: Add BPF-helper for MTU checking

Jonas Bonn (1):
  Revert "GTP: add support for flow based tunneling API"

Martin KaFai Lau (1):
  libbpf: Ignore non function pointer member in struct_ops

Stanislav Fomichev (1):
  libbpf: Use AF_LOCAL instead of AF_INET in xsk.c

Yonghong Song (3):
  bpf: Add bpf_for_each_map_elem() helper
  libbpf: Move function is_ldimm64() earlier in libbpf.c
  libbpf: Support subprog address relocation

 include/uapi/linux/bpf.h     | 140 +++++++++++++++++++++++++++++++++--
 include/uapi/linux/if_link.h |   1 -
 src/libbpf.c                 |  98 +++++++++++++++++++-----
 src/xsk.c                    |   2 +-
 4 files changed, 213 insertions(+), 28 deletions(-)

--
2.24.1
2021-03-03 08:35:13 -08:00
Andrii Nakryiko
18b55bc136 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2021-03-03 08:35:13 -08:00
Yonghong Song
712f6587c9 libbpf: Support subprog address relocation
A new relocation RELO_SUBPROG_ADDR is added to capture
subprog addresses loaded with ld_imm64 insns. Such ld_imm64
insns are marked with BPF_PSEUDO_FUNC and will be passed to
kernel. For bpf_for_each_map_elem() case, kernel will
check that the to-be-used subprog address must be a static
function and replace it with proper actual jited func address.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210226204930.3885367-1-yhs@fb.com
2021-03-03 08:35:13 -08:00
Yonghong Song
60aa32b17a libbpf: Move function is_ldimm64() earlier in libbpf.c
Move function is_ldimm64() close to the beginning of libbpf.c,
so it can be reused by later code and the next patch as well.
There is no functionality change.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210226204929.3885295-1-yhs@fb.com
2021-03-03 08:35:13 -08:00
Yonghong Song
f3612e4117 bpf: Add bpf_for_each_map_elem() helper
The bpf_for_each_map_elem() helper is introduced which
iterates all map elements with a callback function. The
helper signature looks like
  long bpf_for_each_map_elem(map, callback_fn, callback_ctx, flags)
and for each map element, the callback_fn will be called. For example,
like hashmap, the callback signature may look like
  long callback_fn(map, key, val, callback_ctx)

There are two known use cases for this. One is from upstream ([1]) where
a for_each_map_elem helper may help implement a timeout mechanism
in a more generic way. Another is from our internal discussion
for a firewall use case where a map contains all the rules. The packet
data can be compared to all these rules to decide allow or deny
the packet.

For array maps, users can already use a bounded loop to traverse
elements. Using this helper can avoid using bounded loop. For other
type of maps (e.g., hash maps) where bounded loop is hard or
impossible to use, this helper provides a convenient way to
operate on all elements.

For callback_fn, besides map and map element, a callback_ctx,
allocated on caller stack, is also passed to the callback
function. This callback_ctx argument can provide additional
input and allow to write to caller stack for output.

If the callback_fn returns 0, the helper will iterate through next
element if available. If the callback_fn returns 1, the helper
will stop iterating and returns to the bpf program. Other return
values are not used for now.

Currently, this helper is only available with jit. It is possible
to make it work with interpreter with so effort but I leave it
as the future work.

[1]: https://lore.kernel.org/bpf/20210122205415.113822-1-xiyou.wangcong@gmail.com/

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210226204925.3884923-1-yhs@fb.com
2021-03-03 08:35:13 -08:00
Hangbin Liu
587d2ab628 bpf: Remove blank line in bpf helper description comment
Commit 34b2021cc616 ("bpf: Add BPF-helper for MTU checking") added an extra
blank line in bpf helper description. This will make bpf_helpers_doc.py stop
building bpf_helper_defs.h immediately after bpf_check_mtu(), which will
affect future added functions.

Fixes: 34b2021cc616 ("bpf: Add BPF-helper for MTU checking")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/20210223131457.1378978-1-liuhangbin@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-03-03 08:35:13 -08:00
Jesper Dangaard Brouer
6cc16d6401 bpf: Add BPF-helper for MTU checking
This BPF-helper bpf_check_mtu() works for both XDP and TC-BPF programs.

The SKB object is complex and the skb->len value (accessible from
BPF-prog) also include the length of any extra GRO/GSO segments, but
without taking into account that these GRO/GSO segments get added
transport (L4) and network (L3) headers before being transmitted. Thus,
this BPF-helper is created such that the BPF-programmer don't need to
handle these details in the BPF-prog.

The API is designed to help the BPF-programmer, that want to do packet
context size changes, which involves other helpers. These other helpers
usually does a delta size adjustment. This helper also support a delta
size (len_diff), which allow BPF-programmer to reuse arguments needed by
these other helpers, and perform the MTU check prior to doing any actual
size adjustment of the packet context.

It is on purpose, that we allow the len adjustment to become a negative
result, that will pass the MTU check. This might seem weird, but it's not
this helpers responsibility to "catch" wrong len_diff adjustments. Other
helpers will take care of these checks, if BPF-programmer chooses to do
actual size adjustment.

V14:
 - Improve man-page desc of len_diff.

V13:
 - Enforce flag BPF_MTU_CHK_SEGS cannot use len_diff.

V12:
 - Simplify segment check that calls skb_gso_validate_network_len.
 - Helpers should return long

V9:
- Use dev->hard_header_len (instead of ETH_HLEN)
- Annotate with unlikely req from Daniel
- Fix logic error using skb_gso_validate_network_len from Daniel

V6:
- Took John's advice and dropped BPF_MTU_CHK_RELAX
- Returned MTU is kept at L3-level (like fib_lookup)

V4: Lot of changes
 - ifindex 0 now use current netdev for MTU lookup
 - rename helper from bpf_mtu_check to bpf_check_mtu
 - fix bug for GSO pkt length (as skb->len is total len)
 - remove __bpf_len_adj_positive, simply allow negative len adj

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/161287790461.790810.3429728639563297353.stgit@firesoul
2021-03-03 08:35:13 -08:00
Jesper Dangaard Brouer
f0753b5259 bpf: bpf_fib_lookup return MTU value as output when looked up
The BPF-helpers for FIB lookup (bpf_xdp_fib_lookup and bpf_skb_fib_lookup)
can perform MTU check and return BPF_FIB_LKUP_RET_FRAG_NEEDED. The BPF-prog
don't know the MTU value that caused this rejection.

If the BPF-prog wants to implement PMTU (Path MTU Discovery) (rfc1191) it
need to know this MTU value for the ICMP packet.

Patch change lookup and result struct bpf_fib_lookup, to contain this MTU
value as output via a union with 'tot_len' as this is the value used for
the MTU lookup.

V5:
 - Fixed uninit value spotted by Dan Carpenter.
 - Name struct output member mtu_result

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/161287789952.790810.13134700381067698781.stgit@firesoul
2021-03-03 08:35:13 -08:00
Martin KaFai Lau
642655629b libbpf: Ignore non function pointer member in struct_ops
When libbpf initializes the kernel's struct_ops in
"bpf_map__init_kern_struct_ops()", it enforces all
pointer types must be a function pointer and rejects
others.  It turns out to be too strict.  For example,
when directly using "struct tcp_congestion_ops" from vmlinux.h,
it has a "struct module *owner" member and it is set to NULL
in a bpf_tcp_cc.o.

Instead, it only needs to ensure the member is a function
pointer if it has been set (relocated) to a bpf-prog.
This patch moves the "btf_is_func_proto(kern_mtype)" check
after the existing "if (!prog) { continue; }".  The original debug
message in "if (!prog) { continue; }" is also removed since it is
no longer valid.  Beside, there is a later debug message to tell
which function pointer is set.

The "btf_is_func_proto(mtype)" has already been guaranteed
in "bpf_object__collect_st_ops_relos()" which has been run
before "bpf_map__init_kern_struct_ops()".  Thus, this check
is removed.

v2:
- Remove outdated debug message (Andrii)
  Remove because there is a later debug message to tell
  which function pointer is set.
- Following mtype->type is no longer needed. Remove:
  "skip_mods_and_typedefs(btf, mtype->type, &mtype_id)"
- Do "if (!prog)" test before skip_mods_and_typedefs.

Fixes: 590a00888250 ("bpf: libbpf: Add STRUCT_OPS support")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210212021030.266932-1-kafai@fb.com
2021-03-03 08:35:13 -08:00
Stanislav Fomichev
14a61e86f0 libbpf: Use AF_LOCAL instead of AF_INET in xsk.c
We have the environments where usage of AF_INET is prohibited
(cgroup/sock_create returns EPERM for AF_INET). Let's use
AF_LOCAL instead of AF_INET, it should perfectly work with SIOCETHTOOL.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Björn Töpel <bjorn.topel@intel.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Link: https://lore.kernel.org/bpf/20210209221826.922940-1-sdf@google.com
2021-03-03 08:35:13 -08:00
Florent Revest
d142d4a382 bpf: Expose bpf_get_socket_cookie to tracing programs
This needs a new helper that:
- can work in a sleepable context (using sock_gen_cookie)
- takes a struct sock pointer and checks that it's not NULL

Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: KP Singh <kpsingh@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210210111406.785541-2-revest@chromium.org
2021-03-03 08:35:13 -08:00
Florent Revest
99e6a464b8 bpf: Be less specific about socket cookies guarantees
Since "92acdc58ab11 bpf, net: Rework cookie generator as per-cpu one"
socket cookies are not guaranteed to be non-decreasing. The
bpf_get_socket_cookie helper descriptions are currently specifying that
cookies are non-decreasing but we don't want users to rely on that.

Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: KP Singh <kpsingh@kernel.org>
Link: https://lore.kernel.org/bpf/20210210111406.785541-1-revest@chromium.org
2021-03-03 08:35:13 -08:00
Alexei Starovoitov
1015d47c2b bpf: Count the number of times recursion was prevented
Add per-program counter for number of times recursion prevention mechanism
was triggered and expose it via show_fdinfo and bpf_prog_info.
Teach bpftool to print it.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210210033634.62081-7-alexei.starovoitov@gmail.com
2021-03-03 08:35:13 -08:00
Jonas Bonn
06ee116fb1 Revert "GTP: add support for flow based tunneling API"
This reverts commit 9ab7e76aefc97a9aa664accb59d6e8dc5e52514a.

This patch was committed without maintainer approval and despite a number
of unaddressed concerns from review.  There are several issues that
impede the acceptance of this patch and that make a reversion of this
particular instance of these changes the best way forward:

i)  the patch contains several logically separate changes that would be
better served as smaller patches (for review purposes)
ii) functionality like the handling of end markers has been introduced
without further explanation
iii) symmetry between the handling of GTPv0 and GTPv1 has been
unnecessarily broken
iv) the patchset produces 'broken' packets when extension headers are
included
v) there are no available userspace tools to allow for testing this
functionality
vi) there is an unaddressed Coverity report against the patch concering
memory leakage
vii) most importantly, the patch contains a large amount of superfluous
churn that impedes other ongoing work with this driver

This patch will be reworked into a series that aligns with other
ongoing work and facilitates review.

Signed-off-by: Jonas Bonn <jonas@norrbonn.se>
Acked-by: Harald Welte <laforge@gnumonks.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-03-03 08:35:13 -08:00
Andrii Nakryiko
e1a90f3768 travis-ci: switch from GCC8 to GCC10
GCC 8 builds started failing due to missing gcc-8 package. Let's switch to GCC
10 instead.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-02-22 14:00:53 -08:00
Andrii Nakryiko
f2a926ba46 Revert "vmtests: revert to Clang/LLVM 12 until Clang 13 regression is fixed"
Clang 13 now has the fix for the original regression, time to get back to
using nightly versions.

This reverts commit adaf538bca.
2021-02-22 11:36:12 -08:00
Matteo Croce
b0b5ec0006 fix typo in license name
LGPL was incorrectly named LPGL

Signed-off-by: Matteo Croce <mcroce@microsoft.com>
2021-02-22 11:35:49 -08:00
Andrii Nakryiko
adaf538bca vmtests: revert to Clang/LLVM 12 until Clang 13 regression is fixed
Clang 13 regressed BPF code generation causing some of BPF selftests to fail.
Until that is mitigated, stick to version 12.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-02-08 20:37:11 -08:00
Andrii Nakryiko
f35e87ddc4 vmtest: switch to Clang/LLVM 13
Clang 13 became the new nightly version, so switch to it. Also do vmlinux
compilation with a bit more parallelism. And account python-docutils
installation as part of selftests build.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-02-03 14:53:19 -08:00
Matteo Croce
767d82caab install: don't preserve file owner
'cp -p' preserve file ownership, this may leave files owned by the
current in user in /lib .

Signed-off-by: Matteo Croce <mcroce@microsoft.com>
2021-01-26 19:33:48 -08:00
Andrii Nakryiko
a199b85415 vmtest: blacklist atomics selftest for 5.5
5.5 doesn't support atomics.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-01-26 19:32:13 -08:00
Andrii Nakryiko
649f9dc746 sync: latest libbpf changes from kernel
Syncing latest libbpf commits from kernel repository.
Baseline bpf-next commit:   3db1a3fa98808aa90f95ec3e0fa2fc7abf28f5c9
Checkpoint bpf-next commit: 86ce322d21eb032ed8fdd294d0fb095d2debb430
Baseline bpf commit:        1a3449c19407a28f7019a887cdf0d6ba2444751a
Checkpoint bpf commit:      78031381ae9c88f4f914d66154f4745122149c58

Andrii Nakryiko (5):
  libbpf: Add user-space variants of BPF_CORE_READ() family of macros
  libbpf: Add non-CO-RE variants of BPF_CORE_READ() macro family
  libbpf: Clarify kernel type use with USER variants of CORE reading
    macros
  libbpf: Support kernel module ksym externs
  libbpf: Allow loading empty BTFs

Björn Töpel (1):
  libbpf, xsk: Select AF_XDP BPF program based on kernel version

Brendan Jackman (4):
  bpf: Clarify return value of probe str helpers
  bpf: Rename BPF_XADD and prepare to encode other atomics in .imm
  bpf: Add BPF_FETCH field / create atomic_fetch_add instruction
  bpf: Add instructions for atomic_[cmp]xchg

Ian Rogers (1):
  bpf, libbpf: Avoid unused function warning on bpf_tail_call_static

Jiri Olsa (1):
  libbpf: Use string table index from index table if needed

Pravin B Shelar (1):
  GTP: add support for flow based tunneling API

 include/uapi/linux/bpf.h     |  20 +++--
 include/uapi/linux/if_link.h |   1 +
 src/bpf_core_read.h          | 169 +++++++++++++++++++++++++++--------
 src/bpf_helpers.h            |   2 +-
 src/btf.c                    |  17 ++--
 src/libbpf.c                 |  50 +++++++----
 src/xsk.c                    |  81 ++++++++++++++++-
 7 files changed, 265 insertions(+), 75 deletions(-)

--
2.24.1
2021-01-26 19:32:13 -08:00
Andrii Nakryiko
eb56f8fb12 sync: auto-generate latest BPF helpers
Latest changes to BPF helper definitions.
2021-01-26 19:32:13 -08:00
Björn Töpel
6e01a23cf6 libbpf, xsk: Select AF_XDP BPF program based on kernel version
Add detection for kernel version, and adapt the BPF program based on
kernel support. This way, users will get the best possible performance
from the BPF program.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Marek Majtyka  <alardam@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210122105351.11751-4-bjorn.topel@gmail.com
2021-01-26 19:32:13 -08:00
Jiri Olsa
16d7f413e2 libbpf: Use string table index from index table if needed
For very large ELF objects (with many sections), we could
get special value SHN_XINDEX (65535) for elf object's string
table index - e_shstrndx.

Call elf_getshdrstrndx to get the proper string table index,
instead of reading it directly from ELF header.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210121202203.9346-4-jolsa@kernel.org
2021-01-26 19:32:13 -08:00
Andrii Nakryiko
f037b92465 libbpf: Allow loading empty BTFs
Empty BTFs do come up (e.g., simple kernel modules with no new types and
strings, compared to the vmlinux BTF) and there is nothing technically wrong
with them. So remove unnecessary check preventing loading empty BTFs.

Fixes: d8123624506c ("libbpf: Fix BTF data layout checks and allow empty BTF")
Reported-by: Christopher William Snowhill <chris@kode54.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210110070341.1380086-2-andrii@kernel.org
2021-01-26 19:32:13 -08:00
Pravin B Shelar
4adbb7b2c7 GTP: add support for flow based tunneling API
Following patch add support for flow based tunneling API
to send and recv GTP tunnel packet over tunnel metadata API.
This would allow this device integration with OVS or eBPF using
flow based tunneling APIs.

Signed-off-by: Pravin B Shelar <pbshelar@fb.com>
Link: https://lore.kernel.org/r/20210110070021.26822-1-pbshelar@fb.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-26 19:32:13 -08:00
Brendan Jackman
d2b784d370 bpf: Add instructions for atomic_[cmp]xchg
This adds two atomic opcodes, both of which include the BPF_FETCH
flag. XCHG without the BPF_FETCH flag would naturally encode
atomic_set. This is not supported because it would be of limited
value to userspace (it doesn't imply any barriers). CMPXCHG without
BPF_FETCH woulud be an atomic compare-and-write. We don't have such
an operation in the kernel so it isn't provided to BPF either.

There are two significant design decisions made for the CMPXCHG
instruction:

 - To solve the issue that this operation fundamentally has 3
   operands, but we only have two register fields. Therefore the
   operand we compare against (the kernel's API calls it 'old') is
   hard-coded to be R0. x86 has similar design (and A64 doesn't
   have this problem).

   A potential alternative might be to encode the other operand's
   register number in the immediate field.

 - The kernel's atomic_cmpxchg returns the old value, while the C11
   userspace APIs return a boolean indicating the comparison
   result. Which should BPF do? A64 returns the old value. x86 returns
   the old value in the hard-coded register (and also sets a
   flag). That means return-old-value is easier to JIT, so that's
   what we use.

Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210114181751.768687-8-jackmanb@google.com
2021-01-26 19:32:13 -08:00
Brendan Jackman
ac86f42e4a bpf: Add BPF_FETCH field / create atomic_fetch_add instruction
The BPF_FETCH field can be set in bpf_insn.imm, for BPF_ATOMIC
instructions, in order to have the previous value of the
atomically-modified memory location loaded into the src register
after an atomic op is carried out.

Suggested-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210114181751.768687-7-jackmanb@google.com
2021-01-26 19:32:13 -08:00
Brendan Jackman
03fbe22a59 bpf: Rename BPF_XADD and prepare to encode other atomics in .imm
A subsequent patch will add additional atomic operations. These new
operations will use the same opcode field as the existing XADD, with
the immediate discriminating different operations.

In preparation, rename the instruction mode BPF_ATOMIC and start
calling the zero immediate BPF_ADD.

This is possible (doesn't break existing valid BPF progs) because the
immediate field is currently reserved MBZ and BPF_ADD is zero.

All uses are removed from the tree but the BPF_XADD definition is
kept around to avoid breaking builds for people including kernel
headers.

Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Björn Töpel <bjorn.topel@gmail.com>
Link: https://lore.kernel.org/bpf/20210114181751.768687-5-jackmanb@google.com
2021-01-26 19:32:13 -08:00
Ian Rogers
f15814c93a bpf, libbpf: Avoid unused function warning on bpf_tail_call_static
Add inline to __always_inline making it match the linux/compiler.h.
Adding this avoids an unused function warning on bpf_tail_call_static
when compining with -Wall.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210113223609.3358812-1-irogers@google.com
2021-01-26 19:32:13 -08:00
Andrii Nakryiko
0db7da9a4a libbpf: Support kernel module ksym externs
Add support for searching for ksym externs not just in vmlinux BTF, but across
all module BTFs, similarly to how it's done for CO-RE relocations. Kernels
that expose module BTFs through sysfs are assumed to support new ldimm64
instruction extension with BTF FD provided in insn[1].imm field, so no extra
feature detection is performed.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Hao Luo <haoluo@google.com>
Link: https://lore.kernel.org/bpf/20210112075520.4103414-7-andrii@kernel.org
2021-01-26 19:32:13 -08:00
Brendan Jackman
0de8b9a906 bpf: Clarify return value of probe str helpers
When the buffer is too small to contain the input string, these helpers
return the length of the buffer, not the length of the original string.
This tries to make the docs totally clear about that, since "the length
of the [copied ]string" could also refer to the length of the input.

Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: KP Singh <kpsingh@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210112123422.2011234-1-jackmanb@google.com
2021-01-26 19:32:13 -08:00
Andrii Nakryiko
d52e5f5f88 libbpf: Clarify kernel type use with USER variants of CORE reading macros
Add comments clarifying that USER variants of CO-RE reading macro are still
only going to work with kernel types, defined in kernel or kernel module BTF.
This should help preventing invalid use of those macro to read user-defined
types (which doesn't work with CO-RE).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210108194408.3468860-1-andrii@kernel.org
2021-01-26 19:32:13 -08:00
Andrii Nakryiko
a26ae1b254 libbpf: Add non-CO-RE variants of BPF_CORE_READ() macro family
BPF_CORE_READ(), in addition to handling CO-RE relocations, also allows much
nicer way to read data structures with nested pointers. Instead of writing
a sequence of bpf_probe_read() calls to follow links, one can just write
BPF_CORE_READ(a, b, c, d) to effectively do a->b->c->d read. This is a welcome
ability when porting BCC code, which (in most cases) allows exactly the
intuitive a->b->c->d variant.

This patch adds non-CO-RE variants of BPF_CORE_READ() family of macros for
cases where CO-RE is not supported (e.g., old kernels). In such cases, the
property of shortening a sequence of bpf_probe_read()s to a simple
BPF_PROBE_READ(a, b, c, d) invocation is still desirable, especially when
porting BCC code to libbpf. Yet, no CO-RE relocation is going to be emitted.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20201218235614.2284956-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-01-26 19:32:13 -08:00
Andrii Nakryiko
c1d4bbb8c7 libbpf: Add user-space variants of BPF_CORE_READ() family of macros
Add BPF_CORE_READ_USER(), BPF_CORE_READ_USER_STR() and their _INTO()
variations to allow reading CO-RE-relocatable kernel data structures from the
user-space. One of such cases is reading input arguments of syscalls, while
reaping the benefits of CO-RE relocations w.r.t. handling 32/64 bit
conversions and handling missing/new fields in UAPI data structs.

Suggested-by: Gilad Reti <gilad.reti@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20201218235614.2284956-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-01-26 19:32:13 -08:00
Luca Boccassi
7c2a94f4f8 README: mention that Debian 11 ships with BTF support 2021-01-22 12:27:35 -08:00
107 changed files with 59895 additions and 87304 deletions

View File

@@ -0,0 +1,30 @@
name: 'build-selftests'
description: 'Build BPF selftests'
inputs:
repo-path:
description: 'where is the source code'
required: true
kernel:
description: 'kernel version or LATEST'
required: true
default: 'LATEST'
vmlinux:
description: 'where is vmlinux file'
required: true
default: '${{ github.workspace }}/vmlinux'
runs:
using: "composite"
steps:
- shell: bash
run: |
echo "::group::Setup Env"
sudo apt-get install -y qemu-kvm zstd binutils-dev elfutils libcap-dev libelf-dev libdw-dev python3-docutils
echo "::endgroup::"
- shell: bash
run: |
export KERNEL=${{ inputs.kernel }}
export REPO_ROOT="${{ github.workspace }}"
export REPO_PATH="${{ inputs.repo-path }}"
export VMLINUX_BTF="${{ inputs.vmlinux }}"
${{ github.action_path }}/build_selftests.sh

View File

@@ -2,15 +2,16 @@
set -euo pipefail
source $(cd $(dirname $0) && pwd)/helpers.sh
THISDIR="$(cd $(dirname $0) && pwd)"
source ${THISDIR}/helpers.sh
travis_fold start prepare_selftests "Building selftests"
LLVM_VER=12
LLVM_VER=15
LIBBPF_PATH="${REPO_ROOT}"
REPO_PATH="travis-ci/vmtest/bpf-next"
PREPARE_SELFTESTS_SCRIPT=${VMTEST_ROOT}/prepare_selftests-${KERNEL}.sh
PREPARE_SELFTESTS_SCRIPT=${THISDIR}/prepare_selftests-${KERNEL}.sh
if [ -f "${PREPARE_SELFTESTS_SCRIPT}" ]; then
(cd "${REPO_ROOT}/${REPO_PATH}/tools/testing/selftests/bpf" && ${PREPARE_SELFTESTS_SCRIPT})
fi
@@ -18,9 +19,10 @@ fi
if [[ "${KERNEL}" = 'LATEST' ]]; then
VMLINUX_H=
else
VMLINUX_H=${VMTEST_ROOT}/vmlinux.h
VMLINUX_H=${THISDIR}/vmlinux.h
fi
cd ${REPO_ROOT}/${REPO_PATH}
make \
CLANG=clang-${LLVM_VER} \
LLC=llc-${LLVM_VER} \
@@ -28,7 +30,8 @@ make \
VMLINUX_BTF="${VMLINUX_BTF}" \
VMLINUX_H=${VMLINUX_H} \
-C "${REPO_ROOT}/${REPO_PATH}/tools/testing/selftests/bpf" \
-j $((2*$(nproc)))
-j $((4*$(nproc))) > /dev/null
cd -
mkdir ${LIBBPF_PATH}/selftests
cp -R "${REPO_ROOT}/${REPO_PATH}/tools/testing/selftests/bpf" \
${LIBBPF_PATH}/selftests
@@ -36,6 +39,4 @@ cd ${LIBBPF_PATH}
rm selftests/bpf/.gitignore
git add selftests
git add "${VMTEST_ROOT}/configs/blacklist"
travis_fold end prepare_selftests

View File

@@ -0,0 +1,44 @@
# $1 - start or end
# $2 - fold identifier, no spaces
# $3 - fold section description
travis_fold() {
local YELLOW='\033[1;33m'
local NOCOLOR='\033[0m'
if [ -z ${GITHUB_WORKFLOW+x} ]; then
echo travis_fold:$1:$2
if [ ! -z "${3:-}" ]; then
echo -e "${YELLOW}$3${NOCOLOR}"
fi
echo
else
if [ $1 = "start" ]; then
line="::group::$2"
if [ ! -z "${3:-}" ]; then
line="$line - ${YELLOW}$3${NOCOLOR}"
fi
else
line="::endgroup::"
fi
echo -e "$line"
fi
}
__print() {
local TITLE=""
if [[ -n $2 ]]; then
TITLE=" title=$2"
fi
echo "::$1${TITLE}::$3"
}
# $1 - title
# $2 - message
print_error() {
__print error $1 $2
}
# $1 - title
# $2 - message
print_notice() {
__print notice $1 $2
}

File diff suppressed because it is too large Load Diff

16
.github/actions/debian/action.yml vendored Normal file
View File

@@ -0,0 +1,16 @@
name: 'debian'
description: 'Build'
inputs:
target:
description: 'Run target'
required: true
runs:
using: "composite"
steps:
- run: |
source /tmp/ci_setup
bash -x $CI_ROOT/managers/debian.sh SETUP
bash -x $CI_ROOT/managers/debian.sh ${{ inputs.target }}
bash -x $CI_ROOT/managers/debian.sh CLEANUP
shell: bash

23
.github/actions/setup/action.yml vendored Normal file
View File

@@ -0,0 +1,23 @@
name: 'setup'
description: 'setup env, create /tmp/ci_setup'
runs:
using: "composite"
steps:
- id: variables
run: |
export REPO_ROOT=$GITHUB_WORKSPACE
export CI_ROOT=$REPO_ROOT/travis-ci
# this is somewhat ugly, but that is the easiest way to share this code with
# arch specific docker
echo 'echo ::group::Env setup' > /tmp/ci_setup
echo export DEBIAN_FRONTEND=noninteractive >> /tmp/ci_setup
echo sudo apt-get update >> /tmp/ci_setup
echo sudo apt-get install -y aptitude qemu-kvm zstd binutils-dev elfutils libcap-dev libelf-dev libdw-dev libguestfs-tools >> /tmp/ci_setup
echo export PROJECT_NAME='libbpf' >> /tmp/ci_setup
echo export AUTHOR_EMAIL="$(git log -1 --pretty=\"%aE\")" >> /tmp/ci_setup
echo export REPO_ROOT=$GITHUB_WORKSPACE >> /tmp/ci_setup
echo export CI_ROOT=$REPO_ROOT/travis-ci >> /tmp/ci_setup
echo export VMTEST_ROOT=$CI_ROOT/vmtest >> /tmp/ci_setup
echo 'echo ::endgroup::' >> /tmp/ci_setup
shell: bash

87
.github/actions/vmtest/action.yml vendored Normal file
View File

@@ -0,0 +1,87 @@
name: 'vmtest'
description: 'Build + run vmtest'
inputs:
kernel:
description: 'kernel version or LATEST'
required: true
default: 'LATEST'
arch:
description: 'what arch to test'
required: true
default: 'x86_64'
pahole:
description: 'pahole rev or master'
required: true
default: 'master'
runs:
using: "composite"
steps:
# setup envinronment
- name: Setup environment
uses: libbpf/ci/setup-build-env@master
with:
pahole: ${{ inputs.pahole }}
# 1. download CHECKPOINT kernel source
- name: Get checkpoint commit
shell: bash
run: |
cat CHECKPOINT-COMMIT
echo "CHECKPOINT=$(cat CHECKPOINT-COMMIT)" >> $GITHUB_ENV
- name: Get kernel source at checkpoint
uses: libbpf/ci/get-linux-source@master
with:
repo: 'https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git'
rev: ${{ env.CHECKPOINT }}
dest: '${{ github.workspace }}/.kernel'
- name: Patch kernel source
uses: libbpf/ci/patch-kernel@master
with:
patches-root: '${{ github.workspace }}/travis-ci/diffs'
repo-root: '.kernel'
- name: Prepare to build BPF selftests
shell: bash
run: |
echo "::group::Prepare buidling selftest"
cd .kernel
cp ${{ github.workspace }}/travis-ci/vmtest/configs/config-latest.${{ inputs.arch }} .config
make olddefconfig && make prepare
cd -
echo "::endgroup::"
# 2. if kernel == LATEST, build kernel image from tree
- name: Build kernel image
if: ${{ inputs.kernel == 'LATEST' }}
shell: bash
run: |
echo "::group::Build Kernel Image"
cd .kernel
make -j $((4*$(nproc))) all > /dev/null
cp vmlinux ${{ github.workspace }}
cd -
echo "::endgroup::"
# else, just download prebuilt kernel image
- name: Download prebuilt kernel
if: ${{ inputs.kernel != 'LATEST' }}
uses: libbpf/ci/download-vmlinux@master
with:
kernel: ${{ inputs.kernel }}
arch: ${{ inputs.arch }}
# 3. build selftests
- name: Build BPF selftests
uses: ./.github/actions/build-selftests
with:
repo-path: '.kernel'
kernel: ${{ inputs.kernel }}
# 4. prepare rootfs
- name: prepare rootfs
uses: libbpf/ci/prepare-rootfs@master
with:
kernel: ${{ inputs.kernel }}
project-name: 'libbpf'
arch: ${{ inputs.arch }}
# 5. run selftest in QEMU
- name: Run selftests
uses: libbpf/ci/run-qemu@master
with:
img: '/tmp/root.img'
vmlinuz: 'vmlinuz'
arch: ${{ inputs.arch }}

81
.github/workflows/build.yml vendored Normal file
View File

@@ -0,0 +1,81 @@
name: libbpf-build
on:
pull_request:
push:
schedule:
- cron: '0 18 * * *'
concurrency:
group: ci-build-${{ github.head_ref }}
cancel-in-progress: true
jobs:
debian:
runs-on: ubuntu-latest
name: Debian Build (${{ matrix.name }})
strategy:
fail-fast: false
matrix:
include:
- name: default
target: RUN
- name: ASan+UBSan
target: RUN_ASAN
- name: clang
target: RUN_CLANG
- name: clang ASan+UBSan
target: RUN_CLANG_ASAN
- name: gcc-10
target: RUN_GCC10
- name: gcc-10 ASan+UBSan
target: RUN_GCC10_ASAN
steps:
- uses: actions/checkout@v2
name: Checkout
- uses: ./.github/actions/setup
name: Setup
- uses: ./.github/actions/debian
name: Build
with:
target: ${{ matrix.target }}
ubuntu:
runs-on: ubuntu-latest
name: Ubuntu Focal Build (${{ matrix.arch }})
strategy:
fail-fast: false
matrix:
include:
- arch: aarch64
- arch: ppc64le
- arch: s390x
- arch: x86
steps:
- uses: actions/checkout@v2
name: Checkout
- uses: ./.github/actions/setup
name: Pre-Setup
- run: source /tmp/ci_setup && sudo -E $CI_ROOT/managers/ubuntu.sh
if: matrix.arch == 'x86'
name: Setup
- uses: uraimo/run-on-arch-action@v2.0.5
name: Build in docker
if: matrix.arch != 'x86'
with:
distro:
ubuntu20.04
arch:
${{ matrix.arch }}
setup:
cp /tmp/ci_setup $GITHUB_WORKSPACE
dockerRunArgs: |
--volume "${GITHUB_WORKSPACE}:${GITHUB_WORKSPACE}"
shell: /bin/bash
install: |
export DEBIAN_FRONTEND=noninteractive
export TZ="America/Los_Angeles"
apt-get update -y
apt-get install -y tzdata build-essential sudo
run: source ${GITHUB_WORKSPACE}/ci_setup && $CI_ROOT/managers/ubuntu.sh

40
.github/workflows/cifuzz.yml vendored Normal file
View File

@@ -0,0 +1,40 @@
---
# https://google.github.io/oss-fuzz/getting-started/continuous-integration/
name: CIFuzz
on:
push:
branches:
- master
pull_request:
branches:
- master
jobs:
Fuzzing:
runs-on: ubuntu-latest
if: github.repository == 'libbpf/libbpf'
strategy:
fail-fast: false
matrix:
sanitizer: [address, undefined, memory]
steps:
- name: Build Fuzzers (${{ matrix.sanitizer }})
id: build
uses: google/oss-fuzz/infra/cifuzz/actions/build_fuzzers@master
with:
oss-fuzz-project-name: 'libbpf'
dry-run: false
allowed-broken-targets-percentage: 0
sanitizer: ${{ matrix.sanitizer }}
- name: Run Fuzzers (${{ matrix.sanitizer }})
uses: google/oss-fuzz/infra/cifuzz/actions/run_fuzzers@master
with:
oss-fuzz-project-name: 'libbpf'
fuzz-seconds: 300
dry-run: false
sanitizer: ${{ matrix.sanitizer }}
- name: Upload Crash
uses: actions/upload-artifact@v1
if: failure() && steps.build.outcome == 'success'
with:
name: ${{ matrix.sanitizer }}-artifacts
path: ./out/artifacts

31
.github/workflows/coverity.yml vendored Normal file
View File

@@ -0,0 +1,31 @@
name: libbpf-ci-coverity
on:
schedule:
- cron: '0 18 * * *'
jobs:
coverity:
runs-on: ubuntu-latest
if: github.repository == 'libbpf/libbpf'
name: Coverity
steps:
- uses: actions/checkout@v2
- uses: ./.github/actions/setup
- name: Run coverity
run: |
echo ::group::Setup CI env
source /tmp/ci_setup
export COVERITY_SCAN_NOTIFICATION_EMAIL="${AUTHOR_EMAIL}"
export COVERITY_SCAN_BRANCH_PATTERN=${GITHUB_REF##refs/*/}
export TRAVIS_BRANCH=${COVERITY_SCAN_BRANCH_PATTERN}
echo ::endgroup::
scripts/coverity.sh
env:
COVERITY_SCAN_TOKEN: ${{ secrets.COVERITY_SCAN_TOKEN }}
COVERITY_SCAN_PROJECT_NAME: libbpf
COVERITY_SCAN_BUILD_COMMAND_PREPEND: 'cd src/'
COVERITY_SCAN_BUILD_COMMAND: 'make'
- name: SCM log
run: cat /home/runner/work/libbpf/libbpf/src/cov-int/scm_log.txt

36
.github/workflows/ondemand.yml vendored Normal file
View File

@@ -0,0 +1,36 @@
name: ondemand
on:
workflow_dispatch:
inputs:
kernel-origin:
description: 'git repo for linux kernel'
default: 'https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git'
required: true
kernel-rev:
description: 'rev/tag/branch for linux kernel'
default: "master"
required: true
pahole-origin:
description: 'git repo for pahole'
default: 'https://git.kernel.org/pub/scm/devel/pahole/pahole.git'
required: true
pahole-rev:
description: 'ref/tag/branch for pahole'
default: "master"
required: true
jobs:
vmtest:
runs-on: ubuntu-latest
name: vmtest with customized pahole/Kernel
steps:
- uses: actions/checkout@v2
- uses: ./.github/actions/setup
- uses: ./.github/actions/vmtest
with:
kernel: 'LATEST'
kernel-rev: ${{ github.event.inputs.kernel-rev }}
kernel-origin: ${{ github.event.inputs.kernel-origin }}
pahole: ${{ github.event.inputs.pahole-rev }}
pahole-origin: ${{ github.event.inputs.pahole-origin }}

20
.github/workflows/pahole.yml vendored Normal file
View File

@@ -0,0 +1,20 @@
name: pahole-staging
on:
schedule:
- cron: '0 18 * * *'
jobs:
vmtest:
runs-on: ubuntu-latest
name: Kernel LATEST + staging pahole
env:
STAGING: tmp.master
steps:
- uses: actions/checkout@v2
- uses: ./.github/actions/setup
- uses: ./.github/actions/vmtest
with:
kernel: LATEST
pahole: $STAGING

42
.github/workflows/test.yml vendored Normal file
View File

@@ -0,0 +1,42 @@
name: libbpf-ci
on:
pull_request:
push:
schedule:
- cron: '0 18 * * *'
concurrency:
group: ci-test-${{ github.head_ref }}
cancel-in-progress: true
jobs:
vmtest:
runs-on: ${{ matrix.runs_on }}
name: Kernel ${{ matrix.kernel }} on ${{ matrix.runs_on }} + selftests
strategy:
fail-fast: false
matrix:
include:
- kernel: 'LATEST'
runs_on: ubuntu-latest
arch: 'x86_64'
- kernel: '5.5.0'
runs_on: ubuntu-latest
arch: 'x86_64'
- kernel: '4.9.0'
runs_on: ubuntu-latest
arch: 'x86_64'
- kernel: 'LATEST'
runs_on: z15
arch: 's390x'
steps:
- uses: actions/checkout@v2
name: Checkout
- uses: ./.github/actions/setup
name: Setup
- uses: ./.github/actions/vmtest
name: vmtest
with:
kernel: ${{ matrix.kernel }}
arch: ${{ matrix.arch }}

17
.readthedocs.yaml Normal file
View File

@@ -0,0 +1,17 @@
# .readthedocs.yaml
# Read the Docs configuration file
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details
# Required
version: 2
# Build documentation in the docs/ directory with Sphinx
sphinx:
builder: html
configuration: docs/conf.py
# Optionally set the version of Python and requirements required to build your docs
python:
version: 3.7
install:
- requirements: docs/sphinx/requirements.txt

View File

@@ -1,6 +1,6 @@
sudo: required
language: bash
dist: bionic
dist: focal
services:
- docker
@@ -36,7 +36,12 @@ stages:
jobs:
include:
- stage: Builds & Tests
name: Kernel LATEST + selftests
name: Kernel 5.5.0 + selftests
language: bash
env: KERNEL=5.5.0
script: $CI_ROOT/vmtest/run_vmtest.sh || travis_terminate 1
- name: Kernel LATEST + selftests
language: bash
env: KERNEL=LATEST
script: $CI_ROOT/vmtest/run_vmtest.sh || travis_terminate 1
@@ -46,11 +51,6 @@ jobs:
env: KERNEL=4.9.0
script: $CI_ROOT/vmtest/run_vmtest.sh || travis_terminate 1
- name: Kernel 5.5.0 + selftests
language: bash
env: KERNEL=5.5.0
script: $CI_ROOT/vmtest/run_vmtest.sh || travis_terminate 1
- name: Debian Build
language: bash
install: $CI_ROOT/managers/debian.sh SETUP
@@ -75,33 +75,33 @@ jobs:
script: $CI_ROOT/managers/debian.sh RUN_CLANG_ASAN || travis_terminate 1
after_script: $CI_ROOT/managers/debian.sh CLEANUP
- name: Debian Build (gcc-8)
- name: Debian Build (gcc-10)
language: bash
install: $CI_ROOT/managers/debian.sh SETUP
script: $CI_ROOT/managers/debian.sh RUN_GCC8 || travis_terminate 1
script: $CI_ROOT/managers/debian.sh RUN_GCC10 || travis_terminate 1
after_script: $CI_ROOT/managers/debian.sh CLEANUP
- name: Debian Build (gcc-8 ASan+UBSan)
- name: Debian Build (gcc-10 ASan+UBSan)
language: bash
install: $CI_ROOT/managers/debian.sh SETUP
script: $CI_ROOT/managers/debian.sh RUN_GCC8_ASAN || travis_terminate 1
script: $CI_ROOT/managers/debian.sh RUN_GCC10_ASAN || travis_terminate 1
after_script: $CI_ROOT/managers/debian.sh CLEANUP
- name: Ubuntu Bionic Build
- name: Ubuntu Focal Build
language: bash
script: sudo $CI_ROOT/managers/ubuntu.sh || travis_terminate 1
- name: Ubuntu Bionic Build (arm)
- name: Ubuntu Focal Build (arm)
arch: arm64
language: bash
script: sudo $CI_ROOT/managers/ubuntu.sh || travis_terminate 1
- name: Ubuntu Bionic Build (s390x)
- name: Ubuntu Focal Build (s390x)
arch: s390x
language: bash
script: sudo $CI_ROOT/managers/ubuntu.sh || travis_terminate 1
- name: Ubuntu Bionic Build (ppc64le)
- name: Ubuntu Focal Build (ppc64le)
arch: ppc64le
language: bash
script: sudo $CI_ROOT/managers/ubuntu.sh || travis_terminate 1
@@ -120,7 +120,7 @@ jobs:
- COVERITY_SCAN_BUILD_COMMAND_PREPEND="cd src/"
- COVERITY_SCAN_BUILD_COMMAND="make"
install:
- sudo echo 'deb-src http://archive.ubuntu.com/ubuntu/ bionic main restricted universe multiverse' >>/etc/apt/sources.list
- sudo echo 'deb-src http://archive.ubuntu.com/ubuntu/ focal main restricted universe multiverse' >>/etc/apt/sources.list
- sudo apt-get update
- sudo apt-get -y build-dep libelf-dev
- sudo apt-get install -y libelf-dev pkg-config

View File

@@ -1 +1 @@
1a3449c19407a28f7019a887cdf0d6ba2444751a
fe68195daf34d5dddacd3f93dd3eafc4beca3a0e

View File

@@ -1 +1 @@
3db1a3fa98808aa90f95ec3e0fa2fc7abf28f5c9
dc37dc617fabfb1c3a16d49f5d8cc20e9e3608ca

View File

@@ -1,3 +1,15 @@
This is a mirror of [bpf-next Linux source
tree](https://kernel.googlesource.com/pub/scm/linux/kernel/git/bpf/bpf-next)'s
`tools/lib/bpf` directory plus its supporting header files.
All the gory details of syncing can be found in `scripts/sync-kernel.sh`
script.
Some header files in this repo (`include/linux/*.h`) are reduced versions of
their counterpart files at
[bpf-next](https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/)'s
`tools/include/linux/*.h` to make compilation successful.
BPF/libbpf usage and questions
==============================
@@ -7,6 +19,11 @@ the examples of building BPF applications with libbpf.
[libbpf-tools](https://github.com/iovisor/bcc/tree/master/libbpf-tools) are also
a good source of the real-world libbpf-based tracing tools.
See also ["BPF CO-RE reference guide"](https://nakryiko.com/posts/bpf-core-reference-guide/)
for the coverage of practical aspects of building BPF CO-RE applications and
["BPF CO-RE"](https://nakryiko.com/posts/bpf-portability-and-co-re/) for
general introduction into BPF portability issues and BPF CO-RE origins.
All general BPF questions, including kernel functionality, libbpf APIs and
their application, should be sent to bpf@vger.kernel.org mailing list. You can
subscribe to it [here](http://vger.kernel.org/vger-lists.html#bpf) and search
@@ -20,9 +37,10 @@ should be opened only for dealing with issues pertaining to specific way this
libbpf mirror repo is set up and organized.
Build
[![Build Status](https://travis-ci.com/libbpf/libbpf.svg?branch=master)](https://travis-ci.com/github/libbpf/libbpf)
[![Github Actions Builds & Tests](https://github.com/libbpf/libbpf/actions/workflows/test.yml/badge.svg)](https://github.com/libbpf/libbpf/actions/workflows/test.yml)
[![Total alerts](https://img.shields.io/lgtm/alerts/g/libbpf/libbpf.svg?logo=lgtm&logoWidth=18)](https://lgtm.com/projects/g/libbpf/libbpf/alerts/)
[![Coverity](https://img.shields.io/coverity/scan/18195.svg)](https://scan.coverity.com/projects/libbpf)
[![OSS-Fuzz Status](https://oss-fuzz-build-logs.storage.googleapis.com/badges/libbpf.svg)](https://oss-fuzz-build-logs.storage.googleapis.com/index.html#libbpf)
=====
libelf is an internal dependency of libbpf and thus it is required to link
against and must be installed on the system for applications to work.
@@ -63,7 +81,8 @@ Distributions packaging libbpf from this mirror:
- [Gentoo](https://packages.gentoo.org/packages/dev-libs/libbpf)
- [Debian](https://packages.debian.org/source/sid/libbpf)
- [Arch](https://www.archlinux.org/packages/extra/x86_64/libbpf/)
- [Ubuntu](https://packages.ubuntu.com/source/groovy/libbpf)
- [Ubuntu](https://packages.ubuntu.com/source/impish/libbpf)
- [Alpine](https://pkgs.alpinelinux.org/packages?name=libbpf)
Benefits of packaging from the mirror over packaging from kernel sources:
- Consistent versioning across distributions.
@@ -97,7 +116,9 @@ Some major Linux distributions come with kernel BTF already built in:
- RHEL 8.2+
- OpenSUSE Tumbleweed (in the next release, as of 2020-06-04)
- Arch Linux (from kernel 5.7.1.arch1-1)
- Manjaro (from kernel 5.4 if compiled after 2021-06-18)
- Ubuntu 20.10
- Debian 11 (amd64/arm64)
If your kernel doesn't come with BTF built-in, you'll need to build custom
kernel. You'll need:
@@ -119,31 +140,19 @@ distributions have Clang/LLVM 10+ packaged by default:
- Arch Linux
- Ubuntu 20.10 (LLVM 11)
- Debian 11 (LLVM 11)
- Alpine 3.13+
Otherwise, please make sure to update it on your system.
The following resources are useful to understand what BPF CO-RE is and how to
use it:
- [BPF Portability and CO-RE](https://facebookmicrosites.github.io/bpf/blog/2020/02/19/bpf-portability-and-co-re.html)
- [HOWTO: BCC to libbpf conversion](https://facebookmicrosites.github.io/bpf/blog/2020/02/20/bcc-to-libbpf-howto-guide.html)
- [BPF CO-RE reference guide](https://nakryiko.com/posts/bpf-core-reference-guide/)
- [BPF Portability and CO-RE](https://nakryiko.com/posts/bpf-portability-and-co-re/)
- [HOWTO: BCC to libbpf conversion](https://nakryiko.com/posts/bcc-to-libbpf-howto-guide/)
- [libbpf-tools in BCC repo](https://github.com/iovisor/bcc/tree/master/libbpf-tools)
contain lots of real-world tools converted from BCC to BPF CO-RE. Consider
converting some more to both contribute to the BPF community and gain some
more experience with it.
Details
=======
This is a mirror of [bpf-next Linux source
tree](https://kernel.googlesource.com/pub/scm/linux/kernel/git/bpf/bpf-next)'s
`tools/lib/bpf` directory plus its supporting header files.
All the gory details of syncing can be found in `scripts/sync-kernel.sh`
script.
Some header files in this repo (`include/linux/*.h`) are reduced versions of
their counterpart files at
[bpf-next](https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/)'s
`tools/include/linux/*.h` to make compilation successful.
License
=======

2
docs/.gitignore vendored Normal file
View File

@@ -0,0 +1,2 @@
sphinx/build
sphinx/doxygen/build

93
docs/api.rst Normal file
View File

@@ -0,0 +1,93 @@
.. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
.. _api:
.. toctree:: Table of Contents
LIBBPF API
==========
Error Handling
--------------
When libbpf is used in "libbpf 1.0 mode", API functions can return errors in one of two ways.
You can set "libbpf 1.0" mode with the following line:
.. code-block::
libbpf_set_strict_mode(LIBBPF_STRICT_DIRECT_ERRS | LIBBPF_STRICT_CLEAN_PTRS);
If the function returns an error code directly, it uses 0 to indicate success
and a negative error code to indicate what caused the error. In this case the
error code should be checked directly from the return, you do not need to check
errno.
For example:
.. code-block::
err = some_libbpf_api_with_error_return(...);
if (err < 0) {
/* Handle error accordingly */
}
If the function returns a pointer, it will return NULL to indicate there was
an error. In this case errno should be checked for the error code.
For example:
.. code-block::
ptr = some_libbpf_api_returning_ptr();
if (!ptr) {
/* note no minus sign for EINVAL and E2BIG below */
if (errno == EINVAL) {
/* handle EINVAL error */
} else if (errno == E2BIG) {
/* handle E2BIG error */
}
}
libbpf.h
--------
.. doxygenfile:: libbpf.h
:project: libbpf
:sections: func define public-type enum
bpf.h
-----
.. doxygenfile:: bpf.h
:project: libbpf
:sections: func define public-type enum
btf.h
-----
.. doxygenfile:: btf.h
:project: libbpf
:sections: func define public-type enum
xsk.h
-----
.. doxygenfile:: xsk.h
:project: libbpf
:sections: func define public-type enum
bpf_tracing.h
-------------
.. doxygenfile:: bpf_tracing.h
:project: libbpf
:sections: func define public-type enum
bpf_core_read.h
---------------
.. doxygenfile:: bpf_core_read.h
:project: libbpf
:sections: func define public-type enum
bpf_endian.h
------------
.. doxygenfile:: bpf_endian.h
:project: libbpf
:sections: func define public-type enum

40
docs/conf.py Normal file
View File

@@ -0,0 +1,40 @@
#!/usr/bin/env python3
# SPDX-License-Identifier: GPL-2.0
# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html
import os
import subprocess
project = "libbpf"
extensions = [
'sphinx.ext.autodoc',
'sphinx.ext.doctest',
'sphinx.ext.mathjax',
'sphinx.ext.viewcode',
'sphinx.ext.imgmath',
'sphinx.ext.todo',
'breathe',
]
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = []
read_the_docs_build = os.environ.get('READTHEDOCS', None) == 'True'
if read_the_docs_build:
subprocess.call('cd sphinx ; make clean', shell=True)
subprocess.call('cd sphinx/doxygen ; doxygen', shell=True)
html_theme = 'sphinx_rtd_theme'
breathe_projects = { "libbpf": "./sphinx/doxygen/build/xml/" }
breathe_default_project = "libbpf"
breathe_show_define_initializer = True
breathe_show_enumvalue_initializer = True

22
docs/index.rst Normal file
View File

@@ -0,0 +1,22 @@
.. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
libbpf
======
.. toctree::
:maxdepth: 1
libbpf_naming_convention
libbpf_build
This is documentation for libbpf, a userspace library for loading and
interacting with bpf programs.
For API documentation see the `versioned API documentation site <https://libbpf.readthedocs.io/en/latest/api.html>`_.
All general BPF questions, including kernel functionality, libbpf APIs and
their application, should be sent to bpf@vger.kernel.org mailing list.
You can `subscribe <http://vger.kernel.org/vger-lists.html#bpf>`_ to the
mailing list search its `archive <https://lore.kernel.org/bpf/>`_.
Please search the archive before asking new questions. It very well might
be that this was already addressed or answered before.

37
docs/libbpf_build.rst Normal file
View File

@@ -0,0 +1,37 @@
.. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
Building libbpf
===============
libelf and zlib are internal dependencies of libbpf and thus are required to link
against and must be installed on the system for applications to work.
pkg-config is used by default to find libelf, and the program called
can be overridden with PKG_CONFIG.
If using pkg-config at build time is not desired, it can be disabled by
setting NO_PKG_CONFIG=1 when calling make.
To build both static libbpf.a and shared libbpf.so:
.. code-block:: bash
$ cd src
$ make
To build only static libbpf.a library in directory build/ and install them
together with libbpf headers in a staging directory root/:
.. code-block:: bash
$ cd src
$ mkdir build root
$ BUILD_STATIC_ONLY=y OBJDIR=build DESTDIR=root make install
To build both static libbpf.a and shared libbpf.so against a custom libelf
dependency installed in /build/root/ and install them together with libbpf
headers in a build directory /build/root/:
.. code-block:: bash
$ cd src
$ PKG_CONFIG_PATH=/build/root/lib64/pkgconfig DESTDIR=/build/root make

View File

@@ -1,7 +1,7 @@
.. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
libbpf API naming convention
============================
API naming convention
=====================
libbpf API provides access to a few logically separated groups of
functions and types. Every group has its own naming convention
@@ -10,14 +10,14 @@ new function or type is added to keep libbpf API clean and consistent.
All types and functions provided by libbpf API should have one of the
following prefixes: ``bpf_``, ``btf_``, ``libbpf_``, ``xsk_``,
``perf_buffer_``.
``btf_dump_``, ``ring_buffer_``, ``perf_buffer_``.
System call wrappers
--------------------
System call wrappers are simple wrappers for commands supported by
sys_bpf system call. These wrappers should go to ``bpf.h`` header file
and map one-on-one to corresponding commands.
and map one to one to corresponding commands.
For example ``bpf_map_lookup_elem`` wraps ``BPF_MAP_LOOKUP_ELEM``
command of sys_bpf, ``bpf_prog_attach`` wraps ``BPF_PROG_ATTACH``, etc.
@@ -49,10 +49,6 @@ object, ``bpf_object``, double underscore and ``open`` that defines the
purpose of the function to open ELF file and create ``bpf_object`` from
it.
Another example: ``bpf_program__load`` is named for corresponding
object, ``bpf_program``, that is separated from other part of the name
by double underscore.
All objects and corresponding functions other than BTF related should go
to ``libbpf.h``. BTF types and functions should go to ``btf.h``.
@@ -72,12 +68,8 @@ of both low-level ring access functions and high-level configuration
functions. These can be mixed and matched. Note that these functions
are not reentrant for performance reasons.
Please take a look at Documentation/networking/af_xdp.rst in the Linux
kernel source tree on how to use XDP sockets and for some common
mistakes in case you do not get any traffic up to user space.
libbpf ABI
==========
ABI
---
libbpf can be both linked statically or used as DSO. To avoid possible
conflicts with other libraries an application is linked with, all
@@ -116,7 +108,8 @@ This bump in ABI version is at most once per kernel development cycle.
For example, if current state of ``libbpf.map`` is:
.. code-block::
.. code-block:: none
LIBBPF_0.0.1 {
global:
bpf_func_a;
@@ -128,7 +121,8 @@ For example, if current state of ``libbpf.map`` is:
, and a new symbol ``bpf_func_c`` is being introduced, then
``libbpf.map`` should be changed like this:
.. code-block::
.. code-block:: none
LIBBPF_0.0.1 {
global:
bpf_func_a;
@@ -148,7 +142,7 @@ Format of version script and ways to handle ABI changes, including
incompatible ones, described in details in [1].
Stand-alone build
=================
-------------------
Under https://github.com/libbpf/libbpf there is a (semi-)automated
mirror of the mainline's version of libbpf for a stand-alone build.
@@ -156,13 +150,53 @@ mirror of the mainline's version of libbpf for a stand-alone build.
However, all changes to libbpf's code base must be upstreamed through
the mainline kernel tree.
API documentation convention
============================
The libbpf API is documented via comments above definitions in
header files. These comments can be rendered by doxygen and sphinx
for well organized html output. This section describes the
convention in which these comments should be formated.
Here is an example from btf.h:
.. code-block:: c
/**
* @brief **btf__new()** creates a new instance of a BTF object from the raw
* bytes of an ELF's BTF section
* @param data raw bytes
* @param size number of bytes passed in `data`
* @return new BTF object instance which has to be eventually freed with
* **btf__free()**
*
* On error, error-code-encoded-as-pointer is returned, not a NULL. To extract
* error code from such a pointer `libbpf_get_error()` should be used. If
* `libbpf_set_strict_mode(LIBBPF_STRICT_CLEAN_PTRS)` is enabled, NULL is
* returned on error instead. In both cases thread-local `errno` variable is
* always set to error code as well.
*/
The comment must start with a block comment of the form '/\*\*'.
The documentation always starts with a @brief directive. This line is a short
description about this API. It starts with the name of the API, denoted in bold
like so: **api_name**. Please include an open and close parenthesis if this is a
function. Follow with the short description of the API. A longer form description
can be added below the last directive, at the bottom of the comment.
Parameters are denoted with the @param directive, there should be one for each
parameter. If this is a function with a non-void return, use the @return directive
to document it.
License
=======
-------------------
libbpf is dual-licensed under LGPL 2.1 and BSD 2-Clause.
Links
=====
-------------------
[1] https://www.akkadia.org/drepper/dsohowto.pdf
(Chapter 3. Maintaining APIs and ABIs).

9
docs/sphinx/Makefile Normal file
View File

@@ -0,0 +1,9 @@
SPHINXBUILD ?= sphinx-build
SOURCEDIR = ../src
BUILDDIR = build
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)"
%:
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)"

View File

@@ -0,0 +1,277 @@
DOXYFILE_ENCODING = UTF-8
PROJECT_NAME = "libbpf"
PROJECT_NUMBER =
PROJECT_BRIEF =
PROJECT_LOGO =
OUTPUT_DIRECTORY = ./build
CREATE_SUBDIRS = NO
ALLOW_UNICODE_NAMES = NO
OUTPUT_LANGUAGE = English
OUTPUT_TEXT_DIRECTION = None
BRIEF_MEMBER_DESC = YES
REPEAT_BRIEF = YES
ALWAYS_DETAILED_SEC = NO
INLINE_INHERITED_MEMB = NO
FULL_PATH_NAMES = YES
STRIP_FROM_PATH =
STRIP_FROM_INC_PATH =
SHORT_NAMES = NO
JAVADOC_AUTOBRIEF = NO
JAVADOC_BANNER = NO
QT_AUTOBRIEF = NO
MULTILINE_CPP_IS_BRIEF = NO
PYTHON_DOCSTRING = NO
INHERIT_DOCS = YES
SEPARATE_MEMBER_PAGES = NO
TAB_SIZE = 4
ALIASES =
OPTIMIZE_OUTPUT_FOR_C = YES
OPTIMIZE_OUTPUT_JAVA = NO
OPTIMIZE_FOR_FORTRAN = NO
OPTIMIZE_OUTPUT_VHDL = NO
OPTIMIZE_OUTPUT_SLICE = NO
EXTENSION_MAPPING =
MARKDOWN_SUPPORT = YES
TOC_INCLUDE_HEADINGS = 5
AUTOLINK_SUPPORT = YES
BUILTIN_STL_SUPPORT = NO
CPP_CLI_SUPPORT = NO
SIP_SUPPORT = NO
IDL_PROPERTY_SUPPORT = YES
DISTRIBUTE_GROUP_DOC = NO
GROUP_NESTED_COMPOUNDS = NO
SUBGROUPING = YES
INLINE_GROUPED_CLASSES = NO
INLINE_SIMPLE_STRUCTS = NO
TYPEDEF_HIDES_STRUCT = NO
LOOKUP_CACHE_SIZE = 0
NUM_PROC_THREADS = 1
EXTRACT_ALL = NO
EXTRACT_PRIVATE = NO
EXTRACT_PRIV_VIRTUAL = NO
EXTRACT_PACKAGE = NO
EXTRACT_STATIC = NO
EXTRACT_LOCAL_CLASSES = YES
EXTRACT_LOCAL_METHODS = NO
EXTRACT_ANON_NSPACES = NO
RESOLVE_UNNAMED_PARAMS = YES
HIDE_UNDOC_MEMBERS = NO
HIDE_UNDOC_CLASSES = NO
HIDE_FRIEND_COMPOUNDS = NO
HIDE_IN_BODY_DOCS = NO
INTERNAL_DOCS = NO
CASE_SENSE_NAMES = YES
HIDE_SCOPE_NAMES = NO
HIDE_COMPOUND_REFERENCE= NO
SHOW_INCLUDE_FILES = YES
SHOW_GROUPED_MEMB_INC = NO
FORCE_LOCAL_INCLUDES = NO
INLINE_INFO = YES
SORT_MEMBER_DOCS = YES
SORT_BRIEF_DOCS = NO
SORT_MEMBERS_CTORS_1ST = NO
SORT_GROUP_NAMES = NO
SORT_BY_SCOPE_NAME = NO
STRICT_PROTO_MATCHING = NO
GENERATE_TODOLIST = YES
GENERATE_TESTLIST = YES
GENERATE_BUGLIST = YES
GENERATE_DEPRECATEDLIST= YES
ENABLED_SECTIONS =
MAX_INITIALIZER_LINES = 30
SHOW_USED_FILES = YES
SHOW_FILES = YES
SHOW_NAMESPACES = YES
FILE_VERSION_FILTER =
LAYOUT_FILE =
CITE_BIB_FILES =
QUIET = NO
WARNINGS = YES
WARN_IF_UNDOCUMENTED = YES
WARN_IF_DOC_ERROR = YES
WARN_NO_PARAMDOC = NO
WARN_AS_ERROR = NO
WARN_FORMAT = "$file:$line: $text"
WARN_LOGFILE =
INPUT = ../../../src
INPUT_ENCODING = UTF-8
FILE_PATTERNS = *.c \
*.h
RECURSIVE = NO
EXCLUDE =
EXCLUDE_SYMLINKS = NO
EXCLUDE_PATTERNS =
EXCLUDE_SYMBOLS = ___*
EXAMPLE_PATH =
EXAMPLE_PATTERNS = *
EXAMPLE_RECURSIVE = NO
IMAGE_PATH =
INPUT_FILTER =
FILTER_PATTERNS =
FILTER_SOURCE_FILES = NO
FILTER_SOURCE_PATTERNS =
USE_MDFILE_AS_MAINPAGE = YES
SOURCE_BROWSER = NO
INLINE_SOURCES = NO
STRIP_CODE_COMMENTS = YES
REFERENCED_BY_RELATION = NO
REFERENCES_RELATION = NO
REFERENCES_LINK_SOURCE = YES
SOURCE_TOOLTIPS = YES
USE_HTAGS = NO
VERBATIM_HEADERS = YES
ALPHABETICAL_INDEX = YES
IGNORE_PREFIX =
GENERATE_HTML = NO
HTML_OUTPUT = html
HTML_FILE_EXTENSION = .html
HTML_HEADER =
HTML_FOOTER =
HTML_STYLESHEET =
HTML_EXTRA_STYLESHEET =
HTML_EXTRA_FILES =
HTML_COLORSTYLE_HUE = 220
HTML_COLORSTYLE_SAT = 100
HTML_COLORSTYLE_GAMMA = 80
HTML_TIMESTAMP = NO
HTML_DYNAMIC_MENUS = YES
HTML_DYNAMIC_SECTIONS = NO
HTML_INDEX_NUM_ENTRIES = 100
GENERATE_DOCSET = NO
DOCSET_FEEDNAME = "Doxygen generated docs"
DOCSET_BUNDLE_ID = org.doxygen.Project
DOCSET_PUBLISHER_ID = org.doxygen.Publisher
DOCSET_PUBLISHER_NAME = Publisher
GENERATE_HTMLHELP = NO
CHM_FILE =
HHC_LOCATION =
GENERATE_CHI = NO
CHM_INDEX_ENCODING =
BINARY_TOC = NO
TOC_EXPAND = NO
GENERATE_QHP = NO
QCH_FILE =
QHP_NAMESPACE = org.doxygen.Project
QHP_VIRTUAL_FOLDER = doc
QHP_CUST_FILTER_NAME =
QHP_CUST_FILTER_ATTRS =
QHP_SECT_FILTER_ATTRS =
QHG_LOCATION =
GENERATE_ECLIPSEHELP = NO
ECLIPSE_DOC_ID = org.doxygen.Project
DISABLE_INDEX = NO
GENERATE_TREEVIEW = NO
ENUM_VALUES_PER_LINE = 4
TREEVIEW_WIDTH = 250
EXT_LINKS_IN_WINDOW = NO
HTML_FORMULA_FORMAT = png
FORMULA_FONTSIZE = 10
FORMULA_TRANSPARENT = YES
FORMULA_MACROFILE =
USE_MATHJAX = NO
MATHJAX_FORMAT = HTML-CSS
MATHJAX_RELPATH = https://cdn.jsdelivr.net/npm/mathjax@2
MATHJAX_EXTENSIONS =
MATHJAX_CODEFILE =
SEARCHENGINE = YES
SERVER_BASED_SEARCH = NO
EXTERNAL_SEARCH = NO
SEARCHENGINE_URL =
SEARCHDATA_FILE = searchdata.xml
EXTERNAL_SEARCH_ID =
EXTRA_SEARCH_MAPPINGS =
GENERATE_LATEX = NO
LATEX_OUTPUT = latex
LATEX_CMD_NAME =
MAKEINDEX_CMD_NAME = makeindex
LATEX_MAKEINDEX_CMD = makeindex
COMPACT_LATEX = NO
PAPER_TYPE = a4
EXTRA_PACKAGES =
LATEX_HEADER =
LATEX_FOOTER =
LATEX_EXTRA_STYLESHEET =
LATEX_EXTRA_FILES =
PDF_HYPERLINKS = YES
USE_PDFLATEX = YES
LATEX_BATCHMODE = NO
LATEX_HIDE_INDICES = NO
LATEX_SOURCE_CODE = NO
LATEX_BIB_STYLE = plain
LATEX_TIMESTAMP = NO
LATEX_EMOJI_DIRECTORY =
GENERATE_RTF = NO
RTF_OUTPUT = rtf
COMPACT_RTF = NO
RTF_HYPERLINKS = NO
RTF_STYLESHEET_FILE =
RTF_EXTENSIONS_FILE =
RTF_SOURCE_CODE = NO
GENERATE_MAN = NO
MAN_OUTPUT = man
MAN_EXTENSION = .3
MAN_SUBDIR =
MAN_LINKS = NO
GENERATE_XML = YES
XML_OUTPUT = xml
XML_PROGRAMLISTING = YES
XML_NS_MEMB_FILE_SCOPE = NO
GENERATE_DOCBOOK = NO
DOCBOOK_OUTPUT = docbook
DOCBOOK_PROGRAMLISTING = NO
GENERATE_AUTOGEN_DEF = NO
GENERATE_PERLMOD = NO
PERLMOD_LATEX = NO
PERLMOD_PRETTY = YES
PERLMOD_MAKEVAR_PREFIX =
ENABLE_PREPROCESSING = YES
MACRO_EXPANSION = NO
EXPAND_ONLY_PREDEF = YES
SEARCH_INCLUDES = YES
INCLUDE_PATH =
INCLUDE_FILE_PATTERNS =
PREDEFINED =
EXPAND_AS_DEFINED =
SKIP_FUNCTION_MACROS = NO
TAGFILES =
GENERATE_TAGFILE =
ALLEXTERNALS = NO
EXTERNAL_GROUPS = YES
EXTERNAL_PAGES = YES
CLASS_DIAGRAMS = YES
DIA_PATH =
HIDE_UNDOC_RELATIONS = YES
HAVE_DOT = NO
DOT_NUM_THREADS = 0
DOT_FONTNAME = Helvetica
DOT_FONTSIZE = 10
DOT_FONTPATH =
CLASS_GRAPH = YES
COLLABORATION_GRAPH = YES
GROUP_GRAPHS = YES
UML_LOOK = NO
UML_LIMIT_NUM_FIELDS = 10
DOT_UML_DETAILS = NO
DOT_WRAP_THRESHOLD = 17
TEMPLATE_RELATIONS = NO
INCLUDE_GRAPH = YES
INCLUDED_BY_GRAPH = YES
CALL_GRAPH = NO
CALLER_GRAPH = NO
GRAPHICAL_HIERARCHY = YES
DIRECTORY_GRAPH = YES
DOT_IMAGE_FORMAT = png
INTERACTIVE_SVG = NO
DOT_PATH =
DOTFILE_DIRS =
MSCFILE_DIRS =
DIAFILE_DIRS =
PLANTUML_JAR_PATH =
PLANTUML_CFG_FILE =
PLANTUML_INCLUDE_PATH =
DOT_GRAPH_MAX_NODES = 50
MAX_DOT_GRAPH_DEPTH = 0
DOT_TRANSPARENT = NO
DOT_MULTI_TARGETS = NO
GENERATE_LEGEND = YES
DOT_CLEANUP = YES

View File

@@ -0,0 +1 @@
breathe

23
fuzz/bpf-object-fuzzer.c Normal file
View File

@@ -0,0 +1,23 @@
#include "libbpf.h"
static int libbpf_print_fn(enum libbpf_print_level level, const char *format, va_list args)
{
return 0;
}
int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
struct bpf_object *obj = NULL;
DECLARE_LIBBPF_OPTS(bpf_object_open_opts, opts);
int err;
libbpf_set_print(libbpf_print_fn);
opts.object_name = "fuzz-object";
obj = bpf_object__open_mem(data, size, &opts);
err = libbpf_get_error(obj);
if (err)
return 0;
bpf_object__close(obj);
return 0;
}

Binary file not shown.

View File

@@ -13,6 +13,14 @@
.off = OFF, \
.imm = IMM })
#define BPF_ALU32_IMM(OP, DST, IMM) \
((struct bpf_insn) { \
.code = BPF_ALU | BPF_OP(OP) | BPF_K, \
.dst_reg = DST, \
.src_reg = 0, \
.off = 0, \
.imm = IMM })
#define BPF_ALU64_IMM(OP, DST, IMM) \
((struct bpf_insn) { \
.code = BPF_ALU64 | BPF_OP(OP) | BPF_K, \

View File

@@ -7,6 +7,8 @@
#include <stddef.h>
#include <stdint.h>
#include <linux/stddef.h>
#include <asm/types.h>
#include <asm/posix_types.h>

File diff suppressed because it is too large Load Diff

View File

@@ -43,7 +43,7 @@ struct btf_type {
* "size" tells the size of the type it is describing.
*
* "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT,
* FUNC, FUNC_PROTO and VAR.
* FUNC, FUNC_PROTO, VAR, DECL_TAG and TYPE_TAG.
* "type" is a type_id referring to another type.
*/
union {
@@ -52,28 +52,34 @@ struct btf_type {
};
};
#define BTF_INFO_KIND(info) (((info) >> 24) & 0x0f)
#define BTF_INFO_KIND(info) (((info) >> 24) & 0x1f)
#define BTF_INFO_VLEN(info) ((info) & 0xffff)
#define BTF_INFO_KFLAG(info) ((info) >> 31)
#define BTF_KIND_UNKN 0 /* Unknown */
#define BTF_KIND_INT 1 /* Integer */
#define BTF_KIND_PTR 2 /* Pointer */
#define BTF_KIND_ARRAY 3 /* Array */
#define BTF_KIND_STRUCT 4 /* Struct */
#define BTF_KIND_UNION 5 /* Union */
#define BTF_KIND_ENUM 6 /* Enumeration */
#define BTF_KIND_FWD 7 /* Forward */
#define BTF_KIND_TYPEDEF 8 /* Typedef */
#define BTF_KIND_VOLATILE 9 /* Volatile */
#define BTF_KIND_CONST 10 /* Const */
#define BTF_KIND_RESTRICT 11 /* Restrict */
#define BTF_KIND_FUNC 12 /* Function */
#define BTF_KIND_FUNC_PROTO 13 /* Function Proto */
#define BTF_KIND_VAR 14 /* Variable */
#define BTF_KIND_DATASEC 15 /* Section */
#define BTF_KIND_MAX BTF_KIND_DATASEC
#define NR_BTF_KINDS (BTF_KIND_MAX + 1)
enum {
BTF_KIND_UNKN = 0, /* Unknown */
BTF_KIND_INT = 1, /* Integer */
BTF_KIND_PTR = 2, /* Pointer */
BTF_KIND_ARRAY = 3, /* Array */
BTF_KIND_STRUCT = 4, /* Struct */
BTF_KIND_UNION = 5, /* Union */
BTF_KIND_ENUM = 6, /* Enumeration */
BTF_KIND_FWD = 7, /* Forward */
BTF_KIND_TYPEDEF = 8, /* Typedef */
BTF_KIND_VOLATILE = 9, /* Volatile */
BTF_KIND_CONST = 10, /* Const */
BTF_KIND_RESTRICT = 11, /* Restrict */
BTF_KIND_FUNC = 12, /* Function */
BTF_KIND_FUNC_PROTO = 13, /* Function Proto */
BTF_KIND_VAR = 14, /* Variable */
BTF_KIND_DATASEC = 15, /* Section */
BTF_KIND_FLOAT = 16, /* Floating point */
BTF_KIND_DECL_TAG = 17, /* Decl Tag */
BTF_KIND_TYPE_TAG = 18, /* Type Tag */
NR_BTF_KINDS,
BTF_KIND_MAX = NR_BTF_KINDS - 1,
};
/* For some specific BTF_KIND, "struct btf_type" is immediately
* followed by extra data.
@@ -169,4 +175,15 @@ struct btf_var_secinfo {
__u32 size;
};
/* BTF_KIND_DECL_TAG is followed by a single "struct btf_decl_tag" to describe
* additional information related to the tag applied location.
* If component_idx == -1, the tag is applied to a struct, union,
* variable or function. Otherwise, it is applied to a struct/union
* member or a func argument, and component_idx indicates which member
* or argument (0 ... vlen-1).
*/
struct btf_decl_tag {
__s32 component_idx;
};
#endif /* _UAPI__LINUX_BTF_H__ */

View File

@@ -7,24 +7,23 @@
/* This struct should be in sync with struct rtnl_link_stats64 */
struct rtnl_link_stats {
__u32 rx_packets; /* total packets received */
__u32 tx_packets; /* total packets transmitted */
__u32 rx_bytes; /* total bytes received */
__u32 tx_bytes; /* total bytes transmitted */
__u32 rx_errors; /* bad packets received */
__u32 tx_errors; /* packet transmit problems */
__u32 rx_dropped; /* no space in linux buffers */
__u32 tx_dropped; /* no space available in linux */
__u32 multicast; /* multicast packets received */
__u32 rx_packets;
__u32 tx_packets;
__u32 rx_bytes;
__u32 tx_bytes;
__u32 rx_errors;
__u32 tx_errors;
__u32 rx_dropped;
__u32 tx_dropped;
__u32 multicast;
__u32 collisions;
/* detailed rx_errors: */
__u32 rx_length_errors;
__u32 rx_over_errors; /* receiver ring buff overflow */
__u32 rx_crc_errors; /* recved pkt with crc error */
__u32 rx_frame_errors; /* recv'd frame alignment error */
__u32 rx_fifo_errors; /* recv'r fifo overrun */
__u32 rx_missed_errors; /* receiver missed packet */
__u32 rx_over_errors;
__u32 rx_crc_errors;
__u32 rx_frame_errors;
__u32 rx_fifo_errors;
__u32 rx_missed_errors;
/* detailed tx_errors */
__u32 tx_aborted_errors;
@@ -37,29 +36,201 @@ struct rtnl_link_stats {
__u32 rx_compressed;
__u32 tx_compressed;
__u32 rx_nohandler; /* dropped, no handler found */
__u32 rx_nohandler;
};
/* The main device statistics structure */
/**
* struct rtnl_link_stats64 - The main device statistics structure.
*
* @rx_packets: Number of good packets received by the interface.
* For hardware interfaces counts all good packets received from the device
* by the host, including packets which host had to drop at various stages
* of processing (even in the driver).
*
* @tx_packets: Number of packets successfully transmitted.
* For hardware interfaces counts packets which host was able to successfully
* hand over to the device, which does not necessarily mean that packets
* had been successfully transmitted out of the device, only that device
* acknowledged it copied them out of host memory.
*
* @rx_bytes: Number of good received bytes, corresponding to @rx_packets.
*
* For IEEE 802.3 devices should count the length of Ethernet Frames
* excluding the FCS.
*
* @tx_bytes: Number of good transmitted bytes, corresponding to @tx_packets.
*
* For IEEE 802.3 devices should count the length of Ethernet Frames
* excluding the FCS.
*
* @rx_errors: Total number of bad packets received on this network device.
* This counter must include events counted by @rx_length_errors,
* @rx_crc_errors, @rx_frame_errors and other errors not otherwise
* counted.
*
* @tx_errors: Total number of transmit problems.
* This counter must include events counter by @tx_aborted_errors,
* @tx_carrier_errors, @tx_fifo_errors, @tx_heartbeat_errors,
* @tx_window_errors and other errors not otherwise counted.
*
* @rx_dropped: Number of packets received but not processed,
* e.g. due to lack of resources or unsupported protocol.
* For hardware interfaces this counter may include packets discarded
* due to L2 address filtering but should not include packets dropped
* by the device due to buffer exhaustion which are counted separately in
* @rx_missed_errors (since procfs folds those two counters together).
*
* @tx_dropped: Number of packets dropped on their way to transmission,
* e.g. due to lack of resources.
*
* @multicast: Multicast packets received.
* For hardware interfaces this statistic is commonly calculated
* at the device level (unlike @rx_packets) and therefore may include
* packets which did not reach the host.
*
* For IEEE 802.3 devices this counter may be equivalent to:
*
* - 30.3.1.1.21 aMulticastFramesReceivedOK
*
* @collisions: Number of collisions during packet transmissions.
*
* @rx_length_errors: Number of packets dropped due to invalid length.
* Part of aggregate "frame" errors in `/proc/net/dev`.
*
* For IEEE 802.3 devices this counter should be equivalent to a sum
* of the following attributes:
*
* - 30.3.1.1.23 aInRangeLengthErrors
* - 30.3.1.1.24 aOutOfRangeLengthField
* - 30.3.1.1.25 aFrameTooLongErrors
*
* @rx_over_errors: Receiver FIFO overflow event counter.
*
* Historically the count of overflow events. Such events may be
* reported in the receive descriptors or via interrupts, and may
* not correspond one-to-one with dropped packets.
*
* The recommended interpretation for high speed interfaces is -
* number of packets dropped because they did not fit into buffers
* provided by the host, e.g. packets larger than MTU or next buffer
* in the ring was not available for a scatter transfer.
*
* Part of aggregate "frame" errors in `/proc/net/dev`.
*
* This statistics was historically used interchangeably with
* @rx_fifo_errors.
*
* This statistic corresponds to hardware events and is not commonly used
* on software devices.
*
* @rx_crc_errors: Number of packets received with a CRC error.
* Part of aggregate "frame" errors in `/proc/net/dev`.
*
* For IEEE 802.3 devices this counter must be equivalent to:
*
* - 30.3.1.1.6 aFrameCheckSequenceErrors
*
* @rx_frame_errors: Receiver frame alignment errors.
* Part of aggregate "frame" errors in `/proc/net/dev`.
*
* For IEEE 802.3 devices this counter should be equivalent to:
*
* - 30.3.1.1.7 aAlignmentErrors
*
* @rx_fifo_errors: Receiver FIFO error counter.
*
* Historically the count of overflow events. Those events may be
* reported in the receive descriptors or via interrupts, and may
* not correspond one-to-one with dropped packets.
*
* This statistics was used interchangeably with @rx_over_errors.
* Not recommended for use in drivers for high speed interfaces.
*
* This statistic is used on software devices, e.g. to count software
* packet queue overflow (can) or sequencing errors (GRE).
*
* @rx_missed_errors: Count of packets missed by the host.
* Folded into the "drop" counter in `/proc/net/dev`.
*
* Counts number of packets dropped by the device due to lack
* of buffer space. This usually indicates that the host interface
* is slower than the network interface, or host is not keeping up
* with the receive packet rate.
*
* This statistic corresponds to hardware events and is not used
* on software devices.
*
* @tx_aborted_errors:
* Part of aggregate "carrier" errors in `/proc/net/dev`.
* For IEEE 802.3 devices capable of half-duplex operation this counter
* must be equivalent to:
*
* - 30.3.1.1.11 aFramesAbortedDueToXSColls
*
* High speed interfaces may use this counter as a general device
* discard counter.
*
* @tx_carrier_errors: Number of frame transmission errors due to loss
* of carrier during transmission.
* Part of aggregate "carrier" errors in `/proc/net/dev`.
*
* For IEEE 802.3 devices this counter must be equivalent to:
*
* - 30.3.1.1.13 aCarrierSenseErrors
*
* @tx_fifo_errors: Number of frame transmission errors due to device
* FIFO underrun / underflow. This condition occurs when the device
* begins transmission of a frame but is unable to deliver the
* entire frame to the transmitter in time for transmission.
* Part of aggregate "carrier" errors in `/proc/net/dev`.
*
* @tx_heartbeat_errors: Number of Heartbeat / SQE Test errors for
* old half-duplex Ethernet.
* Part of aggregate "carrier" errors in `/proc/net/dev`.
*
* For IEEE 802.3 devices possibly equivalent to:
*
* - 30.3.2.1.4 aSQETestErrors
*
* @tx_window_errors: Number of frame transmission errors due
* to late collisions (for Ethernet - after the first 64B of transmission).
* Part of aggregate "carrier" errors in `/proc/net/dev`.
*
* For IEEE 802.3 devices this counter must be equivalent to:
*
* - 30.3.1.1.10 aLateCollisions
*
* @rx_compressed: Number of correctly received compressed packets.
* This counters is only meaningful for interfaces which support
* packet compression (e.g. CSLIP, PPP).
*
* @tx_compressed: Number of transmitted compressed packets.
* This counters is only meaningful for interfaces which support
* packet compression (e.g. CSLIP, PPP).
*
* @rx_nohandler: Number of packets received on the interface
* but dropped by the networking stack because the device is
* not designated to receive packets (e.g. backup link in a bond).
*/
struct rtnl_link_stats64 {
__u64 rx_packets; /* total packets received */
__u64 tx_packets; /* total packets transmitted */
__u64 rx_bytes; /* total bytes received */
__u64 tx_bytes; /* total bytes transmitted */
__u64 rx_errors; /* bad packets received */
__u64 tx_errors; /* packet transmit problems */
__u64 rx_dropped; /* no space in linux buffers */
__u64 tx_dropped; /* no space available in linux */
__u64 multicast; /* multicast packets received */
__u64 rx_packets;
__u64 tx_packets;
__u64 rx_bytes;
__u64 tx_bytes;
__u64 rx_errors;
__u64 tx_errors;
__u64 rx_dropped;
__u64 tx_dropped;
__u64 multicast;
__u64 collisions;
/* detailed rx_errors: */
__u64 rx_length_errors;
__u64 rx_over_errors; /* receiver ring buff overflow */
__u64 rx_crc_errors; /* recved pkt with crc error */
__u64 rx_frame_errors; /* recv'd frame alignment error */
__u64 rx_fifo_errors; /* recv'r fifo overrun */
__u64 rx_missed_errors; /* receiver missed packet */
__u64 rx_over_errors;
__u64 rx_crc_errors;
__u64 rx_frame_errors;
__u64 rx_fifo_errors;
__u64 rx_missed_errors;
/* detailed tx_errors */
__u64 tx_aborted_errors;
@@ -71,8 +242,7 @@ struct rtnl_link_stats64 {
/* for cslip etc */
__u64 rx_compressed;
__u64 tx_compressed;
__u64 rx_nohandler; /* dropped, no handler found */
__u64 rx_nohandler;
};
/* The struct should be in sync with struct ifmap */
@@ -170,12 +340,30 @@ enum {
IFLA_PROP_LIST,
IFLA_ALT_IFNAME, /* Alternative ifname */
IFLA_PERM_ADDRESS,
IFLA_PROTO_DOWN_REASON,
/* device (sysfs) name as parent, used instead
* of IFLA_LINK where there's no parent netdev
*/
IFLA_PARENT_DEV_NAME,
IFLA_PARENT_DEV_BUS_NAME,
IFLA_GRO_MAX_SIZE,
__IFLA_MAX
};
#define IFLA_MAX (__IFLA_MAX - 1)
enum {
IFLA_PROTO_DOWN_REASON_UNSPEC,
IFLA_PROTO_DOWN_REASON_MASK, /* u32, mask for reason bits */
IFLA_PROTO_DOWN_REASON_VALUE, /* u32, reason bit value */
__IFLA_PROTO_DOWN_REASON_CNT,
IFLA_PROTO_DOWN_REASON_MAX = __IFLA_PROTO_DOWN_REASON_CNT - 1
};
/* backwards compatibility for userspace */
#ifndef __KERNEL__
#define IFLA_RTA(r) ((struct rtattr*)(((char*)(r)) + NLMSG_ALIGN(sizeof(struct ifinfomsg))))
@@ -230,6 +418,7 @@ enum {
IFLA_INET6_ICMP6STATS, /* statistics (icmpv6) */
IFLA_INET6_TOKEN, /* device token */
IFLA_INET6_ADDR_GEN_MODE, /* implicit address generator mode */
IFLA_INET6_RA_MTU, /* mtu carried in the RA message */
__IFLA_INET6_MAX
};
@@ -292,6 +481,7 @@ enum {
IFLA_BR_MCAST_MLD_VERSION,
IFLA_BR_VLAN_STATS_PER_PORT,
IFLA_BR_MULTI_BOOLOPT,
IFLA_BR_MCAST_QUERIER_STATE,
__IFLA_BR_MAX,
};
@@ -345,6 +535,8 @@ enum {
IFLA_BRPORT_BACKUP_PORT,
IFLA_BRPORT_MRP_RING_OPEN,
IFLA_BRPORT_MRP_IN_OPEN,
IFLA_BRPORT_MCAST_EHT_HOSTS_LIMIT,
IFLA_BRPORT_MCAST_EHT_HOSTS_CNT,
__IFLA_BRPORT_MAX
};
#define IFLA_BRPORT_MAX (__IFLA_BRPORT_MAX - 1)
@@ -432,6 +624,7 @@ enum macvlan_macaddr_mode {
};
#define MACVLAN_FLAG_NOPROMISC 1
#define MACVLAN_FLAG_NODST 2 /* skip dst macvlan if matching src macvlan */
/* VRF section */
enum {
@@ -596,6 +789,18 @@ enum ifla_geneve_df {
GENEVE_DF_MAX = __GENEVE_DF_END - 1,
};
/* Bareudp section */
enum {
IFLA_BAREUDP_UNSPEC,
IFLA_BAREUDP_PORT,
IFLA_BAREUDP_ETHERTYPE,
IFLA_BAREUDP_SRCPORT_MIN,
IFLA_BAREUDP_MULTIPROTO_MODE,
__IFLA_BAREUDP_MAX
};
#define IFLA_BAREUDP_MAX (__IFLA_BAREUDP_MAX - 1)
/* PPP section */
enum {
IFLA_PPP_UNSPEC,
@@ -653,6 +858,8 @@ enum {
IFLA_BOND_AD_ACTOR_SYSTEM,
IFLA_BOND_TLB_DYNAMIC_LB,
IFLA_BOND_PEER_NOTIF_DELAY,
IFLA_BOND_AD_LACP_ACTIVE,
IFLA_BOND_MISSED_MAX,
__IFLA_BOND_MAX,
};
@@ -897,7 +1104,14 @@ enum {
#define IFLA_IPOIB_MAX (__IFLA_IPOIB_MAX - 1)
/* HSR section */
/* HSR/PRP section, both uses same interface */
/* Different redundancy protocols for hsr device */
enum {
HSR_PROTOCOL_HSR,
HSR_PROTOCOL_PRP,
HSR_PROTOCOL_MAX,
};
enum {
IFLA_HSR_UNSPEC,
@@ -907,6 +1121,9 @@ enum {
IFLA_HSR_SUPERVISION_ADDR, /* Supervision frame multicast addr */
IFLA_HSR_SEQ_NR,
IFLA_HSR_VERSION, /* HSR version */
IFLA_HSR_PROTOCOL, /* Indicate different protocol than
* HSR. For example PRP.
*/
__IFLA_HSR_MAX,
};
@@ -1031,6 +1248,8 @@ enum {
#define RMNET_FLAGS_INGRESS_MAP_COMMANDS (1U << 1)
#define RMNET_FLAGS_INGRESS_MAP_CKSUMV4 (1U << 2)
#define RMNET_FLAGS_EGRESS_MAP_CKSUMV4 (1U << 3)
#define RMNET_FLAGS_INGRESS_MAP_CKSUMV5 (1U << 4)
#define RMNET_FLAGS_EGRESS_MAP_CKSUMV5 (1U << 5)
enum {
IFLA_RMNET_UNSPEC,
@@ -1046,4 +1265,14 @@ struct ifla_rmnet_flags {
__u32 mask;
};
/* MCTP section */
enum {
IFLA_MCTP_UNSPEC,
IFLA_MCTP_NET,
__IFLA_MCTP_MAX,
};
#define IFLA_MCTP_MAX (__IFLA_MCTP_MAX - 1)
#endif /* _UAPI_LINUX_IF_LINK_H */

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,612 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
#ifndef __LINUX_PKT_CLS_H
#define __LINUX_PKT_CLS_H
#include <linux/types.h>
#include <linux/pkt_sched.h>
#define TC_COOKIE_MAX_SIZE 16
/* Action attributes */
enum {
TCA_ACT_UNSPEC,
TCA_ACT_KIND,
TCA_ACT_OPTIONS,
TCA_ACT_INDEX,
TCA_ACT_STATS,
TCA_ACT_PAD,
TCA_ACT_COOKIE,
__TCA_ACT_MAX
};
#define TCA_ACT_MAX __TCA_ACT_MAX
#define TCA_OLD_COMPAT (TCA_ACT_MAX+1)
#define TCA_ACT_MAX_PRIO 32
#define TCA_ACT_BIND 1
#define TCA_ACT_NOBIND 0
#define TCA_ACT_UNBIND 1
#define TCA_ACT_NOUNBIND 0
#define TCA_ACT_REPLACE 1
#define TCA_ACT_NOREPLACE 0
#define TC_ACT_UNSPEC (-1)
#define TC_ACT_OK 0
#define TC_ACT_RECLASSIFY 1
#define TC_ACT_SHOT 2
#define TC_ACT_PIPE 3
#define TC_ACT_STOLEN 4
#define TC_ACT_QUEUED 5
#define TC_ACT_REPEAT 6
#define TC_ACT_REDIRECT 7
#define TC_ACT_TRAP 8 /* For hw path, this means "trap to cpu"
* and don't further process the frame
* in hardware. For sw path, this is
* equivalent of TC_ACT_STOLEN - drop
* the skb and act like everything
* is alright.
*/
#define TC_ACT_VALUE_MAX TC_ACT_TRAP
/* There is a special kind of actions called "extended actions",
* which need a value parameter. These have a local opcode located in
* the highest nibble, starting from 1. The rest of the bits
* are used to carry the value. These two parts together make
* a combined opcode.
*/
#define __TC_ACT_EXT_SHIFT 28
#define __TC_ACT_EXT(local) ((local) << __TC_ACT_EXT_SHIFT)
#define TC_ACT_EXT_VAL_MASK ((1 << __TC_ACT_EXT_SHIFT) - 1)
#define TC_ACT_EXT_OPCODE(combined) ((combined) & (~TC_ACT_EXT_VAL_MASK))
#define TC_ACT_EXT_CMP(combined, opcode) (TC_ACT_EXT_OPCODE(combined) == opcode)
#define TC_ACT_JUMP __TC_ACT_EXT(1)
#define TC_ACT_GOTO_CHAIN __TC_ACT_EXT(2)
#define TC_ACT_EXT_OPCODE_MAX TC_ACT_GOTO_CHAIN
/* Action type identifiers*/
enum {
TCA_ID_UNSPEC=0,
TCA_ID_POLICE=1,
/* other actions go here */
__TCA_ID_MAX=255
};
#define TCA_ID_MAX __TCA_ID_MAX
struct tc_police {
__u32 index;
int action;
#define TC_POLICE_UNSPEC TC_ACT_UNSPEC
#define TC_POLICE_OK TC_ACT_OK
#define TC_POLICE_RECLASSIFY TC_ACT_RECLASSIFY
#define TC_POLICE_SHOT TC_ACT_SHOT
#define TC_POLICE_PIPE TC_ACT_PIPE
__u32 limit;
__u32 burst;
__u32 mtu;
struct tc_ratespec rate;
struct tc_ratespec peakrate;
int refcnt;
int bindcnt;
__u32 capab;
};
struct tcf_t {
__u64 install;
__u64 lastuse;
__u64 expires;
__u64 firstuse;
};
struct tc_cnt {
int refcnt;
int bindcnt;
};
#define tc_gen \
__u32 index; \
__u32 capab; \
int action; \
int refcnt; \
int bindcnt
enum {
TCA_POLICE_UNSPEC,
TCA_POLICE_TBF,
TCA_POLICE_RATE,
TCA_POLICE_PEAKRATE,
TCA_POLICE_AVRATE,
TCA_POLICE_RESULT,
TCA_POLICE_TM,
TCA_POLICE_PAD,
__TCA_POLICE_MAX
#define TCA_POLICE_RESULT TCA_POLICE_RESULT
};
#define TCA_POLICE_MAX (__TCA_POLICE_MAX - 1)
/* tca flags definitions */
#define TCA_CLS_FLAGS_SKIP_HW (1 << 0) /* don't offload filter to HW */
#define TCA_CLS_FLAGS_SKIP_SW (1 << 1) /* don't use filter in SW */
#define TCA_CLS_FLAGS_IN_HW (1 << 2) /* filter is offloaded to HW */
#define TCA_CLS_FLAGS_NOT_IN_HW (1 << 3) /* filter isn't offloaded to HW */
#define TCA_CLS_FLAGS_VERBOSE (1 << 4) /* verbose logging */
/* U32 filters */
#define TC_U32_HTID(h) ((h)&0xFFF00000)
#define TC_U32_USERHTID(h) (TC_U32_HTID(h)>>20)
#define TC_U32_HASH(h) (((h)>>12)&0xFF)
#define TC_U32_NODE(h) ((h)&0xFFF)
#define TC_U32_KEY(h) ((h)&0xFFFFF)
#define TC_U32_UNSPEC 0
#define TC_U32_ROOT (0xFFF00000)
enum {
TCA_U32_UNSPEC,
TCA_U32_CLASSID,
TCA_U32_HASH,
TCA_U32_LINK,
TCA_U32_DIVISOR,
TCA_U32_SEL,
TCA_U32_POLICE,
TCA_U32_ACT,
TCA_U32_INDEV,
TCA_U32_PCNT,
TCA_U32_MARK,
TCA_U32_FLAGS,
TCA_U32_PAD,
__TCA_U32_MAX
};
#define TCA_U32_MAX (__TCA_U32_MAX - 1)
struct tc_u32_key {
__be32 mask;
__be32 val;
int off;
int offmask;
};
struct tc_u32_sel {
unsigned char flags;
unsigned char offshift;
unsigned char nkeys;
__be16 offmask;
__u16 off;
short offoff;
short hoff;
__be32 hmask;
struct tc_u32_key keys[0];
};
struct tc_u32_mark {
__u32 val;
__u32 mask;
__u32 success;
};
struct tc_u32_pcnt {
__u64 rcnt;
__u64 rhit;
__u64 kcnts[0];
};
/* Flags */
#define TC_U32_TERMINAL 1
#define TC_U32_OFFSET 2
#define TC_U32_VAROFFSET 4
#define TC_U32_EAT 8
#define TC_U32_MAXDEPTH 8
/* RSVP filter */
enum {
TCA_RSVP_UNSPEC,
TCA_RSVP_CLASSID,
TCA_RSVP_DST,
TCA_RSVP_SRC,
TCA_RSVP_PINFO,
TCA_RSVP_POLICE,
TCA_RSVP_ACT,
__TCA_RSVP_MAX
};
#define TCA_RSVP_MAX (__TCA_RSVP_MAX - 1 )
struct tc_rsvp_gpi {
__u32 key;
__u32 mask;
int offset;
};
struct tc_rsvp_pinfo {
struct tc_rsvp_gpi dpi;
struct tc_rsvp_gpi spi;
__u8 protocol;
__u8 tunnelid;
__u8 tunnelhdr;
__u8 pad;
};
/* ROUTE filter */
enum {
TCA_ROUTE4_UNSPEC,
TCA_ROUTE4_CLASSID,
TCA_ROUTE4_TO,
TCA_ROUTE4_FROM,
TCA_ROUTE4_IIF,
TCA_ROUTE4_POLICE,
TCA_ROUTE4_ACT,
__TCA_ROUTE4_MAX
};
#define TCA_ROUTE4_MAX (__TCA_ROUTE4_MAX - 1)
/* FW filter */
enum {
TCA_FW_UNSPEC,
TCA_FW_CLASSID,
TCA_FW_POLICE,
TCA_FW_INDEV,
TCA_FW_ACT, /* used by CONFIG_NET_CLS_ACT */
TCA_FW_MASK,
__TCA_FW_MAX
};
#define TCA_FW_MAX (__TCA_FW_MAX - 1)
/* TC index filter */
enum {
TCA_TCINDEX_UNSPEC,
TCA_TCINDEX_HASH,
TCA_TCINDEX_MASK,
TCA_TCINDEX_SHIFT,
TCA_TCINDEX_FALL_THROUGH,
TCA_TCINDEX_CLASSID,
TCA_TCINDEX_POLICE,
TCA_TCINDEX_ACT,
__TCA_TCINDEX_MAX
};
#define TCA_TCINDEX_MAX (__TCA_TCINDEX_MAX - 1)
/* Flow filter */
enum {
FLOW_KEY_SRC,
FLOW_KEY_DST,
FLOW_KEY_PROTO,
FLOW_KEY_PROTO_SRC,
FLOW_KEY_PROTO_DST,
FLOW_KEY_IIF,
FLOW_KEY_PRIORITY,
FLOW_KEY_MARK,
FLOW_KEY_NFCT,
FLOW_KEY_NFCT_SRC,
FLOW_KEY_NFCT_DST,
FLOW_KEY_NFCT_PROTO_SRC,
FLOW_KEY_NFCT_PROTO_DST,
FLOW_KEY_RTCLASSID,
FLOW_KEY_SKUID,
FLOW_KEY_SKGID,
FLOW_KEY_VLAN_TAG,
FLOW_KEY_RXHASH,
__FLOW_KEY_MAX,
};
#define FLOW_KEY_MAX (__FLOW_KEY_MAX - 1)
enum {
FLOW_MODE_MAP,
FLOW_MODE_HASH,
};
enum {
TCA_FLOW_UNSPEC,
TCA_FLOW_KEYS,
TCA_FLOW_MODE,
TCA_FLOW_BASECLASS,
TCA_FLOW_RSHIFT,
TCA_FLOW_ADDEND,
TCA_FLOW_MASK,
TCA_FLOW_XOR,
TCA_FLOW_DIVISOR,
TCA_FLOW_ACT,
TCA_FLOW_POLICE,
TCA_FLOW_EMATCHES,
TCA_FLOW_PERTURB,
__TCA_FLOW_MAX
};
#define TCA_FLOW_MAX (__TCA_FLOW_MAX - 1)
/* Basic filter */
enum {
TCA_BASIC_UNSPEC,
TCA_BASIC_CLASSID,
TCA_BASIC_EMATCHES,
TCA_BASIC_ACT,
TCA_BASIC_POLICE,
__TCA_BASIC_MAX
};
#define TCA_BASIC_MAX (__TCA_BASIC_MAX - 1)
/* Cgroup classifier */
enum {
TCA_CGROUP_UNSPEC,
TCA_CGROUP_ACT,
TCA_CGROUP_POLICE,
TCA_CGROUP_EMATCHES,
__TCA_CGROUP_MAX,
};
#define TCA_CGROUP_MAX (__TCA_CGROUP_MAX - 1)
/* BPF classifier */
#define TCA_BPF_FLAG_ACT_DIRECT (1 << 0)
enum {
TCA_BPF_UNSPEC,
TCA_BPF_ACT,
TCA_BPF_POLICE,
TCA_BPF_CLASSID,
TCA_BPF_OPS_LEN,
TCA_BPF_OPS,
TCA_BPF_FD,
TCA_BPF_NAME,
TCA_BPF_FLAGS,
TCA_BPF_FLAGS_GEN,
TCA_BPF_TAG,
TCA_BPF_ID,
__TCA_BPF_MAX,
};
#define TCA_BPF_MAX (__TCA_BPF_MAX - 1)
/* Flower classifier */
enum {
TCA_FLOWER_UNSPEC,
TCA_FLOWER_CLASSID,
TCA_FLOWER_INDEV,
TCA_FLOWER_ACT,
TCA_FLOWER_KEY_ETH_DST, /* ETH_ALEN */
TCA_FLOWER_KEY_ETH_DST_MASK, /* ETH_ALEN */
TCA_FLOWER_KEY_ETH_SRC, /* ETH_ALEN */
TCA_FLOWER_KEY_ETH_SRC_MASK, /* ETH_ALEN */
TCA_FLOWER_KEY_ETH_TYPE, /* be16 */
TCA_FLOWER_KEY_IP_PROTO, /* u8 */
TCA_FLOWER_KEY_IPV4_SRC, /* be32 */
TCA_FLOWER_KEY_IPV4_SRC_MASK, /* be32 */
TCA_FLOWER_KEY_IPV4_DST, /* be32 */
TCA_FLOWER_KEY_IPV4_DST_MASK, /* be32 */
TCA_FLOWER_KEY_IPV6_SRC, /* struct in6_addr */
TCA_FLOWER_KEY_IPV6_SRC_MASK, /* struct in6_addr */
TCA_FLOWER_KEY_IPV6_DST, /* struct in6_addr */
TCA_FLOWER_KEY_IPV6_DST_MASK, /* struct in6_addr */
TCA_FLOWER_KEY_TCP_SRC, /* be16 */
TCA_FLOWER_KEY_TCP_DST, /* be16 */
TCA_FLOWER_KEY_UDP_SRC, /* be16 */
TCA_FLOWER_KEY_UDP_DST, /* be16 */
TCA_FLOWER_FLAGS,
TCA_FLOWER_KEY_VLAN_ID, /* be16 */
TCA_FLOWER_KEY_VLAN_PRIO, /* u8 */
TCA_FLOWER_KEY_VLAN_ETH_TYPE, /* be16 */
TCA_FLOWER_KEY_ENC_KEY_ID, /* be32 */
TCA_FLOWER_KEY_ENC_IPV4_SRC, /* be32 */
TCA_FLOWER_KEY_ENC_IPV4_SRC_MASK,/* be32 */
TCA_FLOWER_KEY_ENC_IPV4_DST, /* be32 */
TCA_FLOWER_KEY_ENC_IPV4_DST_MASK,/* be32 */
TCA_FLOWER_KEY_ENC_IPV6_SRC, /* struct in6_addr */
TCA_FLOWER_KEY_ENC_IPV6_SRC_MASK,/* struct in6_addr */
TCA_FLOWER_KEY_ENC_IPV6_DST, /* struct in6_addr */
TCA_FLOWER_KEY_ENC_IPV6_DST_MASK,/* struct in6_addr */
TCA_FLOWER_KEY_TCP_SRC_MASK, /* be16 */
TCA_FLOWER_KEY_TCP_DST_MASK, /* be16 */
TCA_FLOWER_KEY_UDP_SRC_MASK, /* be16 */
TCA_FLOWER_KEY_UDP_DST_MASK, /* be16 */
TCA_FLOWER_KEY_SCTP_SRC_MASK, /* be16 */
TCA_FLOWER_KEY_SCTP_DST_MASK, /* be16 */
TCA_FLOWER_KEY_SCTP_SRC, /* be16 */
TCA_FLOWER_KEY_SCTP_DST, /* be16 */
TCA_FLOWER_KEY_ENC_UDP_SRC_PORT, /* be16 */
TCA_FLOWER_KEY_ENC_UDP_SRC_PORT_MASK, /* be16 */
TCA_FLOWER_KEY_ENC_UDP_DST_PORT, /* be16 */
TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK, /* be16 */
TCA_FLOWER_KEY_FLAGS, /* be32 */
TCA_FLOWER_KEY_FLAGS_MASK, /* be32 */
TCA_FLOWER_KEY_ICMPV4_CODE, /* u8 */
TCA_FLOWER_KEY_ICMPV4_CODE_MASK,/* u8 */
TCA_FLOWER_KEY_ICMPV4_TYPE, /* u8 */
TCA_FLOWER_KEY_ICMPV4_TYPE_MASK,/* u8 */
TCA_FLOWER_KEY_ICMPV6_CODE, /* u8 */
TCA_FLOWER_KEY_ICMPV6_CODE_MASK,/* u8 */
TCA_FLOWER_KEY_ICMPV6_TYPE, /* u8 */
TCA_FLOWER_KEY_ICMPV6_TYPE_MASK,/* u8 */
TCA_FLOWER_KEY_ARP_SIP, /* be32 */
TCA_FLOWER_KEY_ARP_SIP_MASK, /* be32 */
TCA_FLOWER_KEY_ARP_TIP, /* be32 */
TCA_FLOWER_KEY_ARP_TIP_MASK, /* be32 */
TCA_FLOWER_KEY_ARP_OP, /* u8 */
TCA_FLOWER_KEY_ARP_OP_MASK, /* u8 */
TCA_FLOWER_KEY_ARP_SHA, /* ETH_ALEN */
TCA_FLOWER_KEY_ARP_SHA_MASK, /* ETH_ALEN */
TCA_FLOWER_KEY_ARP_THA, /* ETH_ALEN */
TCA_FLOWER_KEY_ARP_THA_MASK, /* ETH_ALEN */
TCA_FLOWER_KEY_MPLS_TTL, /* u8 - 8 bits */
TCA_FLOWER_KEY_MPLS_BOS, /* u8 - 1 bit */
TCA_FLOWER_KEY_MPLS_TC, /* u8 - 3 bits */
TCA_FLOWER_KEY_MPLS_LABEL, /* be32 - 20 bits */
TCA_FLOWER_KEY_TCP_FLAGS, /* be16 */
TCA_FLOWER_KEY_TCP_FLAGS_MASK, /* be16 */
TCA_FLOWER_KEY_IP_TOS, /* u8 */
TCA_FLOWER_KEY_IP_TOS_MASK, /* u8 */
TCA_FLOWER_KEY_IP_TTL, /* u8 */
TCA_FLOWER_KEY_IP_TTL_MASK, /* u8 */
TCA_FLOWER_KEY_CVLAN_ID, /* be16 */
TCA_FLOWER_KEY_CVLAN_PRIO, /* u8 */
TCA_FLOWER_KEY_CVLAN_ETH_TYPE, /* be16 */
TCA_FLOWER_KEY_ENC_IP_TOS, /* u8 */
TCA_FLOWER_KEY_ENC_IP_TOS_MASK, /* u8 */
TCA_FLOWER_KEY_ENC_IP_TTL, /* u8 */
TCA_FLOWER_KEY_ENC_IP_TTL_MASK, /* u8 */
TCA_FLOWER_KEY_ENC_OPTS,
TCA_FLOWER_KEY_ENC_OPTS_MASK,
TCA_FLOWER_IN_HW_COUNT,
__TCA_FLOWER_MAX,
};
#define TCA_FLOWER_MAX (__TCA_FLOWER_MAX - 1)
enum {
TCA_FLOWER_KEY_ENC_OPTS_UNSPEC,
TCA_FLOWER_KEY_ENC_OPTS_GENEVE, /* Nested
* TCA_FLOWER_KEY_ENC_OPT_GENEVE_
* attributes
*/
__TCA_FLOWER_KEY_ENC_OPTS_MAX,
};
#define TCA_FLOWER_KEY_ENC_OPTS_MAX (__TCA_FLOWER_KEY_ENC_OPTS_MAX - 1)
enum {
TCA_FLOWER_KEY_ENC_OPT_GENEVE_UNSPEC,
TCA_FLOWER_KEY_ENC_OPT_GENEVE_CLASS, /* u16 */
TCA_FLOWER_KEY_ENC_OPT_GENEVE_TYPE, /* u8 */
TCA_FLOWER_KEY_ENC_OPT_GENEVE_DATA, /* 4 to 128 bytes */
__TCA_FLOWER_KEY_ENC_OPT_GENEVE_MAX,
};
#define TCA_FLOWER_KEY_ENC_OPT_GENEVE_MAX \
(__TCA_FLOWER_KEY_ENC_OPT_GENEVE_MAX - 1)
enum {
TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT = (1 << 0),
TCA_FLOWER_KEY_FLAGS_FRAG_IS_FIRST = (1 << 1),
};
/* Match-all classifier */
enum {
TCA_MATCHALL_UNSPEC,
TCA_MATCHALL_CLASSID,
TCA_MATCHALL_ACT,
TCA_MATCHALL_FLAGS,
__TCA_MATCHALL_MAX,
};
#define TCA_MATCHALL_MAX (__TCA_MATCHALL_MAX - 1)
/* Extended Matches */
struct tcf_ematch_tree_hdr {
__u16 nmatches;
__u16 progid;
};
enum {
TCA_EMATCH_TREE_UNSPEC,
TCA_EMATCH_TREE_HDR,
TCA_EMATCH_TREE_LIST,
__TCA_EMATCH_TREE_MAX
};
#define TCA_EMATCH_TREE_MAX (__TCA_EMATCH_TREE_MAX - 1)
struct tcf_ematch_hdr {
__u16 matchid;
__u16 kind;
__u16 flags;
__u16 pad; /* currently unused */
};
/* 0 1
* 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
* +-----------------------+-+-+---+
* | Unused |S|I| R |
* +-----------------------+-+-+---+
*
* R(2) ::= relation to next ematch
* where: 0 0 END (last ematch)
* 0 1 AND
* 1 0 OR
* 1 1 Unused (invalid)
* I(1) ::= invert result
* S(1) ::= simple payload
*/
#define TCF_EM_REL_END 0
#define TCF_EM_REL_AND (1<<0)
#define TCF_EM_REL_OR (1<<1)
#define TCF_EM_INVERT (1<<2)
#define TCF_EM_SIMPLE (1<<3)
#define TCF_EM_REL_MASK 3
#define TCF_EM_REL_VALID(v) (((v) & TCF_EM_REL_MASK) != TCF_EM_REL_MASK)
enum {
TCF_LAYER_LINK,
TCF_LAYER_NETWORK,
TCF_LAYER_TRANSPORT,
__TCF_LAYER_MAX
};
#define TCF_LAYER_MAX (__TCF_LAYER_MAX - 1)
/* Ematch type assignments
* 1..32767 Reserved for ematches inside kernel tree
* 32768..65535 Free to use, not reliable
*/
#define TCF_EM_CONTAINER 0
#define TCF_EM_CMP 1
#define TCF_EM_NBYTE 2
#define TCF_EM_U32 3
#define TCF_EM_META 4
#define TCF_EM_TEXT 5
#define TCF_EM_VLAN 6
#define TCF_EM_CANID 7
#define TCF_EM_IPSET 8
#define TCF_EM_IPT 9
#define TCF_EM_MAX 9
enum {
TCF_EM_PROG_TC
};
enum {
TCF_EM_OPND_EQ,
TCF_EM_OPND_GT,
TCF_EM_OPND_LT
};
#endif

File diff suppressed because it is too large Load Diff

58
scripts/build-fuzzers.sh Executable file
View File

@@ -0,0 +1,58 @@
#!/bin/bash
set -eux
SANITIZER=${SANITIZER:-address}
flags="-O1 -fno-omit-frame-pointer -g -DFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION -fsanitize=$SANITIZER -fsanitize=fuzzer-no-link"
export CC=${CC:-clang}
export CFLAGS=${CFLAGS:-$flags}
export CXX=${CXX:-clang++}
export CXXFLAGS=${CXXFLAGS:-$flags}
cd "$(dirname -- "$0")/.."
export OUT=${OUT:-"$(pwd)/out"}
mkdir -p "$OUT"
export LIB_FUZZING_ENGINE=${LIB_FUZZING_ENGINE:--fsanitize=fuzzer}
# Ideally libbelf should be built using release tarballs available
# at https://sourceware.org/elfutils/ftp/. Unfortunately sometimes they
# fail to compile (for example, elfutils-0.185 fails to compile with LDFLAGS enabled
# due to https://bugs.gentoo.org/794601) so let's just point the script to
# commits referring to versions of libelf that actually can be built
rm -rf elfutils
git clone git://sourceware.org/git/elfutils.git
(
cd elfutils
git checkout 983e86fd89e8bf02f2d27ba5dce5bf078af4ceda
git log --oneline -1
# ASan isn't compatible with -Wl,--no-undefined: https://github.com/google/sanitizers/issues/380
find -name Makefile.am | xargs sed -i 's/,--no-undefined//'
# ASan isn't compatible with -Wl,-z,defs either:
# https://clang.llvm.org/docs/AddressSanitizer.html#usage
sed -i 's/^\(ZDEFS_LDFLAGS=\).*/\1/' configure.ac
autoreconf -i -f
if ! ./configure --enable-maintainer-mode --disable-debuginfod --disable-libdebuginfod \
CC="$CC" CFLAGS="-Wno-error $CFLAGS" CXX="$CXX" CXXFLAGS="-Wno-error $CXXFLAGS" LDFLAGS="$CFLAGS"; then
cat config.log
exit 1
fi
make -C config -j$(nproc) V=1
make -C lib -j$(nproc) V=1
make -C libelf -j$(nproc) V=1
)
make -C src BUILD_STATIC_ONLY=y V=1 clean
make -C src -j$(nproc) CFLAGS="-I$(pwd)/elfutils/libelf $CFLAGS" BUILD_STATIC_ONLY=y V=1
$CC $CFLAGS -Isrc -Iinclude -Iinclude/uapi -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -c fuzz/bpf-object-fuzzer.c -o bpf-object-fuzzer.o
$CXX $CXXFLAGS $LIB_FUZZING_ENGINE bpf-object-fuzzer.o src/libbpf.a "$(pwd)/elfutils/libelf/libelf.a" -l:libz.a -o "$OUT/bpf-object-fuzzer"
cp fuzz/bpf-object-fuzzer_seed_corpus.zip "$OUT"

View File

@@ -17,7 +17,7 @@ BPF_BRANCH=${3-""}
BASELINE_COMMIT=${BPF_NEXT_BASELINE:-$(cat ${LIBBPF_REPO}/CHECKPOINT-COMMIT)}
BPF_BASELINE_COMMIT=${BPF_BASELINE:-$(cat ${LIBBPF_REPO}/BPF-CHECKPOINT-COMMIT)}
if [ -z "${LIBBPF_REPO}" ] || [ -z "${LINUX_REPO}" ] || [ -z "${BPF_BRANCH}" ]; then
if [ -z "${LIBBPF_REPO}" ] || [ -z "${LINUX_REPO}" ]; then
echo "Error: libbpf or linux repos are not specified"
usage
fi
@@ -45,11 +45,15 @@ PATH_MAP=( \
[tools/include/uapi/linux/if_link.h]=include/uapi/linux/if_link.h \
[tools/include/uapi/linux/if_xdp.h]=include/uapi/linux/if_xdp.h \
[tools/include/uapi/linux/netlink.h]=include/uapi/linux/netlink.h \
[tools/include/uapi/linux/pkt_cls.h]=include/uapi/linux/pkt_cls.h \
[tools/include/uapi/linux/pkt_sched.h]=include/uapi/linux/pkt_sched.h \
[include/uapi/linux/perf_event.h]=include/uapi/linux/perf_event.h \
[Documentation/bpf/libbpf]=docs \
)
LIBBPF_PATHS="${!PATH_MAP[@]} :^tools/lib/bpf/Makefile :^tools/lib/bpf/Build :^tools/lib/bpf/.gitignore :^tools/include/tools/libc_compat.h"
LIBBPF_VIEW_PATHS="${PATH_MAP[@]}"
LIBBPF_VIEW_EXCLUDE_REGEX='^src/(Makefile|Build|test_libbpf\.c|bpf_helper_defs\.h|\.gitignore)$'
LIBBPF_VIEW_EXCLUDE_REGEX='^src/(Makefile|Build|test_libbpf\.c|bpf_helper_defs\.h|\.gitignore)$|^docs/(\.gitignore|api\.rst|conf\.py)$|^docs/sphinx/.*'
LINUX_VIEW_EXCLUDE_REGEX='^include/tools/libc_compat.h$'
LIBBPF_TREE_FILTER="mkdir -p __libbpf/include/uapi/linux __libbpf/include/tools && "$'\\\n'
@@ -71,7 +75,7 @@ commit_desc()
}
# Create commit single-line signature, which consists of:
# - full commit hash
# - full commit subject
# - author date in ISO8601 format
# - full commit body with newlines replaced with vertical bars (|)
# - shortstat appended at the end
@@ -260,18 +264,21 @@ cd_to ${LIBBPF_REPO}
git checkout -b ${LIBBPF_SYNC_TAG}
for patch in $(ls -1 ${TMP_DIR}/patches | tail -n +2); do
if ! git am --3way --committer-date-is-author-date "${TMP_DIR}/patches/${patch}"; then
read -p "Applying ${TMP_DIR}/patches/${patch} failed, please resolve manually and press <return> to proceed..."
if ! git am -3 --committer-date-is-author-date "${TMP_DIR}/patches/${patch}"; then
if ! patch -p1 --merge < "${TMP_DIR}/patches/${patch}"; then
read -p "Applying ${TMP_DIR}/patches/${patch} failed, please resolve manually and press <return> to proceed..."
fi
git am --continue
fi
done
# Generate bpf_helper_defs.h and commit, if anything changed
# restore Linux tip to use bpf_helpers_doc.py
# restore Linux tip to use bpf_doc.py
cd_to ${LINUX_REPO}
git checkout ${TIP_TAG}
# re-generate bpf_helper_defs.h
cd_to ${LIBBPF_REPO}
"${LINUX_ABS_DIR}/scripts/bpf_helpers_doc.py" --header \
"${LINUX_ABS_DIR}/scripts/bpf_doc.py" --header \
--file include/uapi/linux/bpf.h > src/bpf_helper_defs.h
# if anything changed, commit it
helpers_changes=$(git status --porcelain src/bpf_helper_defs.h | wc -l)

View File

@@ -5,7 +5,7 @@ ifeq ($(V),1)
msg =
else
Q = @
msg = @printf ' %-8s %s%s\n' "$(1)" "$(notdir $(2))" "$(if $(3), $(3))";
msg = @printf ' %-8s %s%s\n' "$(1)" "$(2)" "$(if $(3), $(3))";
endif
LIBBPF_VERSION := $(shell \
@@ -20,7 +20,7 @@ ALL_CFLAGS := $(INCLUDES)
SHARED_CFLAGS += -fPIC -fvisibility=hidden -DSHARED
CFLAGS ?= -g -O2 -Werror -Wall
CFLAGS ?= -g -O2 -Werror -Wall -std=gnu89
ALL_CFLAGS += $(CFLAGS) -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64
ALL_LDFLAGS += $(LDFLAGS)
ifdef NO_PKG_CONFIG
@@ -36,7 +36,8 @@ SHARED_OBJDIR := $(OBJDIR)/sharedobjs
STATIC_OBJDIR := $(OBJDIR)/staticobjs
OBJS := bpf.o btf.o libbpf.o libbpf_errno.o netlink.o \
nlattr.o str_error.o libbpf_probes.o bpf_prog_linfo.o xsk.o \
btf_dump.o hashmap.o ringbuf.o
btf_dump.o hashmap.o ringbuf.o strset.o linker.o gen_loader.o \
relo_core.o
SHARED_OBJS := $(addprefix $(SHARED_OBJDIR)/,$(OBJS))
STATIC_OBJS := $(addprefix $(STATIC_OBJDIR)/,$(OBJS))
@@ -48,9 +49,9 @@ ifndef BUILD_STATIC_ONLY
VERSION_SCRIPT := libbpf.map
endif
HEADERS := bpf.h libbpf.h btf.h xsk.h libbpf_util.h \
HEADERS := bpf.h libbpf.h btf.h libbpf_common.h libbpf_legacy.h xsk.h \
bpf_helpers.h bpf_helper_defs.h bpf_tracing.h \
bpf_endian.h bpf_core_read.h libbpf_common.h
bpf_endian.h bpf_core_read.h skel_internal.h libbpf_version.h
UAPI_HEADERS := $(addprefix $(TOPDIR)/include/uapi/linux/,\
bpf.h bpf_common.h btf.h)
@@ -60,7 +61,7 @@ INSTALL = install
DESTDIR ?=
ifeq ($(shell uname -m),x86_64)
ifeq ($(filter-out %64 %64be %64eb %64le %64el s390x, $(shell uname -m)),)
LIBSUBDIR := lib64
else
LIBSUBDIR := lib
@@ -121,7 +122,7 @@ define do_install
$(Q)if [ ! -d '$(DESTDIR)$2' ]; then \
$(INSTALL) -d -m 755 '$(DESTDIR)$2'; \
fi;
$(Q)$(INSTALL) $1 $(if $3,-m $3,) '$(DESTDIR)$2'
$(Q)$(INSTALL) $(if $3,-m $3,) $1 '$(DESTDIR)$2'
endef
# Preserve symlinks at installation.
@@ -130,7 +131,7 @@ define do_s_install
$(Q)if [ ! -d '$(DESTDIR)$2' ]; then \
$(INSTALL) -d -m 755 '$(DESTDIR)$2'; \
fi;
$(Q)cp -fpR $1 '$(DESTDIR)$2'
$(Q)cp -fR $1 '$(DESTDIR)$2'
endef
install: all install_headers install_pkgconfig

694
src/bpf.c

File diff suppressed because it is too large Load Diff

264
src/bpf.h
View File

@@ -29,11 +29,38 @@
#include <stdint.h>
#include "libbpf_common.h"
#include "libbpf_legacy.h"
#ifdef __cplusplus
extern "C" {
#endif
int libbpf_set_memlock_rlim(size_t memlock_bytes);
struct bpf_map_create_opts {
size_t sz; /* size of this struct for forward/backward compatibility */
__u32 btf_fd;
__u32 btf_key_type_id;
__u32 btf_value_type_id;
__u32 btf_vmlinux_value_type_id;
__u32 inner_map_fd;
__u32 map_flags;
__u64 map_extra;
__u32 numa_node;
__u32 map_ifindex;
};
#define bpf_map_create_opts__last_field map_ifindex
LIBBPF_API int bpf_map_create(enum bpf_map_type map_type,
const char *map_name,
__u32 key_size,
__u32 value_size,
__u32 max_entries,
const struct bpf_map_create_opts *opts);
struct bpf_create_map_attr {
const char *name;
enum bpf_map_type map_type;
@@ -52,25 +79,95 @@ struct bpf_create_map_attr {
};
};
LIBBPF_API int
bpf_create_map_xattr(const struct bpf_create_map_attr *create_attr);
LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_map_create() instead")
LIBBPF_API int bpf_create_map_xattr(const struct bpf_create_map_attr *create_attr);
LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_map_create() instead")
LIBBPF_API int bpf_create_map_node(enum bpf_map_type map_type, const char *name,
int key_size, int value_size,
int max_entries, __u32 map_flags, int node);
LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_map_create() instead")
LIBBPF_API int bpf_create_map_name(enum bpf_map_type map_type, const char *name,
int key_size, int value_size,
int max_entries, __u32 map_flags);
LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_map_create() instead")
LIBBPF_API int bpf_create_map(enum bpf_map_type map_type, int key_size,
int value_size, int max_entries, __u32 map_flags);
LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_map_create() instead")
LIBBPF_API int bpf_create_map_in_map_node(enum bpf_map_type map_type,
const char *name, int key_size,
int inner_map_fd, int max_entries,
__u32 map_flags, int node);
LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_map_create() instead")
LIBBPF_API int bpf_create_map_in_map(enum bpf_map_type map_type,
const char *name, int key_size,
int inner_map_fd, int max_entries,
__u32 map_flags);
struct bpf_prog_load_opts {
size_t sz; /* size of this struct for forward/backward compatibility */
/* libbpf can retry BPF_PROG_LOAD command if bpf() syscall returns
* -EAGAIN. This field determines how many attempts libbpf has to
* make. If not specified, libbpf will use default value of 5.
*/
int attempts;
enum bpf_attach_type expected_attach_type;
__u32 prog_btf_fd;
__u32 prog_flags;
__u32 prog_ifindex;
__u32 kern_version;
__u32 attach_btf_id;
__u32 attach_prog_fd;
__u32 attach_btf_obj_fd;
const int *fd_array;
/* .BTF.ext func info data */
const void *func_info;
__u32 func_info_cnt;
__u32 func_info_rec_size;
/* .BTF.ext line info data */
const void *line_info;
__u32 line_info_cnt;
__u32 line_info_rec_size;
/* verifier log options */
__u32 log_level;
__u32 log_size;
char *log_buf;
};
#define bpf_prog_load_opts__last_field log_buf
LIBBPF_API int bpf_prog_load(enum bpf_prog_type prog_type,
const char *prog_name, const char *license,
const struct bpf_insn *insns, size_t insn_cnt,
const struct bpf_prog_load_opts *opts);
/* this "specialization" should go away in libbpf 1.0 */
LIBBPF_API int bpf_prog_load_v0_6_0(enum bpf_prog_type prog_type,
const char *prog_name, const char *license,
const struct bpf_insn *insns, size_t insn_cnt,
const struct bpf_prog_load_opts *opts);
/* This is an elaborate way to not conflict with deprecated bpf_prog_load()
* API, defined in libbpf.h. Once we hit libbpf 1.0, all this will be gone.
* With this approach, if someone is calling bpf_prog_load() with
* 4 arguments, they will use the deprecated API, which keeps backwards
* compatibility (both source code and binary). If bpf_prog_load() is called
* with 6 arguments, though, it gets redirected to __bpf_prog_load.
* So looking forward to libbpf 1.0 when this hack will be gone and
* __bpf_prog_load() will be called just bpf_prog_load().
*/
#ifndef bpf_prog_load
#define bpf_prog_load(...) ___libbpf_overload(___bpf_prog_load, __VA_ARGS__)
#define ___bpf_prog_load4(file, type, pobj, prog_fd) \
bpf_prog_load_deprecated(file, type, pobj, prog_fd)
#define ___bpf_prog_load6(prog_type, prog_name, license, insns, insn_cnt, opts) \
bpf_prog_load(prog_type, prog_name, license, insns, insn_cnt, opts)
#endif /* bpf_prog_load */
struct bpf_load_program_attr {
enum bpf_prog_type prog_type;
enum bpf_attach_type expected_attach_type;
@@ -100,15 +197,18 @@ struct bpf_load_program_attr {
/* Flags to direct loading requirements */
#define MAPS_RELAX_COMPAT 0x01
/* Recommend log buffer size */
/* Recommended log buffer size */
#define BPF_LOG_BUF_SIZE (UINT32_MAX >> 8) /* verifier maximum in kernels <= 5.1 */
LIBBPF_API int
bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
char *log_buf, size_t log_buf_sz);
LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_prog_load() instead")
LIBBPF_API int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
char *log_buf, size_t log_buf_sz);
LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_prog_load() instead")
LIBBPF_API int bpf_load_program(enum bpf_prog_type type,
const struct bpf_insn *insns, size_t insns_cnt,
const char *license, __u32 kern_version,
char *log_buf, size_t log_buf_sz);
LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_prog_load() instead")
LIBBPF_API int bpf_verify_program(enum bpf_prog_type type,
const struct bpf_insn *insns,
size_t insns_cnt, __u32 prog_flags,
@@ -116,6 +216,23 @@ LIBBPF_API int bpf_verify_program(enum bpf_prog_type type,
char *log_buf, size_t log_buf_sz,
int log_level);
struct bpf_btf_load_opts {
size_t sz; /* size of this struct for forward/backward compatibility */
/* kernel log options */
char *log_buf;
__u32 log_level;
__u32 log_size;
};
#define bpf_btf_load_opts__last_field log_size
LIBBPF_API int bpf_btf_load(const void *btf_data, size_t btf_size,
const struct bpf_btf_load_opts *opts);
LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_btf_load() instead")
LIBBPF_API int bpf_load_btf(const void *btf, __u32 btf_size, char *log_buf,
__u32 log_buf_size, bool do_log);
LIBBPF_API int bpf_map_update_elem(int fd, const void *key, const void *value,
__u64 flags);
@@ -124,6 +241,8 @@ LIBBPF_API int bpf_map_lookup_elem_flags(int fd, const void *key, void *value,
__u64 flags);
LIBBPF_API int bpf_map_lookup_and_delete_elem(int fd, const void *key,
void *value);
LIBBPF_API int bpf_map_lookup_and_delete_elem_flags(int fd, const void *key,
void *value, __u64 flags);
LIBBPF_API int bpf_map_delete_elem(int fd, const void *key);
LIBBPF_API int bpf_map_get_next_key(int fd, const void *key, void *next_key);
LIBBPF_API int bpf_map_freeze(int fd);
@@ -135,17 +254,128 @@ struct bpf_map_batch_opts {
};
#define bpf_map_batch_opts__last_field flags
LIBBPF_API int bpf_map_delete_batch(int fd, void *keys,
/**
* @brief **bpf_map_delete_batch()** allows for batch deletion of multiple
* elements in a BPF map.
*
* @param fd BPF map file descriptor
* @param keys pointer to an array of *count* keys
* @param count input and output parameter; on input **count** represents the
* number of elements in the map to delete in batch;
* on output if a non-EFAULT error is returned, **count** represents the number of deleted
* elements if the output **count** value is not equal to the input **count** value
* If EFAULT is returned, **count** should not be trusted to be correct.
* @param opts options for configuring the way the batch deletion works
* @return 0, on success; negative error code, otherwise (errno is also set to
* the error code)
*/
LIBBPF_API int bpf_map_delete_batch(int fd, const void *keys,
__u32 *count,
const struct bpf_map_batch_opts *opts);
/**
* @brief **bpf_map_lookup_batch()** allows for batch lookup of BPF map elements.
*
* The parameter *in_batch* is the address of the first element in the batch to read.
* *out_batch* is an output parameter that should be passed as *in_batch* to subsequent
* calls to **bpf_map_lookup_batch()**. NULL can be passed for *in_batch* to indicate
* that the batched lookup starts from the beginning of the map.
*
* The *keys* and *values* are output parameters which must point to memory large enough to
* hold *count* items based on the key and value size of the map *map_fd*. The *keys*
* buffer must be of *key_size* * *count*. The *values* buffer must be of
* *value_size* * *count*.
*
* @param fd BPF map file descriptor
* @param in_batch address of the first element in batch to read, can pass NULL to
* indicate that the batched lookup starts from the beginning of the map.
* @param out_batch output parameter that should be passed to next call as *in_batch*
* @param keys pointer to an array large enough for *count* keys
* @param values pointer to an array large enough for *count* values
* @param count input and output parameter; on input it's the number of elements
* in the map to read in batch; on output it's the number of elements that were
* successfully read.
* If a non-EFAULT error is returned, count will be set as the number of elements
* that were read before the error occurred.
* If EFAULT is returned, **count** should not be trusted to be correct.
* @param opts options for configuring the way the batch lookup works
* @return 0, on success; negative error code, otherwise (errno is also set to
* the error code)
*/
LIBBPF_API int bpf_map_lookup_batch(int fd, void *in_batch, void *out_batch,
void *keys, void *values, __u32 *count,
const struct bpf_map_batch_opts *opts);
/**
* @brief **bpf_map_lookup_and_delete_batch()** allows for batch lookup and deletion
* of BPF map elements where each element is deleted after being retrieved.
*
* @param fd BPF map file descriptor
* @param in_batch address of the first element in batch to read, can pass NULL to
* get address of the first element in *out_batch*
* @param out_batch output parameter that should be passed to next call as *in_batch*
* @param keys pointer to an array of *count* keys
* @param values pointer to an array large enough for *count* values
* @param count input and output parameter; on input it's the number of elements
* in the map to read and delete in batch; on output it represents the number of
* elements that were successfully read and deleted
* If a non-**EFAULT** error code is returned and if the output **count** value
* is not equal to the input **count** value, up to **count** elements may
* have been deleted.
* if **EFAULT** is returned up to *count* elements may have been deleted without
* being returned via the *keys* and *values* output parameters.
* @param opts options for configuring the way the batch lookup and delete works
* @return 0, on success; negative error code, otherwise (errno is also set to
* the error code)
*/
LIBBPF_API int bpf_map_lookup_and_delete_batch(int fd, void *in_batch,
void *out_batch, void *keys,
void *values, __u32 *count,
const struct bpf_map_batch_opts *opts);
LIBBPF_API int bpf_map_update_batch(int fd, void *keys, void *values,
/**
* @brief **bpf_map_update_batch()** updates multiple elements in a map
* by specifying keys and their corresponding values.
*
* The *keys* and *values* parameters must point to memory large enough
* to hold *count* items based on the key and value size of the map.
*
* The *opts* parameter can be used to control how *bpf_map_update_batch()*
* should handle keys that either do or do not already exist in the map.
* In particular the *flags* parameter of *bpf_map_batch_opts* can be
* one of the following:
*
* Note that *count* is an input and output parameter, where on output it
* represents how many elements were successfully updated. Also note that if
* **EFAULT** then *count* should not be trusted to be correct.
*
* **BPF_ANY**
* Create new elements or update existing.
*
* **BPF_NOEXIST**
* Create new elements only if they do not exist.
*
* **BPF_EXIST**
* Update existing elements.
*
* **BPF_F_LOCK**
* Update spin_lock-ed map elements. This must be
* specified if the map value contains a spinlock.
*
* @param fd BPF map file descriptor
* @param keys pointer to an array of *count* keys
* @param values pointer to an array of *count* values
* @param count input and output parameter; on input it's the number of elements
* in the map to update in batch; on output if a non-EFAULT error is returned,
* **count** represents the number of updated elements if the output **count**
* value is not equal to the input **count** value.
* If EFAULT is returned, **count** should not be trusted to be correct.
* @param opts options for configuring the way the batch update works
* @return 0, on success; negative error code, otherwise (errno is also set to
* the error code)
*/
LIBBPF_API int bpf_map_update_batch(int fd, const void *keys, const void *values,
__u32 *count,
const struct bpf_map_batch_opts *opts);
@@ -161,6 +391,10 @@ struct bpf_prog_attach_opts {
LIBBPF_API int bpf_prog_attach(int prog_fd, int attachable_fd,
enum bpf_attach_type type, unsigned int flags);
LIBBPF_API int bpf_prog_attach_opts(int prog_fd, int attachable_fd,
enum bpf_attach_type type,
const struct bpf_prog_attach_opts *opts);
LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_prog_attach_opts() instead")
LIBBPF_API int bpf_prog_attach_xattr(int prog_fd, int attachable_fd,
enum bpf_attach_type type,
const struct bpf_prog_attach_opts *opts);
@@ -175,8 +409,14 @@ struct bpf_link_create_opts {
union bpf_iter_link_info *iter_info;
__u32 iter_info_len;
__u32 target_btf_id;
union {
struct {
__u64 bpf_cookie;
} perf_event;
};
size_t :0;
};
#define bpf_link_create_opts__last_field target_btf_id
#define bpf_link_create_opts__last_field perf_event
LIBBPF_API int bpf_link_create(int prog_fd, int target_fd,
enum bpf_attach_type attach_type,
@@ -213,12 +453,14 @@ struct bpf_prog_test_run_attr {
* out: length of cxt_out */
};
LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_prog_test_run_opts() instead")
LIBBPF_API int bpf_prog_test_run_xattr(struct bpf_prog_test_run_attr *test_attr);
/*
* bpf_prog_test_run does not check that data_out is large enough. Consider
* using bpf_prog_test_run_xattr instead.
* using bpf_prog_test_run_opts instead.
*/
LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_prog_test_run_opts() instead")
LIBBPF_API int bpf_prog_test_run(int prog_fd, int repeat, void *data,
__u32 size, void *data_out, __u32 *size_out,
__u32 *retval, __u32 *duration);
@@ -235,8 +477,6 @@ LIBBPF_API int bpf_prog_query(int target_fd, enum bpf_attach_type type,
__u32 query_flags, __u32 *attach_flags,
__u32 *prog_ids, __u32 *prog_cnt);
LIBBPF_API int bpf_raw_tracepoint_open(const char *name, int prog_fd);
LIBBPF_API int bpf_load_btf(const void *btf, __u32 btf_size, char *log_buf,
__u32 log_buf_size, bool do_log);
LIBBPF_API int bpf_task_fd_query(int pid, int fd, __u32 flags, char *buf,
__u32 *buf_len, __u32 *prog_id, __u32 *fd_type,
__u64 *probe_offset, __u64 *probe_addr);

View File

@@ -40,7 +40,7 @@ enum bpf_enum_value_kind {
#define __CORE_RELO(src, field, info) \
__builtin_preserve_field_info((src)->field, BPF_FIELD_##info)
#if __BYTE_ORDER == __LITTLE_ENDIAN
#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
#define __CORE_BITFIELD_PROBE_READ(dst, src, fld) \
bpf_probe_read_kernel( \
(void *)dst, \
@@ -88,11 +88,19 @@ enum bpf_enum_value_kind {
const void *p = (const void *)s + __CORE_RELO(s, field, BYTE_OFFSET); \
unsigned long long val; \
\
/* This is a so-called barrier_var() operation that makes specified \
* variable "a black box" for optimizing compiler. \
* It forces compiler to perform BYTE_OFFSET relocation on p and use \
* its calculated value in the switch below, instead of applying \
* the same relocation 4 times for each individual memory load. \
*/ \
asm volatile("" : "=r"(p) : "0"(p)); \
\
switch (__CORE_RELO(s, field, BYTE_SIZE)) { \
case 1: val = *(const unsigned char *)p; \
case 2: val = *(const unsigned short *)p; \
case 4: val = *(const unsigned int *)p; \
case 8: val = *(const unsigned long long *)p; \
case 1: val = *(const unsigned char *)p; break; \
case 2: val = *(const unsigned short *)p; break; \
case 4: val = *(const unsigned int *)p; break; \
case 8: val = *(const unsigned long long *)p; break; \
} \
val <<= __CORE_RELO(s, field, LSHIFT_U64); \
if (__CORE_RELO(s, field, SIGNED)) \
@@ -195,17 +203,22 @@ enum bpf_enum_value_kind {
* (local) BTF, used to record relocation.
*/
#define bpf_core_read(dst, sz, src) \
bpf_probe_read_kernel(dst, sz, \
(const void *)__builtin_preserve_access_index(src))
bpf_probe_read_kernel(dst, sz, (const void *)__builtin_preserve_access_index(src))
/* NOTE: see comments for BPF_CORE_READ_USER() about the proper types use. */
#define bpf_core_read_user(dst, sz, src) \
bpf_probe_read_user(dst, sz, (const void *)__builtin_preserve_access_index(src))
/*
* bpf_core_read_str() is a thin wrapper around bpf_probe_read_str()
* additionally emitting BPF CO-RE field relocation for specified source
* argument.
*/
#define bpf_core_read_str(dst, sz, src) \
bpf_probe_read_kernel_str(dst, sz, \
(const void *)__builtin_preserve_access_index(src))
bpf_probe_read_kernel_str(dst, sz, (const void *)__builtin_preserve_access_index(src))
/* NOTE: see comments for BPF_CORE_READ_USER() about the proper types use. */
#define bpf_core_read_user_str(dst, sz, src) \
bpf_probe_read_user_str(dst, sz, (const void *)__builtin_preserve_access_index(src))
#define ___concat(a, b) a ## b
#define ___apply(fn, n) ___concat(fn, n)
@@ -264,30 +277,29 @@ enum bpf_enum_value_kind {
read_fn((void *)(dst), sizeof(*(dst)), &((src_type)(src))->accessor)
/* "recursively" read a sequence of inner pointers using local __t var */
#define ___rd_first(src, a) ___read(bpf_core_read, &__t, ___type(src), src, a);
#define ___rd_last(...) \
___read(bpf_core_read, &__t, \
___type(___nolast(__VA_ARGS__)), __t, ___last(__VA_ARGS__));
#define ___rd_p1(...) const void *__t; ___rd_first(__VA_ARGS__)
#define ___rd_p2(...) ___rd_p1(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__)
#define ___rd_p3(...) ___rd_p2(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__)
#define ___rd_p4(...) ___rd_p3(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__)
#define ___rd_p5(...) ___rd_p4(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__)
#define ___rd_p6(...) ___rd_p5(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__)
#define ___rd_p7(...) ___rd_p6(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__)
#define ___rd_p8(...) ___rd_p7(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__)
#define ___rd_p9(...) ___rd_p8(___nolast(__VA_ARGS__)) ___rd_last(__VA_ARGS__)
#define ___read_ptrs(src, ...) \
___apply(___rd_p, ___narg(__VA_ARGS__))(src, __VA_ARGS__)
#define ___rd_first(fn, src, a) ___read(fn, &__t, ___type(src), src, a);
#define ___rd_last(fn, ...) \
___read(fn, &__t, ___type(___nolast(__VA_ARGS__)), __t, ___last(__VA_ARGS__));
#define ___rd_p1(fn, ...) const void *__t; ___rd_first(fn, __VA_ARGS__)
#define ___rd_p2(fn, ...) ___rd_p1(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__)
#define ___rd_p3(fn, ...) ___rd_p2(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__)
#define ___rd_p4(fn, ...) ___rd_p3(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__)
#define ___rd_p5(fn, ...) ___rd_p4(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__)
#define ___rd_p6(fn, ...) ___rd_p5(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__)
#define ___rd_p7(fn, ...) ___rd_p6(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__)
#define ___rd_p8(fn, ...) ___rd_p7(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__)
#define ___rd_p9(fn, ...) ___rd_p8(fn, ___nolast(__VA_ARGS__)) ___rd_last(fn, __VA_ARGS__)
#define ___read_ptrs(fn, src, ...) \
___apply(___rd_p, ___narg(__VA_ARGS__))(fn, src, __VA_ARGS__)
#define ___core_read0(fn, dst, src, a) \
#define ___core_read0(fn, fn_ptr, dst, src, a) \
___read(fn, dst, ___type(src), src, a);
#define ___core_readN(fn, dst, src, ...) \
___read_ptrs(src, ___nolast(__VA_ARGS__)) \
#define ___core_readN(fn, fn_ptr, dst, src, ...) \
___read_ptrs(fn_ptr, src, ___nolast(__VA_ARGS__)) \
___read(fn, dst, ___type(src, ___nolast(__VA_ARGS__)), __t, \
___last(__VA_ARGS__));
#define ___core_read(fn, dst, src, a, ...) \
___apply(___core_read, ___empty(__VA_ARGS__))(fn, dst, \
#define ___core_read(fn, fn_ptr, dst, src, a, ...) \
___apply(___core_read, ___empty(__VA_ARGS__))(fn, fn_ptr, dst, \
src, a, ##__VA_ARGS__)
/*
@@ -295,20 +307,73 @@ enum bpf_enum_value_kind {
* BPF_CORE_READ(), in which final field is read into user-provided storage.
* See BPF_CORE_READ() below for more details on general usage.
*/
#define BPF_CORE_READ_INTO(dst, src, a, ...) \
({ \
___core_read(bpf_core_read, dst, (src), a, ##__VA_ARGS__) \
})
#define BPF_CORE_READ_INTO(dst, src, a, ...) ({ \
___core_read(bpf_core_read, bpf_core_read, \
dst, (src), a, ##__VA_ARGS__) \
})
/*
* Variant of BPF_CORE_READ_INTO() for reading from user-space memory.
*
* NOTE: see comments for BPF_CORE_READ_USER() about the proper types use.
*/
#define BPF_CORE_READ_USER_INTO(dst, src, a, ...) ({ \
___core_read(bpf_core_read_user, bpf_core_read_user, \
dst, (src), a, ##__VA_ARGS__) \
})
/* Non-CO-RE variant of BPF_CORE_READ_INTO() */
#define BPF_PROBE_READ_INTO(dst, src, a, ...) ({ \
___core_read(bpf_probe_read, bpf_probe_read, \
dst, (src), a, ##__VA_ARGS__) \
})
/* Non-CO-RE variant of BPF_CORE_READ_USER_INTO().
*
* As no CO-RE relocations are emitted, source types can be arbitrary and are
* not restricted to kernel types only.
*/
#define BPF_PROBE_READ_USER_INTO(dst, src, a, ...) ({ \
___core_read(bpf_probe_read_user, bpf_probe_read_user, \
dst, (src), a, ##__VA_ARGS__) \
})
/*
* BPF_CORE_READ_STR_INTO() does same "pointer chasing" as
* BPF_CORE_READ() for intermediate pointers, but then executes (and returns
* corresponding error code) bpf_core_read_str() for final string read.
*/
#define BPF_CORE_READ_STR_INTO(dst, src, a, ...) \
({ \
___core_read(bpf_core_read_str, dst, (src), a, ##__VA_ARGS__)\
})
#define BPF_CORE_READ_STR_INTO(dst, src, a, ...) ({ \
___core_read(bpf_core_read_str, bpf_core_read, \
dst, (src), a, ##__VA_ARGS__) \
})
/*
* Variant of BPF_CORE_READ_STR_INTO() for reading from user-space memory.
*
* NOTE: see comments for BPF_CORE_READ_USER() about the proper types use.
*/
#define BPF_CORE_READ_USER_STR_INTO(dst, src, a, ...) ({ \
___core_read(bpf_core_read_user_str, bpf_core_read_user, \
dst, (src), a, ##__VA_ARGS__) \
})
/* Non-CO-RE variant of BPF_CORE_READ_STR_INTO() */
#define BPF_PROBE_READ_STR_INTO(dst, src, a, ...) ({ \
___core_read(bpf_probe_read_str, bpf_probe_read, \
dst, (src), a, ##__VA_ARGS__) \
})
/*
* Non-CO-RE variant of BPF_CORE_READ_USER_STR_INTO().
*
* As no CO-RE relocations are emitted, source types can be arbitrary and are
* not restricted to kernel types only.
*/
#define BPF_PROBE_READ_USER_STR_INTO(dst, src, a, ...) ({ \
___core_read(bpf_probe_read_user_str, bpf_probe_read_user, \
dst, (src), a, ##__VA_ARGS__) \
})
/*
* BPF_CORE_READ() is used to simplify BPF CO-RE relocatable read, especially
@@ -334,12 +399,46 @@ enum bpf_enum_value_kind {
* N.B. Only up to 9 "field accessors" are supported, which should be more
* than enough for any practical purpose.
*/
#define BPF_CORE_READ(src, a, ...) \
({ \
___type((src), a, ##__VA_ARGS__) __r; \
BPF_CORE_READ_INTO(&__r, (src), a, ##__VA_ARGS__); \
__r; \
})
#define BPF_CORE_READ(src, a, ...) ({ \
___type((src), a, ##__VA_ARGS__) __r; \
BPF_CORE_READ_INTO(&__r, (src), a, ##__VA_ARGS__); \
__r; \
})
/*
* Variant of BPF_CORE_READ() for reading from user-space memory.
*
* NOTE: all the source types involved are still *kernel types* and need to
* exist in kernel (or kernel module) BTF, otherwise CO-RE relocation will
* fail. Custom user types are not relocatable with CO-RE.
* The typical situation in which BPF_CORE_READ_USER() might be used is to
* read kernel UAPI types from the user-space memory passed in as a syscall
* input argument.
*/
#define BPF_CORE_READ_USER(src, a, ...) ({ \
___type((src), a, ##__VA_ARGS__) __r; \
BPF_CORE_READ_USER_INTO(&__r, (src), a, ##__VA_ARGS__); \
__r; \
})
/* Non-CO-RE variant of BPF_CORE_READ() */
#define BPF_PROBE_READ(src, a, ...) ({ \
___type((src), a, ##__VA_ARGS__) __r; \
BPF_PROBE_READ_INTO(&__r, (src), a, ##__VA_ARGS__); \
__r; \
})
/*
* Non-CO-RE variant of BPF_CORE_READ_USER().
*
* As no CO-RE relocations are emitted, source types can be arbitrary and are
* not restricted to kernel types only.
*/
#define BPF_PROBE_READ_USER(src, a, ...) ({ \
___type((src), a, ##__VA_ARGS__) __r; \
BPF_PROBE_READ_USER_INTO(&__r, (src), a, ##__VA_ARGS__); \
__r; \
})
#endif

72
src/bpf_gen_internal.h Normal file
View File

@@ -0,0 +1,72 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
/* Copyright (c) 2021 Facebook */
#ifndef __BPF_GEN_INTERNAL_H
#define __BPF_GEN_INTERNAL_H
#include "bpf.h"
struct ksym_relo_desc {
const char *name;
int kind;
int insn_idx;
bool is_weak;
bool is_typeless;
};
struct ksym_desc {
const char *name;
int ref;
int kind;
union {
/* used for kfunc */
int off;
/* used for typeless ksym */
bool typeless;
};
int insn;
};
struct bpf_gen {
struct gen_loader_opts *opts;
void *data_start;
void *data_cur;
void *insn_start;
void *insn_cur;
ssize_t cleanup_label;
__u32 nr_progs;
__u32 nr_maps;
int log_level;
int error;
struct ksym_relo_desc *relos;
int relo_cnt;
struct bpf_core_relo *core_relos;
int core_relo_cnt;
char attach_target[128];
int attach_kind;
struct ksym_desc *ksyms;
__u32 nr_ksyms;
int fd_array;
int nr_fd_array;
};
void bpf_gen__init(struct bpf_gen *gen, int log_level, int nr_progs, int nr_maps);
int bpf_gen__finish(struct bpf_gen *gen, int nr_progs, int nr_maps);
void bpf_gen__free(struct bpf_gen *gen);
void bpf_gen__load_btf(struct bpf_gen *gen, const void *raw_data, __u32 raw_size);
void bpf_gen__map_create(struct bpf_gen *gen,
enum bpf_map_type map_type, const char *map_name,
__u32 key_size, __u32 value_size, __u32 max_entries,
struct bpf_map_create_opts *map_attr, int map_idx);
void bpf_gen__prog_load(struct bpf_gen *gen,
enum bpf_prog_type prog_type, const char *prog_name,
const char *license, struct bpf_insn *insns, size_t insn_cnt,
struct bpf_prog_load_opts *load_attr, int prog_idx);
void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *value, __u32 value_size);
void bpf_gen__map_freeze(struct bpf_gen *gen, int map_idx);
void bpf_gen__record_attach_target(struct bpf_gen *gen, const char *name, enum bpf_attach_type type);
void bpf_gen__record_extern(struct bpf_gen *gen, const char *name, bool is_weak,
bool is_typeless, int kind, int insn_idx);
void bpf_gen__record_relo_core(struct bpf_gen *gen, const struct bpf_core_relo *core_relo);
void bpf_gen__populate_outer_map(struct bpf_gen *gen, int outer_map_idx, int key, int inner_map_idx);
#endif

View File

@@ -1,4 +1,4 @@
/* This is auto-generated file. See bpf_helpers_doc.py for details. */
/* This is auto-generated file. See bpf_doc.py for details. */
/* Forward declarations of BPF structs */
struct bpf_fib_lookup;
@@ -27,6 +27,7 @@ struct tcp_sock;
struct tcp_timewait_sock;
struct tcp_request_sock;
struct udp6_sock;
struct unix_sock;
struct task_struct;
struct __sk_buff;
struct sk_msg_md;
@@ -36,6 +37,7 @@ struct btf_ptr;
struct inode;
struct socket;
struct file;
struct bpf_timer;
/*
* bpf_map_lookup_elem
@@ -189,7 +191,7 @@ static __u32 (*bpf_get_prandom_u32)(void) = (void *) 7;
* bpf_get_smp_processor_id
*
* Get the SMP (symmetric multiprocessing) processor id. Note that
* all programs run with preemption disabled, which means that the
* all programs run with migration disabled, which means that the
* SMP processor id is stable during all the execution of the
* program.
*
@@ -312,7 +314,7 @@ static long (*bpf_l4_csum_replace)(struct __sk_buff *skb, __u32 offset, __u64 fr
* if the maximum number of tail calls has been reached for this
* chain of programs. This limit is defined in the kernel by the
* macro **MAX_TAIL_CALL_CNT** (not accessible to user space),
* which is currently set to 32.
* which is currently set to 33.
*
* Returns
* 0 on success, or a negative error in case of failure.
@@ -350,6 +352,7 @@ static long (*bpf_clone_redirect)(struct __sk_buff *skb, __u32 ifindex, __u64 fl
/*
* bpf_get_current_pid_tgid
*
* Get the current pid and tgid.
*
* Returns
* A 64-bit integer containing the current tgid and pid, and
@@ -362,6 +365,7 @@ static __u64 (*bpf_get_current_pid_tgid)(void) = (void *) 14;
/*
* bpf_get_current_uid_gid
*
* Get the current uid and gid.
*
* Returns
* A 64-bit integer containing the current GID and UID, and
@@ -917,6 +921,7 @@ static __u32 (*bpf_get_hash_recalc)(struct __sk_buff *skb) = (void *) 34;
/*
* bpf_get_current_task
*
* Get the current task.
*
* Returns
* A pointer to the current task struct.
@@ -1055,6 +1060,8 @@ static __s64 (*bpf_csum_update)(struct __sk_buff *skb, __wsum csum) = (void *) 4
* recalculation the next time the kernel tries to access this
* hash or when the **bpf_get_hash_recalc**\ () helper is called.
*
* Returns
* void.
*/
static void (*bpf_set_hash_invalid)(struct __sk_buff *skb) = (void *) 41;
@@ -1146,14 +1153,15 @@ static long (*bpf_probe_read_str)(void *dst, __u32 size, const void *unsafe_ptr)
* identifier that can be assumed unique.
*
* Returns
* A 8-byte long non-decreasing number on success, or 0 if the
* socket field is missing inside *skb*.
* A 8-byte long unique number on success, or 0 if the socket
* field is missing inside *skb*.
*/
static __u64 (*bpf_get_socket_cookie)(void *ctx) = (void *) 46;
/*
* bpf_get_socket_uid
*
* Get the owner UID of the socked associated to *skb*.
*
* Returns
* The owner UID of the socket associated to *skb*. If the socket
@@ -1249,6 +1257,10 @@ static long (*bpf_setsockopt)(void *bpf_socket, int level, int optname, void *op
* Use with ENCAP_L3/L4 flags to further specify the tunnel
* type; *len* is the length of the inner MAC header.
*
* * **BPF_F_ADJ_ROOM_ENCAP_L2_ETH**:
* Use with BPF_F_ADJ_ROOM_ENCAP_L2 flag to further specify the
* L2 type as Ethernet.
*
* A call to this helper is susceptible to change the underlying
* packet buffer. Therefore, at load time, all checks on pointers
* previously done by the verifier are invalidated and must be
@@ -1273,8 +1285,12 @@ static long (*bpf_skb_adjust_room)(struct __sk_buff *skb, __s32 len_diff, __u32
* The lower two bits of *flags* are used as the return code if
* the map lookup fails. This is so that the return value can be
* one of the XDP program return codes up to **XDP_TX**, as chosen
* by the caller. Any higher bits in the *flags* argument must be
* unset.
* by the caller. The higher bits of *flags* can be set to
* BPF_F_BROADCAST or BPF_F_EXCLUDE_INGRESS as defined below.
*
* With BPF_F_BROADCAST the packet will be broadcasted to all the
* interfaces in the map, with BPF_F_EXCLUDE_INGRESS the ingress
* interface will be excluded when do broadcasting.
*
* See also **bpf_redirect**\ (), which only supports redirecting
* to an ifindex, but doesn't require a map to do so.
@@ -1799,6 +1815,9 @@ static long (*bpf_skb_load_bytes_relative)(const void *skb, __u32 offset, void *
* * 0 on success (packet is forwarded, nexthop neighbor exists)
* * > 0 one of **BPF_FIB_LKUP_RET_** codes explaining why the
* packet is not forwarded or needs assist from full stack
*
* If lookup fails with BPF_FIB_LKUP_RET_FRAG_NEEDED, then the MTU
* was exceeded and output params->mtu_result contains the MTU.
*/
static long (*bpf_fib_lookup)(void *ctx, struct bpf_fib_lookup *params, int plen, __u32 flags) = (void *) 69;
@@ -2050,6 +2069,8 @@ static __u64 (*bpf_skb_cgroup_id)(struct __sk_buff *skb) = (void *) 79;
/*
* bpf_get_current_cgroup_id
*
* Get the current cgroup id based on the cgroup within which
* the current task is running.
*
* Returns
* A 64-bit integer containing the current cgroup id based
@@ -2071,7 +2092,7 @@ static __u64 (*bpf_get_current_cgroup_id)(void) = (void *) 80;
* running simultaneously.
*
* A user should care about the synchronization by himself.
* For example, by using the **BPF_STX_XADD** instruction to alter
* For example, by using the **BPF_ATOMIC** instructions to alter
* the shared data.
*
* Returns
@@ -2083,7 +2104,7 @@ static void *(*bpf_get_local_storage)(void *map, __u64 flags) = (void *) 81;
* bpf_sk_select_reuseport
*
* Select a **SO_REUSEPORT** socket from a
* **BPF_MAP_TYPE_REUSEPORT_ARRAY** *map*.
* **BPF_MAP_TYPE_REUSEPORT_SOCKARRAY** *map*.
* It checks the selected socket is matching the incoming
* request in the socket buffer.
*
@@ -2748,10 +2769,10 @@ static long (*bpf_probe_read_kernel)(void *dst, __u32 size, const void *unsafe_p
* string length is larger than *size*, just *size*-1 bytes are
* copied and the last byte is set to NUL.
*
* On success, the length of the copied string is returned. This
* makes this helper useful in tracing programs for reading
* strings, and more importantly to get its length at runtime. See
* the following snippet:
* On success, returns the number of bytes that were written,
* including the terminal NUL. This makes this helper useful in
* tracing programs for reading strings, and more importantly to
* get its length at runtime. See the following snippet:
*
* ::
*
@@ -2780,7 +2801,7 @@ static long (*bpf_probe_read_kernel)(void *dst, __u32 size, const void *unsafe_p
* one can quickly iterate at the right offset of the memory area.
*
* Returns
* On success, the strictly positive length of the string,
* On success, the strictly positive length of the output string,
* including the trailing NUL character. On error, a negative
* value.
*/
@@ -3000,7 +3021,7 @@ static __u64 (*bpf_ktime_get_boot_ns)(void) = (void *) 125;
* arguments. The *data* are a **u64** array and corresponding format string
* values are stored in the array. For strings and pointers where pointees
* are accessed, only the pointer values are stored in the *data* array.
* The *data_len* is the size of *data* in bytes.
* The *data_len* is the size of *data* in bytes - must be a multiple of 8.
*
* Formats **%s**, **%p{i,I}{4,6}** requires to read kernel memory.
* Reading kernel memory may fail due to either invalid address or
@@ -3085,6 +3106,13 @@ static __u64 (*bpf_sk_ancestor_cgroup_id)(void *sk, int ancestor_level) = (void
* of new data availability is sent.
* If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
* of new data availability is sent unconditionally.
* If **0** is specified in *flags*, an adaptive notification
* of new data availability is sent.
*
* An adaptive notification is a notification sent whenever the user-space
* process has caught up and consumed all available payloads. In case the user-space
* process is still processing a previous payload, then no notification is needed
* as it will process the newly added payload automatically.
*
* Returns
* 0 on success, or a negative error in case of failure.
@@ -3095,6 +3123,7 @@ static long (*bpf_ringbuf_output)(void *ringbuf, void *data, __u64 size, __u64 f
* bpf_ringbuf_reserve
*
* Reserve *size* bytes of payload in a ring buffer *ringbuf*.
* *flags* must be 0.
*
* Returns
* Valid pointer with *size* bytes of memory available; NULL,
@@ -3110,6 +3139,10 @@ static void *(*bpf_ringbuf_reserve)(void *ringbuf, __u64 size, __u64 flags) = (v
* of new data availability is sent.
* If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
* of new data availability is sent unconditionally.
* If **0** is specified in *flags*, an adaptive notification
* of new data availability is sent.
*
* See 'bpf_ringbuf_output()' for the definition of adaptive notification.
*
* Returns
* Nothing. Always succeeds.
@@ -3124,6 +3157,10 @@ static void (*bpf_ringbuf_submit)(void *data, __u64 flags) = (void *) 132;
* of new data availability is sent.
* If **BPF_RB_FORCE_WAKEUP** is specified in *flags*, notification
* of new data availability is sent unconditionally.
* If **0** is specified in *flags*, an adaptive notification
* of new data availability is sent.
*
* See 'bpf_ringbuf_output()' for the definition of adaptive notification.
*
* Returns
* Nothing. Always succeeds.
@@ -3731,4 +3768,531 @@ static long (*bpf_ima_inode_hash)(struct inode *inode, void *dst, __u32 size) =
*/
static struct socket *(*bpf_sock_from_file)(struct file *file) = (void *) 162;
/*
* bpf_check_mtu
*
* Check packet size against exceeding MTU of net device (based
* on *ifindex*). This helper will likely be used in combination
* with helpers that adjust/change the packet size.
*
* The argument *len_diff* can be used for querying with a planned
* size change. This allows to check MTU prior to changing packet
* ctx. Providing an *len_diff* adjustment that is larger than the
* actual packet size (resulting in negative packet size) will in
* principle not exceed the MTU, why it is not considered a
* failure. Other BPF-helpers are needed for performing the
* planned size change, why the responsability for catch a negative
* packet size belong in those helpers.
*
* Specifying *ifindex* zero means the MTU check is performed
* against the current net device. This is practical if this isn't
* used prior to redirect.
*
* On input *mtu_len* must be a valid pointer, else verifier will
* reject BPF program. If the value *mtu_len* is initialized to
* zero then the ctx packet size is use. When value *mtu_len* is
* provided as input this specify the L3 length that the MTU check
* is done against. Remember XDP and TC length operate at L2, but
* this value is L3 as this correlate to MTU and IP-header tot_len
* values which are L3 (similar behavior as bpf_fib_lookup).
*
* The Linux kernel route table can configure MTUs on a more
* specific per route level, which is not provided by this helper.
* For route level MTU checks use the **bpf_fib_lookup**\ ()
* helper.
*
* *ctx* is either **struct xdp_md** for XDP programs or
* **struct sk_buff** for tc cls_act programs.
*
* The *flags* argument can be a combination of one or more of the
* following values:
*
* **BPF_MTU_CHK_SEGS**
* This flag will only works for *ctx* **struct sk_buff**.
* If packet context contains extra packet segment buffers
* (often knows as GSO skb), then MTU check is harder to
* check at this point, because in transmit path it is
* possible for the skb packet to get re-segmented
* (depending on net device features). This could still be
* a MTU violation, so this flag enables performing MTU
* check against segments, with a different violation
* return code to tell it apart. Check cannot use len_diff.
*
* On return *mtu_len* pointer contains the MTU value of the net
* device. Remember the net device configured MTU is the L3 size,
* which is returned here and XDP and TC length operate at L2.
* Helper take this into account for you, but remember when using
* MTU value in your BPF-code.
*
*
* Returns
* * 0 on success, and populate MTU value in *mtu_len* pointer.
*
* * < 0 if any input argument is invalid (*mtu_len* not updated)
*
* MTU violations return positive values, but also populate MTU
* value in *mtu_len* pointer, as this can be needed for
* implementing PMTU handing:
*
* * **BPF_MTU_CHK_RET_FRAG_NEEDED**
* * **BPF_MTU_CHK_RET_SEGS_TOOBIG**
*/
static long (*bpf_check_mtu)(void *ctx, __u32 ifindex, __u32 *mtu_len, __s32 len_diff, __u64 flags) = (void *) 163;
/*
* bpf_for_each_map_elem
*
* For each element in **map**, call **callback_fn** function with
* **map**, **callback_ctx** and other map-specific parameters.
* The **callback_fn** should be a static function and
* the **callback_ctx** should be a pointer to the stack.
* The **flags** is used to control certain aspects of the helper.
* Currently, the **flags** must be 0.
*
* The following are a list of supported map types and their
* respective expected callback signatures:
*
* BPF_MAP_TYPE_HASH, BPF_MAP_TYPE_PERCPU_HASH,
* BPF_MAP_TYPE_LRU_HASH, BPF_MAP_TYPE_LRU_PERCPU_HASH,
* BPF_MAP_TYPE_ARRAY, BPF_MAP_TYPE_PERCPU_ARRAY
*
* long (\*callback_fn)(struct bpf_map \*map, const void \*key, void \*value, void \*ctx);
*
* For per_cpu maps, the map_value is the value on the cpu where the
* bpf_prog is running.
*
* If **callback_fn** return 0, the helper will continue to the next
* element. If return value is 1, the helper will skip the rest of
* elements and return. Other return values are not used now.
*
*
* Returns
* The number of traversed map elements for success, **-EINVAL** for
* invalid **flags**.
*/
static long (*bpf_for_each_map_elem)(void *map, void *callback_fn, void *callback_ctx, __u64 flags) = (void *) 164;
/*
* bpf_snprintf
*
* Outputs a string into the **str** buffer of size **str_size**
* based on a format string stored in a read-only map pointed by
* **fmt**.
*
* Each format specifier in **fmt** corresponds to one u64 element
* in the **data** array. For strings and pointers where pointees
* are accessed, only the pointer values are stored in the *data*
* array. The *data_len* is the size of *data* in bytes - must be
* a multiple of 8.
*
* Formats **%s** and **%p{i,I}{4,6}** require to read kernel
* memory. Reading kernel memory may fail due to either invalid
* address or valid address but requiring a major memory fault. If
* reading kernel memory fails, the string for **%s** will be an
* empty string, and the ip address for **%p{i,I}{4,6}** will be 0.
* Not returning error to bpf program is consistent with what
* **bpf_trace_printk**\ () does for now.
*
*
* Returns
* The strictly positive length of the formatted string, including
* the trailing zero character. If the return value is greater than
* **str_size**, **str** contains a truncated string, guaranteed to
* be zero-terminated except when **str_size** is 0.
*
* Or **-EBUSY** if the per-CPU memory copy buffer is busy.
*/
static long (*bpf_snprintf)(char *str, __u32 str_size, const char *fmt, __u64 *data, __u32 data_len) = (void *) 165;
/*
* bpf_sys_bpf
*
* Execute bpf syscall with given arguments.
*
* Returns
* A syscall result.
*/
static long (*bpf_sys_bpf)(__u32 cmd, void *attr, __u32 attr_size) = (void *) 166;
/*
* bpf_btf_find_by_name_kind
*
* Find BTF type with given name and kind in vmlinux BTF or in module's BTFs.
*
* Returns
* Returns btf_id and btf_obj_fd in lower and upper 32 bits.
*/
static long (*bpf_btf_find_by_name_kind)(char *name, int name_sz, __u32 kind, int flags) = (void *) 167;
/*
* bpf_sys_close
*
* Execute close syscall for given FD.
*
* Returns
* A syscall result.
*/
static long (*bpf_sys_close)(__u32 fd) = (void *) 168;
/*
* bpf_timer_init
*
* Initialize the timer.
* First 4 bits of *flags* specify clockid.
* Only CLOCK_MONOTONIC, CLOCK_REALTIME, CLOCK_BOOTTIME are allowed.
* All other bits of *flags* are reserved.
* The verifier will reject the program if *timer* is not from
* the same *map*.
*
* Returns
* 0 on success.
* **-EBUSY** if *timer* is already initialized.
* **-EINVAL** if invalid *flags* are passed.
* **-EPERM** if *timer* is in a map that doesn't have any user references.
* The user space should either hold a file descriptor to a map with timers
* or pin such map in bpffs. When map is unpinned or file descriptor is
* closed all timers in the map will be cancelled and freed.
*/
static long (*bpf_timer_init)(struct bpf_timer *timer, void *map, __u64 flags) = (void *) 169;
/*
* bpf_timer_set_callback
*
* Configure the timer to call *callback_fn* static function.
*
* Returns
* 0 on success.
* **-EINVAL** if *timer* was not initialized with bpf_timer_init() earlier.
* **-EPERM** if *timer* is in a map that doesn't have any user references.
* The user space should either hold a file descriptor to a map with timers
* or pin such map in bpffs. When map is unpinned or file descriptor is
* closed all timers in the map will be cancelled and freed.
*/
static long (*bpf_timer_set_callback)(struct bpf_timer *timer, void *callback_fn) = (void *) 170;
/*
* bpf_timer_start
*
* Set timer expiration N nanoseconds from the current time. The
* configured callback will be invoked in soft irq context on some cpu
* and will not repeat unless another bpf_timer_start() is made.
* In such case the next invocation can migrate to a different cpu.
* Since struct bpf_timer is a field inside map element the map
* owns the timer. The bpf_timer_set_callback() will increment refcnt
* of BPF program to make sure that callback_fn code stays valid.
* When user space reference to a map reaches zero all timers
* in a map are cancelled and corresponding program's refcnts are
* decremented. This is done to make sure that Ctrl-C of a user
* process doesn't leave any timers running. If map is pinned in
* bpffs the callback_fn can re-arm itself indefinitely.
* bpf_map_update/delete_elem() helpers and user space sys_bpf commands
* cancel and free the timer in the given map element.
* The map can contain timers that invoke callback_fn-s from different
* programs. The same callback_fn can serve different timers from
* different maps if key/value layout matches across maps.
* Every bpf_timer_set_callback() can have different callback_fn.
*
*
* Returns
* 0 on success.
* **-EINVAL** if *timer* was not initialized with bpf_timer_init() earlier
* or invalid *flags* are passed.
*/
static long (*bpf_timer_start)(struct bpf_timer *timer, __u64 nsecs, __u64 flags) = (void *) 171;
/*
* bpf_timer_cancel
*
* Cancel the timer and wait for callback_fn to finish if it was running.
*
* Returns
* 0 if the timer was not active.
* 1 if the timer was active.
* **-EINVAL** if *timer* was not initialized with bpf_timer_init() earlier.
* **-EDEADLK** if callback_fn tried to call bpf_timer_cancel() on its
* own timer which would have led to a deadlock otherwise.
*/
static long (*bpf_timer_cancel)(struct bpf_timer *timer) = (void *) 172;
/*
* bpf_get_func_ip
*
* Get address of the traced function (for tracing and kprobe programs).
*
* Returns
* Address of the traced function.
*/
static __u64 (*bpf_get_func_ip)(void *ctx) = (void *) 173;
/*
* bpf_get_attach_cookie
*
* Get bpf_cookie value provided (optionally) during the program
* attachment. It might be different for each individual
* attachment, even if BPF program itself is the same.
* Expects BPF program context *ctx* as a first argument.
*
* Supported for the following program types:
* - kprobe/uprobe;
* - tracepoint;
* - perf_event.
*
* Returns
* Value specified by user at BPF link creation/attachment time
* or 0, if it was not specified.
*/
static __u64 (*bpf_get_attach_cookie)(void *ctx) = (void *) 174;
/*
* bpf_task_pt_regs
*
* Get the struct pt_regs associated with **task**.
*
* Returns
* A pointer to struct pt_regs.
*/
static long (*bpf_task_pt_regs)(struct task_struct *task) = (void *) 175;
/*
* bpf_get_branch_snapshot
*
* Get branch trace from hardware engines like Intel LBR. The
* hardware engine is stopped shortly after the helper is
* called. Therefore, the user need to filter branch entries
* based on the actual use case. To capture branch trace
* before the trigger point of the BPF program, the helper
* should be called at the beginning of the BPF program.
*
* The data is stored as struct perf_branch_entry into output
* buffer *entries*. *size* is the size of *entries* in bytes.
* *flags* is reserved for now and must be zero.
*
*
* Returns
* On success, number of bytes written to *buf*. On error, a
* negative value.
*
* **-EINVAL** if *flags* is not zero.
*
* **-ENOENT** if architecture does not support branch records.
*/
static long (*bpf_get_branch_snapshot)(void *entries, __u32 size, __u64 flags) = (void *) 176;
/*
* bpf_trace_vprintk
*
* Behaves like **bpf_trace_printk**\ () helper, but takes an array of u64
* to format and can handle more format args as a result.
*
* Arguments are to be used as in **bpf_seq_printf**\ () helper.
*
* Returns
* The number of bytes written to the buffer, or a negative error
* in case of failure.
*/
static long (*bpf_trace_vprintk)(const char *fmt, __u32 fmt_size, const void *data, __u32 data_len) = (void *) 177;
/*
* bpf_skc_to_unix_sock
*
* Dynamically cast a *sk* pointer to a *unix_sock* pointer.
*
* Returns
* *sk* if casting is valid, or **NULL** otherwise.
*/
static struct unix_sock *(*bpf_skc_to_unix_sock)(void *sk) = (void *) 178;
/*
* bpf_kallsyms_lookup_name
*
* Get the address of a kernel symbol, returned in *res*. *res* is
* set to 0 if the symbol is not found.
*
* Returns
* On success, zero. On error, a negative value.
*
* **-EINVAL** if *flags* is not zero.
*
* **-EINVAL** if string *name* is not the same size as *name_sz*.
*
* **-ENOENT** if symbol is not found.
*
* **-EPERM** if caller does not have permission to obtain kernel address.
*/
static long (*bpf_kallsyms_lookup_name)(const char *name, int name_sz, int flags, __u64 *res) = (void *) 179;
/*
* bpf_find_vma
*
* Find vma of *task* that contains *addr*, call *callback_fn*
* function with *task*, *vma*, and *callback_ctx*.
* The *callback_fn* should be a static function and
* the *callback_ctx* should be a pointer to the stack.
* The *flags* is used to control certain aspects of the helper.
* Currently, the *flags* must be 0.
*
* The expected callback signature is
*
* long (\*callback_fn)(struct task_struct \*task, struct vm_area_struct \*vma, void \*callback_ctx);
*
*
* Returns
* 0 on success.
* **-ENOENT** if *task->mm* is NULL, or no vma contains *addr*.
* **-EBUSY** if failed to try lock mmap_lock.
* **-EINVAL** for invalid **flags**.
*/
static long (*bpf_find_vma)(struct task_struct *task, __u64 addr, void *callback_fn, void *callback_ctx, __u64 flags) = (void *) 180;
/*
* bpf_loop
*
* For **nr_loops**, call **callback_fn** function
* with **callback_ctx** as the context parameter.
* The **callback_fn** should be a static function and
* the **callback_ctx** should be a pointer to the stack.
* The **flags** is used to control certain aspects of the helper.
* Currently, the **flags** must be 0. Currently, nr_loops is
* limited to 1 << 23 (~8 million) loops.
*
* long (\*callback_fn)(u32 index, void \*ctx);
*
* where **index** is the current index in the loop. The index
* is zero-indexed.
*
* If **callback_fn** returns 0, the helper will continue to the next
* loop. If return value is 1, the helper will skip the rest of
* the loops and return. Other return values are not used now,
* and will be rejected by the verifier.
*
*
* Returns
* The number of loops performed, **-EINVAL** for invalid **flags**,
* **-E2BIG** if **nr_loops** exceeds the maximum number of loops.
*/
static long (*bpf_loop)(__u32 nr_loops, void *callback_fn, void *callback_ctx, __u64 flags) = (void *) 181;
/*
* bpf_strncmp
*
* Do strncmp() between **s1** and **s2**. **s1** doesn't need
* to be null-terminated and **s1_sz** is the maximum storage
* size of **s1**. **s2** must be a read-only string.
*
* Returns
* An integer less than, equal to, or greater than zero
* if the first **s1_sz** bytes of **s1** is found to be
* less than, to match, or be greater than **s2**.
*/
static long (*bpf_strncmp)(const char *s1, __u32 s1_sz, const char *s2) = (void *) 182;
/*
* bpf_get_func_arg
*
* Get **n**-th argument (zero based) of the traced function (for tracing programs)
* returned in **value**.
*
*
* Returns
* 0 on success.
* **-EINVAL** if n >= arguments count of traced function.
*/
static long (*bpf_get_func_arg)(void *ctx, __u32 n, __u64 *value) = (void *) 183;
/*
* bpf_get_func_ret
*
* Get return value of the traced function (for tracing programs)
* in **value**.
*
*
* Returns
* 0 on success.
* **-EOPNOTSUPP** for tracing programs other than BPF_TRACE_FEXIT or BPF_MODIFY_RETURN.
*/
static long (*bpf_get_func_ret)(void *ctx, __u64 *value) = (void *) 184;
/*
* bpf_get_func_arg_cnt
*
* Get number of arguments of the traced function (for tracing programs).
*
*
* Returns
* The number of arguments of the traced function.
*/
static long (*bpf_get_func_arg_cnt)(void *ctx) = (void *) 185;
/*
* bpf_get_retval
*
* Get the syscall's return value that will be returned to userspace.
*
* This helper is currently supported by cgroup programs only.
*
* Returns
* The syscall's return value.
*/
static int (*bpf_get_retval)(void) = (void *) 186;
/*
* bpf_set_retval
*
* Set the syscall's return value that will be returned to userspace.
*
* This helper is currently supported by cgroup programs only.
*
* Returns
* 0 on success, or a negative error in case of failure.
*/
static int (*bpf_set_retval)(int retval) = (void *) 187;
/*
* bpf_xdp_get_buff_len
*
* Get the total size of a given xdp buff (linear and paged area)
*
* Returns
* The total size of a given xdp buffer.
*/
static __u64 (*bpf_xdp_get_buff_len)(struct xdp_md *xdp_md) = (void *) 188;
/*
* bpf_xdp_load_bytes
*
* This helper is provided as an easy way to load data from a
* xdp buffer. It can be used to load *len* bytes from *offset* from
* the frame associated to *xdp_md*, into the buffer pointed by
* *buf*.
*
* Returns
* 0 on success, or a negative error in case of failure.
*/
static long (*bpf_xdp_load_bytes)(struct xdp_md *xdp_md, __u32 offset, void *buf, __u32 len) = (void *) 189;
/*
* bpf_xdp_store_bytes
*
* Store *len* bytes from buffer *buf* into the frame
* associated to *xdp_md*, at *offset*.
*
* Returns
* 0 on success, or a negative error in case of failure.
*/
static long (*bpf_xdp_store_bytes)(struct xdp_md *xdp_md, __u32 offset, void *buf, __u32 len) = (void *) 190;
/*
* bpf_copy_from_user_task
*
* Read *size* bytes from user space address *user_ptr* in *tsk*'s
* address space, and stores the data in *dst*. *flags* is not
* used yet and is provided for future extensibility. This helper
* can only be used by sleepable programs.
*
* Returns
* 0 on success, or a negative error in case of failure. On error
* *dst* buffer is zeroed out.
*/
static long (*bpf_copy_from_user_task)(void *dst, __u32 size, const void *user_ptr, struct task_struct *tsk, __u64 flags) = (void *) 191;

View File

@@ -14,24 +14,24 @@
#define __type(name, val) typeof(val) *name
#define __array(name, val) typeof(val) *name[]
/* Helper macro to print out debug messages */
#define bpf_printk(fmt, ...) \
({ \
char ____fmt[] = fmt; \
bpf_trace_printk(____fmt, sizeof(____fmt), \
##__VA_ARGS__); \
})
/*
* Helper macro to place programs, maps, license in
* different sections in elf_bpf file. Section names
* are interpreted by elf_bpf loader
* are interpreted by libbpf depending on the context (BPF programs, BPF maps,
* extern variables, etc).
* To allow use of SEC() with externs (e.g., for extern .maps declarations),
* make sure __attribute__((unused)) doesn't trigger compilation warning.
*/
#define SEC(NAME) __attribute__((section(NAME), used))
#define SEC(name) \
_Pragma("GCC diagnostic push") \
_Pragma("GCC diagnostic ignored \"-Wignored-attributes\"") \
__attribute__((section(name), used)) \
_Pragma("GCC diagnostic pop") \
/* Avoid 'linux/stddef.h' definition of '__always_inline'. */
#undef __always_inline
#define __always_inline inline __attribute__((always_inline))
#ifndef __always_inline
#define __always_inline __attribute__((always_inline))
#endif
#ifndef __noinline
#define __noinline __attribute__((noinline))
#endif
@@ -40,7 +40,29 @@
#endif
/*
* Helper macro to manipulate data structures
* Use __hidden attribute to mark a non-static BPF subprogram effectively
* static for BPF verifier's verification algorithm purposes, allowing more
* extensive and permissive BPF verification process, taking into account
* subprogram's caller context.
*/
#define __hidden __attribute__((visibility("hidden")))
/* When utilizing vmlinux.h with BPF CO-RE, user BPF programs can't include
* any system-level headers (such as stddef.h, linux/version.h, etc), and
* commonly-used macros like NULL and KERNEL_VERSION aren't available through
* vmlinux.h. This just adds unnecessary hurdles and forces users to re-define
* them on their own. So as a convenience, provide such definitions here.
*/
#ifndef NULL
#define NULL ((void *)0)
#endif
#ifndef KERNEL_VERSION
#define KERNEL_VERSION(a, b, c) (((a) << 16) + ((b) << 8) + ((c) > 255 ? 255 : (c)))
#endif
/*
* Helper macros to manipulate data structures
*/
#ifndef offsetof
#define offsetof(TYPE, MEMBER) ((unsigned long)&((TYPE *)0)->MEMBER)
@@ -111,7 +133,7 @@ struct bpf_map_def {
unsigned int value_size;
unsigned int max_entries;
unsigned int map_flags;
};
} __attribute__((deprecated("use BTF-defined maps in .maps section")));
enum libbpf_pin_type {
LIBBPF_PIN_NONE,
@@ -128,4 +150,113 @@ enum libbpf_tristate {
#define __kconfig __attribute__((section(".kconfig")))
#define __ksym __attribute__((section(".ksyms")))
#ifndef ___bpf_concat
#define ___bpf_concat(a, b) a ## b
#endif
#ifndef ___bpf_apply
#define ___bpf_apply(fn, n) ___bpf_concat(fn, n)
#endif
#ifndef ___bpf_nth
#define ___bpf_nth(_, _1, _2, _3, _4, _5, _6, _7, _8, _9, _a, _b, _c, N, ...) N
#endif
#ifndef ___bpf_narg
#define ___bpf_narg(...) \
___bpf_nth(_, ##__VA_ARGS__, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0)
#endif
#define ___bpf_fill0(arr, p, x) do {} while (0)
#define ___bpf_fill1(arr, p, x) arr[p] = x
#define ___bpf_fill2(arr, p, x, args...) arr[p] = x; ___bpf_fill1(arr, p + 1, args)
#define ___bpf_fill3(arr, p, x, args...) arr[p] = x; ___bpf_fill2(arr, p + 1, args)
#define ___bpf_fill4(arr, p, x, args...) arr[p] = x; ___bpf_fill3(arr, p + 1, args)
#define ___bpf_fill5(arr, p, x, args...) arr[p] = x; ___bpf_fill4(arr, p + 1, args)
#define ___bpf_fill6(arr, p, x, args...) arr[p] = x; ___bpf_fill5(arr, p + 1, args)
#define ___bpf_fill7(arr, p, x, args...) arr[p] = x; ___bpf_fill6(arr, p + 1, args)
#define ___bpf_fill8(arr, p, x, args...) arr[p] = x; ___bpf_fill7(arr, p + 1, args)
#define ___bpf_fill9(arr, p, x, args...) arr[p] = x; ___bpf_fill8(arr, p + 1, args)
#define ___bpf_fill10(arr, p, x, args...) arr[p] = x; ___bpf_fill9(arr, p + 1, args)
#define ___bpf_fill11(arr, p, x, args...) arr[p] = x; ___bpf_fill10(arr, p + 1, args)
#define ___bpf_fill12(arr, p, x, args...) arr[p] = x; ___bpf_fill11(arr, p + 1, args)
#define ___bpf_fill(arr, args...) \
___bpf_apply(___bpf_fill, ___bpf_narg(args))(arr, 0, args)
/*
* BPF_SEQ_PRINTF to wrap bpf_seq_printf to-be-printed values
* in a structure.
*/
#define BPF_SEQ_PRINTF(seq, fmt, args...) \
({ \
static const char ___fmt[] = fmt; \
unsigned long long ___param[___bpf_narg(args)]; \
\
_Pragma("GCC diagnostic push") \
_Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \
___bpf_fill(___param, args); \
_Pragma("GCC diagnostic pop") \
\
bpf_seq_printf(seq, ___fmt, sizeof(___fmt), \
___param, sizeof(___param)); \
})
/*
* BPF_SNPRINTF wraps the bpf_snprintf helper with variadic arguments instead of
* an array of u64.
*/
#define BPF_SNPRINTF(out, out_size, fmt, args...) \
({ \
static const char ___fmt[] = fmt; \
unsigned long long ___param[___bpf_narg(args)]; \
\
_Pragma("GCC diagnostic push") \
_Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \
___bpf_fill(___param, args); \
_Pragma("GCC diagnostic pop") \
\
bpf_snprintf(out, out_size, ___fmt, \
___param, sizeof(___param)); \
})
#ifdef BPF_NO_GLOBAL_DATA
#define BPF_PRINTK_FMT_MOD
#else
#define BPF_PRINTK_FMT_MOD static const
#endif
#define __bpf_printk(fmt, ...) \
({ \
BPF_PRINTK_FMT_MOD char ____fmt[] = fmt; \
bpf_trace_printk(____fmt, sizeof(____fmt), \
##__VA_ARGS__); \
})
/*
* __bpf_vprintk wraps the bpf_trace_vprintk helper with variadic arguments
* instead of an array of u64.
*/
#define __bpf_vprintk(fmt, args...) \
({ \
static const char ___fmt[] = fmt; \
unsigned long long ___param[___bpf_narg(args)]; \
\
_Pragma("GCC diagnostic push") \
_Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \
___bpf_fill(___param, args); \
_Pragma("GCC diagnostic pop") \
\
bpf_trace_vprintk(___fmt, sizeof(___fmt), \
___param, sizeof(___param)); \
})
/* Use __bpf_printk when bpf_printk call has 3 or fewer fmt args
* Otherwise use __bpf_vprintk
*/
#define ___bpf_pick_printk(...) \
___bpf_nth(_, ##__VA_ARGS__, __bpf_vprintk, __bpf_vprintk, __bpf_vprintk, \
__bpf_vprintk, __bpf_vprintk, __bpf_vprintk, __bpf_vprintk, \
__bpf_vprintk, __bpf_vprintk, __bpf_printk /*3*/, __bpf_printk /*2*/,\
__bpf_printk /*1*/, __bpf_printk /*0*/)
/* Helper macro to print out debug messages */
#define bpf_printk(fmt, args...) ___bpf_pick_printk(args)(fmt, ##args)
#endif

View File

@@ -106,7 +106,7 @@ struct bpf_prog_linfo *bpf_prog_linfo__new(const struct bpf_prog_info *info)
nr_linfo = info->nr_line_info;
if (!nr_linfo)
return NULL;
return errno = EINVAL, NULL;
/*
* The min size that bpf_prog_linfo has to access for
@@ -114,11 +114,11 @@ struct bpf_prog_linfo *bpf_prog_linfo__new(const struct bpf_prog_info *info)
*/
if (info->line_info_rec_size <
offsetof(struct bpf_line_info, file_name_off))
return NULL;
return errno = EINVAL, NULL;
prog_linfo = calloc(1, sizeof(*prog_linfo));
if (!prog_linfo)
return NULL;
return errno = ENOMEM, NULL;
/* Copy xlated line_info */
prog_linfo->nr_linfo = nr_linfo;
@@ -174,7 +174,7 @@ struct bpf_prog_linfo *bpf_prog_linfo__new(const struct bpf_prog_info *info)
err_free:
bpf_prog_linfo__free(prog_linfo);
return NULL;
return errno = EINVAL, NULL;
}
const struct bpf_line_info *
@@ -186,11 +186,11 @@ bpf_prog_linfo__lfind_addr_func(const struct bpf_prog_linfo *prog_linfo,
const __u64 *jited_linfo;
if (func_idx >= prog_linfo->nr_jited_func)
return NULL;
return errno = ENOENT, NULL;
nr_linfo = prog_linfo->nr_jited_linfo_per_func[func_idx];
if (nr_skip >= nr_linfo)
return NULL;
return errno = ENOENT, NULL;
start = prog_linfo->jited_linfo_func_idx[func_idx] + nr_skip;
jited_rec_size = prog_linfo->jited_rec_size;
@@ -198,7 +198,7 @@ bpf_prog_linfo__lfind_addr_func(const struct bpf_prog_linfo *prog_linfo,
(start * jited_rec_size);
jited_linfo = raw_jited_linfo;
if (addr < *jited_linfo)
return NULL;
return errno = ENOENT, NULL;
nr_linfo -= nr_skip;
rec_size = prog_linfo->rec_size;
@@ -225,13 +225,13 @@ bpf_prog_linfo__lfind(const struct bpf_prog_linfo *prog_linfo,
nr_linfo = prog_linfo->nr_linfo;
if (nr_skip >= nr_linfo)
return NULL;
return errno = ENOENT, NULL;
rec_size = prog_linfo->rec_size;
raw_linfo = prog_linfo->raw_linfo + (nr_skip * rec_size);
linfo = raw_linfo;
if (insn_off < linfo->insn_off)
return NULL;
return errno = ENOENT, NULL;
nr_linfo -= nr_skip;
for (i = 0; i < nr_linfo; i++) {

View File

@@ -24,300 +24,365 @@
#elif defined(__TARGET_ARCH_sparc)
#define bpf_target_sparc
#define bpf_target_defined
#elif defined(__TARGET_ARCH_riscv)
#define bpf_target_riscv
#define bpf_target_defined
#else
#undef bpf_target_defined
#endif
/* Fall back to what the compiler says */
#ifndef bpf_target_defined
#if defined(__x86_64__)
#define bpf_target_x86
#define bpf_target_defined
#elif defined(__s390__)
#define bpf_target_s390
#define bpf_target_defined
#elif defined(__arm__)
#define bpf_target_arm
#define bpf_target_defined
#elif defined(__aarch64__)
#define bpf_target_arm64
#define bpf_target_defined
#elif defined(__mips__)
#define bpf_target_mips
#define bpf_target_defined
#elif defined(__powerpc__)
#define bpf_target_powerpc
#define bpf_target_defined
#elif defined(__sparc__)
#define bpf_target_sparc
#define bpf_target_defined
#elif defined(__riscv) && __riscv_xlen == 64
#define bpf_target_riscv
#define bpf_target_defined
#endif /* no compiler target */
#endif
#ifndef __BPF_TARGET_MISSING
#define __BPF_TARGET_MISSING "GCC error \"Must specify a BPF target arch via __TARGET_ARCH_xxx\""
#endif
#if defined(bpf_target_x86)
#if defined(__KERNEL__) || defined(__VMLINUX_H__)
#define PT_REGS_PARM1(x) ((x)->di)
#define PT_REGS_PARM2(x) ((x)->si)
#define PT_REGS_PARM3(x) ((x)->dx)
#define PT_REGS_PARM4(x) ((x)->cx)
#define PT_REGS_PARM5(x) ((x)->r8)
#define PT_REGS_RET(x) ((x)->sp)
#define PT_REGS_FP(x) ((x)->bp)
#define PT_REGS_RC(x) ((x)->ax)
#define PT_REGS_SP(x) ((x)->sp)
#define PT_REGS_IP(x) ((x)->ip)
#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), di)
#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), si)
#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), dx)
#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((x), cx)
#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((x), r8)
#define PT_REGS_RET_CORE(x) BPF_CORE_READ((x), sp)
#define PT_REGS_FP_CORE(x) BPF_CORE_READ((x), bp)
#define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), ax)
#define PT_REGS_SP_CORE(x) BPF_CORE_READ((x), sp)
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), ip)
#define __PT_PARM1_REG di
#define __PT_PARM2_REG si
#define __PT_PARM3_REG dx
#define __PT_PARM4_REG cx
#define __PT_PARM5_REG r8
#define __PT_RET_REG sp
#define __PT_FP_REG bp
#define __PT_RC_REG ax
#define __PT_SP_REG sp
#define __PT_IP_REG ip
/* syscall uses r10 for PARM4 */
#define PT_REGS_PARM4_SYSCALL(x) ((x)->r10)
#define PT_REGS_PARM4_CORE_SYSCALL(x) BPF_CORE_READ(x, r10)
#else
#ifdef __i386__
#define __PT_PARM1_REG eax
#define __PT_PARM2_REG edx
#define __PT_PARM3_REG ecx
/* i386 kernel is built with -mregparm=3 */
#define PT_REGS_PARM1(x) ((x)->eax)
#define PT_REGS_PARM2(x) ((x)->edx)
#define PT_REGS_PARM3(x) ((x)->ecx)
#define PT_REGS_PARM4(x) 0
#define PT_REGS_PARM5(x) 0
#define PT_REGS_RET(x) ((x)->esp)
#define PT_REGS_FP(x) ((x)->ebp)
#define PT_REGS_RC(x) ((x)->eax)
#define PT_REGS_SP(x) ((x)->esp)
#define PT_REGS_IP(x) ((x)->eip)
#define __PT_PARM4_REG __unsupported__
#define __PT_PARM5_REG __unsupported__
#define __PT_RET_REG esp
#define __PT_FP_REG ebp
#define __PT_RC_REG eax
#define __PT_SP_REG esp
#define __PT_IP_REG eip
#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), eax)
#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), edx)
#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), ecx)
#define PT_REGS_PARM4_CORE(x) 0
#define PT_REGS_PARM5_CORE(x) 0
#define PT_REGS_RET_CORE(x) BPF_CORE_READ((x), esp)
#define PT_REGS_FP_CORE(x) BPF_CORE_READ((x), ebp)
#define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), eax)
#define PT_REGS_SP_CORE(x) BPF_CORE_READ((x), esp)
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), eip)
#else /* __i386__ */
#else
#define __PT_PARM1_REG rdi
#define __PT_PARM2_REG rsi
#define __PT_PARM3_REG rdx
#define __PT_PARM4_REG rcx
#define __PT_PARM5_REG r8
#define __PT_RET_REG rsp
#define __PT_FP_REG rbp
#define __PT_RC_REG rax
#define __PT_SP_REG rsp
#define __PT_IP_REG rip
/* syscall uses r10 for PARM4 */
#define PT_REGS_PARM4_SYSCALL(x) ((x)->r10)
#define PT_REGS_PARM4_CORE_SYSCALL(x) BPF_CORE_READ(x, r10)
#define PT_REGS_PARM1(x) ((x)->rdi)
#define PT_REGS_PARM2(x) ((x)->rsi)
#define PT_REGS_PARM3(x) ((x)->rdx)
#define PT_REGS_PARM4(x) ((x)->rcx)
#define PT_REGS_PARM5(x) ((x)->r8)
#define PT_REGS_RET(x) ((x)->rsp)
#define PT_REGS_FP(x) ((x)->rbp)
#define PT_REGS_RC(x) ((x)->rax)
#define PT_REGS_SP(x) ((x)->rsp)
#define PT_REGS_IP(x) ((x)->rip)
#endif /* __i386__ */
#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), rdi)
#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), rsi)
#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), rdx)
#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((x), rcx)
#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((x), r8)
#define PT_REGS_RET_CORE(x) BPF_CORE_READ((x), rsp)
#define PT_REGS_FP_CORE(x) BPF_CORE_READ((x), rbp)
#define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), rax)
#define PT_REGS_SP_CORE(x) BPF_CORE_READ((x), rsp)
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), rip)
#endif
#endif
#endif /* __KERNEL__ || __VMLINUX_H__ */
#elif defined(bpf_target_s390)
/* s390 provides user_pt_regs instead of struct pt_regs to userspace */
struct pt_regs;
#define PT_REGS_S390 const volatile user_pt_regs
#define PT_REGS_PARM1(x) (((PT_REGS_S390 *)(x))->gprs[2])
#define PT_REGS_PARM2(x) (((PT_REGS_S390 *)(x))->gprs[3])
#define PT_REGS_PARM3(x) (((PT_REGS_S390 *)(x))->gprs[4])
#define PT_REGS_PARM4(x) (((PT_REGS_S390 *)(x))->gprs[5])
#define PT_REGS_PARM5(x) (((PT_REGS_S390 *)(x))->gprs[6])
#define PT_REGS_RET(x) (((PT_REGS_S390 *)(x))->gprs[14])
/* Works only with CONFIG_FRAME_POINTER */
#define PT_REGS_FP(x) (((PT_REGS_S390 *)(x))->gprs[11])
#define PT_REGS_RC(x) (((PT_REGS_S390 *)(x))->gprs[2])
#define PT_REGS_SP(x) (((PT_REGS_S390 *)(x))->gprs[15])
#define PT_REGS_IP(x) (((PT_REGS_S390 *)(x))->psw.addr)
struct pt_regs___s390 {
unsigned long orig_gpr2;
};
#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[2])
#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[3])
#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[4])
#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[5])
#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[6])
#define PT_REGS_RET_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[14])
#define PT_REGS_FP_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[11])
#define PT_REGS_RC_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[2])
#define PT_REGS_SP_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[15])
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), psw.addr)
/* s390 provides user_pt_regs instead of struct pt_regs to userspace */
#define __PT_REGS_CAST(x) ((const user_pt_regs *)(x))
#define __PT_PARM1_REG gprs[2]
#define __PT_PARM2_REG gprs[3]
#define __PT_PARM3_REG gprs[4]
#define __PT_PARM4_REG gprs[5]
#define __PT_PARM5_REG gprs[6]
#define __PT_RET_REG grps[14]
#define __PT_FP_REG gprs[11] /* Works only with CONFIG_FRAME_POINTER */
#define __PT_RC_REG gprs[2]
#define __PT_SP_REG gprs[15]
#define __PT_IP_REG psw.addr
#define PT_REGS_PARM1_SYSCALL(x) ({ _Pragma("GCC error \"use PT_REGS_PARM1_CORE_SYSCALL() instead\""); 0l; })
#define PT_REGS_PARM1_CORE_SYSCALL(x) BPF_CORE_READ((const struct pt_regs___s390 *)(x), orig_gpr2)
#elif defined(bpf_target_arm)
#define PT_REGS_PARM1(x) ((x)->uregs[0])
#define PT_REGS_PARM2(x) ((x)->uregs[1])
#define PT_REGS_PARM3(x) ((x)->uregs[2])
#define PT_REGS_PARM4(x) ((x)->uregs[3])
#define PT_REGS_PARM5(x) ((x)->uregs[4])
#define PT_REGS_RET(x) ((x)->uregs[14])
#define PT_REGS_FP(x) ((x)->uregs[11]) /* Works only with CONFIG_FRAME_POINTER */
#define PT_REGS_RC(x) ((x)->uregs[0])
#define PT_REGS_SP(x) ((x)->uregs[13])
#define PT_REGS_IP(x) ((x)->uregs[12])
#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), uregs[0])
#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), uregs[1])
#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), uregs[2])
#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((x), uregs[3])
#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((x), uregs[4])
#define PT_REGS_RET_CORE(x) BPF_CORE_READ((x), uregs[14])
#define PT_REGS_FP_CORE(x) BPF_CORE_READ((x), uregs[11])
#define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), uregs[0])
#define PT_REGS_SP_CORE(x) BPF_CORE_READ((x), uregs[13])
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), uregs[12])
#define __PT_PARM1_REG uregs[0]
#define __PT_PARM2_REG uregs[1]
#define __PT_PARM3_REG uregs[2]
#define __PT_PARM4_REG uregs[3]
#define __PT_PARM5_REG uregs[4]
#define __PT_RET_REG uregs[14]
#define __PT_FP_REG uregs[11] /* Works only with CONFIG_FRAME_POINTER */
#define __PT_RC_REG uregs[0]
#define __PT_SP_REG uregs[13]
#define __PT_IP_REG uregs[12]
#elif defined(bpf_target_arm64)
/* arm64 provides struct user_pt_regs instead of struct pt_regs to userspace */
struct pt_regs;
#define PT_REGS_ARM64 const volatile struct user_pt_regs
#define PT_REGS_PARM1(x) (((PT_REGS_ARM64 *)(x))->regs[0])
#define PT_REGS_PARM2(x) (((PT_REGS_ARM64 *)(x))->regs[1])
#define PT_REGS_PARM3(x) (((PT_REGS_ARM64 *)(x))->regs[2])
#define PT_REGS_PARM4(x) (((PT_REGS_ARM64 *)(x))->regs[3])
#define PT_REGS_PARM5(x) (((PT_REGS_ARM64 *)(x))->regs[4])
#define PT_REGS_RET(x) (((PT_REGS_ARM64 *)(x))->regs[30])
/* Works only with CONFIG_FRAME_POINTER */
#define PT_REGS_FP(x) (((PT_REGS_ARM64 *)(x))->regs[29])
#define PT_REGS_RC(x) (((PT_REGS_ARM64 *)(x))->regs[0])
#define PT_REGS_SP(x) (((PT_REGS_ARM64 *)(x))->sp)
#define PT_REGS_IP(x) (((PT_REGS_ARM64 *)(x))->pc)
struct pt_regs___arm64 {
unsigned long orig_x0;
};
#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[0])
#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[1])
#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[2])
#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[3])
#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[4])
#define PT_REGS_RET_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[30])
#define PT_REGS_FP_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[29])
#define PT_REGS_RC_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), regs[0])
#define PT_REGS_SP_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), sp)
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((PT_REGS_ARM64 *)(x), pc)
/* arm64 provides struct user_pt_regs instead of struct pt_regs to userspace */
#define __PT_REGS_CAST(x) ((const struct user_pt_regs *)(x))
#define __PT_PARM1_REG regs[0]
#define __PT_PARM2_REG regs[1]
#define __PT_PARM3_REG regs[2]
#define __PT_PARM4_REG regs[3]
#define __PT_PARM5_REG regs[4]
#define __PT_RET_REG regs[30]
#define __PT_FP_REG regs[29] /* Works only with CONFIG_FRAME_POINTER */
#define __PT_RC_REG regs[0]
#define __PT_SP_REG sp
#define __PT_IP_REG pc
#define PT_REGS_PARM1_SYSCALL(x) ({ _Pragma("GCC error \"use PT_REGS_PARM1_CORE_SYSCALL() instead\""); 0l; })
#define PT_REGS_PARM1_CORE_SYSCALL(x) BPF_CORE_READ((const struct pt_regs___arm64 *)(x), orig_x0)
#elif defined(bpf_target_mips)
#define PT_REGS_PARM1(x) ((x)->regs[4])
#define PT_REGS_PARM2(x) ((x)->regs[5])
#define PT_REGS_PARM3(x) ((x)->regs[6])
#define PT_REGS_PARM4(x) ((x)->regs[7])
#define PT_REGS_PARM5(x) ((x)->regs[8])
#define PT_REGS_RET(x) ((x)->regs[31])
#define PT_REGS_FP(x) ((x)->regs[30]) /* Works only with CONFIG_FRAME_POINTER */
#define PT_REGS_RC(x) ((x)->regs[2])
#define PT_REGS_SP(x) ((x)->regs[29])
#define PT_REGS_IP(x) ((x)->cp0_epc)
#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), regs[4])
#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), regs[5])
#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), regs[6])
#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((x), regs[7])
#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((x), regs[8])
#define PT_REGS_RET_CORE(x) BPF_CORE_READ((x), regs[31])
#define PT_REGS_FP_CORE(x) BPF_CORE_READ((x), regs[30])
#define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), regs[2])
#define PT_REGS_SP_CORE(x) BPF_CORE_READ((x), regs[29])
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), cp0_epc)
#define __PT_PARM1_REG regs[4]
#define __PT_PARM2_REG regs[5]
#define __PT_PARM3_REG regs[6]
#define __PT_PARM4_REG regs[7]
#define __PT_PARM5_REG regs[8]
#define __PT_RET_REG regs[31]
#define __PT_FP_REG regs[30] /* Works only with CONFIG_FRAME_POINTER */
#define __PT_RC_REG regs[2]
#define __PT_SP_REG regs[29]
#define __PT_IP_REG cp0_epc
#elif defined(bpf_target_powerpc)
#define PT_REGS_PARM1(x) ((x)->gpr[3])
#define PT_REGS_PARM2(x) ((x)->gpr[4])
#define PT_REGS_PARM3(x) ((x)->gpr[5])
#define PT_REGS_PARM4(x) ((x)->gpr[6])
#define PT_REGS_PARM5(x) ((x)->gpr[7])
#define PT_REGS_RC(x) ((x)->gpr[3])
#define PT_REGS_SP(x) ((x)->sp)
#define PT_REGS_IP(x) ((x)->nip)
#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), gpr[3])
#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), gpr[4])
#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), gpr[5])
#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((x), gpr[6])
#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((x), gpr[7])
#define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), gpr[3])
#define PT_REGS_SP_CORE(x) BPF_CORE_READ((x), sp)
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), nip)
#define __PT_PARM1_REG gpr[3]
#define __PT_PARM2_REG gpr[4]
#define __PT_PARM3_REG gpr[5]
#define __PT_PARM4_REG gpr[6]
#define __PT_PARM5_REG gpr[7]
#define __PT_RET_REG regs[31]
#define __PT_FP_REG __unsupported__
#define __PT_RC_REG gpr[3]
#define __PT_SP_REG sp
#define __PT_IP_REG nip
/* powerpc does not select ARCH_HAS_SYSCALL_WRAPPER. */
#define PT_REGS_SYSCALL_REGS(ctx) ctx
#elif defined(bpf_target_sparc)
#define PT_REGS_PARM1(x) ((x)->u_regs[UREG_I0])
#define PT_REGS_PARM2(x) ((x)->u_regs[UREG_I1])
#define PT_REGS_PARM3(x) ((x)->u_regs[UREG_I2])
#define PT_REGS_PARM4(x) ((x)->u_regs[UREG_I3])
#define PT_REGS_PARM5(x) ((x)->u_regs[UREG_I4])
#define PT_REGS_RET(x) ((x)->u_regs[UREG_I7])
#define PT_REGS_RC(x) ((x)->u_regs[UREG_I0])
#define PT_REGS_SP(x) ((x)->u_regs[UREG_FP])
#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ((x), u_regs[UREG_I0])
#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ((x), u_regs[UREG_I1])
#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((x), u_regs[UREG_I2])
#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((x), u_regs[UREG_I3])
#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((x), u_regs[UREG_I4])
#define PT_REGS_RET_CORE(x) BPF_CORE_READ((x), u_regs[UREG_I7])
#define PT_REGS_RC_CORE(x) BPF_CORE_READ((x), u_regs[UREG_I0])
#define PT_REGS_SP_CORE(x) BPF_CORE_READ((x), u_regs[UREG_FP])
#define __PT_PARM1_REG u_regs[UREG_I0]
#define __PT_PARM2_REG u_regs[UREG_I1]
#define __PT_PARM3_REG u_regs[UREG_I2]
#define __PT_PARM4_REG u_regs[UREG_I3]
#define __PT_PARM5_REG u_regs[UREG_I4]
#define __PT_RET_REG u_regs[UREG_I7]
#define __PT_FP_REG __unsupported__
#define __PT_RC_REG u_regs[UREG_I0]
#define __PT_SP_REG u_regs[UREG_FP]
/* Should this also be a bpf_target check for the sparc case? */
#if defined(__arch64__)
#define PT_REGS_IP(x) ((x)->tpc)
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), tpc)
#define __PT_IP_REG tpc
#else
#define PT_REGS_IP(x) ((x)->pc)
#define PT_REGS_IP_CORE(x) BPF_CORE_READ((x), pc)
#define __PT_IP_REG pc
#endif
#elif defined(bpf_target_riscv)
#define __PT_REGS_CAST(x) ((const struct user_regs_struct *)(x))
#define __PT_PARM1_REG a0
#define __PT_PARM2_REG a1
#define __PT_PARM3_REG a2
#define __PT_PARM4_REG a3
#define __PT_PARM5_REG a4
#define __PT_RET_REG ra
#define __PT_FP_REG s0
#define __PT_RC_REG a5
#define __PT_SP_REG sp
#define __PT_IP_REG pc
/* riscv does not select ARCH_HAS_SYSCALL_WRAPPER. */
#define PT_REGS_SYSCALL_REGS(ctx) ctx
#endif
#if defined(bpf_target_defined)
struct pt_regs;
/* allow some architecutres to override `struct pt_regs` */
#ifndef __PT_REGS_CAST
#define __PT_REGS_CAST(x) (x)
#endif
#define PT_REGS_PARM1(x) (__PT_REGS_CAST(x)->__PT_PARM1_REG)
#define PT_REGS_PARM2(x) (__PT_REGS_CAST(x)->__PT_PARM2_REG)
#define PT_REGS_PARM3(x) (__PT_REGS_CAST(x)->__PT_PARM3_REG)
#define PT_REGS_PARM4(x) (__PT_REGS_CAST(x)->__PT_PARM4_REG)
#define PT_REGS_PARM5(x) (__PT_REGS_CAST(x)->__PT_PARM5_REG)
#define PT_REGS_RET(x) (__PT_REGS_CAST(x)->__PT_RET_REG)
#define PT_REGS_FP(x) (__PT_REGS_CAST(x)->__PT_FP_REG)
#define PT_REGS_RC(x) (__PT_REGS_CAST(x)->__PT_RC_REG)
#define PT_REGS_SP(x) (__PT_REGS_CAST(x)->__PT_SP_REG)
#define PT_REGS_IP(x) (__PT_REGS_CAST(x)->__PT_IP_REG)
#define PT_REGS_PARM1_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM1_REG)
#define PT_REGS_PARM2_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM2_REG)
#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM3_REG)
#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM4_REG)
#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_PARM5_REG)
#define PT_REGS_RET_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_RET_REG)
#define PT_REGS_FP_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_FP_REG)
#define PT_REGS_RC_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_RC_REG)
#define PT_REGS_SP_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_SP_REG)
#define PT_REGS_IP_CORE(x) BPF_CORE_READ(__PT_REGS_CAST(x), __PT_IP_REG)
#if defined(bpf_target_powerpc)
#define BPF_KPROBE_READ_RET_IP(ip, ctx) ({ (ip) = (ctx)->link; })
#define BPF_KRETPROBE_READ_RET_IP BPF_KPROBE_READ_RET_IP
#elif defined(bpf_target_sparc)
#define BPF_KPROBE_READ_RET_IP(ip, ctx) ({ (ip) = PT_REGS_RET(ctx); })
#define BPF_KRETPROBE_READ_RET_IP BPF_KPROBE_READ_RET_IP
#else
#define BPF_KPROBE_READ_RET_IP(ip, ctx) \
({ bpf_probe_read_kernel(&(ip), sizeof(ip), (void *)PT_REGS_RET(ctx)); })
#define BPF_KRETPROBE_READ_RET_IP(ip, ctx) \
({ bpf_probe_read_kernel(&(ip), sizeof(ip), \
(void *)(PT_REGS_FP(ctx) + sizeof(ip))); })
({ bpf_probe_read_kernel(&(ip), sizeof(ip), (void *)(PT_REGS_FP(ctx) + sizeof(ip))); })
#endif
#define ___bpf_concat(a, b) a ## b
#define ___bpf_apply(fn, n) ___bpf_concat(fn, n)
#define ___bpf_nth(_, _1, _2, _3, _4, _5, _6, _7, _8, _9, _a, _b, _c, N, ...) N
#define ___bpf_narg(...) \
___bpf_nth(_, ##__VA_ARGS__, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0)
#define ___bpf_empty(...) \
___bpf_nth(_, ##__VA_ARGS__, N, N, N, N, N, N, N, N, N, N, 0)
#ifndef PT_REGS_PARM1_SYSCALL
#define PT_REGS_PARM1_SYSCALL(x) PT_REGS_PARM1(x)
#endif
#define PT_REGS_PARM2_SYSCALL(x) PT_REGS_PARM2(x)
#define PT_REGS_PARM3_SYSCALL(x) PT_REGS_PARM3(x)
#ifndef PT_REGS_PARM4_SYSCALL
#define PT_REGS_PARM4_SYSCALL(x) PT_REGS_PARM4(x)
#endif
#define PT_REGS_PARM5_SYSCALL(x) PT_REGS_PARM5(x)
#define ___bpf_ctx_cast0() ctx
#define ___bpf_ctx_cast1(x) ___bpf_ctx_cast0(), (void *)ctx[0]
#define ___bpf_ctx_cast2(x, args...) ___bpf_ctx_cast1(args), (void *)ctx[1]
#define ___bpf_ctx_cast3(x, args...) ___bpf_ctx_cast2(args), (void *)ctx[2]
#define ___bpf_ctx_cast4(x, args...) ___bpf_ctx_cast3(args), (void *)ctx[3]
#define ___bpf_ctx_cast5(x, args...) ___bpf_ctx_cast4(args), (void *)ctx[4]
#define ___bpf_ctx_cast6(x, args...) ___bpf_ctx_cast5(args), (void *)ctx[5]
#define ___bpf_ctx_cast7(x, args...) ___bpf_ctx_cast6(args), (void *)ctx[6]
#define ___bpf_ctx_cast8(x, args...) ___bpf_ctx_cast7(args), (void *)ctx[7]
#define ___bpf_ctx_cast9(x, args...) ___bpf_ctx_cast8(args), (void *)ctx[8]
#ifndef PT_REGS_PARM1_CORE_SYSCALL
#define PT_REGS_PARM1_CORE_SYSCALL(x) PT_REGS_PARM1_CORE(x)
#endif
#define PT_REGS_PARM2_CORE_SYSCALL(x) PT_REGS_PARM2_CORE(x)
#define PT_REGS_PARM3_CORE_SYSCALL(x) PT_REGS_PARM3_CORE(x)
#ifndef PT_REGS_PARM4_CORE_SYSCALL
#define PT_REGS_PARM4_CORE_SYSCALL(x) PT_REGS_PARM4_CORE(x)
#endif
#define PT_REGS_PARM5_CORE_SYSCALL(x) PT_REGS_PARM5_CORE(x)
#else /* defined(bpf_target_defined) */
#define PT_REGS_PARM1(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_PARM2(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_PARM3(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_PARM4(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_PARM5(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_RET(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_FP(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_RC(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_SP(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_IP(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_PARM1_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_PARM2_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_PARM3_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_PARM4_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_PARM5_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_RET_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_FP_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_RC_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_SP_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_IP_CORE(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define BPF_KPROBE_READ_RET_IP(ip, ctx) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define BPF_KRETPROBE_READ_RET_IP(ip, ctx) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_PARM1_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_PARM2_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_PARM3_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_PARM4_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_PARM5_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_PARM1_CORE_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_PARM2_CORE_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_PARM3_CORE_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_PARM4_CORE_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#define PT_REGS_PARM5_CORE_SYSCALL(x) ({ _Pragma(__BPF_TARGET_MISSING); 0l; })
#endif /* defined(bpf_target_defined) */
/*
* When invoked from a syscall handler kprobe, returns a pointer to a
* struct pt_regs containing syscall arguments and suitable for passing to
* PT_REGS_PARMn_SYSCALL() and PT_REGS_PARMn_CORE_SYSCALL().
*/
#ifndef PT_REGS_SYSCALL_REGS
/* By default, assume that the arch selects ARCH_HAS_SYSCALL_WRAPPER. */
#define PT_REGS_SYSCALL_REGS(ctx) ((struct pt_regs *)PT_REGS_PARM1(ctx))
#endif
#ifndef ___bpf_concat
#define ___bpf_concat(a, b) a ## b
#endif
#ifndef ___bpf_apply
#define ___bpf_apply(fn, n) ___bpf_concat(fn, n)
#endif
#ifndef ___bpf_nth
#define ___bpf_nth(_, _1, _2, _3, _4, _5, _6, _7, _8, _9, _a, _b, _c, N, ...) N
#endif
#ifndef ___bpf_narg
#define ___bpf_narg(...) ___bpf_nth(_, ##__VA_ARGS__, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0)
#endif
#define ___bpf_ctx_cast0() ctx
#define ___bpf_ctx_cast1(x) ___bpf_ctx_cast0(), (void *)ctx[0]
#define ___bpf_ctx_cast2(x, args...) ___bpf_ctx_cast1(args), (void *)ctx[1]
#define ___bpf_ctx_cast3(x, args...) ___bpf_ctx_cast2(args), (void *)ctx[2]
#define ___bpf_ctx_cast4(x, args...) ___bpf_ctx_cast3(args), (void *)ctx[3]
#define ___bpf_ctx_cast5(x, args...) ___bpf_ctx_cast4(args), (void *)ctx[4]
#define ___bpf_ctx_cast6(x, args...) ___bpf_ctx_cast5(args), (void *)ctx[5]
#define ___bpf_ctx_cast7(x, args...) ___bpf_ctx_cast6(args), (void *)ctx[6]
#define ___bpf_ctx_cast8(x, args...) ___bpf_ctx_cast7(args), (void *)ctx[7]
#define ___bpf_ctx_cast9(x, args...) ___bpf_ctx_cast8(args), (void *)ctx[8]
#define ___bpf_ctx_cast10(x, args...) ___bpf_ctx_cast9(args), (void *)ctx[9]
#define ___bpf_ctx_cast11(x, args...) ___bpf_ctx_cast10(args), (void *)ctx[10]
#define ___bpf_ctx_cast12(x, args...) ___bpf_ctx_cast11(args), (void *)ctx[11]
#define ___bpf_ctx_cast(args...) \
___bpf_apply(___bpf_ctx_cast, ___bpf_narg(args))(args)
#define ___bpf_ctx_cast(args...) ___bpf_apply(___bpf_ctx_cast, ___bpf_narg(args))(args)
/*
* BPF_PROG is a convenience wrapper for generic tp_btf/fentry/fexit and
@@ -350,19 +415,13 @@ ____##name(unsigned long long *ctx, ##args)
struct pt_regs;
#define ___bpf_kprobe_args0() ctx
#define ___bpf_kprobe_args1(x) \
___bpf_kprobe_args0(), (void *)PT_REGS_PARM1(ctx)
#define ___bpf_kprobe_args2(x, args...) \
___bpf_kprobe_args1(args), (void *)PT_REGS_PARM2(ctx)
#define ___bpf_kprobe_args3(x, args...) \
___bpf_kprobe_args2(args), (void *)PT_REGS_PARM3(ctx)
#define ___bpf_kprobe_args4(x, args...) \
___bpf_kprobe_args3(args), (void *)PT_REGS_PARM4(ctx)
#define ___bpf_kprobe_args5(x, args...) \
___bpf_kprobe_args4(args), (void *)PT_REGS_PARM5(ctx)
#define ___bpf_kprobe_args(args...) \
___bpf_apply(___bpf_kprobe_args, ___bpf_narg(args))(args)
#define ___bpf_kprobe_args0() ctx
#define ___bpf_kprobe_args1(x) ___bpf_kprobe_args0(), (void *)PT_REGS_PARM1(ctx)
#define ___bpf_kprobe_args2(x, args...) ___bpf_kprobe_args1(args), (void *)PT_REGS_PARM2(ctx)
#define ___bpf_kprobe_args3(x, args...) ___bpf_kprobe_args2(args), (void *)PT_REGS_PARM3(ctx)
#define ___bpf_kprobe_args4(x, args...) ___bpf_kprobe_args3(args), (void *)PT_REGS_PARM4(ctx)
#define ___bpf_kprobe_args5(x, args...) ___bpf_kprobe_args4(args), (void *)PT_REGS_PARM5(ctx)
#define ___bpf_kprobe_args(args...) ___bpf_apply(___bpf_kprobe_args, ___bpf_narg(args))(args)
/*
* BPF_KPROBE serves the same purpose for kprobes as BPF_PROG for
@@ -388,11 +447,9 @@ typeof(name(0)) name(struct pt_regs *ctx) \
static __attribute__((always_inline)) typeof(name(0)) \
____##name(struct pt_regs *ctx, ##args)
#define ___bpf_kretprobe_args0() ctx
#define ___bpf_kretprobe_args1(x) \
___bpf_kretprobe_args0(), (void *)PT_REGS_RC(ctx)
#define ___bpf_kretprobe_args(args...) \
___bpf_apply(___bpf_kretprobe_args, ___bpf_narg(args))(args)
#define ___bpf_kretprobe_args0() ctx
#define ___bpf_kretprobe_args1(x) ___bpf_kretprobe_args0(), (void *)PT_REGS_RC(ctx)
#define ___bpf_kretprobe_args(args...) ___bpf_apply(___bpf_kretprobe_args, ___bpf_narg(args))(args)
/*
* BPF_KRETPROBE is similar to BPF_KPROBE, except, it only provides optional
@@ -413,20 +470,39 @@ typeof(name(0)) name(struct pt_regs *ctx) \
} \
static __always_inline typeof(name(0)) ____##name(struct pt_regs *ctx, ##args)
#define ___bpf_syscall_args0() ctx
#define ___bpf_syscall_args1(x) ___bpf_syscall_args0(), (void *)PT_REGS_PARM1_CORE_SYSCALL(regs)
#define ___bpf_syscall_args2(x, args...) ___bpf_syscall_args1(args), (void *)PT_REGS_PARM2_CORE_SYSCALL(regs)
#define ___bpf_syscall_args3(x, args...) ___bpf_syscall_args2(args), (void *)PT_REGS_PARM3_CORE_SYSCALL(regs)
#define ___bpf_syscall_args4(x, args...) ___bpf_syscall_args3(args), (void *)PT_REGS_PARM4_CORE_SYSCALL(regs)
#define ___bpf_syscall_args5(x, args...) ___bpf_syscall_args4(args), (void *)PT_REGS_PARM5_CORE_SYSCALL(regs)
#define ___bpf_syscall_args(args...) ___bpf_apply(___bpf_syscall_args, ___bpf_narg(args))(args)
/*
* BPF_SEQ_PRINTF to wrap bpf_seq_printf to-be-printed values
* in a structure.
* BPF_KPROBE_SYSCALL is a variant of BPF_KPROBE, which is intended for
* tracing syscall functions, like __x64_sys_close. It hides the underlying
* platform-specific low-level way of getting syscall input arguments from
* struct pt_regs, and provides a familiar typed and named function arguments
* syntax and semantics of accessing syscall input parameters.
*
* Original struct pt_regs* context is preserved as 'ctx' argument. This might
* be necessary when using BPF helpers like bpf_perf_event_output().
*
* This macro relies on BPF CO-RE support.
*/
#define BPF_SEQ_PRINTF(seq, fmt, args...) \
({ \
_Pragma("GCC diagnostic push") \
_Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \
static const char ___fmt[] = fmt; \
unsigned long long ___param[] = { args }; \
_Pragma("GCC diagnostic pop") \
int ___ret = bpf_seq_printf(seq, ___fmt, sizeof(___fmt), \
___param, sizeof(___param)); \
___ret; \
})
#define BPF_KPROBE_SYSCALL(name, args...) \
name(struct pt_regs *ctx); \
static __attribute__((always_inline)) typeof(name(0)) \
____##name(struct pt_regs *ctx, ##args); \
typeof(name(0)) name(struct pt_regs *ctx) \
{ \
struct pt_regs *regs = PT_REGS_SYSCALL_REGS(ctx); \
_Pragma("GCC diagnostic push") \
_Pragma("GCC diagnostic ignored \"-Wint-conversion\"") \
return ____##name(___bpf_syscall_args(args)); \
_Pragma("GCC diagnostic pop") \
} \
static __attribute__((always_inline)) typeof(name(0)) \
____##name(struct pt_regs *ctx, ##args)
#endif

1697
src/btf.c

File diff suppressed because it is too large Load Diff

278
src/btf.h
View File

@@ -1,5 +1,6 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
/* Copyright (c) 2018 Facebook */
/*! \file */
#ifndef __LIBBPF_BTF_H
#define __LIBBPF_BTF_H
@@ -30,11 +31,80 @@ enum btf_endianness {
BTF_BIG_ENDIAN = 1,
};
/**
* @brief **btf__free()** frees all data of a BTF object
* @param btf BTF object to free
*/
LIBBPF_API void btf__free(struct btf *btf);
/**
* @brief **btf__new()** creates a new instance of a BTF object from the raw
* bytes of an ELF's BTF section
* @param data raw bytes
* @param size number of bytes passed in `data`
* @return new BTF object instance which has to be eventually freed with
* **btf__free()**
*
* On error, error-code-encoded-as-pointer is returned, not a NULL. To extract
* error code from such a pointer `libbpf_get_error()` should be used. If
* `libbpf_set_strict_mode(LIBBPF_STRICT_CLEAN_PTRS)` is enabled, NULL is
* returned on error instead. In both cases thread-local `errno` variable is
* always set to error code as well.
*/
LIBBPF_API struct btf *btf__new(const void *data, __u32 size);
/**
* @brief **btf__new_split()** create a new instance of a BTF object from the
* provided raw data bytes. It takes another BTF instance, **base_btf**, which
* serves as a base BTF, which is extended by types in a newly created BTF
* instance
* @param data raw bytes
* @param size length of raw bytes
* @param base_btf the base BTF object
* @return new BTF object instance which has to be eventually freed with
* **btf__free()**
*
* If *base_btf* is NULL, `btf__new_split()` is equivalent to `btf__new()` and
* creates non-split BTF.
*
* On error, error-code-encoded-as-pointer is returned, not a NULL. To extract
* error code from such a pointer `libbpf_get_error()` should be used. If
* `libbpf_set_strict_mode(LIBBPF_STRICT_CLEAN_PTRS)` is enabled, NULL is
* returned on error instead. In both cases thread-local `errno` variable is
* always set to error code as well.
*/
LIBBPF_API struct btf *btf__new_split(const void *data, __u32 size, struct btf *base_btf);
/**
* @brief **btf__new_empty()** creates an empty BTF object. Use
* `btf__add_*()` to populate such BTF object.
* @return new BTF object instance which has to be eventually freed with
* **btf__free()**
*
* On error, error-code-encoded-as-pointer is returned, not a NULL. To extract
* error code from such a pointer `libbpf_get_error()` should be used. If
* `libbpf_set_strict_mode(LIBBPF_STRICT_CLEAN_PTRS)` is enabled, NULL is
* returned on error instead. In both cases thread-local `errno` variable is
* always set to error code as well.
*/
LIBBPF_API struct btf *btf__new_empty(void);
/**
* @brief **btf__new_empty_split()** creates an unpopulated BTF object from an
* ELF BTF section except with a base BTF on top of which split BTF should be
* based
* @return new BTF object instance which has to be eventually freed with
* **btf__free()**
*
* If *base_btf* is NULL, `btf__new_empty_split()` is equivalent to
* `btf__new_empty()` and creates non-split BTF.
*
* On error, error-code-encoded-as-pointer is returned, not a NULL. To extract
* error code from such a pointer `libbpf_get_error()` should be used. If
* `libbpf_set_strict_mode(LIBBPF_STRICT_CLEAN_PTRS)` is enabled, NULL is
* returned on error instead. In both cases thread-local `errno` variable is
* always set to error code as well.
*/
LIBBPF_API struct btf *btf__new_empty_split(struct btf *base_btf);
LIBBPF_API struct btf *btf__parse(const char *path, struct btf_ext **btf_ext);
@@ -44,13 +114,27 @@ LIBBPF_API struct btf *btf__parse_elf_split(const char *path, struct btf *base_b
LIBBPF_API struct btf *btf__parse_raw(const char *path);
LIBBPF_API struct btf *btf__parse_raw_split(const char *path, struct btf *base_btf);
LIBBPF_API struct btf *btf__load_vmlinux_btf(void);
LIBBPF_API struct btf *btf__load_module_btf(const char *module_name, struct btf *vmlinux_btf);
LIBBPF_API struct btf *libbpf_find_kernel_btf(void);
LIBBPF_API struct btf *btf__load_from_kernel_by_id(__u32 id);
LIBBPF_API struct btf *btf__load_from_kernel_by_id_split(__u32 id, struct btf *base_btf);
LIBBPF_DEPRECATED_SINCE(0, 6, "use btf__load_from_kernel_by_id instead")
LIBBPF_API int btf__get_from_id(__u32 id, struct btf **btf);
LIBBPF_DEPRECATED_SINCE(0, 6, "intended for internal libbpf use only")
LIBBPF_API int btf__finalize_data(struct bpf_object *obj, struct btf *btf);
LIBBPF_DEPRECATED_SINCE(0, 6, "use btf__load_into_kernel instead")
LIBBPF_API int btf__load(struct btf *btf);
LIBBPF_API int btf__load_into_kernel(struct btf *btf);
LIBBPF_API __s32 btf__find_by_name(const struct btf *btf,
const char *type_name);
LIBBPF_API __s32 btf__find_by_name_kind(const struct btf *btf,
const char *type_name, __u32 kind);
LIBBPF_DEPRECATED_SINCE(0, 7, "use btf__type_cnt() instead; note that btf__get_nr_types() == btf__type_cnt() - 1")
LIBBPF_API __u32 btf__get_nr_types(const struct btf *btf);
LIBBPF_API __u32 btf__type_cnt(const struct btf *btf);
LIBBPF_API const struct btf *btf__base_btf(const struct btf *btf);
LIBBPF_API const struct btf_type *btf__type_by_id(const struct btf *btf,
__u32 id);
@@ -63,19 +147,18 @@ LIBBPF_API int btf__resolve_type(const struct btf *btf, __u32 type_id);
LIBBPF_API int btf__align_of(const struct btf *btf, __u32 id);
LIBBPF_API int btf__fd(const struct btf *btf);
LIBBPF_API void btf__set_fd(struct btf *btf, int fd);
LIBBPF_API const void *btf__get_raw_data(const struct btf *btf, __u32 *size);
LIBBPF_API const void *btf__raw_data(const struct btf *btf, __u32 *size);
LIBBPF_API const char *btf__name_by_offset(const struct btf *btf, __u32 offset);
LIBBPF_API const char *btf__str_by_offset(const struct btf *btf, __u32 offset);
LIBBPF_API int btf__get_from_id(__u32 id, struct btf **btf);
LIBBPF_DEPRECATED_SINCE(0, 7, "this API is not necessary when BTF-defined maps are used")
LIBBPF_API int btf__get_map_kv_tids(const struct btf *btf, const char *map_name,
__u32 expected_key_size,
__u32 expected_value_size,
__u32 *key_type_id, __u32 *value_type_id);
LIBBPF_API struct btf_ext *btf_ext__new(__u8 *data, __u32 size);
LIBBPF_API struct btf_ext *btf_ext__new(const __u8 *data, __u32 size);
LIBBPF_API void btf_ext__free(struct btf_ext *btf_ext);
LIBBPF_API const void *btf_ext__get_raw_data(const struct btf_ext *btf_ext,
__u32 *size);
LIBBPF_API const void *btf_ext__raw_data(const struct btf_ext *btf_ext, __u32 *size);
LIBBPF_API LIBBPF_DEPRECATED("btf_ext__reloc_func_info was never meant as a public API and has wrong assumptions embedded in it; it will be removed in the future libbpf versions")
int btf_ext__reloc_func_info(const struct btf *btf,
const struct btf_ext *btf_ext,
@@ -86,15 +169,40 @@ int btf_ext__reloc_line_info(const struct btf *btf,
const struct btf_ext *btf_ext,
const char *sec_name, __u32 insns_cnt,
void **line_info, __u32 *cnt);
LIBBPF_API __u32 btf_ext__func_info_rec_size(const struct btf_ext *btf_ext);
LIBBPF_API __u32 btf_ext__line_info_rec_size(const struct btf_ext *btf_ext);
LIBBPF_API struct btf *libbpf_find_kernel_btf(void);
LIBBPF_API LIBBPF_DEPRECATED("btf_ext__reloc_func_info is deprecated; write custom func_info parsing to fetch rec_size")
__u32 btf_ext__func_info_rec_size(const struct btf_ext *btf_ext);
LIBBPF_API LIBBPF_DEPRECATED("btf_ext__reloc_line_info is deprecated; write custom line_info parsing to fetch rec_size")
__u32 btf_ext__line_info_rec_size(const struct btf_ext *btf_ext);
LIBBPF_API int btf__find_str(struct btf *btf, const char *s);
LIBBPF_API int btf__add_str(struct btf *btf, const char *s);
LIBBPF_API int btf__add_type(struct btf *btf, const struct btf *src_btf,
const struct btf_type *src_type);
/**
* @brief **btf__add_btf()** appends all the BTF types from *src_btf* into *btf*
* @param btf BTF object which all the BTF types and strings are added to
* @param src_btf BTF object which all BTF types and referenced strings are copied from
* @return BTF type ID of the first appended BTF type, or negative error code
*
* **btf__add_btf()** can be used to simply and efficiently append the entire
* contents of one BTF object to another one. All the BTF type data is copied
* over, all referenced type IDs are adjusted by adding a necessary ID offset.
* Only strings referenced from BTF types are copied over and deduplicated, so
* if there were some unused strings in *src_btf*, those won't be copied over,
* which is consistent with the general string deduplication semantics of BTF
* writing APIs.
*
* If any error is encountered during this process, the contents of *btf* is
* left intact, which means that **btf__add_btf()** follows the transactional
* semantics and the operation as a whole is all-or-nothing.
*
* *src_btf* has to be non-split BTF, as of now copying types from split BTF
* is not supported and will result in -ENOTSUP error code returned.
*/
LIBBPF_API int btf__add_btf(struct btf *btf, const struct btf *src_btf);
LIBBPF_API int btf__add_int(struct btf *btf, const char *name, size_t byte_sz, int encoding);
LIBBPF_API int btf__add_float(struct btf *btf, const char *name, size_t byte_sz);
LIBBPF_API int btf__add_ptr(struct btf *btf, int ref_type_id);
LIBBPF_API int btf__add_array(struct btf *btf,
int index_type_id, int elem_type_id, __u32 nr_elems);
@@ -119,6 +227,7 @@ LIBBPF_API int btf__add_typedef(struct btf *btf, const char *name, int ref_type_
LIBBPF_API int btf__add_volatile(struct btf *btf, int ref_type_id);
LIBBPF_API int btf__add_const(struct btf *btf, int ref_type_id);
LIBBPF_API int btf__add_restrict(struct btf *btf, int ref_type_id);
LIBBPF_API int btf__add_type_tag(struct btf *btf, const char *value, int ref_type_id);
/* func and func_proto construction APIs */
LIBBPF_API int btf__add_func(struct btf *btf, const char *name,
@@ -132,26 +241,91 @@ LIBBPF_API int btf__add_datasec(struct btf *btf, const char *name, __u32 byte_sz
LIBBPF_API int btf__add_datasec_var_info(struct btf *btf, int var_type_id,
__u32 offset, __u32 byte_sz);
struct btf_dedup_opts {
unsigned int dedup_table_size;
bool dont_resolve_fwds;
};
/* tag construction API */
LIBBPF_API int btf__add_decl_tag(struct btf *btf, const char *value, int ref_type_id,
int component_idx);
LIBBPF_API int btf__dedup(struct btf *btf, struct btf_ext *btf_ext,
const struct btf_dedup_opts *opts);
struct btf_dedup_opts {
size_t sz;
/* optional .BTF.ext info to dedup along the main BTF info */
struct btf_ext *btf_ext;
/* force hash collisions (used for testing) */
bool force_collisions;
size_t :0;
};
#define btf_dedup_opts__last_field force_collisions
LIBBPF_API int btf__dedup(struct btf *btf, const struct btf_dedup_opts *opts);
LIBBPF_API int btf__dedup_v0_6_0(struct btf *btf, const struct btf_dedup_opts *opts);
LIBBPF_DEPRECATED_SINCE(0, 7, "use btf__dedup() instead")
LIBBPF_API int btf__dedup_deprecated(struct btf *btf, struct btf_ext *btf_ext, const void *opts);
#define btf__dedup(...) ___libbpf_overload(___btf_dedup, __VA_ARGS__)
#define ___btf_dedup3(btf, btf_ext, opts) btf__dedup_deprecated(btf, btf_ext, opts)
#define ___btf_dedup2(btf, opts) btf__dedup(btf, opts)
struct btf_dump;
struct btf_dump_opts {
void *ctx;
union {
size_t sz;
void *ctx; /* DEPRECATED: will be gone in v1.0 */
};
};
typedef void (*btf_dump_printf_fn_t)(void *ctx, const char *fmt, va_list args);
LIBBPF_API struct btf_dump *btf_dump__new(const struct btf *btf,
const struct btf_ext *btf_ext,
const struct btf_dump_opts *opts,
btf_dump_printf_fn_t printf_fn);
btf_dump_printf_fn_t printf_fn,
void *ctx,
const struct btf_dump_opts *opts);
LIBBPF_API struct btf_dump *btf_dump__new_v0_6_0(const struct btf *btf,
btf_dump_printf_fn_t printf_fn,
void *ctx,
const struct btf_dump_opts *opts);
LIBBPF_API struct btf_dump *btf_dump__new_deprecated(const struct btf *btf,
const struct btf_ext *btf_ext,
const struct btf_dump_opts *opts,
btf_dump_printf_fn_t printf_fn);
/* Choose either btf_dump__new() or btf_dump__new_deprecated() based on the
* type of 4th argument. If it's btf_dump's print callback, use deprecated
* API; otherwise, choose the new btf_dump__new(). ___libbpf_override()
* doesn't work here because both variants have 4 input arguments.
*
* (void *) casts are necessary to avoid compilation warnings about type
* mismatches, because even though __builtin_choose_expr() only ever evaluates
* one side the other side still has to satisfy type constraints (this is
* compiler implementation limitation which might be lifted eventually,
* according to the documentation). So passing struct btf_ext in place of
* btf_dump_printf_fn_t would be generating compilation warning. Casting to
* void * avoids this issue.
*
* Also, two type compatibility checks for a function and function pointer are
* required because passing function reference into btf_dump__new() as
* btf_dump__new(..., my_callback, ...) and as btf_dump__new(...,
* &my_callback, ...) (not explicit ampersand in the latter case) actually
* differs as far as __builtin_types_compatible_p() is concerned. Thus two
* checks are combined to detect callback argument.
*
* The rest works just like in case of ___libbpf_override() usage with symbol
* versioning.
*
* C++ compilers don't support __builtin_types_compatible_p(), so at least
* don't screw up compilation for them and let C++ users pick btf_dump__new
* vs btf_dump__new_deprecated explicitly.
*/
#ifndef __cplusplus
#define btf_dump__new(a1, a2, a3, a4) __builtin_choose_expr( \
__builtin_types_compatible_p(typeof(a4), btf_dump_printf_fn_t) || \
__builtin_types_compatible_p(typeof(a4), void(void *, const char *, va_list)), \
btf_dump__new_deprecated((void *)a1, (void *)a2, (void *)a3, (void *)a4), \
btf_dump__new((void *)a1, (void *)a2, (void *)a3, (void *)a4))
#endif
LIBBPF_API void btf_dump__free(struct btf_dump *d);
LIBBPF_API int btf_dump__dump_type(struct btf_dump *d, __u32 id);
@@ -173,6 +347,7 @@ struct btf_dump_emit_type_decl_opts {
int indent_level;
/* strip all the const/volatile/restrict mods */
bool strip_mods;
size_t :0;
};
#define btf_dump_emit_type_decl_opts__last_field strip_mods
@@ -180,9 +355,48 @@ LIBBPF_API int
btf_dump__emit_type_decl(struct btf_dump *d, __u32 id,
const struct btf_dump_emit_type_decl_opts *opts);
struct btf_dump_type_data_opts {
/* size of this struct, for forward/backward compatibility */
size_t sz;
const char *indent_str;
int indent_level;
/* below match "show" flags for bpf_show_snprintf() */
bool compact; /* no newlines/indentation */
bool skip_names; /* skip member/type names */
bool emit_zeroes; /* show 0-valued fields */
size_t :0;
};
#define btf_dump_type_data_opts__last_field emit_zeroes
LIBBPF_API int
btf_dump__dump_type_data(struct btf_dump *d, __u32 id,
const void *data, size_t data_sz,
const struct btf_dump_type_data_opts *opts);
/*
* A set of helpers for easier BTF types handling
* A set of helpers for easier BTF types handling.
*
* The inline functions below rely on constants from the kernel headers which
* may not be available for applications including this header file. To avoid
* compilation errors, we define all the constants here that were added after
* the initial introduction of the BTF_KIND* constants.
*/
#ifndef BTF_KIND_FUNC
#define BTF_KIND_FUNC 12 /* Function */
#define BTF_KIND_FUNC_PROTO 13 /* Function Proto */
#endif
#ifndef BTF_KIND_VAR
#define BTF_KIND_VAR 14 /* Variable */
#define BTF_KIND_DATASEC 15 /* Section */
#endif
#ifndef BTF_KIND_FLOAT
#define BTF_KIND_FLOAT 16 /* Floating point */
#endif
/* The kernel header switched to enums, so these two were never #defined */
#define BTF_KIND_DECL_TAG 17 /* Decl Tag */
#define BTF_KIND_TYPE_TAG 18 /* Type Tag */
static inline __u16 btf_kind(const struct btf_type *t)
{
return BTF_INFO_KIND(t->info);
@@ -271,7 +485,8 @@ static inline bool btf_is_mod(const struct btf_type *t)
return kind == BTF_KIND_VOLATILE ||
kind == BTF_KIND_CONST ||
kind == BTF_KIND_RESTRICT;
kind == BTF_KIND_RESTRICT ||
kind == BTF_KIND_TYPE_TAG;
}
static inline bool btf_is_func(const struct btf_type *t)
@@ -294,6 +509,21 @@ static inline bool btf_is_datasec(const struct btf_type *t)
return btf_kind(t) == BTF_KIND_DATASEC;
}
static inline bool btf_is_float(const struct btf_type *t)
{
return btf_kind(t) == BTF_KIND_FLOAT;
}
static inline bool btf_is_decl_tag(const struct btf_type *t)
{
return btf_kind(t) == BTF_KIND_DECL_TAG;
}
static inline bool btf_is_type_tag(const struct btf_type *t)
{
return btf_kind(t) == BTF_KIND_TYPE_TAG;
}
static inline __u8 btf_int_encoding(const struct btf_type *t)
{
return BTF_INT_ENCODING(*(__u32 *)(t + 1));
@@ -362,6 +592,12 @@ btf_var_secinfos(const struct btf_type *t)
return (struct btf_var_secinfo *)(t + 1);
}
struct btf_decl_tag;
static inline struct btf_decl_tag *btf_decl_tag(const struct btf_type *t)
{
return (struct btf_decl_tag *)(t + 1);
}
#ifdef __cplusplus
} /* extern "C" */
#endif

File diff suppressed because it is too large Load Diff

1112
src/gen_loader.c Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -75,7 +75,7 @@ void hashmap__clear(struct hashmap *map)
void hashmap__free(struct hashmap *map)
{
if (!map)
if (IS_ERR_OR_NULL(map))
return;
hashmap__clear(map);
@@ -238,4 +238,3 @@ bool hashmap__delete(struct hashmap *map, const void *key,
return true;
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -247,6 +247,7 @@ LIBBPF_0.0.8 {
bpf_link_create;
bpf_link_update;
bpf_map__set_initial_value;
bpf_prog_attach_opts;
bpf_program__attach_cgroup;
bpf_program__attach_lsm;
bpf_program__is_lsm;
@@ -350,3 +351,91 @@ LIBBPF_0.3.0 {
xsk_setup_xdp_prog;
xsk_socket__update_xskmap;
} LIBBPF_0.2.0;
LIBBPF_0.4.0 {
global:
btf__add_float;
btf__add_type;
bpf_linker__add_file;
bpf_linker__finalize;
bpf_linker__free;
bpf_linker__new;
bpf_map__inner_map;
bpf_object__set_kversion;
bpf_tc_attach;
bpf_tc_detach;
bpf_tc_hook_create;
bpf_tc_hook_destroy;
bpf_tc_query;
} LIBBPF_0.3.0;
LIBBPF_0.5.0 {
global:
bpf_map__initial_value;
bpf_map__pin_path;
bpf_map_lookup_and_delete_elem_flags;
bpf_program__attach_kprobe_opts;
bpf_program__attach_perf_event_opts;
bpf_program__attach_tracepoint_opts;
bpf_program__attach_uprobe_opts;
bpf_object__gen_loader;
btf__load_from_kernel_by_id;
btf__load_from_kernel_by_id_split;
btf__load_into_kernel;
btf__load_module_btf;
btf__load_vmlinux_btf;
btf_dump__dump_type_data;
libbpf_set_strict_mode;
} LIBBPF_0.4.0;
LIBBPF_0.6.0 {
global:
bpf_map__map_extra;
bpf_map__set_map_extra;
bpf_map_create;
bpf_object__next_map;
bpf_object__next_program;
bpf_object__prev_map;
bpf_object__prev_program;
bpf_prog_load_deprecated;
bpf_prog_load;
bpf_program__flags;
bpf_program__insn_cnt;
bpf_program__insns;
bpf_program__set_flags;
btf__add_btf;
btf__add_decl_tag;
btf__add_type_tag;
btf__dedup;
btf__dedup_deprecated;
btf__raw_data;
btf__type_cnt;
btf_dump__new;
btf_dump__new_deprecated;
libbpf_major_version;
libbpf_minor_version;
libbpf_version_string;
perf_buffer__new;
perf_buffer__new_deprecated;
perf_buffer__new_raw;
perf_buffer__new_raw_deprecated;
} LIBBPF_0.5.0;
LIBBPF_0.7.0 {
global:
bpf_btf_load;
bpf_program__expected_attach_type;
bpf_program__log_buf;
bpf_program__log_level;
bpf_program__set_log_buf;
bpf_program__set_log_level;
bpf_program__type;
bpf_xdp_attach;
bpf_xdp_detach;
bpf_xdp_query;
bpf_xdp_query_id;
libbpf_probe_bpf_helper;
libbpf_probe_bpf_map_type;
libbpf_probe_bpf_prog_type;
libbpf_set_memlock_rlim_max;
};

View File

@@ -10,6 +10,7 @@
#define __LIBBPF_LIBBPF_COMMON_H
#include <string.h>
#include "libbpf_version.h"
#ifndef LIBBPF_API
#define LIBBPF_API __attribute__((visibility("default")))
@@ -17,6 +18,46 @@
#define LIBBPF_DEPRECATED(msg) __attribute__((deprecated(msg)))
/* Mark a symbol as deprecated when libbpf version is >= {major}.{minor} */
#define LIBBPF_DEPRECATED_SINCE(major, minor, msg) \
__LIBBPF_MARK_DEPRECATED_ ## major ## _ ## minor \
(LIBBPF_DEPRECATED("libbpf v" # major "." # minor "+: " msg))
#define __LIBBPF_CURRENT_VERSION_GEQ(major, minor) \
(LIBBPF_MAJOR_VERSION > (major) || \
(LIBBPF_MAJOR_VERSION == (major) && LIBBPF_MINOR_VERSION >= (minor)))
/* Add checks for other versions below when planning deprecation of API symbols
* with the LIBBPF_DEPRECATED_SINCE macro.
*/
#if __LIBBPF_CURRENT_VERSION_GEQ(0, 6)
#define __LIBBPF_MARK_DEPRECATED_0_6(X) X
#else
#define __LIBBPF_MARK_DEPRECATED_0_6(X)
#endif
#if __LIBBPF_CURRENT_VERSION_GEQ(0, 7)
#define __LIBBPF_MARK_DEPRECATED_0_7(X) X
#else
#define __LIBBPF_MARK_DEPRECATED_0_7(X)
#endif
#if __LIBBPF_CURRENT_VERSION_GEQ(0, 8)
#define __LIBBPF_MARK_DEPRECATED_0_8(X) X
#else
#define __LIBBPF_MARK_DEPRECATED_0_8(X)
#endif
/* This set of internal macros allows to do "function overloading" based on
* number of arguments provided by used in backwards-compatible way during the
* transition to libbpf 1.0
* It's ugly but necessary evil that will be cleaned up when we get to 1.0.
* See bpf_prog_load() overload for example.
*/
#define ___libbpf_cat(A, B) A ## B
#define ___libbpf_select(NAME, NUM) ___libbpf_cat(NAME, NUM)
#define ___libbpf_nth(_1, _2, _3, _4, _5, _6, N, ...) N
#define ___libbpf_cnt(...) ___libbpf_nth(__VA_ARGS__, 6, 5, 4, 3, 2, 1)
#define ___libbpf_overload(NAME, ...) ___libbpf_select(NAME, ___libbpf_cnt(__VA_ARGS__))(__VA_ARGS__)
/* Helper macro to declare and initialize libbpf options struct
*
* This dance with uninitialized declaration, followed by memset to zero,
@@ -30,7 +71,7 @@
* including any extra padding, it with memset() and then assigns initial
* values provided by users in struct initializer-syntax as varargs.
*/
#define DECLARE_LIBBPF_OPTS(TYPE, NAME, ...) \
#define LIBBPF_OPTS(TYPE, NAME, ...) \
struct TYPE NAME = ({ \
memset(&NAME, 0, sizeof(struct TYPE)); \
(struct TYPE) { \

View File

@@ -12,6 +12,7 @@
#include <string.h>
#include "libbpf.h"
#include "libbpf_internal.h"
/* make sure libbpf doesn't use kernel-only integer typedefs */
#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
@@ -39,7 +40,7 @@ static const char *libbpf_strerror_table[NR_ERRNO] = {
int libbpf_strerror(int err, char *buf, size_t size)
{
if (!buf || !size)
return -1;
return libbpf_err(-EINVAL);
err = err > 0 ? err : -err;
@@ -48,7 +49,7 @@ int libbpf_strerror(int err, char *buf, size_t size)
ret = strerror_r(err, buf, size);
buf[size - 1] = '\0';
return ret;
return libbpf_err_errno(ret);
}
if (err < __LIBBPF_ERRNO__END) {
@@ -62,5 +63,5 @@ int libbpf_strerror(int err, char *buf, size_t size)
snprintf(buf, size, "Unknown libbpf error %d", err);
buf[size - 1] = '\0';
return -1;
return libbpf_err(-ENOENT);
}

View File

@@ -11,6 +11,12 @@
#include <stdlib.h>
#include <limits.h>
#include <errno.h>
#include <linux/err.h>
#include <fcntl.h>
#include <unistd.h>
#include "libbpf_legacy.h"
#include "relo_core.h"
/* make sure libbpf doesn't use kernel-only integer typedefs */
#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
@@ -19,6 +25,38 @@
#pragma GCC poison reallocarray
#include "libbpf.h"
#include "btf.h"
#ifndef EM_BPF
#define EM_BPF 247
#endif
#ifndef R_BPF_64_64
#define R_BPF_64_64 1
#endif
#ifndef R_BPF_64_ABS64
#define R_BPF_64_ABS64 2
#endif
#ifndef R_BPF_64_ABS32
#define R_BPF_64_ABS32 3
#endif
#ifndef R_BPF_64_32
#define R_BPF_64_32 10
#endif
#ifndef SHT_LLVM_ADDRSIG
#define SHT_LLVM_ADDRSIG 0x6FFF4C03
#endif
/* if libelf is old and doesn't support mmap(), fall back to read() */
#ifndef ELF_C_READ_MMAP
#define ELF_C_READ_MMAP ELF_C_READ
#endif
/* Older libelf all end up in this expression, for both 32 and 64 bit */
#ifndef ELF64_ST_VISIBILITY
#define ELF64_ST_VISIBILITY(o) ((o) & 0x03)
#endif
#define BTF_INFO_ENC(kind, kind_flag, vlen) \
((!!(kind_flag) << 31) | ((kind) << 24) | ((vlen) & BTF_MAX_VLEN))
@@ -31,6 +69,12 @@
#define BTF_MEMBER_ENC(name, type, bits_offset) (name), (type), (bits_offset)
#define BTF_PARAM_ENC(name, type) (name), (type)
#define BTF_VAR_SECINFO_ENC(type, offset, size) (type), (offset), (size)
#define BTF_TYPE_FLOAT_ENC(name, sz) \
BTF_TYPE_ENC(name, BTF_INFO_ENC(BTF_KIND_FLOAT, 0, 0), sz)
#define BTF_TYPE_DECL_TAG_ENC(value, type, component_idx) \
BTF_TYPE_ENC(value, BTF_INFO_ENC(BTF_KIND_DECL_TAG, 0, 0), type), (component_idx)
#define BTF_TYPE_TYPE_TAG_ENC(value, type) \
BTF_TYPE_ENC(value, BTF_INFO_ENC(BTF_KIND_TYPE_TAG, 0, 0), type)
#ifndef likely
#define likely(x) __builtin_expect(!!(x), 1)
@@ -48,21 +92,44 @@
# define offsetofend(TYPE, FIELD) \
(offsetof(TYPE, FIELD) + sizeof(((TYPE *)0)->FIELD))
#endif
#ifndef __alias
#define __alias(symbol) __attribute__((alias(#symbol)))
#endif
/* Check whether a string `str` has prefix `pfx`, regardless if `pfx` is
* a string literal known at compilation time or char * pointer known only at
* runtime.
*/
#define str_has_pfx(str, pfx) \
(strncmp(str, pfx, __builtin_constant_p(pfx) ? sizeof(pfx) - 1 : strlen(pfx)) == 0)
/* Symbol versioning is different between static and shared library.
* Properly versioned symbols are needed for shared library, but
* only the symbol of the new version is needed for static library.
* Starting with GNU C 10, use symver attribute instead of .symver assembler
* directive, which works better with GCC LTO builds.
*/
#ifdef SHARED
# define COMPAT_VERSION(internal_name, api_name, version) \
#if defined(SHARED) && defined(__GNUC__) && __GNUC__ >= 10
#define DEFAULT_VERSION(internal_name, api_name, version) \
__attribute__((symver(#api_name "@@" #version)))
#define COMPAT_VERSION(internal_name, api_name, version) \
__attribute__((symver(#api_name "@" #version)))
#elif defined(SHARED)
#define COMPAT_VERSION(internal_name, api_name, version) \
asm(".symver " #internal_name "," #api_name "@" #version);
# define DEFAULT_VERSION(internal_name, api_name, version) \
#define DEFAULT_VERSION(internal_name, api_name, version) \
asm(".symver " #internal_name "," #api_name "@@" #version);
#else
# define COMPAT_VERSION(internal_name, api_name, version)
# define DEFAULT_VERSION(internal_name, api_name, version) \
#else /* !SHARED */
#define COMPAT_VERSION(internal_name, api_name, version)
#define DEFAULT_VERSION(internal_name, api_name, version) \
extern typeof(internal_name) api_name \
__attribute__((alias(#internal_name)));
#endif
extern void libbpf_print(enum libbpf_print_level level,
@@ -105,9 +172,92 @@ static inline void *libbpf_reallocarray(void *ptr, size_t nmemb, size_t size)
return realloc(ptr, total);
}
void *btf_add_mem(void **data, size_t *cap_cnt, size_t elem_sz,
size_t cur_cnt, size_t max_cnt, size_t add_cnt);
int btf_ensure_mem(void **data, size_t *cap_cnt, size_t elem_sz, size_t need_cnt);
/* Copy up to sz - 1 bytes from zero-terminated src string and ensure that dst
* is zero-terminated string no matter what (unless sz == 0, in which case
* it's a no-op). It's conceptually close to FreeBSD's strlcpy(), but differs
* in what is returned. Given this is internal helper, it's trivial to extend
* this, when necessary. Use this instead of strncpy inside libbpf source code.
*/
static inline void libbpf_strlcpy(char *dst, const char *src, size_t sz)
{
size_t i;
if (sz == 0)
return;
sz--;
for (i = 0; i < sz && src[i]; i++)
dst[i] = src[i];
dst[i] = '\0';
}
__u32 get_kernel_version(void);
struct btf;
struct btf_type;
struct btf_type *btf_type_by_id(const struct btf *btf, __u32 type_id);
const char *btf_kind_str(const struct btf_type *t);
const struct btf_type *skip_mods_and_typedefs(const struct btf *btf, __u32 id, __u32 *res_id);
static inline enum btf_func_linkage btf_func_linkage(const struct btf_type *t)
{
return (enum btf_func_linkage)(int)btf_vlen(t);
}
static inline __u32 btf_type_info(int kind, int vlen, int kflag)
{
return (kflag << 31) | (kind << 24) | vlen;
}
enum map_def_parts {
MAP_DEF_MAP_TYPE = 0x001,
MAP_DEF_KEY_TYPE = 0x002,
MAP_DEF_KEY_SIZE = 0x004,
MAP_DEF_VALUE_TYPE = 0x008,
MAP_DEF_VALUE_SIZE = 0x010,
MAP_DEF_MAX_ENTRIES = 0x020,
MAP_DEF_MAP_FLAGS = 0x040,
MAP_DEF_NUMA_NODE = 0x080,
MAP_DEF_PINNING = 0x100,
MAP_DEF_INNER_MAP = 0x200,
MAP_DEF_MAP_EXTRA = 0x400,
MAP_DEF_ALL = 0x7ff, /* combination of all above */
};
struct btf_map_def {
enum map_def_parts parts;
__u32 map_type;
__u32 key_type_id;
__u32 key_size;
__u32 value_type_id;
__u32 value_size;
__u32 max_entries;
__u32 map_flags;
__u32 numa_node;
__u32 pinning;
__u64 map_extra;
};
int parse_btf_map_def(const char *map_name, struct btf *btf,
const struct btf_type *def_t, bool strict,
struct btf_map_def *map_def, struct btf_map_def *inner_def);
void *libbpf_add_mem(void **data, size_t *cap_cnt, size_t elem_sz,
size_t cur_cnt, size_t max_cnt, size_t add_cnt);
int libbpf_ensure_mem(void **data, size_t *cap_cnt, size_t elem_sz, size_t need_cnt);
static inline bool libbpf_is_mem_zeroed(const char *p, ssize_t len)
{
while (len > 0) {
if (*p)
return false;
p++;
len--;
}
return true;
}
static inline bool libbpf_validate_opts(const char *opts,
size_t opts_sz, size_t user_sz,
@@ -117,16 +267,9 @@ static inline bool libbpf_validate_opts(const char *opts,
pr_warn("%s size (%zu) is too small\n", type_name, user_sz);
return false;
}
if (user_sz > opts_sz) {
size_t i;
for (i = opts_sz; i < user_sz; i++) {
if (opts[i]) {
pr_warn("%s has non-zero extra bytes\n",
type_name);
return false;
}
}
if (!libbpf_is_mem_zeroed(opts + opts_sz, (ssize_t)user_sz - opts_sz)) {
pr_warn("%s has non-zero extra bytes\n", type_name);
return false;
}
return true;
}
@@ -146,46 +289,62 @@ static inline bool libbpf_validate_opts(const char *opts,
(opts)->field = value; \
} while (0)
#define OPTS_ZEROED(opts, last_nonzero_field) \
({ \
ssize_t __off = offsetofend(typeof(*(opts)), last_nonzero_field); \
!(opts) || libbpf_is_mem_zeroed((const void *)opts + __off, \
(opts)->sz - __off); \
})
enum kern_feature_id {
/* v4.14: kernel support for program & map names. */
FEAT_PROG_NAME,
/* v5.2: kernel support for global data sections. */
FEAT_GLOBAL_DATA,
/* BTF support */
FEAT_BTF,
/* BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO support */
FEAT_BTF_FUNC,
/* BTF_KIND_VAR and BTF_KIND_DATASEC support */
FEAT_BTF_DATASEC,
/* BTF_FUNC_GLOBAL is supported */
FEAT_BTF_GLOBAL_FUNC,
/* BPF_F_MMAPABLE is supported for arrays */
FEAT_ARRAY_MMAP,
/* kernel support for expected_attach_type in BPF_PROG_LOAD */
FEAT_EXP_ATTACH_TYPE,
/* bpf_probe_read_{kernel,user}[_str] helpers */
FEAT_PROBE_READ_KERN,
/* BPF_PROG_BIND_MAP is supported */
FEAT_PROG_BIND_MAP,
/* Kernel support for module BTFs */
FEAT_MODULE_BTF,
/* BTF_KIND_FLOAT support */
FEAT_BTF_FLOAT,
/* BPF perf link support */
FEAT_PERF_LINK,
/* BTF_KIND_DECL_TAG support */
FEAT_BTF_DECL_TAG,
/* BTF_KIND_TYPE_TAG support */
FEAT_BTF_TYPE_TAG,
/* memcg-based accounting for BPF maps and progs */
FEAT_MEMCG_ACCOUNT,
__FEAT_CNT,
};
int probe_memcg_account(void);
bool kernel_supports(const struct bpf_object *obj, enum kern_feature_id feat_id);
int bump_rlimit_memlock(void);
int parse_cpu_mask_str(const char *s, bool **mask, int *mask_sz);
int parse_cpu_mask_file(const char *fcpu, bool **mask, int *mask_sz);
int libbpf__load_raw_btf(const char *raw_types, size_t types_len,
const char *str_sec, size_t str_len);
int btf_load_into_kernel(struct btf *btf, char *log_buf, size_t log_sz, __u32 log_level);
struct bpf_prog_load_params {
enum bpf_prog_type prog_type;
enum bpf_attach_type expected_attach_type;
const char *name;
const struct bpf_insn *insns;
size_t insn_cnt;
const char *license;
__u32 kern_version;
__u32 attach_prog_fd;
__u32 attach_btf_obj_fd;
__u32 attach_btf_id;
__u32 prog_ifindex;
__u32 prog_btf_fd;
__u32 prog_flags;
__u32 func_info_rec_size;
const void *func_info;
__u32 func_info_cnt;
__u32 line_info_rec_size;
const void *line_info;
__u32 line_info_cnt;
__u32 log_level;
char *log_buf;
size_t log_buf_sz;
};
int libbpf__bpf_prog_load(const struct bpf_prog_load_params *load_attr);
int bpf_object__section_size(const struct bpf_object *obj, const char *name,
__u32 *size);
int bpf_object__variable_offset(const struct bpf_object *obj, const char *name,
__u32 *off);
struct btf *btf_get_from_fd(int btf_fd, struct btf *base_btf);
void btf_get_kernel_prefix_kind(enum bpf_attach_type attach_type,
const char **prefix, int *kind);
struct btf_ext_info {
/*
@@ -278,75 +437,96 @@ struct bpf_line_info_min {
__u32 line_col;
};
/* bpf_core_relo_kind encodes which aspect of captured field/type/enum value
* has to be adjusted by relocations.
*/
enum bpf_core_relo_kind {
BPF_FIELD_BYTE_OFFSET = 0, /* field byte offset */
BPF_FIELD_BYTE_SIZE = 1, /* field size in bytes */
BPF_FIELD_EXISTS = 2, /* field existence in target kernel */
BPF_FIELD_SIGNED = 3, /* field signedness (0 - unsigned, 1 - signed) */
BPF_FIELD_LSHIFT_U64 = 4, /* bitfield-specific left bitshift */
BPF_FIELD_RSHIFT_U64 = 5, /* bitfield-specific right bitshift */
BPF_TYPE_ID_LOCAL = 6, /* type ID in local BPF object */
BPF_TYPE_ID_TARGET = 7, /* type ID in target kernel */
BPF_TYPE_EXISTS = 8, /* type existence in target kernel */
BPF_TYPE_SIZE = 9, /* type size in bytes */
BPF_ENUMVAL_EXISTS = 10, /* enum value existence in target kernel */
BPF_ENUMVAL_VALUE = 11, /* enum value integer value */
};
/* The minimum bpf_core_relo checked by the loader
*
* CO-RE relocation captures the following data:
* - insn_off - instruction offset (in bytes) within a BPF program that needs
* its insn->imm field to be relocated with actual field info;
* - type_id - BTF type ID of the "root" (containing) entity of a relocatable
* type or field;
* - access_str_off - offset into corresponding .BTF string section. String
* interpretation depends on specific relocation kind:
* - for field-based relocations, string encodes an accessed field using
* a sequence of field and array indices, separated by colon (:). It's
* conceptually very close to LLVM's getelementptr ([0]) instruction's
* arguments for identifying offset to a field.
* - for type-based relocations, strings is expected to be just "0";
* - for enum value-based relocations, string contains an index of enum
* value within its enum type;
*
* Example to provide a better feel.
*
* struct sample {
* int a;
* struct {
* int b[10];
* };
* };
*
* struct sample *s = ...;
* int x = &s->a; // encoded as "0:0" (a is field #0)
* int y = &s->b[5]; // encoded as "0:1:0:5" (anon struct is field #1,
* // b is field #0 inside anon struct, accessing elem #5)
* int z = &s[10]->b; // encoded as "10:1" (ptr is used as an array)
*
* type_id for all relocs in this example will capture BTF type id of
* `struct sample`.
*
* Such relocation is emitted when using __builtin_preserve_access_index()
* Clang built-in, passing expression that captures field address, e.g.:
*
* bpf_probe_read(&dst, sizeof(dst),
* __builtin_preserve_access_index(&src->a.b.c));
*
* In this case Clang will emit field relocation recording necessary data to
* be able to find offset of embedded `a.b.c` field within `src` struct.
*
* [0] https://llvm.org/docs/LangRef.html#getelementptr-instruction
typedef int (*type_id_visit_fn)(__u32 *type_id, void *ctx);
typedef int (*str_off_visit_fn)(__u32 *str_off, void *ctx);
int btf_type_visit_type_ids(struct btf_type *t, type_id_visit_fn visit, void *ctx);
int btf_type_visit_str_offs(struct btf_type *t, str_off_visit_fn visit, void *ctx);
int btf_ext_visit_type_ids(struct btf_ext *btf_ext, type_id_visit_fn visit, void *ctx);
int btf_ext_visit_str_offs(struct btf_ext *btf_ext, str_off_visit_fn visit, void *ctx);
__s32 btf__find_by_name_kind_own(const struct btf *btf, const char *type_name,
__u32 kind);
extern enum libbpf_strict_mode libbpf_mode;
/* handle direct returned errors */
static inline int libbpf_err(int ret)
{
if (ret < 0)
errno = -ret;
return ret;
}
/* handle errno-based (e.g., syscall or libc) errors according to libbpf's
* strict mode settings
*/
struct bpf_core_relo {
__u32 insn_off;
__u32 type_id;
__u32 access_str_off;
enum bpf_core_relo_kind kind;
};
static inline int libbpf_err_errno(int ret)
{
if (libbpf_mode & LIBBPF_STRICT_DIRECT_ERRS)
/* errno is already assumed to be set on error */
return ret < 0 ? -errno : ret;
/* legacy: on error return -1 directly and don't touch errno */
return ret;
}
/* handle error for pointer-returning APIs, err is assumed to be < 0 always */
static inline void *libbpf_err_ptr(int err)
{
/* set errno on error, this doesn't break anything */
errno = -err;
if (libbpf_mode & LIBBPF_STRICT_CLEAN_PTRS)
return NULL;
/* legacy: encode err as ptr */
return ERR_PTR(err);
}
/* handle pointer-returning APIs' error handling */
static inline void *libbpf_ptr(void *ret)
{
/* set errno on error, this doesn't break anything */
if (IS_ERR(ret))
errno = -PTR_ERR(ret);
if (libbpf_mode & LIBBPF_STRICT_CLEAN_PTRS)
return IS_ERR(ret) ? NULL : ret;
/* legacy: pass-through original pointer */
return ret;
}
static inline bool str_is_empty(const char *s)
{
return !s || !s[0];
}
static inline bool is_ldimm64_insn(struct bpf_insn *insn)
{
return insn->code == (BPF_LD | BPF_IMM | BPF_DW);
}
/* if fd is stdin, stdout, or stderr, dup to a fd greater than 2
* Takes ownership of the fd passed in, and closes it if calling
* fcntl(fd, F_DUPFD_CLOEXEC, 3).
*/
static inline int ensure_good_fd(int fd)
{
int old_fd = fd, saved_errno;
if (fd < 0)
return fd;
if (fd < 3) {
fd = fcntl(fd, F_DUPFD_CLOEXEC, 3);
saved_errno = errno;
close(old_fd);
if (fd < 0) {
pr_warn("failed to dup FD %d to FD > 2: %d\n", old_fd, -saved_errno);
errno = saved_errno;
}
}
return fd;
}
#endif /* __LIBBPF_LIBBPF_INTERNAL_H */

110
src/libbpf_legacy.h Normal file
View File

@@ -0,0 +1,110 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
/*
* Libbpf legacy APIs (either discouraged or deprecated, as mentioned in [0])
*
* [0] https://docs.google.com/document/d/1UyjTZuPFWiPFyKk1tV5an11_iaRuec6U-ZESZ54nNTY
*
* Copyright (C) 2021 Facebook
*/
#ifndef __LIBBPF_LEGACY_BPF_H
#define __LIBBPF_LEGACY_BPF_H
#include <linux/bpf.h>
#include <stdbool.h>
#include <stddef.h>
#include <stdint.h>
#include "libbpf_common.h"
#ifdef __cplusplus
extern "C" {
#endif
enum libbpf_strict_mode {
/* Turn on all supported strict features of libbpf to simulate libbpf
* v1.0 behavior.
* This will be the default behavior in libbpf v1.0.
*/
LIBBPF_STRICT_ALL = 0xffffffff,
/*
* Disable any libbpf 1.0 behaviors. This is the default before libbpf
* v1.0. It won't be supported anymore in v1.0, please update your
* code so that it handles LIBBPF_STRICT_ALL mode before libbpf v1.0.
*/
LIBBPF_STRICT_NONE = 0x00,
/*
* Return NULL pointers on error, not ERR_PTR(err).
* Additionally, libbpf also always sets errno to corresponding Exx
* (positive) error code.
*/
LIBBPF_STRICT_CLEAN_PTRS = 0x01,
/*
* Return actual error codes from low-level APIs directly, not just -1.
* Additionally, libbpf also always sets errno to corresponding Exx
* (positive) error code.
*/
LIBBPF_STRICT_DIRECT_ERRS = 0x02,
/*
* Enforce strict BPF program section (SEC()) names.
* E.g., while prefiously SEC("xdp_whatever") or SEC("perf_event_blah") were
* allowed, with LIBBPF_STRICT_SEC_PREFIX this will become
* unrecognized by libbpf and would have to be just SEC("xdp") and
* SEC("xdp") and SEC("perf_event").
*
* Note, in this mode the program pin path will be based on the
* function name instead of section name.
*/
LIBBPF_STRICT_SEC_NAME = 0x04,
/*
* Disable the global 'bpf_objects_list'. Maintaining this list adds
* a race condition to bpf_object__open() and bpf_object__close().
* Clients can maintain it on their own if it is valuable for them.
*/
LIBBPF_STRICT_NO_OBJECT_LIST = 0x08,
/*
* Automatically bump RLIMIT_MEMLOCK using setrlimit() before the
* first BPF program or map creation operation. This is done only if
* kernel is too old to support memcg-based memory accounting for BPF
* subsystem. By default, RLIMIT_MEMLOCK limit is set to RLIM_INFINITY,
* but it can be overriden with libbpf_set_memlock_rlim_max() API.
* Note that libbpf_set_memlock_rlim_max() needs to be called before
* the very first bpf_prog_load(), bpf_map_create() or bpf_object__load()
* operation.
*/
LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK = 0x10,
/*
* Error out on any SEC("maps") map definition, which are deprecated
* in favor of BTF-defined map definitions in SEC(".maps").
*/
LIBBPF_STRICT_MAP_DEFINITIONS = 0x20,
__LIBBPF_STRICT_LAST,
};
LIBBPF_API int libbpf_set_strict_mode(enum libbpf_strict_mode mode);
#define DECLARE_LIBBPF_OPTS LIBBPF_OPTS
/* "Discouraged" APIs which don't follow consistent libbpf naming patterns.
* They are normally a trivial aliases or wrappers for proper APIs and are
* left to minimize unnecessary disruption for users of libbpf. But they
* shouldn't be used going forward.
*/
struct bpf_program;
struct bpf_map;
struct btf;
struct btf_ext;
LIBBPF_API enum bpf_prog_type bpf_program__get_type(const struct bpf_program *prog);
LIBBPF_API enum bpf_attach_type bpf_program__get_expected_attach_type(const struct bpf_program *prog);
LIBBPF_API const char *bpf_map__get_pin_path(const struct bpf_map *map);
LIBBPF_API const void *btf__get_raw_data(const struct btf *btf, __u32 *size);
LIBBPF_API const void *btf_ext__get_raw_data(const struct btf_ext *btf_ext, __u32 *size);
#ifdef __cplusplus
} /* extern "C" */
#endif
#endif /* __LIBBPF_LEGACY_BPF_H */

View File

@@ -33,7 +33,7 @@ static int get_vendor_id(int ifindex)
snprintf(path, sizeof(path), "/sys/class/net/%s/device/vendor", ifname);
fd = open(path, O_RDONLY);
fd = open(path, O_RDONLY | O_CLOEXEC);
if (fd < 0)
return -1;
@@ -48,38 +48,65 @@ static int get_vendor_id(int ifindex)
return strtol(buf, NULL, 0);
}
static int get_kernel_version(void)
static int probe_prog_load(enum bpf_prog_type prog_type,
const struct bpf_insn *insns, size_t insns_cnt,
char *log_buf, size_t log_buf_sz,
__u32 ifindex)
{
int version, subversion, patchlevel;
struct utsname utsn;
/* Return 0 on failure, and attempt to probe with empty kversion */
if (uname(&utsn))
return 0;
if (sscanf(utsn.release, "%d.%d.%d",
&version, &subversion, &patchlevel) != 3)
return 0;
return (version << 16) + (subversion << 8) + patchlevel;
}
static void
probe_load(enum bpf_prog_type prog_type, const struct bpf_insn *insns,
size_t insns_cnt, char *buf, size_t buf_len, __u32 ifindex)
{
struct bpf_load_program_attr xattr = {};
int fd;
LIBBPF_OPTS(bpf_prog_load_opts, opts,
.log_buf = log_buf,
.log_size = log_buf_sz,
.log_level = log_buf ? 1 : 0,
.prog_ifindex = ifindex,
);
int fd, err, exp_err = 0;
const char *exp_msg = NULL;
char buf[4096];
switch (prog_type) {
case BPF_PROG_TYPE_CGROUP_SOCK_ADDR:
xattr.expected_attach_type = BPF_CGROUP_INET4_CONNECT;
opts.expected_attach_type = BPF_CGROUP_INET4_CONNECT;
break;
case BPF_PROG_TYPE_CGROUP_SOCKOPT:
opts.expected_attach_type = BPF_CGROUP_GETSOCKOPT;
break;
case BPF_PROG_TYPE_SK_LOOKUP:
xattr.expected_attach_type = BPF_SK_LOOKUP;
opts.expected_attach_type = BPF_SK_LOOKUP;
break;
case BPF_PROG_TYPE_KPROBE:
xattr.kern_version = get_kernel_version();
opts.kern_version = get_kernel_version();
break;
case BPF_PROG_TYPE_LIRC_MODE2:
opts.expected_attach_type = BPF_LIRC_MODE2;
break;
case BPF_PROG_TYPE_TRACING:
case BPF_PROG_TYPE_LSM:
opts.log_buf = buf;
opts.log_size = sizeof(buf);
opts.log_level = 1;
if (prog_type == BPF_PROG_TYPE_TRACING)
opts.expected_attach_type = BPF_TRACE_FENTRY;
else
opts.expected_attach_type = BPF_MODIFY_RETURN;
opts.attach_btf_id = 1;
exp_err = -EINVAL;
exp_msg = "attach_btf_id 1 is not a function";
break;
case BPF_PROG_TYPE_EXT:
opts.log_buf = buf;
opts.log_size = sizeof(buf);
opts.log_level = 1;
opts.attach_btf_id = 1;
exp_err = -EINVAL;
exp_msg = "Cannot replace kernel functions";
break;
case BPF_PROG_TYPE_SYSCALL:
opts.prog_flags = BPF_F_SLEEPABLE;
break;
case BPF_PROG_TYPE_STRUCT_OPS:
exp_err = -524; /* -ENOTSUPP */
break;
case BPF_PROG_TYPE_UNSPEC:
case BPF_PROG_TYPE_SOCKET_FILTER:
@@ -100,28 +127,42 @@ probe_load(enum bpf_prog_type prog_type, const struct bpf_insn *insns,
case BPF_PROG_TYPE_RAW_TRACEPOINT:
case BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE:
case BPF_PROG_TYPE_LWT_SEG6LOCAL:
case BPF_PROG_TYPE_LIRC_MODE2:
case BPF_PROG_TYPE_SK_REUSEPORT:
case BPF_PROG_TYPE_FLOW_DISSECTOR:
case BPF_PROG_TYPE_CGROUP_SYSCTL:
case BPF_PROG_TYPE_CGROUP_SOCKOPT:
case BPF_PROG_TYPE_TRACING:
case BPF_PROG_TYPE_STRUCT_OPS:
case BPF_PROG_TYPE_EXT:
case BPF_PROG_TYPE_LSM:
default:
break;
default:
return -EOPNOTSUPP;
}
xattr.prog_type = prog_type;
xattr.insns = insns;
xattr.insns_cnt = insns_cnt;
xattr.license = "GPL";
xattr.prog_ifindex = ifindex;
fd = bpf_load_program_xattr(&xattr, buf, buf_len);
fd = bpf_prog_load(prog_type, NULL, "GPL", insns, insns_cnt, &opts);
err = -errno;
if (fd >= 0)
close(fd);
if (exp_err) {
if (fd >= 0 || err != exp_err)
return 0;
if (exp_msg && !strstr(buf, exp_msg))
return 0;
return 1;
}
return fd >= 0 ? 1 : 0;
}
int libbpf_probe_bpf_prog_type(enum bpf_prog_type prog_type, const void *opts)
{
struct bpf_insn insns[] = {
BPF_MOV64_IMM(BPF_REG_0, 0),
BPF_EXIT_INSN()
};
const size_t insn_cnt = ARRAY_SIZE(insns);
int ret;
if (opts)
return libbpf_err(-EINVAL);
ret = probe_prog_load(prog_type, insns, insn_cnt, NULL, 0, 0);
return libbpf_err(ret);
}
bool bpf_probe_prog_type(enum bpf_prog_type prog_type, __u32 ifindex)
@@ -131,12 +172,16 @@ bool bpf_probe_prog_type(enum bpf_prog_type prog_type, __u32 ifindex)
BPF_EXIT_INSN()
};
/* prefer libbpf_probe_bpf_prog_type() unless offload is requested */
if (ifindex == 0)
return libbpf_probe_bpf_prog_type(prog_type, NULL) == 1;
if (ifindex && prog_type == BPF_PROG_TYPE_SCHED_CLS)
/* nfp returns -EINVAL on exit(0) with TC offload */
insns[0].imm = 2;
errno = 0;
probe_load(prog_type, insns, ARRAY_SIZE(insns), NULL, 0, ifindex);
probe_prog_load(prog_type, insns, ARRAY_SIZE(insns), NULL, 0, ifindex);
return errno != EINVAL && errno != EOPNOTSUPP;
}
@@ -164,7 +209,7 @@ int libbpf__load_raw_btf(const char *raw_types, size_t types_len,
memcpy(raw_btf + hdr.hdr_len, raw_types, hdr.type_len);
memcpy(raw_btf + hdr.hdr_len + hdr.type_len, str_sec, hdr.str_len);
btf_fd = bpf_load_btf(raw_btf, btf_len, NULL, 0, false);
btf_fd = bpf_btf_load(raw_btf, btf_len, NULL);
free(raw_btf);
return btf_fd;
@@ -197,17 +242,18 @@ static int load_local_storage_btf(void)
strs, sizeof(strs));
}
bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex)
static int probe_map_create(enum bpf_map_type map_type, __u32 ifindex)
{
int key_size, value_size, max_entries, map_flags;
LIBBPF_OPTS(bpf_map_create_opts, opts);
int key_size, value_size, max_entries;
__u32 btf_key_type_id = 0, btf_value_type_id = 0;
struct bpf_create_map_attr attr = {};
int fd = -1, btf_fd = -1, fd_inner;
int fd = -1, btf_fd = -1, fd_inner = -1, exp_err = 0, err;
opts.map_ifindex = ifindex;
key_size = sizeof(__u32);
value_size = sizeof(__u32);
max_entries = 1;
map_flags = 0;
switch (map_type) {
case BPF_MAP_TYPE_STACK_TRACE:
@@ -216,7 +262,7 @@ bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex)
case BPF_MAP_TYPE_LPM_TRIE:
key_size = sizeof(__u64);
value_size = sizeof(__u64);
map_flags = BPF_F_NO_PREALLOC;
opts.map_flags = BPF_F_NO_PREALLOC;
break;
case BPF_MAP_TYPE_CGROUP_STORAGE:
case BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE:
@@ -235,17 +281,25 @@ bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex)
btf_value_type_id = 3;
value_size = 8;
max_entries = 0;
map_flags = BPF_F_NO_PREALLOC;
opts.map_flags = BPF_F_NO_PREALLOC;
btf_fd = load_local_storage_btf();
if (btf_fd < 0)
return false;
return btf_fd;
break;
case BPF_MAP_TYPE_RINGBUF:
key_size = 0;
value_size = 0;
max_entries = 4096;
break;
case BPF_MAP_TYPE_UNSPEC:
case BPF_MAP_TYPE_STRUCT_OPS:
/* we'll get -ENOTSUPP for invalid BTF type ID for struct_ops */
opts.btf_vmlinux_value_type_id = 1;
exp_err = -524; /* -ENOTSUPP */
break;
case BPF_MAP_TYPE_BLOOM_FILTER:
key_size = 0;
max_entries = 1;
break;
case BPF_MAP_TYPE_HASH:
case BPF_MAP_TYPE_ARRAY:
case BPF_MAP_TYPE_PROG_ARRAY:
@@ -264,9 +318,10 @@ bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex)
case BPF_MAP_TYPE_XSKMAP:
case BPF_MAP_TYPE_SOCKHASH:
case BPF_MAP_TYPE_REUSEPORT_SOCKARRAY:
case BPF_MAP_TYPE_STRUCT_OPS:
default:
break;
case BPF_MAP_TYPE_UNSPEC:
default:
return -EOPNOTSUPP;
}
if (map_type == BPF_MAP_TYPE_ARRAY_OF_MAPS ||
@@ -275,37 +330,102 @@ bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex)
* map-in-map for offload
*/
if (ifindex)
return false;
goto cleanup;
fd_inner = bpf_create_map(BPF_MAP_TYPE_HASH,
sizeof(__u32), sizeof(__u32), 1, 0);
fd_inner = bpf_map_create(BPF_MAP_TYPE_HASH, NULL,
sizeof(__u32), sizeof(__u32), 1, NULL);
if (fd_inner < 0)
return false;
fd = bpf_create_map_in_map(map_type, NULL, sizeof(__u32),
fd_inner, 1, 0);
close(fd_inner);
} else {
/* Note: No other restriction on map type probes for offload */
attr.map_type = map_type;
attr.key_size = key_size;
attr.value_size = value_size;
attr.max_entries = max_entries;
attr.map_flags = map_flags;
attr.map_ifindex = ifindex;
if (btf_fd >= 0) {
attr.btf_fd = btf_fd;
attr.btf_key_type_id = btf_key_type_id;
attr.btf_value_type_id = btf_value_type_id;
}
goto cleanup;
fd = bpf_create_map_xattr(&attr);
opts.inner_map_fd = fd_inner;
}
if (btf_fd >= 0) {
opts.btf_fd = btf_fd;
opts.btf_key_type_id = btf_key_type_id;
opts.btf_value_type_id = btf_value_type_id;
}
fd = bpf_map_create(map_type, NULL, key_size, value_size, max_entries, &opts);
err = -errno;
cleanup:
if (fd >= 0)
close(fd);
if (fd_inner >= 0)
close(fd_inner);
if (btf_fd >= 0)
close(btf_fd);
return fd >= 0;
if (exp_err)
return fd < 0 && err == exp_err ? 1 : 0;
else
return fd >= 0 ? 1 : 0;
}
int libbpf_probe_bpf_map_type(enum bpf_map_type map_type, const void *opts)
{
int ret;
if (opts)
return libbpf_err(-EINVAL);
ret = probe_map_create(map_type, 0);
return libbpf_err(ret);
}
bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex)
{
return probe_map_create(map_type, ifindex) == 1;
}
int libbpf_probe_bpf_helper(enum bpf_prog_type prog_type, enum bpf_func_id helper_id,
const void *opts)
{
struct bpf_insn insns[] = {
BPF_EMIT_CALL((__u32)helper_id),
BPF_EXIT_INSN(),
};
const size_t insn_cnt = ARRAY_SIZE(insns);
char buf[4096];
int ret;
if (opts)
return libbpf_err(-EINVAL);
/* we can't successfully load all prog types to check for BPF helper
* support, so bail out with -EOPNOTSUPP error
*/
switch (prog_type) {
case BPF_PROG_TYPE_TRACING:
case BPF_PROG_TYPE_EXT:
case BPF_PROG_TYPE_LSM:
case BPF_PROG_TYPE_STRUCT_OPS:
return -EOPNOTSUPP;
default:
break;
}
buf[0] = '\0';
ret = probe_prog_load(prog_type, insns, insn_cnt, buf, sizeof(buf), 0);
if (ret < 0)
return libbpf_err(ret);
/* If BPF verifier doesn't recognize BPF helper ID (enum bpf_func_id)
* at all, it will emit something like "invalid func unknown#181".
* If BPF verifier recognizes BPF helper but it's not supported for
* given BPF program type, it will emit "unknown func bpf_sys_bpf#166".
* In both cases, provided combination of BPF program type and BPF
* helper is not supported by the kernel.
* In all other cases, probe_prog_load() above will either succeed (e.g.,
* because BPF helper happens to accept no input arguments or it
* accepts one input argument and initial PTR_TO_CTX is fine for
* that), or we'll get some more specific BPF verifier error about
* some unsatisfied conditions.
*/
if (ret == 0 && (strstr(buf, "invalid func ") || strstr(buf, "unknown func ")))
return 0;
return 1; /* assume supported */
}
bool bpf_probe_helper(enum bpf_func_id id, enum bpf_prog_type prog_type,
@@ -318,8 +438,7 @@ bool bpf_probe_helper(enum bpf_func_id id, enum bpf_prog_type prog_type,
char buf[4096] = {};
bool res;
probe_load(prog_type, insns, ARRAY_SIZE(insns), buf, sizeof(buf),
ifindex);
probe_prog_load(prog_type, insns, ARRAY_SIZE(insns), buf, sizeof(buf), ifindex);
res = !grep(buf, "invalid func ") && !grep(buf, "unknown func ");
if (ifindex) {
@@ -351,8 +470,8 @@ bool bpf_probe_large_insn_limit(__u32 ifindex)
insns[BPF_MAXINSNS] = BPF_EXIT_INSN();
errno = 0;
probe_load(BPF_PROG_TYPE_SCHED_CLS, insns, ARRAY_SIZE(insns), NULL, 0,
ifindex);
probe_prog_load(BPF_PROG_TYPE_SCHED_CLS, insns, ARRAY_SIZE(insns), NULL, 0,
ifindex);
return errno != E2BIG && errno != EINVAL;
}

View File

@@ -1,47 +0,0 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
/* Copyright (c) 2019 Facebook */
#ifndef __LIBBPF_LIBBPF_UTIL_H
#define __LIBBPF_LIBBPF_UTIL_H
#include <stdbool.h>
#ifdef __cplusplus
extern "C" {
#endif
/* Use these barrier functions instead of smp_[rw]mb() when they are
* used in a libbpf header file. That way they can be built into the
* application that uses libbpf.
*/
#if defined(__i386__) || defined(__x86_64__)
# define libbpf_smp_rmb() asm volatile("" : : : "memory")
# define libbpf_smp_wmb() asm volatile("" : : : "memory")
# define libbpf_smp_mb() \
asm volatile("lock; addl $0,-4(%%rsp)" : : : "memory", "cc")
/* Hinders stores to be observed before older loads. */
# define libbpf_smp_rwmb() asm volatile("" : : : "memory")
#elif defined(__aarch64__)
# define libbpf_smp_rmb() asm volatile("dmb ishld" : : : "memory")
# define libbpf_smp_wmb() asm volatile("dmb ishst" : : : "memory")
# define libbpf_smp_mb() asm volatile("dmb ish" : : : "memory")
# define libbpf_smp_rwmb() libbpf_smp_mb()
#elif defined(__arm__)
/* These are only valid for armv7 and above */
# define libbpf_smp_rmb() asm volatile("dmb ish" : : : "memory")
# define libbpf_smp_wmb() asm volatile("dmb ishst" : : : "memory")
# define libbpf_smp_mb() asm volatile("dmb ish" : : : "memory")
# define libbpf_smp_rwmb() libbpf_smp_mb()
#else
/* Architecture missing native barrier functions. */
# define libbpf_smp_rmb() __sync_synchronize()
# define libbpf_smp_wmb() __sync_synchronize()
# define libbpf_smp_mb() __sync_synchronize()
# define libbpf_smp_rwmb() __sync_synchronize()
#endif
#ifdef __cplusplus
} /* extern "C" */
#endif
#endif

9
src/libbpf_version.h Normal file
View File

@@ -0,0 +1,9 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
/* Copyright (C) 2021 Facebook */
#ifndef __LIBBPF_VERSION_H
#define __LIBBPF_VERSION_H
#define LIBBPF_MAJOR_VERSION 0
#define LIBBPF_MINOR_VERSION 7
#endif /* __LIBBPF_VERSION_H */

2903
src/linker.c Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -4,7 +4,10 @@
#include <stdlib.h>
#include <memory.h>
#include <unistd.h>
#include <arpa/inet.h>
#include <linux/bpf.h>
#include <linux/if_ether.h>
#include <linux/pkt_cls.h>
#include <linux/rtnetlink.h>
#include <sys/socket.h>
#include <errno.h>
@@ -40,7 +43,7 @@ static int libbpf_netlink_open(__u32 *nl_pid)
memset(&sa, 0, sizeof(sa));
sa.nl_family = AF_NETLINK;
sock = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
sock = socket(AF_NETLINK, SOCK_RAW | SOCK_CLOEXEC, NETLINK_ROUTE);
if (sock < 0)
return -errno;
@@ -73,9 +76,20 @@ cleanup:
return ret;
}
static int bpf_netlink_recv(int sock, __u32 nl_pid, int seq,
__dump_nlmsg_t _fn, libbpf_dump_nlmsg_t fn,
void *cookie)
static void libbpf_netlink_close(int sock)
{
close(sock);
}
enum {
NL_CONT,
NL_NEXT,
NL_DONE,
};
static int libbpf_netlink_recv(int sock, __u32 nl_pid, int seq,
__dump_nlmsg_t _fn, libbpf_dump_nlmsg_t fn,
void *cookie)
{
bool multipart = true;
struct nlmsgerr *err;
@@ -84,6 +98,7 @@ static int bpf_netlink_recv(int sock, __u32 nl_pid, int seq,
int len, ret;
while (multipart) {
start:
multipart = false;
len = recv(sock, buf, sizeof(buf), 0);
if (len < 0) {
@@ -121,8 +136,16 @@ static int bpf_netlink_recv(int sock, __u32 nl_pid, int seq,
}
if (_fn) {
ret = _fn(nh, fn, cookie);
if (ret)
switch (ret) {
case NL_CONT:
break;
case NL_NEXT:
goto start;
case NL_DONE:
return 0;
default:
return ret;
}
}
}
}
@@ -131,95 +154,114 @@ done:
return ret;
}
static int __bpf_set_link_xdp_fd_replace(int ifindex, int fd, int old_fd,
__u32 flags)
static int libbpf_netlink_send_recv(struct libbpf_nla_req *req,
__dump_nlmsg_t parse_msg,
libbpf_dump_nlmsg_t parse_attr,
void *cookie)
{
int sock, seq = 0, ret;
struct nlattr *nla, *nla_xdp;
struct {
struct nlmsghdr nh;
struct ifinfomsg ifinfo;
char attrbuf[64];
} req;
__u32 nl_pid = 0;
int sock, ret;
sock = libbpf_netlink_open(&nl_pid);
if (sock < 0)
return sock;
memset(&req, 0, sizeof(req));
req.nh.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
req.nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
req.nh.nlmsg_type = RTM_SETLINK;
req.nh.nlmsg_pid = 0;
req.nh.nlmsg_seq = ++seq;
req.ifinfo.ifi_family = AF_UNSPEC;
req.ifinfo.ifi_index = ifindex;
req->nh.nlmsg_pid = 0;
req->nh.nlmsg_seq = time(NULL);
/* started nested attribute for XDP */
nla = (struct nlattr *)(((char *)&req)
+ NLMSG_ALIGN(req.nh.nlmsg_len));
nla->nla_type = NLA_F_NESTED | IFLA_XDP;
nla->nla_len = NLA_HDRLEN;
/* add XDP fd */
nla_xdp = (struct nlattr *)((char *)nla + nla->nla_len);
nla_xdp->nla_type = IFLA_XDP_FD;
nla_xdp->nla_len = NLA_HDRLEN + sizeof(int);
memcpy((char *)nla_xdp + NLA_HDRLEN, &fd, sizeof(fd));
nla->nla_len += nla_xdp->nla_len;
/* if user passed in any flags, add those too */
if (flags) {
nla_xdp = (struct nlattr *)((char *)nla + nla->nla_len);
nla_xdp->nla_type = IFLA_XDP_FLAGS;
nla_xdp->nla_len = NLA_HDRLEN + sizeof(flags);
memcpy((char *)nla_xdp + NLA_HDRLEN, &flags, sizeof(flags));
nla->nla_len += nla_xdp->nla_len;
}
if (flags & XDP_FLAGS_REPLACE) {
nla_xdp = (struct nlattr *)((char *)nla + nla->nla_len);
nla_xdp->nla_type = IFLA_XDP_EXPECTED_FD;
nla_xdp->nla_len = NLA_HDRLEN + sizeof(old_fd);
memcpy((char *)nla_xdp + NLA_HDRLEN, &old_fd, sizeof(old_fd));
nla->nla_len += nla_xdp->nla_len;
}
req.nh.nlmsg_len += NLA_ALIGN(nla->nla_len);
if (send(sock, &req, req.nh.nlmsg_len, 0) < 0) {
if (send(sock, req, req->nh.nlmsg_len, 0) < 0) {
ret = -errno;
goto cleanup;
goto out;
}
ret = bpf_netlink_recv(sock, nl_pid, seq, NULL, NULL, NULL);
cleanup:
close(sock);
ret = libbpf_netlink_recv(sock, nl_pid, req->nh.nlmsg_seq,
parse_msg, parse_attr, cookie);
out:
libbpf_netlink_close(sock);
return ret;
}
static int __bpf_set_link_xdp_fd_replace(int ifindex, int fd, int old_fd,
__u32 flags)
{
struct nlattr *nla;
int ret;
struct libbpf_nla_req req;
memset(&req, 0, sizeof(req));
req.nh.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
req.nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
req.nh.nlmsg_type = RTM_SETLINK;
req.ifinfo.ifi_family = AF_UNSPEC;
req.ifinfo.ifi_index = ifindex;
nla = nlattr_begin_nested(&req, IFLA_XDP);
if (!nla)
return -EMSGSIZE;
ret = nlattr_add(&req, IFLA_XDP_FD, &fd, sizeof(fd));
if (ret < 0)
return ret;
if (flags) {
ret = nlattr_add(&req, IFLA_XDP_FLAGS, &flags, sizeof(flags));
if (ret < 0)
return ret;
}
if (flags & XDP_FLAGS_REPLACE) {
ret = nlattr_add(&req, IFLA_XDP_EXPECTED_FD, &old_fd,
sizeof(old_fd));
if (ret < 0)
return ret;
}
nlattr_end_nested(&req, nla);
return libbpf_netlink_send_recv(&req, NULL, NULL, NULL);
}
int bpf_xdp_attach(int ifindex, int prog_fd, __u32 flags, const struct bpf_xdp_attach_opts *opts)
{
int old_prog_fd, err;
if (!OPTS_VALID(opts, bpf_xdp_attach_opts))
return libbpf_err(-EINVAL);
old_prog_fd = OPTS_GET(opts, old_prog_fd, 0);
if (old_prog_fd)
flags |= XDP_FLAGS_REPLACE;
else
old_prog_fd = -1;
err = __bpf_set_link_xdp_fd_replace(ifindex, prog_fd, old_prog_fd, flags);
return libbpf_err(err);
}
int bpf_xdp_detach(int ifindex, __u32 flags, const struct bpf_xdp_attach_opts *opts)
{
return bpf_xdp_attach(ifindex, -1, flags, opts);
}
int bpf_set_link_xdp_fd_opts(int ifindex, int fd, __u32 flags,
const struct bpf_xdp_set_link_opts *opts)
{
int old_fd = -1;
int old_fd = -1, ret;
if (!OPTS_VALID(opts, bpf_xdp_set_link_opts))
return -EINVAL;
return libbpf_err(-EINVAL);
if (OPTS_HAS(opts, old_fd)) {
old_fd = OPTS_GET(opts, old_fd, -1);
flags |= XDP_FLAGS_REPLACE;
}
return __bpf_set_link_xdp_fd_replace(ifindex, fd,
old_fd,
flags);
ret = __bpf_set_link_xdp_fd_replace(ifindex, fd, old_fd, flags);
return libbpf_err(ret);
}
int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags)
{
return __bpf_set_link_xdp_fd_replace(ifindex, fd, 0, flags);
int ret;
ret = __bpf_set_link_xdp_fd_replace(ifindex, fd, 0, flags);
return libbpf_err(ret);
}
static int __dump_link_nlmsg(struct nlmsghdr *nlh,
@@ -231,6 +273,7 @@ static int __dump_link_nlmsg(struct nlmsghdr *nlh,
len = nlh->nlmsg_len - NLMSG_LENGTH(sizeof(*ifi));
attr = (struct nlattr *) ((void *) ifi + NLMSG_ALIGN(sizeof(*ifi)));
if (libbpf_nla_parse(tb, IFLA_MAX, attr, len, NULL) != 0)
return -LIBBPF_ERRNO__NLPARSE;
@@ -282,91 +325,485 @@ static int get_xdp_info(void *cookie, void *msg, struct nlattr **tb)
return 0;
}
static int libbpf_nl_get_link(int sock, unsigned int nl_pid,
libbpf_dump_nlmsg_t dump_link_nlmsg, void *cookie);
int bpf_get_link_xdp_info(int ifindex, struct xdp_link_info *info,
size_t info_size, __u32 flags)
int bpf_xdp_query(int ifindex, int xdp_flags, struct bpf_xdp_query_opts *opts)
{
struct libbpf_nla_req req = {
.nh.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg)),
.nh.nlmsg_type = RTM_GETLINK,
.nh.nlmsg_flags = NLM_F_DUMP | NLM_F_REQUEST,
.ifinfo.ifi_family = AF_PACKET,
};
struct xdp_id_md xdp_id = {};
int sock, ret;
__u32 nl_pid = 0;
__u32 mask;
int err;
if (flags & ~XDP_FLAGS_MASK || !info_size)
return -EINVAL;
if (!OPTS_VALID(opts, bpf_xdp_query_opts))
return libbpf_err(-EINVAL);
if (xdp_flags & ~XDP_FLAGS_MASK)
return libbpf_err(-EINVAL);
/* Check whether the single {HW,DRV,SKB} mode is set */
flags &= (XDP_FLAGS_SKB_MODE | XDP_FLAGS_DRV_MODE | XDP_FLAGS_HW_MODE);
mask = flags - 1;
if (flags && flags & mask)
return -EINVAL;
sock = libbpf_netlink_open(&nl_pid);
if (sock < 0)
return sock;
xdp_flags &= XDP_FLAGS_SKB_MODE | XDP_FLAGS_DRV_MODE | XDP_FLAGS_HW_MODE;
if (xdp_flags & (xdp_flags - 1))
return libbpf_err(-EINVAL);
xdp_id.ifindex = ifindex;
xdp_id.flags = flags;
xdp_id.flags = xdp_flags;
ret = libbpf_nl_get_link(sock, nl_pid, get_xdp_info, &xdp_id);
if (!ret) {
size_t sz = min(info_size, sizeof(xdp_id.info));
err = libbpf_netlink_send_recv(&req, __dump_link_nlmsg,
get_xdp_info, &xdp_id);
if (err)
return libbpf_err(err);
memcpy(info, &xdp_id.info, sz);
memset((void *) info + sz, 0, info_size - sz);
}
close(sock);
return ret;
}
static __u32 get_xdp_id(struct xdp_link_info *info, __u32 flags)
{
flags &= XDP_FLAGS_MODES;
if (info->attach_mode != XDP_ATTACHED_MULTI && !flags)
return info->prog_id;
if (flags & XDP_FLAGS_DRV_MODE)
return info->drv_prog_id;
if (flags & XDP_FLAGS_HW_MODE)
return info->hw_prog_id;
if (flags & XDP_FLAGS_SKB_MODE)
return info->skb_prog_id;
OPTS_SET(opts, prog_id, xdp_id.info.prog_id);
OPTS_SET(opts, drv_prog_id, xdp_id.info.drv_prog_id);
OPTS_SET(opts, hw_prog_id, xdp_id.info.hw_prog_id);
OPTS_SET(opts, skb_prog_id, xdp_id.info.skb_prog_id);
OPTS_SET(opts, attach_mode, xdp_id.info.attach_mode);
return 0;
}
int bpf_get_link_xdp_id(int ifindex, __u32 *prog_id, __u32 flags)
int bpf_get_link_xdp_info(int ifindex, struct xdp_link_info *info,
size_t info_size, __u32 flags)
{
struct xdp_link_info info;
LIBBPF_OPTS(bpf_xdp_query_opts, opts);
size_t sz;
int err;
if (!info_size)
return libbpf_err(-EINVAL);
err = bpf_xdp_query(ifindex, flags, &opts);
if (err)
return libbpf_err(err);
/* struct xdp_link_info field layout matches struct bpf_xdp_query_opts
* layout after sz field
*/
sz = min(info_size, offsetofend(struct xdp_link_info, attach_mode));
memcpy(info, &opts.prog_id, sz);
memset((void *)info + sz, 0, info_size - sz);
return 0;
}
int bpf_xdp_query_id(int ifindex, int flags, __u32 *prog_id)
{
LIBBPF_OPTS(bpf_xdp_query_opts, opts);
int ret;
ret = bpf_get_link_xdp_info(ifindex, &info, sizeof(info), flags);
if (!ret)
*prog_id = get_xdp_id(&info, flags);
ret = bpf_xdp_query(ifindex, flags, &opts);
if (ret)
return libbpf_err(ret);
flags &= XDP_FLAGS_MODES;
if (opts.attach_mode != XDP_ATTACHED_MULTI && !flags)
*prog_id = opts.prog_id;
else if (flags & XDP_FLAGS_DRV_MODE)
*prog_id = opts.drv_prog_id;
else if (flags & XDP_FLAGS_HW_MODE)
*prog_id = opts.hw_prog_id;
else if (flags & XDP_FLAGS_SKB_MODE)
*prog_id = opts.skb_prog_id;
else
*prog_id = 0;
return 0;
}
int bpf_get_link_xdp_id(int ifindex, __u32 *prog_id, __u32 flags)
{
return bpf_xdp_query_id(ifindex, flags, prog_id);
}
typedef int (*qdisc_config_t)(struct libbpf_nla_req *req);
static int clsact_config(struct libbpf_nla_req *req)
{
req->tc.tcm_parent = TC_H_CLSACT;
req->tc.tcm_handle = TC_H_MAKE(TC_H_CLSACT, 0);
return nlattr_add(req, TCA_KIND, "clsact", sizeof("clsact"));
}
static int attach_point_to_config(struct bpf_tc_hook *hook,
qdisc_config_t *config)
{
switch (OPTS_GET(hook, attach_point, 0)) {
case BPF_TC_INGRESS:
case BPF_TC_EGRESS:
case BPF_TC_INGRESS | BPF_TC_EGRESS:
if (OPTS_GET(hook, parent, 0))
return -EINVAL;
*config = &clsact_config;
return 0;
case BPF_TC_CUSTOM:
return -EOPNOTSUPP;
default:
return -EINVAL;
}
}
static int tc_get_tcm_parent(enum bpf_tc_attach_point attach_point,
__u32 *parent)
{
switch (attach_point) {
case BPF_TC_INGRESS:
case BPF_TC_EGRESS:
if (*parent)
return -EINVAL;
*parent = TC_H_MAKE(TC_H_CLSACT,
attach_point == BPF_TC_INGRESS ?
TC_H_MIN_INGRESS : TC_H_MIN_EGRESS);
break;
case BPF_TC_CUSTOM:
if (!*parent)
return -EINVAL;
break;
default:
return -EINVAL;
}
return 0;
}
static int tc_qdisc_modify(struct bpf_tc_hook *hook, int cmd, int flags)
{
qdisc_config_t config;
int ret;
struct libbpf_nla_req req;
ret = attach_point_to_config(hook, &config);
if (ret < 0)
return ret;
memset(&req, 0, sizeof(req));
req.nh.nlmsg_len = NLMSG_LENGTH(sizeof(struct tcmsg));
req.nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK | flags;
req.nh.nlmsg_type = cmd;
req.tc.tcm_family = AF_UNSPEC;
req.tc.tcm_ifindex = OPTS_GET(hook, ifindex, 0);
ret = config(&req);
if (ret < 0)
return ret;
return libbpf_netlink_send_recv(&req, NULL, NULL, NULL);
}
static int tc_qdisc_create_excl(struct bpf_tc_hook *hook)
{
return tc_qdisc_modify(hook, RTM_NEWQDISC, NLM_F_CREATE | NLM_F_EXCL);
}
static int tc_qdisc_delete(struct bpf_tc_hook *hook)
{
return tc_qdisc_modify(hook, RTM_DELQDISC, 0);
}
int bpf_tc_hook_create(struct bpf_tc_hook *hook)
{
int ret;
if (!hook || !OPTS_VALID(hook, bpf_tc_hook) ||
OPTS_GET(hook, ifindex, 0) <= 0)
return libbpf_err(-EINVAL);
ret = tc_qdisc_create_excl(hook);
return libbpf_err(ret);
}
static int __bpf_tc_detach(const struct bpf_tc_hook *hook,
const struct bpf_tc_opts *opts,
const bool flush);
int bpf_tc_hook_destroy(struct bpf_tc_hook *hook)
{
if (!hook || !OPTS_VALID(hook, bpf_tc_hook) ||
OPTS_GET(hook, ifindex, 0) <= 0)
return libbpf_err(-EINVAL);
switch (OPTS_GET(hook, attach_point, 0)) {
case BPF_TC_INGRESS:
case BPF_TC_EGRESS:
return libbpf_err(__bpf_tc_detach(hook, NULL, true));
case BPF_TC_INGRESS | BPF_TC_EGRESS:
return libbpf_err(tc_qdisc_delete(hook));
case BPF_TC_CUSTOM:
return libbpf_err(-EOPNOTSUPP);
default:
return libbpf_err(-EINVAL);
}
}
struct bpf_cb_ctx {
struct bpf_tc_opts *opts;
bool processed;
};
static int __get_tc_info(void *cookie, struct tcmsg *tc, struct nlattr **tb,
bool unicast)
{
struct nlattr *tbb[TCA_BPF_MAX + 1];
struct bpf_cb_ctx *info = cookie;
if (!info || !info->opts)
return -EINVAL;
if (unicast && info->processed)
return -EINVAL;
if (!tb[TCA_OPTIONS])
return NL_CONT;
libbpf_nla_parse_nested(tbb, TCA_BPF_MAX, tb[TCA_OPTIONS], NULL);
if (!tbb[TCA_BPF_ID])
return -EINVAL;
OPTS_SET(info->opts, prog_id, libbpf_nla_getattr_u32(tbb[TCA_BPF_ID]));
OPTS_SET(info->opts, handle, tc->tcm_handle);
OPTS_SET(info->opts, priority, TC_H_MAJ(tc->tcm_info) >> 16);
info->processed = true;
return unicast ? NL_NEXT : NL_DONE;
}
static int get_tc_info(struct nlmsghdr *nh, libbpf_dump_nlmsg_t fn,
void *cookie)
{
struct tcmsg *tc = NLMSG_DATA(nh);
struct nlattr *tb[TCA_MAX + 1];
libbpf_nla_parse(tb, TCA_MAX,
(struct nlattr *)((void *)tc + NLMSG_ALIGN(sizeof(*tc))),
NLMSG_PAYLOAD(nh, sizeof(*tc)), NULL);
if (!tb[TCA_KIND])
return NL_CONT;
return __get_tc_info(cookie, tc, tb, nh->nlmsg_flags & NLM_F_ECHO);
}
static int tc_add_fd_and_name(struct libbpf_nla_req *req, int fd)
{
struct bpf_prog_info info = {};
__u32 info_len = sizeof(info);
char name[256];
int len, ret;
ret = bpf_obj_get_info_by_fd(fd, &info, &info_len);
if (ret < 0)
return ret;
ret = nlattr_add(req, TCA_BPF_FD, &fd, sizeof(fd));
if (ret < 0)
return ret;
len = snprintf(name, sizeof(name), "%s:[%u]", info.name, info.id);
if (len < 0)
return -errno;
if (len >= sizeof(name))
return -ENAMETOOLONG;
return nlattr_add(req, TCA_BPF_NAME, name, len + 1);
}
int bpf_tc_attach(const struct bpf_tc_hook *hook, struct bpf_tc_opts *opts)
{
__u32 protocol, bpf_flags, handle, priority, parent, prog_id, flags;
int ret, ifindex, attach_point, prog_fd;
struct bpf_cb_ctx info = {};
struct libbpf_nla_req req;
struct nlattr *nla;
if (!hook || !opts ||
!OPTS_VALID(hook, bpf_tc_hook) ||
!OPTS_VALID(opts, bpf_tc_opts))
return libbpf_err(-EINVAL);
ifindex = OPTS_GET(hook, ifindex, 0);
parent = OPTS_GET(hook, parent, 0);
attach_point = OPTS_GET(hook, attach_point, 0);
handle = OPTS_GET(opts, handle, 0);
priority = OPTS_GET(opts, priority, 0);
prog_fd = OPTS_GET(opts, prog_fd, 0);
prog_id = OPTS_GET(opts, prog_id, 0);
flags = OPTS_GET(opts, flags, 0);
if (ifindex <= 0 || !prog_fd || prog_id)
return libbpf_err(-EINVAL);
if (priority > UINT16_MAX)
return libbpf_err(-EINVAL);
if (flags & ~BPF_TC_F_REPLACE)
return libbpf_err(-EINVAL);
flags = (flags & BPF_TC_F_REPLACE) ? NLM_F_REPLACE : NLM_F_EXCL;
protocol = ETH_P_ALL;
memset(&req, 0, sizeof(req));
req.nh.nlmsg_len = NLMSG_LENGTH(sizeof(struct tcmsg));
req.nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK | NLM_F_CREATE |
NLM_F_ECHO | flags;
req.nh.nlmsg_type = RTM_NEWTFILTER;
req.tc.tcm_family = AF_UNSPEC;
req.tc.tcm_ifindex = ifindex;
req.tc.tcm_handle = handle;
req.tc.tcm_info = TC_H_MAKE(priority << 16, htons(protocol));
ret = tc_get_tcm_parent(attach_point, &parent);
if (ret < 0)
return libbpf_err(ret);
req.tc.tcm_parent = parent;
ret = nlattr_add(&req, TCA_KIND, "bpf", sizeof("bpf"));
if (ret < 0)
return libbpf_err(ret);
nla = nlattr_begin_nested(&req, TCA_OPTIONS);
if (!nla)
return libbpf_err(-EMSGSIZE);
ret = tc_add_fd_and_name(&req, prog_fd);
if (ret < 0)
return libbpf_err(ret);
bpf_flags = TCA_BPF_FLAG_ACT_DIRECT;
ret = nlattr_add(&req, TCA_BPF_FLAGS, &bpf_flags, sizeof(bpf_flags));
if (ret < 0)
return libbpf_err(ret);
nlattr_end_nested(&req, nla);
info.opts = opts;
ret = libbpf_netlink_send_recv(&req, get_tc_info, NULL, &info);
if (ret < 0)
return libbpf_err(ret);
if (!info.processed)
return libbpf_err(-ENOENT);
return ret;
}
int libbpf_nl_get_link(int sock, unsigned int nl_pid,
libbpf_dump_nlmsg_t dump_link_nlmsg, void *cookie)
static int __bpf_tc_detach(const struct bpf_tc_hook *hook,
const struct bpf_tc_opts *opts,
const bool flush)
{
struct {
struct nlmsghdr nlh;
struct ifinfomsg ifm;
} req = {
.nlh.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg)),
.nlh.nlmsg_type = RTM_GETLINK,
.nlh.nlmsg_flags = NLM_F_DUMP | NLM_F_REQUEST,
.ifm.ifi_family = AF_PACKET,
};
int seq = time(NULL);
__u32 protocol = 0, handle, priority, parent, prog_id, flags;
int ret, ifindex, attach_point, prog_fd;
struct libbpf_nla_req req;
req.nlh.nlmsg_seq = seq;
if (send(sock, &req, req.nlh.nlmsg_len, 0) < 0)
return -errno;
if (!hook ||
!OPTS_VALID(hook, bpf_tc_hook) ||
!OPTS_VALID(opts, bpf_tc_opts))
return -EINVAL;
return bpf_netlink_recv(sock, nl_pid, seq, __dump_link_nlmsg,
dump_link_nlmsg, cookie);
ifindex = OPTS_GET(hook, ifindex, 0);
parent = OPTS_GET(hook, parent, 0);
attach_point = OPTS_GET(hook, attach_point, 0);
handle = OPTS_GET(opts, handle, 0);
priority = OPTS_GET(opts, priority, 0);
prog_fd = OPTS_GET(opts, prog_fd, 0);
prog_id = OPTS_GET(opts, prog_id, 0);
flags = OPTS_GET(opts, flags, 0);
if (ifindex <= 0 || flags || prog_fd || prog_id)
return -EINVAL;
if (priority > UINT16_MAX)
return -EINVAL;
if (!flush) {
if (!handle || !priority)
return -EINVAL;
protocol = ETH_P_ALL;
} else {
if (handle || priority)
return -EINVAL;
}
memset(&req, 0, sizeof(req));
req.nh.nlmsg_len = NLMSG_LENGTH(sizeof(struct tcmsg));
req.nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
req.nh.nlmsg_type = RTM_DELTFILTER;
req.tc.tcm_family = AF_UNSPEC;
req.tc.tcm_ifindex = ifindex;
if (!flush) {
req.tc.tcm_handle = handle;
req.tc.tcm_info = TC_H_MAKE(priority << 16, htons(protocol));
}
ret = tc_get_tcm_parent(attach_point, &parent);
if (ret < 0)
return ret;
req.tc.tcm_parent = parent;
if (!flush) {
ret = nlattr_add(&req, TCA_KIND, "bpf", sizeof("bpf"));
if (ret < 0)
return ret;
}
return libbpf_netlink_send_recv(&req, NULL, NULL, NULL);
}
int bpf_tc_detach(const struct bpf_tc_hook *hook,
const struct bpf_tc_opts *opts)
{
int ret;
if (!opts)
return libbpf_err(-EINVAL);
ret = __bpf_tc_detach(hook, opts, false);
return libbpf_err(ret);
}
int bpf_tc_query(const struct bpf_tc_hook *hook, struct bpf_tc_opts *opts)
{
__u32 protocol, handle, priority, parent, prog_id, flags;
int ret, ifindex, attach_point, prog_fd;
struct bpf_cb_ctx info = {};
struct libbpf_nla_req req;
if (!hook || !opts ||
!OPTS_VALID(hook, bpf_tc_hook) ||
!OPTS_VALID(opts, bpf_tc_opts))
return libbpf_err(-EINVAL);
ifindex = OPTS_GET(hook, ifindex, 0);
parent = OPTS_GET(hook, parent, 0);
attach_point = OPTS_GET(hook, attach_point, 0);
handle = OPTS_GET(opts, handle, 0);
priority = OPTS_GET(opts, priority, 0);
prog_fd = OPTS_GET(opts, prog_fd, 0);
prog_id = OPTS_GET(opts, prog_id, 0);
flags = OPTS_GET(opts, flags, 0);
if (ifindex <= 0 || flags || prog_fd || prog_id ||
!handle || !priority)
return libbpf_err(-EINVAL);
if (priority > UINT16_MAX)
return libbpf_err(-EINVAL);
protocol = ETH_P_ALL;
memset(&req, 0, sizeof(req));
req.nh.nlmsg_len = NLMSG_LENGTH(sizeof(struct tcmsg));
req.nh.nlmsg_flags = NLM_F_REQUEST;
req.nh.nlmsg_type = RTM_GETTFILTER;
req.tc.tcm_family = AF_UNSPEC;
req.tc.tcm_ifindex = ifindex;
req.tc.tcm_handle = handle;
req.tc.tcm_info = TC_H_MAKE(priority << 16, htons(protocol));
ret = tc_get_tcm_parent(attach_point, &parent);
if (ret < 0)
return libbpf_err(ret);
req.tc.tcm_parent = parent;
ret = nlattr_add(&req, TCA_KIND, "bpf", sizeof("bpf"));
if (ret < 0)
return libbpf_err(ret);
info.opts = opts;
ret = libbpf_netlink_send_recv(&req, get_tc_info, NULL, &info);
if (ret < 0)
return libbpf_err(ret);
if (!info.processed)
return libbpf_err(-ENOENT);
return ret;
}

View File

@@ -27,7 +27,7 @@ static struct nlattr *nla_next(const struct nlattr *nla, int *remaining)
int totlen = NLA_ALIGN(nla->nla_len);
*remaining -= totlen;
return (struct nlattr *) ((char *) nla + totlen);
return (struct nlattr *)((void *)nla + totlen);
}
static int nla_ok(const struct nlattr *nla, int remaining)

View File

@@ -10,7 +10,11 @@
#define __LIBBPF_NLATTR_H
#include <stdint.h>
#include <string.h>
#include <errno.h>
#include <linux/netlink.h>
#include <linux/rtnetlink.h>
/* avoid multiple definition of netlink features */
#define __LINUX_NETLINK_H
@@ -49,6 +53,15 @@ struct libbpf_nla_policy {
uint16_t maxlen;
};
struct libbpf_nla_req {
struct nlmsghdr nh;
union {
struct ifinfomsg ifinfo;
struct tcmsg tc;
};
char buf[128];
};
/**
* @ingroup attr
* Iterate over a stream of attributes
@@ -68,7 +81,7 @@ struct libbpf_nla_policy {
*/
static inline void *libbpf_nla_data(const struct nlattr *nla)
{
return (char *) nla + NLA_HDRLEN;
return (void *)nla + NLA_HDRLEN;
}
static inline uint8_t libbpf_nla_getattr_u8(const struct nlattr *nla)
@@ -103,4 +116,49 @@ int libbpf_nla_parse_nested(struct nlattr *tb[], int maxtype,
int libbpf_nla_dump_errormsg(struct nlmsghdr *nlh);
static inline struct nlattr *nla_data(struct nlattr *nla)
{
return (struct nlattr *)((void *)nla + NLA_HDRLEN);
}
static inline struct nlattr *req_tail(struct libbpf_nla_req *req)
{
return (struct nlattr *)((void *)req + NLMSG_ALIGN(req->nh.nlmsg_len));
}
static inline int nlattr_add(struct libbpf_nla_req *req, int type,
const void *data, int len)
{
struct nlattr *nla;
if (NLMSG_ALIGN(req->nh.nlmsg_len) + NLA_ALIGN(NLA_HDRLEN + len) > sizeof(*req))
return -EMSGSIZE;
if (!!data != !!len)
return -EINVAL;
nla = req_tail(req);
nla->nla_type = type;
nla->nla_len = NLA_HDRLEN + len;
if (data)
memcpy(nla_data(nla), data, len);
req->nh.nlmsg_len = NLMSG_ALIGN(req->nh.nlmsg_len) + NLA_ALIGN(nla->nla_len);
return 0;
}
static inline struct nlattr *nlattr_begin_nested(struct libbpf_nla_req *req, int type)
{
struct nlattr *tail;
tail = req_tail(req);
if (nlattr_add(req, type | NLA_F_NESTED, NULL, 0))
return NULL;
return tail;
}
static inline void nlattr_end_nested(struct libbpf_nla_req *req,
struct nlattr *tail)
{
tail->nla_len = (void *)req_tail(req) - (void *)tail;
}
#endif /* __LIBBPF_NLATTR_H */

1332
src/relo_core.c Normal file

File diff suppressed because it is too large Load Diff

57
src/relo_core.h Normal file
View File

@@ -0,0 +1,57 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
/* Copyright (c) 2019 Facebook */
#ifndef __RELO_CORE_H
#define __RELO_CORE_H
#include <linux/bpf.h>
struct bpf_core_cand {
const struct btf *btf;
__u32 id;
};
/* dynamically sized list of type IDs and its associated struct btf */
struct bpf_core_cand_list {
struct bpf_core_cand *cands;
int len;
};
#define BPF_CORE_SPEC_MAX_LEN 64
/* represents BPF CO-RE field or array element accessor */
struct bpf_core_accessor {
__u32 type_id; /* struct/union type or array element type */
__u32 idx; /* field index or array index */
const char *name; /* field name or NULL for array accessor */
};
struct bpf_core_spec {
const struct btf *btf;
/* high-level spec: named fields and array indices only */
struct bpf_core_accessor spec[BPF_CORE_SPEC_MAX_LEN];
/* original unresolved (no skip_mods_or_typedefs) root type ID */
__u32 root_type_id;
/* CO-RE relocation kind */
enum bpf_core_relo_kind relo_kind;
/* high-level spec length */
int len;
/* raw, low-level spec: 1-to-1 with accessor spec string */
int raw_spec[BPF_CORE_SPEC_MAX_LEN];
/* raw spec length */
int raw_len;
/* field bit offset represented by spec */
__u32 bit_offset;
};
int bpf_core_apply_relo_insn(const char *prog_name,
struct bpf_insn *insn, int insn_idx,
const struct bpf_core_relo *relo, int relo_idx,
const struct btf *local_btf,
struct bpf_core_cand_list *cands,
struct bpf_core_spec *specs_scratch);
int bpf_core_types_are_compat(const struct btf *local_btf, __u32 local_id,
const struct btf *targ_btf, __u32 targ_id);
size_t bpf_core_essential_name_len(const char *name);
#endif

View File

@@ -69,23 +69,23 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd,
err = -errno;
pr_warn("ringbuf: failed to get map info for fd=%d: %d\n",
map_fd, err);
return err;
return libbpf_err(err);
}
if (info.type != BPF_MAP_TYPE_RINGBUF) {
pr_warn("ringbuf: map fd=%d is not BPF_MAP_TYPE_RINGBUF\n",
map_fd);
return -EINVAL;
return libbpf_err(-EINVAL);
}
tmp = libbpf_reallocarray(rb->rings, rb->ring_cnt + 1, sizeof(*rb->rings));
if (!tmp)
return -ENOMEM;
return libbpf_err(-ENOMEM);
rb->rings = tmp;
tmp = libbpf_reallocarray(rb->events, rb->ring_cnt + 1, sizeof(*rb->events));
if (!tmp)
return -ENOMEM;
return libbpf_err(-ENOMEM);
rb->events = tmp;
r = &rb->rings[rb->ring_cnt];
@@ -103,7 +103,7 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd,
err = -errno;
pr_warn("ringbuf: failed to mmap consumer page for map fd=%d: %d\n",
map_fd, err);
return err;
return libbpf_err(err);
}
r->consumer_pos = tmp;
@@ -118,7 +118,7 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd,
ringbuf_unmap_ring(rb, r);
pr_warn("ringbuf: failed to mmap data pages for map fd=%d: %d\n",
map_fd, err);
return err;
return libbpf_err(err);
}
r->producer_pos = tmp;
r->data = tmp + rb->page_size;
@@ -133,7 +133,7 @@ int ring_buffer__add(struct ring_buffer *rb, int map_fd,
ringbuf_unmap_ring(rb, r);
pr_warn("ringbuf: failed to epoll add map fd=%d: %d\n",
map_fd, err);
return err;
return libbpf_err(err);
}
rb->ring_cnt++;
@@ -165,11 +165,11 @@ ring_buffer__new(int map_fd, ring_buffer_sample_fn sample_cb, void *ctx,
int err;
if (!OPTS_VALID(opts, ring_buffer_opts))
return NULL;
return errno = EINVAL, NULL;
rb = calloc(1, sizeof(*rb));
if (!rb)
return NULL;
return errno = ENOMEM, NULL;
rb->page_size = getpagesize();
@@ -188,7 +188,7 @@ ring_buffer__new(int map_fd, ring_buffer_sample_fn sample_cb, void *ctx,
err_out:
ring_buffer__free(rb);
return NULL;
return errno = -err, NULL;
}
static inline int roundup_len(__u32 len)
@@ -202,9 +202,11 @@ static inline int roundup_len(__u32 len)
return (len + 7) / 8 * 8;
}
static int ringbuf_process_ring(struct ring* r)
static int64_t ringbuf_process_ring(struct ring* r)
{
int *len_ptr, len, err, cnt = 0;
int *len_ptr, len, err;
/* 64-bit to avoid overflow in case of extreme application behavior */
int64_t cnt = 0;
unsigned long cons_pos, prod_pos;
bool got_new_data;
void *sample;
@@ -227,7 +229,7 @@ static int ringbuf_process_ring(struct ring* r)
if ((len & BPF_RINGBUF_DISCARD_BIT) == 0) {
sample = (void *)len_ptr + BPF_RINGBUF_HDR_SZ;
err = r->sample_cb(r->ctx, sample, len);
if (err) {
if (err < 0) {
/* update consumer pos and bail out */
smp_store_release(r->consumer_pos,
cons_pos);
@@ -244,43 +246,53 @@ done:
}
/* Consume available ring buffer(s) data without event polling.
* Returns number of records consumed across all registered ring buffers, or
* negative number if any of the callbacks return error.
* Returns number of records consumed across all registered ring buffers (or
* INT_MAX, whichever is less), or negative number if any of the callbacks
* return error.
*/
int ring_buffer__consume(struct ring_buffer *rb)
{
int i, err, res = 0;
int64_t err, res = 0;
int i;
for (i = 0; i < rb->ring_cnt; i++) {
struct ring *ring = &rb->rings[i];
err = ringbuf_process_ring(ring);
if (err < 0)
return err;
return libbpf_err(err);
res += err;
}
if (res > INT_MAX)
return INT_MAX;
return res;
}
/* Poll for available data and consume records, if any are available.
* Returns number of records consumed, or negative number, if any of the
* registered callbacks returned error.
* Returns number of records consumed (or INT_MAX, whichever is less), or
* negative number, if any of the registered callbacks returned error.
*/
int ring_buffer__poll(struct ring_buffer *rb, int timeout_ms)
{
int i, cnt, err, res = 0;
int i, cnt;
int64_t err, res = 0;
cnt = epoll_wait(rb->epoll_fd, rb->events, rb->ring_cnt, timeout_ms);
if (cnt < 0)
return libbpf_err(-errno);
for (i = 0; i < cnt; i++) {
__u32 ring_id = rb->events[i].data.fd;
struct ring *ring = &rb->rings[ring_id];
err = ringbuf_process_ring(ring);
if (err < 0)
return err;
return libbpf_err(err);
res += err;
}
return cnt < 0 ? -errno : res;
if (res > INT_MAX)
return INT_MAX;
return res;
}
/* Get an fd that can be used to sleep until data is available in the ring(s) */

200
src/skel_internal.h Normal file
View File

@@ -0,0 +1,200 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
/* Copyright (c) 2021 Facebook */
#ifndef __SKEL_INTERNAL_H
#define __SKEL_INTERNAL_H
#include <unistd.h>
#include <sys/syscall.h>
#include <sys/mman.h>
#ifndef __NR_bpf
# if defined(__mips__) && defined(_ABIO32)
# define __NR_bpf 4355
# elif defined(__mips__) && defined(_ABIN32)
# define __NR_bpf 6319
# elif defined(__mips__) && defined(_ABI64)
# define __NR_bpf 5315
# endif
#endif
/* This file is a base header for auto-generated *.lskel.h files.
* Its contents will change and may become part of auto-generation in the future.
*
* The layout of bpf_[map|prog]_desc and bpf_loader_ctx is feature dependent
* and will change from one version of libbpf to another and features
* requested during loader program generation.
*/
struct bpf_map_desc {
union {
/* input for the loader prog */
struct {
__aligned_u64 initial_value;
__u32 max_entries;
};
/* output of the loader prog */
struct {
int map_fd;
};
};
};
struct bpf_prog_desc {
int prog_fd;
};
struct bpf_loader_ctx {
size_t sz;
__u32 log_level;
__u32 log_size;
__u64 log_buf;
};
struct bpf_load_and_run_opts {
struct bpf_loader_ctx *ctx;
const void *data;
const void *insns;
__u32 data_sz;
__u32 insns_sz;
const char *errstr;
};
static inline int skel_sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr,
unsigned int size)
{
return syscall(__NR_bpf, cmd, attr, size);
}
static inline int skel_closenz(int fd)
{
if (fd > 0)
return close(fd);
return -EINVAL;
}
#ifndef offsetofend
#define offsetofend(TYPE, MEMBER) \
(offsetof(TYPE, MEMBER) + sizeof((((TYPE *)0)->MEMBER)))
#endif
static inline int skel_map_create(enum bpf_map_type map_type,
const char *map_name,
__u32 key_size,
__u32 value_size,
__u32 max_entries)
{
const size_t attr_sz = offsetofend(union bpf_attr, map_extra);
union bpf_attr attr;
memset(&attr, 0, attr_sz);
attr.map_type = map_type;
strncpy(attr.map_name, map_name, sizeof(attr.map_name));
attr.key_size = key_size;
attr.value_size = value_size;
attr.max_entries = max_entries;
return skel_sys_bpf(BPF_MAP_CREATE, &attr, attr_sz);
}
static inline int skel_map_update_elem(int fd, const void *key,
const void *value, __u64 flags)
{
const size_t attr_sz = offsetofend(union bpf_attr, flags);
union bpf_attr attr;
memset(&attr, 0, attr_sz);
attr.map_fd = fd;
attr.key = (long) key;
attr.value = (long) value;
attr.flags = flags;
return skel_sys_bpf(BPF_MAP_UPDATE_ELEM, &attr, attr_sz);
}
static inline int skel_raw_tracepoint_open(const char *name, int prog_fd)
{
const size_t attr_sz = offsetofend(union bpf_attr, raw_tracepoint.prog_fd);
union bpf_attr attr;
memset(&attr, 0, attr_sz);
attr.raw_tracepoint.name = (long) name;
attr.raw_tracepoint.prog_fd = prog_fd;
return skel_sys_bpf(BPF_RAW_TRACEPOINT_OPEN, &attr, attr_sz);
}
static inline int skel_link_create(int prog_fd, int target_fd,
enum bpf_attach_type attach_type)
{
const size_t attr_sz = offsetofend(union bpf_attr, link_create.iter_info_len);
union bpf_attr attr;
memset(&attr, 0, attr_sz);
attr.link_create.prog_fd = prog_fd;
attr.link_create.target_fd = target_fd;
attr.link_create.attach_type = attach_type;
return skel_sys_bpf(BPF_LINK_CREATE, &attr, attr_sz);
}
static inline int bpf_load_and_run(struct bpf_load_and_run_opts *opts)
{
int map_fd = -1, prog_fd = -1, key = 0, err;
union bpf_attr attr;
map_fd = skel_map_create(BPF_MAP_TYPE_ARRAY, "__loader.map", 4, opts->data_sz, 1);
if (map_fd < 0) {
opts->errstr = "failed to create loader map";
err = -errno;
goto out;
}
err = skel_map_update_elem(map_fd, &key, opts->data, 0);
if (err < 0) {
opts->errstr = "failed to update loader map";
err = -errno;
goto out;
}
memset(&attr, 0, sizeof(attr));
attr.prog_type = BPF_PROG_TYPE_SYSCALL;
attr.insns = (long) opts->insns;
attr.insn_cnt = opts->insns_sz / sizeof(struct bpf_insn);
attr.license = (long) "Dual BSD/GPL";
memcpy(attr.prog_name, "__loader.prog", sizeof("__loader.prog"));
attr.fd_array = (long) &map_fd;
attr.log_level = opts->ctx->log_level;
attr.log_size = opts->ctx->log_size;
attr.log_buf = opts->ctx->log_buf;
attr.prog_flags = BPF_F_SLEEPABLE;
prog_fd = skel_sys_bpf(BPF_PROG_LOAD, &attr, sizeof(attr));
if (prog_fd < 0) {
opts->errstr = "failed to load loader prog";
err = -errno;
goto out;
}
memset(&attr, 0, sizeof(attr));
attr.test.prog_fd = prog_fd;
attr.test.ctx_in = (long) opts->ctx;
attr.test.ctx_size_in = opts->ctx->sz;
err = skel_sys_bpf(BPF_PROG_RUN, &attr, sizeof(attr));
if (err < 0 || (int)attr.test.retval < 0) {
opts->errstr = "failed to execute loader prog";
if (err < 0) {
err = -errno;
} else {
err = (int)attr.test.retval;
errno = -err;
}
goto out;
}
err = 0;
out:
if (map_fd >= 0)
close(map_fd);
if (prog_fd >= 0)
close(prog_fd);
return err;
}
#endif

177
src/strset.c Normal file
View File

@@ -0,0 +1,177 @@
// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
/* Copyright (c) 2021 Facebook */
#include <stdint.h>
#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <linux/err.h>
#include "hashmap.h"
#include "libbpf_internal.h"
#include "strset.h"
struct strset {
void *strs_data;
size_t strs_data_len;
size_t strs_data_cap;
size_t strs_data_max_len;
/* lookup index for each unique string in strings set */
struct hashmap *strs_hash;
};
static size_t strset_hash_fn(const void *key, void *ctx)
{
const struct strset *s = ctx;
const char *str = s->strs_data + (long)key;
return str_hash(str);
}
static bool strset_equal_fn(const void *key1, const void *key2, void *ctx)
{
const struct strset *s = ctx;
const char *str1 = s->strs_data + (long)key1;
const char *str2 = s->strs_data + (long)key2;
return strcmp(str1, str2) == 0;
}
struct strset *strset__new(size_t max_data_sz, const char *init_data, size_t init_data_sz)
{
struct strset *set = calloc(1, sizeof(*set));
struct hashmap *hash;
int err = -ENOMEM;
if (!set)
return ERR_PTR(-ENOMEM);
hash = hashmap__new(strset_hash_fn, strset_equal_fn, set);
if (IS_ERR(hash))
goto err_out;
set->strs_data_max_len = max_data_sz;
set->strs_hash = hash;
if (init_data) {
long off;
set->strs_data = malloc(init_data_sz);
if (!set->strs_data)
goto err_out;
memcpy(set->strs_data, init_data, init_data_sz);
set->strs_data_len = init_data_sz;
set->strs_data_cap = init_data_sz;
for (off = 0; off < set->strs_data_len; off += strlen(set->strs_data + off) + 1) {
/* hashmap__add() returns EEXIST if string with the same
* content already is in the hash map
*/
err = hashmap__add(hash, (void *)off, (void *)off);
if (err == -EEXIST)
continue; /* duplicate */
if (err)
goto err_out;
}
}
return set;
err_out:
strset__free(set);
return ERR_PTR(err);
}
void strset__free(struct strset *set)
{
if (IS_ERR_OR_NULL(set))
return;
hashmap__free(set->strs_hash);
free(set->strs_data);
free(set);
}
size_t strset__data_size(const struct strset *set)
{
return set->strs_data_len;
}
const char *strset__data(const struct strset *set)
{
return set->strs_data;
}
static void *strset_add_str_mem(struct strset *set, size_t add_sz)
{
return libbpf_add_mem(&set->strs_data, &set->strs_data_cap, 1,
set->strs_data_len, set->strs_data_max_len, add_sz);
}
/* Find string offset that corresponds to a given string *s*.
* Returns:
* - >0 offset into string data, if string is found;
* - -ENOENT, if string is not in the string data;
* - <0, on any other error.
*/
int strset__find_str(struct strset *set, const char *s)
{
long old_off, new_off, len;
void *p;
/* see strset__add_str() for why we do this */
len = strlen(s) + 1;
p = strset_add_str_mem(set, len);
if (!p)
return -ENOMEM;
new_off = set->strs_data_len;
memcpy(p, s, len);
if (hashmap__find(set->strs_hash, (void *)new_off, (void **)&old_off))
return old_off;
return -ENOENT;
}
/* Add a string s to the string data. If the string already exists, return its
* offset within string data.
* Returns:
* - > 0 offset into string data, on success;
* - < 0, on error.
*/
int strset__add_str(struct strset *set, const char *s)
{
long old_off, new_off, len;
void *p;
int err;
/* Hashmap keys are always offsets within set->strs_data, so to even
* look up some string from the "outside", we need to first append it
* at the end, so that it can be addressed with an offset. Luckily,
* until set->strs_data_len is incremented, that string is just a piece
* of garbage for the rest of the code, so no harm, no foul. On the
* other hand, if the string is unique, it's already appended and
* ready to be used, only a simple set->strs_data_len increment away.
*/
len = strlen(s) + 1;
p = strset_add_str_mem(set, len);
if (!p)
return -ENOMEM;
new_off = set->strs_data_len;
memcpy(p, s, len);
/* Now attempt to add the string, but only if the string with the same
* contents doesn't exist already (HASHMAP_ADD strategy). If such
* string exists, we'll get its offset in old_off (that's old_key).
*/
err = hashmap__insert(set->strs_hash, (void *)new_off, (void *)new_off,
HASHMAP_ADD, (const void **)&old_off, NULL);
if (err == -EEXIST)
return old_off; /* duplicated string, return existing offset */
if (err)
return err;
set->strs_data_len += len; /* new unique string, adjust data length */
return new_off;
}

21
src/strset.h Normal file
View File

@@ -0,0 +1,21 @@
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
/* Copyright (c) 2021 Facebook */
#ifndef __LIBBPF_STRSET_H
#define __LIBBPF_STRSET_H
#include <stdbool.h>
#include <stddef.h>
struct strset;
struct strset *strset__new(size_t max_data_sz, const char *init_data, size_t init_data_sz);
void strset__free(struct strset *set);
const char *strset__data(const struct strset *set);
size_t strset__data_size(const struct strset *set);
int strset__find_str(struct strset *set, const char *s);
int strset__add_str(struct strset *set, const char *s);
#endif /* __LIBBPF_STRSET_H */

423
src/xsk.c
View File

@@ -28,12 +28,18 @@
#include <sys/mman.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <linux/if_link.h>
#include "bpf.h"
#include "libbpf.h"
#include "libbpf_internal.h"
#include "xsk.h"
/* entire xsk.h and xsk.c is going away in libbpf 1.0, so ignore all internal
* uses of deprecated APIs
*/
#pragma GCC diagnostic ignored "-Wdeprecated-declarations"
#ifndef SOL_XDP
#define SOL_XDP 283
#endif
@@ -46,6 +52,11 @@
#define PF_XDP AF_XDP
#endif
enum xsk_prog {
XSK_PROG_FALLBACK,
XSK_PROG_REDIRECT_FLAGS,
};
struct xsk_umem {
struct xsk_ring_prod *fill_save;
struct xsk_ring_cons *comp_save;
@@ -54,6 +65,8 @@ struct xsk_umem {
int fd;
int refcount;
struct list_head ctx_list;
bool rx_ring_setup_done;
bool tx_ring_setup_done;
};
struct xsk_ctx {
@@ -65,8 +78,10 @@ struct xsk_ctx {
int ifindex;
struct list_head list;
int prog_fd;
int link_fd;
int xsks_map_fd;
char ifname[IFNAMSIZ];
bool has_bpf_link;
};
struct xsk_socket {
@@ -271,6 +286,7 @@ out_mmap:
return err;
}
DEFAULT_VERSION(xsk_umem__create_v0_0_4, xsk_umem__create, LIBBPF_0.0.4)
int xsk_umem__create_v0_0_4(struct xsk_umem **umem_ptr, void *umem_area,
__u64 size, struct xsk_ring_prod *fill,
struct xsk_ring_cons *comp,
@@ -289,7 +305,7 @@ int xsk_umem__create_v0_0_4(struct xsk_umem **umem_ptr, void *umem_area,
if (!umem)
return -ENOMEM;
umem->fd = socket(AF_XDP, SOCK_RAW, 0);
umem->fd = socket(AF_XDP, SOCK_RAW | SOCK_CLOEXEC, 0);
if (umem->fd < 0) {
err = -errno;
goto out_umem_alloc;
@@ -335,6 +351,7 @@ struct xsk_umem_config_v1 {
__u32 frame_headroom;
};
COMPAT_VERSION(xsk_umem__create_v0_0_2, xsk_umem__create, LIBBPF_0.0.2)
int xsk_umem__create_v0_0_2(struct xsk_umem **umem_ptr, void *umem_area,
__u64 size, struct xsk_ring_prod *fill,
struct xsk_ring_cons *comp,
@@ -348,17 +365,49 @@ int xsk_umem__create_v0_0_2(struct xsk_umem **umem_ptr, void *umem_area,
return xsk_umem__create_v0_0_4(umem_ptr, umem_area, size, fill, comp,
&config);
}
COMPAT_VERSION(xsk_umem__create_v0_0_2, xsk_umem__create, LIBBPF_0.0.2)
DEFAULT_VERSION(xsk_umem__create_v0_0_4, xsk_umem__create, LIBBPF_0.0.4)
static enum xsk_prog get_xsk_prog(void)
{
enum xsk_prog detected = XSK_PROG_FALLBACK;
__u32 size_out, retval, duration;
char data_in = 0, data_out;
struct bpf_insn insns[] = {
BPF_LD_MAP_FD(BPF_REG_1, 0),
BPF_MOV64_IMM(BPF_REG_2, 0),
BPF_MOV64_IMM(BPF_REG_3, XDP_PASS),
BPF_EMIT_CALL(BPF_FUNC_redirect_map),
BPF_EXIT_INSN(),
};
int prog_fd, map_fd, ret, insn_cnt = ARRAY_SIZE(insns);
map_fd = bpf_map_create(BPF_MAP_TYPE_XSKMAP, NULL, sizeof(int), sizeof(int), 1, NULL);
if (map_fd < 0)
return detected;
insns[0].imm = map_fd;
prog_fd = bpf_prog_load(BPF_PROG_TYPE_XDP, NULL, "GPL", insns, insn_cnt, NULL);
if (prog_fd < 0) {
close(map_fd);
return detected;
}
ret = bpf_prog_test_run(prog_fd, 0, &data_in, 1, &data_out, &size_out, &retval, &duration);
if (!ret && retval == XDP_PASS)
detected = XSK_PROG_REDIRECT_FLAGS;
close(prog_fd);
close(map_fd);
return detected;
}
static int xsk_load_xdp_prog(struct xsk_socket *xsk)
{
static const int log_buf_size = 16 * 1024;
struct xsk_ctx *ctx = xsk->ctx;
char log_buf[log_buf_size];
int err, prog_fd;
int prog_fd;
/* This is the C-program:
/* This is the fallback C-program:
* SEC("xdp_sock") int xdp_sock_prog(struct xdp_md *ctx)
* {
* int ret, index = ctx->rx_queue_index;
@@ -414,24 +463,76 @@ static int xsk_load_xdp_prog(struct xsk_socket *xsk)
/* The jumps are to this instruction */
BPF_EXIT_INSN(),
};
size_t insns_cnt = sizeof(prog) / sizeof(struct bpf_insn);
prog_fd = bpf_load_program(BPF_PROG_TYPE_XDP, prog, insns_cnt,
"LGPL-2.1 or BSD-2-Clause", 0, log_buf,
log_buf_size);
/* This is the post-5.3 kernel C-program:
* SEC("xdp_sock") int xdp_sock_prog(struct xdp_md *ctx)
* {
* return bpf_redirect_map(&xsks_map, ctx->rx_queue_index, XDP_PASS);
* }
*/
struct bpf_insn prog_redirect_flags[] = {
/* r2 = *(u32 *)(r1 + 16) */
BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, 16),
/* r1 = xskmap[] */
BPF_LD_MAP_FD(BPF_REG_1, ctx->xsks_map_fd),
/* r3 = XDP_PASS */
BPF_MOV64_IMM(BPF_REG_3, 2),
/* call bpf_redirect_map */
BPF_EMIT_CALL(BPF_FUNC_redirect_map),
BPF_EXIT_INSN(),
};
size_t insns_cnt[] = {sizeof(prog) / sizeof(struct bpf_insn),
sizeof(prog_redirect_flags) / sizeof(struct bpf_insn),
};
struct bpf_insn *progs[] = {prog, prog_redirect_flags};
enum xsk_prog option = get_xsk_prog();
LIBBPF_OPTS(bpf_prog_load_opts, opts,
.log_buf = log_buf,
.log_size = log_buf_size,
);
prog_fd = bpf_prog_load(BPF_PROG_TYPE_XDP, NULL, "LGPL-2.1 or BSD-2-Clause",
progs[option], insns_cnt[option], &opts);
if (prog_fd < 0) {
pr_warn("BPF log buffer:\n%s", log_buf);
return prog_fd;
}
err = bpf_set_link_xdp_fd(xsk->ctx->ifindex, prog_fd,
xsk->config.xdp_flags);
ctx->prog_fd = prog_fd;
return 0;
}
static int xsk_create_bpf_link(struct xsk_socket *xsk)
{
DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts);
struct xsk_ctx *ctx = xsk->ctx;
__u32 prog_id = 0;
int link_fd;
int err;
err = bpf_get_link_xdp_id(ctx->ifindex, &prog_id, xsk->config.xdp_flags);
if (err) {
close(prog_fd);
pr_warn("getting XDP prog id failed\n");
return err;
}
ctx->prog_fd = prog_fd;
/* if there's a netlink-based XDP prog loaded on interface, bail out
* and ask user to do the removal by himself
*/
if (prog_id) {
pr_warn("Netlink-based XDP prog detected, please unload it in order to launch AF_XDP prog\n");
return -EINVAL;
}
opts.flags = xsk->config.xdp_flags & ~(XDP_FLAGS_UPDATE_IF_NOEXIST | XDP_FLAGS_REPLACE);
link_fd = bpf_link_create(ctx->prog_fd, ctx->ifindex, BPF_XDP, &opts);
if (link_fd < 0) {
pr_warn("bpf_link_create failed: %s\n", strerror(errno));
return link_fd;
}
ctx->link_fd = link_fd;
return 0;
}
@@ -442,13 +543,12 @@ static int xsk_get_max_queues(struct xsk_socket *xsk)
struct ifreq ifr = {};
int fd, err, ret;
fd = socket(AF_INET, SOCK_DGRAM, 0);
fd = socket(AF_LOCAL, SOCK_DGRAM | SOCK_CLOEXEC, 0);
if (fd < 0)
return -errno;
ifr.ifr_data = (void *)&channels;
memcpy(ifr.ifr_name, ctx->ifname, IFNAMSIZ - 1);
ifr.ifr_name[IFNAMSIZ - 1] = '\0';
libbpf_strlcpy(ifr.ifr_name, ctx->ifname, IFNAMSIZ);
err = ioctl(fd, SIOCETHTOOL, &ifr);
if (err && errno != EOPNOTSUPP) {
ret = -errno;
@@ -483,8 +583,8 @@ static int xsk_create_bpf_maps(struct xsk_socket *xsk)
if (max_queues < 0)
return max_queues;
fd = bpf_create_map_name(BPF_MAP_TYPE_XSKMAP, "xsks_map",
sizeof(int), sizeof(int), max_queues, 0);
fd = bpf_map_create(BPF_MAP_TYPE_XSKMAP, "xsks_map",
sizeof(int), sizeof(int), max_queues, NULL);
if (fd < 0)
return fd;
@@ -535,21 +635,21 @@ static int xsk_lookup_bpf_maps(struct xsk_socket *xsk)
if (fd < 0)
continue;
memset(&map_info, 0, map_len);
err = bpf_obj_get_info_by_fd(fd, &map_info, &map_len);
if (err) {
close(fd);
continue;
}
if (!strcmp(map_info.name, "xsks_map")) {
if (!strncmp(map_info.name, "xsks_map", sizeof(map_info.name))) {
ctx->xsks_map_fd = fd;
continue;
break;
}
close(fd);
}
err = 0;
if (ctx->xsks_map_fd == -1)
err = -ENOENT;
@@ -566,6 +666,90 @@ static int xsk_set_bpf_maps(struct xsk_socket *xsk)
&xsk->fd, 0);
}
static int xsk_link_lookup(int ifindex, __u32 *prog_id, int *link_fd)
{
struct bpf_link_info link_info;
__u32 link_len;
__u32 id = 0;
int err;
int fd;
while (true) {
err = bpf_link_get_next_id(id, &id);
if (err) {
if (errno == ENOENT) {
err = 0;
break;
}
pr_warn("can't get next link: %s\n", strerror(errno));
break;
}
fd = bpf_link_get_fd_by_id(id);
if (fd < 0) {
if (errno == ENOENT)
continue;
pr_warn("can't get link by id (%u): %s\n", id, strerror(errno));
err = -errno;
break;
}
link_len = sizeof(struct bpf_link_info);
memset(&link_info, 0, link_len);
err = bpf_obj_get_info_by_fd(fd, &link_info, &link_len);
if (err) {
pr_warn("can't get link info: %s\n", strerror(errno));
close(fd);
break;
}
if (link_info.type == BPF_LINK_TYPE_XDP) {
if (link_info.xdp.ifindex == ifindex) {
*link_fd = fd;
if (prog_id)
*prog_id = link_info.prog_id;
break;
}
}
close(fd);
}
return err;
}
static bool xsk_probe_bpf_link(void)
{
LIBBPF_OPTS(bpf_link_create_opts, opts, .flags = XDP_FLAGS_SKB_MODE);
struct bpf_insn insns[2] = {
BPF_MOV64_IMM(BPF_REG_0, XDP_PASS),
BPF_EXIT_INSN()
};
int prog_fd, link_fd = -1, insn_cnt = ARRAY_SIZE(insns);
int ifindex_lo = 1;
bool ret = false;
int err;
err = xsk_link_lookup(ifindex_lo, NULL, &link_fd);
if (err)
return ret;
if (link_fd >= 0)
return true;
prog_fd = bpf_prog_load(BPF_PROG_TYPE_XDP, NULL, "GPL", insns, insn_cnt, NULL);
if (prog_fd < 0)
return ret;
link_fd = bpf_link_create(prog_fd, ifindex_lo, BPF_XDP, &opts);
close(prog_fd);
if (link_fd >= 0) {
ret = true;
close(link_fd);
}
return ret;
}
static int xsk_create_xsk_struct(int ifindex, struct xsk_socket *xsk)
{
char ifname[IFNAMSIZ];
@@ -583,69 +767,112 @@ static int xsk_create_xsk_struct(int ifindex, struct xsk_socket *xsk)
}
ctx->ifindex = ifindex;
memcpy(ctx->ifname, ifname, IFNAMSIZ -1);
ctx->ifname[IFNAMSIZ - 1] = 0;
libbpf_strlcpy(ctx->ifname, ifname, IFNAMSIZ);
xsk->ctx = ctx;
xsk->ctx->has_bpf_link = xsk_probe_bpf_link();
return 0;
}
static int __xsk_setup_xdp_prog(struct xsk_socket *_xdp,
int *xsks_map_fd)
static int xsk_init_xdp_res(struct xsk_socket *xsk,
int *xsks_map_fd)
{
struct xsk_ctx *ctx = xsk->ctx;
int err;
err = xsk_create_bpf_maps(xsk);
if (err)
return err;
err = xsk_load_xdp_prog(xsk);
if (err)
goto err_load_xdp_prog;
if (ctx->has_bpf_link)
err = xsk_create_bpf_link(xsk);
else
err = bpf_set_link_xdp_fd(xsk->ctx->ifindex, ctx->prog_fd,
xsk->config.xdp_flags);
if (err)
goto err_attach_xdp_prog;
if (!xsk->rx)
return err;
err = xsk_set_bpf_maps(xsk);
if (err)
goto err_set_bpf_maps;
return err;
err_set_bpf_maps:
if (ctx->has_bpf_link)
close(ctx->link_fd);
else
bpf_set_link_xdp_fd(ctx->ifindex, -1, 0);
err_attach_xdp_prog:
close(ctx->prog_fd);
err_load_xdp_prog:
xsk_delete_bpf_maps(xsk);
return err;
}
static int xsk_lookup_xdp_res(struct xsk_socket *xsk, int *xsks_map_fd, int prog_id)
{
struct xsk_ctx *ctx = xsk->ctx;
int err;
ctx->prog_fd = bpf_prog_get_fd_by_id(prog_id);
if (ctx->prog_fd < 0) {
err = -errno;
goto err_prog_fd;
}
err = xsk_lookup_bpf_maps(xsk);
if (err)
goto err_lookup_maps;
if (!xsk->rx)
return err;
err = xsk_set_bpf_maps(xsk);
if (err)
goto err_set_maps;
return err;
err_set_maps:
close(ctx->xsks_map_fd);
err_lookup_maps:
close(ctx->prog_fd);
err_prog_fd:
if (ctx->has_bpf_link)
close(ctx->link_fd);
return err;
}
static int __xsk_setup_xdp_prog(struct xsk_socket *_xdp, int *xsks_map_fd)
{
struct xsk_socket *xsk = _xdp;
struct xsk_ctx *ctx = xsk->ctx;
__u32 prog_id = 0;
int err;
err = bpf_get_link_xdp_id(ctx->ifindex, &prog_id,
xsk->config.xdp_flags);
if (ctx->has_bpf_link)
err = xsk_link_lookup(ctx->ifindex, &prog_id, &ctx->link_fd);
else
err = bpf_get_link_xdp_id(ctx->ifindex, &prog_id, xsk->config.xdp_flags);
if (err)
return err;
if (!prog_id) {
err = xsk_create_bpf_maps(xsk);
if (err)
return err;
err = !prog_id ? xsk_init_xdp_res(xsk, xsks_map_fd) :
xsk_lookup_xdp_res(xsk, xsks_map_fd, prog_id);
err = xsk_load_xdp_prog(xsk);
if (err) {
goto err_load_xdp_prog;
}
} else {
ctx->prog_fd = bpf_prog_get_fd_by_id(prog_id);
if (ctx->prog_fd < 0)
return -errno;
err = xsk_lookup_bpf_maps(xsk);
if (err) {
close(ctx->prog_fd);
return err;
}
}
if (xsk->rx) {
err = xsk_set_bpf_maps(xsk);
if (err) {
if (!prog_id) {
goto err_set_bpf_maps;
} else {
close(ctx->prog_fd);
return err;
}
}
}
if (xsks_map_fd)
if (!err && xsks_map_fd)
*xsks_map_fd = ctx->xsks_map_fd;
return 0;
err_set_bpf_maps:
close(ctx->prog_fd);
bpf_set_link_xdp_fd(ctx->ifindex, -1, 0);
err_load_xdp_prog:
xsk_delete_bpf_maps(xsk);
return err;
}
@@ -667,26 +894,30 @@ static struct xsk_ctx *xsk_get_ctx(struct xsk_umem *umem, int ifindex,
return NULL;
}
static void xsk_put_ctx(struct xsk_ctx *ctx)
static void xsk_put_ctx(struct xsk_ctx *ctx, bool unmap)
{
struct xsk_umem *umem = ctx->umem;
struct xdp_mmap_offsets off;
int err;
if (--ctx->refcount == 0) {
err = xsk_get_mmap_offsets(umem->fd, &off);
if (!err) {
munmap(ctx->fill->ring - off.fr.desc,
off.fr.desc + umem->config.fill_size *
sizeof(__u64));
munmap(ctx->comp->ring - off.cr.desc,
off.cr.desc + umem->config.comp_size *
sizeof(__u64));
}
if (--ctx->refcount)
return;
list_del(&ctx->list);
free(ctx);
}
if (!unmap)
goto out_free;
err = xsk_get_mmap_offsets(umem->fd, &off);
if (err)
goto out_free;
munmap(ctx->fill->ring - off.fr.desc, off.fr.desc + umem->config.fill_size *
sizeof(__u64));
munmap(ctx->comp->ring - off.cr.desc, off.cr.desc + umem->config.comp_size *
sizeof(__u64));
out_free:
list_del(&ctx->list);
free(ctx);
}
static struct xsk_ctx *xsk_create_ctx(struct xsk_socket *xsk,
@@ -718,11 +949,8 @@ static struct xsk_ctx *xsk_create_ctx(struct xsk_socket *xsk,
ctx->refcount = 1;
ctx->umem = umem;
ctx->queue_id = queue_id;
memcpy(ctx->ifname, ifname, IFNAMSIZ - 1);
ctx->ifname[IFNAMSIZ - 1] = '\0';
libbpf_strlcpy(ctx->ifname, ifname, IFNAMSIZ);
umem->fill_save = NULL;
umem->comp_save = NULL;
ctx->fill = fill;
ctx->comp = comp;
list_add(&ctx->list, &umem->ctx_list);
@@ -772,6 +1000,7 @@ int xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
struct xsk_ring_cons *comp,
const struct xsk_socket_config *usr_config)
{
bool unmap, rx_setup_done = false, tx_setup_done = false;
void *rx_map = NULL, *tx_map = NULL;
struct sockaddr_xdp sxdp = {};
struct xdp_mmap_offsets off;
@@ -782,6 +1011,8 @@ int xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
if (!umem || !xsk_ptr || !(rx || tx))
return -EFAULT;
unmap = umem->fill_save != fill;
xsk = calloc(1, sizeof(*xsk));
if (!xsk)
return -ENOMEM;
@@ -798,13 +1029,15 @@ int xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
}
if (umem->refcount++ > 0) {
xsk->fd = socket(AF_XDP, SOCK_RAW, 0);
xsk->fd = socket(AF_XDP, SOCK_RAW | SOCK_CLOEXEC, 0);
if (xsk->fd < 0) {
err = -errno;
goto out_xsk_alloc;
}
} else {
xsk->fd = umem->fd;
rx_setup_done = umem->rx_ring_setup_done;
tx_setup_done = umem->tx_ring_setup_done;
}
ctx = xsk_get_ctx(umem, ifindex, queue_id);
@@ -822,8 +1055,9 @@ int xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
}
}
xsk->ctx = ctx;
xsk->ctx->has_bpf_link = xsk_probe_bpf_link();
if (rx) {
if (rx && !rx_setup_done) {
err = setsockopt(xsk->fd, SOL_XDP, XDP_RX_RING,
&xsk->config.rx_size,
sizeof(xsk->config.rx_size));
@@ -831,8 +1065,10 @@ int xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
err = -errno;
goto out_put_ctx;
}
if (xsk->fd == umem->fd)
umem->rx_ring_setup_done = true;
}
if (tx) {
if (tx && !tx_setup_done) {
err = setsockopt(xsk->fd, SOL_XDP, XDP_TX_RING,
&xsk->config.tx_size,
sizeof(xsk->config.tx_size));
@@ -840,6 +1076,8 @@ int xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
err = -errno;
goto out_put_ctx;
}
if (xsk->fd == umem->fd)
umem->tx_ring_setup_done = true;
}
err = xsk_get_mmap_offsets(xsk->fd, &off);
@@ -918,6 +1156,8 @@ int xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
}
*xsk_ptr = xsk;
umem->fill_save = NULL;
umem->comp_save = NULL;
return 0;
out_mmap_tx:
@@ -929,7 +1169,7 @@ out_mmap_rx:
munmap(rx_map, off.rx.desc +
xsk->config.rx_size * sizeof(struct xdp_desc));
out_put_ctx:
xsk_put_ctx(ctx);
xsk_put_ctx(ctx, unmap);
out_socket:
if (--umem->refcount)
close(xsk->fd);
@@ -943,6 +1183,9 @@ int xsk_socket__create(struct xsk_socket **xsk_ptr, const char *ifname,
struct xsk_ring_cons *rx, struct xsk_ring_prod *tx,
const struct xsk_socket_config *usr_config)
{
if (!umem)
return -EFAULT;
return xsk_socket__create_shared(xsk_ptr, ifname, queue_id, umem,
rx, tx, umem->fill_save,
umem->comp_save, usr_config);
@@ -978,6 +1221,8 @@ void xsk_socket__delete(struct xsk_socket *xsk)
if (ctx->prog_fd != -1) {
xsk_delete_bpf_maps(xsk);
close(ctx->prog_fd);
if (ctx->has_bpf_link)
close(ctx->link_fd);
}
err = xsk_get_mmap_offsets(xsk->fd, &off);
@@ -992,7 +1237,7 @@ void xsk_socket__delete(struct xsk_socket *xsk)
}
}
xsk_put_ctx(ctx);
xsk_put_ctx(ctx, true);
umem->refcount--;
/* Do not close an fd that also has an associated umem connected

177
src/xsk.h
View File

@@ -3,7 +3,8 @@
/*
* AF_XDP user-space access library.
*
* Copyright(c) 2018 - 2019 Intel Corporation.
* Copyright (c) 2018 - 2019 Intel Corporation.
* Copyright (c) 2019 Facebook
*
* Author(s): Magnus Karlsson <magnus.karlsson@intel.com>
*/
@@ -13,15 +14,86 @@
#include <stdio.h>
#include <stdint.h>
#include <stdbool.h>
#include <linux/if_xdp.h>
#include "libbpf.h"
#include "libbpf_util.h"
#ifdef __cplusplus
extern "C" {
#endif
/* This whole API has been deprecated and moved to libxdp that can be found at
* https://github.com/xdp-project/xdp-tools. The APIs are exactly the same so
* it should just be linking with libxdp instead of libbpf for this set of
* functionality. If not, please submit a bug report on the aforementioned page.
*/
/* Load-Acquire Store-Release barriers used by the XDP socket
* library. The following macros should *NOT* be considered part of
* the xsk.h API, and is subject to change anytime.
*
* LIBRARY INTERNAL
*/
#define __XSK_READ_ONCE(x) (*(volatile typeof(x) *)&x)
#define __XSK_WRITE_ONCE(x, v) (*(volatile typeof(x) *)&x) = (v)
#if defined(__i386__) || defined(__x86_64__)
# define libbpf_smp_store_release(p, v) \
do { \
asm volatile("" : : : "memory"); \
__XSK_WRITE_ONCE(*p, v); \
} while (0)
# define libbpf_smp_load_acquire(p) \
({ \
typeof(*p) ___p1 = __XSK_READ_ONCE(*p); \
asm volatile("" : : : "memory"); \
___p1; \
})
#elif defined(__aarch64__)
# define libbpf_smp_store_release(p, v) \
asm volatile ("stlr %w1, %0" : "=Q" (*p) : "r" (v) : "memory")
# define libbpf_smp_load_acquire(p) \
({ \
typeof(*p) ___p1; \
asm volatile ("ldar %w0, %1" \
: "=r" (___p1) : "Q" (*p) : "memory"); \
___p1; \
})
#elif defined(__riscv)
# define libbpf_smp_store_release(p, v) \
do { \
asm volatile ("fence rw,w" : : : "memory"); \
__XSK_WRITE_ONCE(*p, v); \
} while (0)
# define libbpf_smp_load_acquire(p) \
({ \
typeof(*p) ___p1 = __XSK_READ_ONCE(*p); \
asm volatile ("fence r,rw" : : : "memory"); \
___p1; \
})
#endif
#ifndef libbpf_smp_store_release
#define libbpf_smp_store_release(p, v) \
do { \
__sync_synchronize(); \
__XSK_WRITE_ONCE(*p, v); \
} while (0)
#endif
#ifndef libbpf_smp_load_acquire
#define libbpf_smp_load_acquire(p) \
({ \
typeof(*p) ___p1 = __XSK_READ_ONCE(*p); \
__sync_synchronize(); \
___p1; \
})
#endif
/* LIBRARY INTERNAL -- END */
/* Do not access these members directly. Use the functions below. */
#define DEFINE_XSK_RING(name) \
struct name { \
@@ -96,7 +168,8 @@ static inline __u32 xsk_prod_nb_free(struct xsk_ring_prod *r, __u32 nb)
* this function. Without this optimization it whould have been
* free_entries = r->cached_prod - r->cached_cons + r->size.
*/
r->cached_cons = *r->consumer + r->size;
r->cached_cons = libbpf_smp_load_acquire(r->consumer);
r->cached_cons += r->size;
return r->cached_cons - r->cached_prod;
}
@@ -106,7 +179,7 @@ static inline __u32 xsk_cons_nb_avail(struct xsk_ring_cons *r, __u32 nb)
__u32 entries = r->cached_prod - r->cached_cons;
if (entries == 0) {
r->cached_prod = *r->producer;
r->cached_prod = libbpf_smp_load_acquire(r->producer);
entries = r->cached_prod - r->cached_cons;
}
@@ -129,9 +202,7 @@ static inline void xsk_ring_prod__submit(struct xsk_ring_prod *prod, __u32 nb)
/* Make sure everything has been written to the ring before indicating
* this to the kernel by writing the producer pointer.
*/
libbpf_smp_wmb();
*prod->producer += nb;
libbpf_smp_store_release(prod->producer, *prod->producer + nb);
}
static inline __u32 xsk_ring_cons__peek(struct xsk_ring_cons *cons, __u32 nb, __u32 *idx)
@@ -139,11 +210,6 @@ static inline __u32 xsk_ring_cons__peek(struct xsk_ring_cons *cons, __u32 nb, __
__u32 entries = xsk_cons_nb_avail(cons, nb);
if (entries > 0) {
/* Make sure we do not speculatively read the data before
* we have received the packet buffers from the ring.
*/
libbpf_smp_rmb();
*idx = cons->cached_cons;
cons->cached_cons += entries;
}
@@ -161,9 +227,8 @@ static inline void xsk_ring_cons__release(struct xsk_ring_cons *cons, __u32 nb)
/* Make sure data has been read before indicating we are done
* with the entries by updating the consumer pointer.
*/
libbpf_smp_rwmb();
libbpf_smp_store_release(cons->consumer, *cons->consumer + nb);
*cons->consumer += nb;
}
static inline void *xsk_umem__get_data(void *umem_area, __u64 addr)
@@ -186,8 +251,10 @@ static inline __u64 xsk_umem__add_offset_to_addr(__u64 addr)
return xsk_umem__extract_addr(addr) + xsk_umem__extract_offset(addr);
}
LIBBPF_API int xsk_umem__fd(const struct xsk_umem *umem);
LIBBPF_API int xsk_socket__fd(const struct xsk_socket *xsk);
LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp")
int xsk_umem__fd(const struct xsk_umem *umem);
LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp")
int xsk_socket__fd(const struct xsk_socket *xsk);
#define XSK_RING_CONS__DEFAULT_NUM_DESCS 2048
#define XSK_RING_PROD__DEFAULT_NUM_DESCS 2048
@@ -204,10 +271,10 @@ struct xsk_umem_config {
__u32 flags;
};
LIBBPF_API int xsk_setup_xdp_prog(int ifindex,
int *xsks_map_fd);
LIBBPF_API int xsk_socket__update_xskmap(struct xsk_socket *xsk,
int xsks_map_fd);
LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp")
int xsk_setup_xdp_prog(int ifindex, int *xsks_map_fd);
LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp")
int xsk_socket__update_xskmap(struct xsk_socket *xsk, int xsks_map_fd);
/* Flags for the libbpf_flags field. */
#define XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD (1 << 0)
@@ -221,40 +288,46 @@ struct xsk_socket_config {
};
/* Set config to NULL to get the default configuration. */
LIBBPF_API int xsk_umem__create(struct xsk_umem **umem,
void *umem_area, __u64 size,
struct xsk_ring_prod *fill,
struct xsk_ring_cons *comp,
const struct xsk_umem_config *config);
LIBBPF_API int xsk_umem__create_v0_0_2(struct xsk_umem **umem,
void *umem_area, __u64 size,
struct xsk_ring_prod *fill,
struct xsk_ring_cons *comp,
const struct xsk_umem_config *config);
LIBBPF_API int xsk_umem__create_v0_0_4(struct xsk_umem **umem,
void *umem_area, __u64 size,
struct xsk_ring_prod *fill,
struct xsk_ring_cons *comp,
const struct xsk_umem_config *config);
LIBBPF_API int xsk_socket__create(struct xsk_socket **xsk,
const char *ifname, __u32 queue_id,
struct xsk_umem *umem,
struct xsk_ring_cons *rx,
struct xsk_ring_prod *tx,
const struct xsk_socket_config *config);
LIBBPF_API int
xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
const char *ifname,
__u32 queue_id, struct xsk_umem *umem,
struct xsk_ring_cons *rx,
struct xsk_ring_prod *tx,
struct xsk_ring_prod *fill,
struct xsk_ring_cons *comp,
const struct xsk_socket_config *config);
LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp")
int xsk_umem__create(struct xsk_umem **umem,
void *umem_area, __u64 size,
struct xsk_ring_prod *fill,
struct xsk_ring_cons *comp,
const struct xsk_umem_config *config);
LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp")
int xsk_umem__create_v0_0_2(struct xsk_umem **umem,
void *umem_area, __u64 size,
struct xsk_ring_prod *fill,
struct xsk_ring_cons *comp,
const struct xsk_umem_config *config);
LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp")
int xsk_umem__create_v0_0_4(struct xsk_umem **umem,
void *umem_area, __u64 size,
struct xsk_ring_prod *fill,
struct xsk_ring_cons *comp,
const struct xsk_umem_config *config);
LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp")
int xsk_socket__create(struct xsk_socket **xsk,
const char *ifname, __u32 queue_id,
struct xsk_umem *umem,
struct xsk_ring_cons *rx,
struct xsk_ring_prod *tx,
const struct xsk_socket_config *config);
LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp")
int xsk_socket__create_shared(struct xsk_socket **xsk_ptr,
const char *ifname,
__u32 queue_id, struct xsk_umem *umem,
struct xsk_ring_cons *rx,
struct xsk_ring_prod *tx,
struct xsk_ring_prod *fill,
struct xsk_ring_cons *comp,
const struct xsk_socket_config *config);
/* Returns 0 for success and -EBUSY if the umem is still in use. */
LIBBPF_API int xsk_umem__delete(struct xsk_umem *umem);
LIBBPF_API void xsk_socket__delete(struct xsk_socket *xsk);
LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp")
int xsk_umem__delete(struct xsk_umem *umem);
LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp")
void xsk_socket__delete(struct xsk_socket *xsk);
#ifdef __cplusplus
} /* extern "C" */

View File

@@ -0,0 +1,35 @@
From: Kumar Kartikeya Dwivedi <memxor@gmail.com>
To: bpf@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>
Subject: [PATCH bpf-next] selftests/bpf: Fix OOB write in test_verifier
Date: Tue, 14 Dec 2021 07:18:00 +0530 [thread overview]
Message-ID: <20211214014800.78762-1-memxor@gmail.com> (raw)
The commit referenced below added fixup_map_timer support (to create a
BPF map containing timers), but failed to increase the size of the
map_fds array, leading to out of bounds write. Fix this by changing
MAX_NR_MAPS to 22.
Fixes: e60e6962c503 ("selftests/bpf: Add tests for restricted helpers")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
tools/testing/selftests/bpf/test_verifier.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
index ad5d30bafd93..33e2ecb3bef9 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -54,7 +54,7 @@
#define MAX_INSNS BPF_MAXINSNS
#define MAX_TEST_INSNS 1000000
#define MAX_FIXUPS 8
-#define MAX_NR_MAPS 21
+#define MAX_NR_MAPS 22
#define MAX_TEST_RUNS 8
#define POINTER_VALUE 0xcafe4all
#define TEST_DATA_LEN 64
--
2.34.1

View File

@@ -6,7 +6,7 @@ CONT_NAME="${CONT_NAME:-libbpf-debian-$DEBIAN_RELEASE}"
ENV_VARS="${ENV_VARS:-}"
DOCKER_RUN="${DOCKER_RUN:-docker run}"
REPO_ROOT="${REPO_ROOT:-$PWD}"
ADDITIONAL_DEPS=(clang pkg-config gcc-8)
ADDITIONAL_DEPS=(clang pkg-config gcc-10)
CFLAGS="-g -O2 -Werror -Wall"
function info() {
@@ -18,10 +18,10 @@ function error() {
}
function docker_exec() {
docker exec $ENV_VARS -it $CONT_NAME "$@"
docker exec $ENV_VARS $CONT_NAME "$@"
}
set -e
set -eu
source "$(dirname $0)/travis_wait.bash"
@@ -31,7 +31,6 @@ for phase in "${PHASES[@]}"; do
info "Setup phase"
info "Using Debian $DEBIAN_RELEASE"
sudo apt-get -y -o Dpkg::Options::="--force-confnew" install docker-ce
docker --version
docker pull debian:$DEBIAN_RELEASE
@@ -39,19 +38,24 @@ for phase in "${PHASES[@]}"; do
$DOCKER_RUN -v $REPO_ROOT:/build:rw \
-w /build --privileged=true --name $CONT_NAME \
-dit --net=host debian:$DEBIAN_RELEASE /bin/bash
echo -e "::group::Build Env Setup"
docker_exec bash -c "echo deb-src http://deb.debian.org/debian $DEBIAN_RELEASE main >>/etc/apt/sources.list"
docker_exec apt-get -y update
docker_exec apt-get -y build-dep libelf-dev
docker_exec apt-get -y install libelf-dev
docker_exec apt-get -y install "${ADDITIONAL_DEPS[@]}"
docker_exec apt-get -y install aptitude
docker_exec aptitude -y build-dep libelf-dev
docker_exec aptitude -y install libelf-dev
docker_exec aptitude -y install "${ADDITIONAL_DEPS[@]}"
echo -e "::endgroup::"
;;
RUN|RUN_CLANG|RUN_GCC8|RUN_ASAN|RUN_CLANG_ASAN|RUN_GCC8_ASAN)
RUN|RUN_CLANG|RUN_GCC10|RUN_ASAN|RUN_CLANG_ASAN|RUN_GCC10_ASAN)
CC="cc"
if [[ "$phase" = *"CLANG"* ]]; then
ENV_VARS="-e CC=clang -e CXX=clang++"
CC="clang"
elif [[ "$phase" = *"GCC8"* ]]; then
ENV_VARS="-e CC=gcc-8 -e CXX=g++-8"
CC="gcc-8"
elif [[ "$phase" = *"GCC10"* ]]; then
ENV_VARS="-e CC=gcc-10 -e CXX=g++-10"
CC="gcc-10"
CFLAGS="${CFLAGS} -Wno-stringop-truncation"
else
CFLAGS="${CFLAGS} -Wno-stringop-truncation"
fi
@@ -59,9 +63,9 @@ for phase in "${PHASES[@]}"; do
CFLAGS="${CFLAGS} -fsanitize=address,undefined"
fi
docker_exec mkdir build install
docker_exec ${CC:-cc} --version
docker_exec ${CC} --version
info "build"
docker_exec make -j$((4*$(nproc))) CFLAGS="${CFLAGS}" -C ./src -B OBJDIR=../build
docker_exec make -j$((4*$(nproc))) CFLAGS="${CFLAGS}" -C ./src -B OBJDIR=../build
info "ldd build/libbpf.so:"
docker_exec ldd build/libbpf.so
if ! docker_exec ldd build/libbpf.so | grep -q libelf; then
@@ -70,7 +74,8 @@ for phase in "${PHASES[@]}"; do
fi
info "install"
docker_exec make -j$((4*$(nproc))) -C src OBJDIR=../build DESTDIR=../install install
docker_exec rm -rf build install
info "link binary"
docker_exec bash -c "CFLAGS=\"${CFLAGS}\" ./travis-ci/managers/test_compile.sh"
;;
CLEANUP)
info "Cleanup phase"

View File

@@ -0,0 +1,14 @@
#!/bin/bash
set -euox pipefail
CFLAGS=${CFLAGS:-}
cat << EOF > main.c
#include <bpf/libbpf.h>
int main() {
return bpf_object__open(0) < 0;
}
EOF
# static linking
${CC:-cc} ${CFLAGS} -o main -I./install/usr/include main.c ./build/libbpf.a -lelf -lz

View File

@@ -1,20 +1,16 @@
#!/bin/bash
set -e
set -x
set -eux
RELEASE="bionic"
echo "deb-src http://archive.ubuntu.com/ubuntu/ $RELEASE main restricted universe multiverse" >>/etc/apt/sources.list
RELEASE="focal"
apt-get update
apt-get -y build-dep libelf-dev
apt-get install -y libelf-dev pkg-config
apt-get install -y pkg-config
source "$(dirname $0)/travis_wait.bash"
cd $REPO_ROOT
CFLAGS="-g -O2 -Werror -Wall -fsanitize=address,undefined"
CFLAGS="-g -O2 -Werror -Wall -fsanitize=address,undefined -Wno-stringop-truncation"
mkdir build install
cc --version
make -j$((4*$(nproc))) CFLAGS="${CFLAGS}" -C ./src -B OBJDIR=../build
@@ -24,4 +20,4 @@ if ! ldd build/libbpf.so | grep -q libelf; then
exit 1
fi
make -j$((4*$(nproc))) -C src OBJDIR=../build DESTDIR=../install install
rm -rf build install
CFLAGS=${CFLAGS} $(dirname $0)/test_compile.sh

View File

@@ -100,59 +100,7 @@ rm -rf "$root/var/lib/pacman/sync/"
# We don't need any documentation.
rm -rf "$root/usr/share/{doc,help,man,texinfo}"
chroot "${root}" /bin/busybox --install
"$(dirname "$0")"/mkrootfs_tweak.sh "$root"
cat > "$root/etc/inittab" << "EOF"
::sysinit:/etc/init.d/rcS
::ctrlaltdel:/sbin/reboot
::shutdown:/sbin/swapoff -a
::shutdown:/bin/umount -a -r
::restart:/sbin/init
EOF
chmod 644 "$root/etc/inittab"
mkdir -m 755 "$root/etc/init.d" "$root/etc/rcS.d"
cat > "$root/etc/rcS.d/S10-mount" << "EOF"
#!/bin/sh
set -eux
/bin/mount proc /proc -t proc
# Mount devtmpfs if not mounted
if [[ -z $(/bin/mount -l -t devtmpfs) ]]; then
/bin/mount devtmpfs /dev -t devtmpfs
fi
/bin/mount sysfs /sys -t sysfs
/bin/mount bpffs /sys/fs/bpf -t bpf
/bin/mount debugfs /sys/kernel/debug -t debugfs
echo 'Listing currently mounted file systems'
/bin/mount
EOF
chmod 755 "$root/etc/rcS.d/S10-mount"
cat > "$root/etc/rcS.d/S40-network" << "EOF"
#!/bin/sh
set -eux
ip link set lo up
EOF
chmod 755 "$root/etc/rcS.d/S40-network"
cat > "$root/etc/init.d/rcS" << "EOF"
#!/bin/sh
set -eux
for path in /etc/rcS.d/S*; do
[ -x "$path" ] && "$path"
done
EOF
chmod 755 "$root/etc/init.d/rcS"
chmod 755 "$root"
tar -C "$root" -c . | zstd -T0 -19 -o "$NAME"
chmod 644 "$NAME"

View File

@@ -0,0 +1,40 @@
#!/bin/bash
# This script builds a Debian root filesystem image for testing libbpf in a
# virtual machine. Requires debootstrap >= 1.0.95 and zstd.
set -e -u -x -o pipefail
# Check whether we are root now in order to avoid confusing errors later.
if [ "$(id -u)" != 0 ]; then
echo "$0 must run as root" >&2
exit 1
fi
# Create a working directory and schedule its deletion.
root=$(mktemp -d -p "$PWD")
trap 'rm -r "$root"' EXIT
# Install packages.
packages=binutils,busybox,elfutils,iproute2,libcap2,libelf1,strace,zlib1g
debootstrap --include="$packages" --variant=minbase bullseye "$root"
# Remove the init scripts (tests use their own). Also remove various
# unnecessary files in order to save space.
rm -rf \
"$root"/etc/rcS.d \
"$root"/usr/share/{doc,info,locale,man,zoneinfo} \
"$root"/var/cache/apt/archives/* \
"$root"/var/lib/apt/lists/*
# Save some more space by removing coreutils - the tests use busybox. Before
# doing that, delete the buggy postrm script, which uses the rm command.
rm -f "$root/var/lib/dpkg/info/coreutils.postrm"
chroot "$root" dpkg --remove --force-remove-essential coreutils
# Apply common tweaks.
"$(dirname "$0")"/mkrootfs_tweak.sh "$root"
# Save the result.
name="libbpf-vmtest-rootfs-$(date +%Y.%m.%d).tar.zst"
rm -f "$name"
tar -C "$root" -c . | zstd -T0 -19 -o "$name"

View File

@@ -0,0 +1,61 @@
#!/bin/bash
# This script prepares a mounted root filesystem for testing libbpf in a virtual
# machine.
set -e -u -x -o pipefail
root=$1
shift
chroot "${root}" /bin/busybox --install
cat > "$root/etc/inittab" << "EOF"
::sysinit:/etc/init.d/rcS
::ctrlaltdel:/sbin/reboot
::shutdown:/sbin/swapoff -a
::shutdown:/bin/umount -a -r
::restart:/sbin/init
EOF
chmod 644 "$root/etc/inittab"
mkdir -m 755 -p "$root/etc/init.d" "$root/etc/rcS.d"
cat > "$root/etc/rcS.d/S10-mount" << "EOF"
#!/bin/sh
set -eux
/bin/mount proc /proc -t proc
# Mount devtmpfs if not mounted
if [[ -z $(/bin/mount -t devtmpfs) ]]; then
/bin/mount devtmpfs /dev -t devtmpfs
fi
/bin/mount sysfs /sys -t sysfs
/bin/mount bpffs /sys/fs/bpf -t bpf
/bin/mount debugfs /sys/kernel/debug -t debugfs
echo 'Listing currently mounted file systems'
/bin/mount
EOF
chmod 755 "$root/etc/rcS.d/S10-mount"
cat > "$root/etc/rcS.d/S40-network" << "EOF"
#!/bin/sh
set -eux
ip link set lo up
EOF
chmod 755 "$root/etc/rcS.d/S40-network"
cat > "$root/etc/init.d/rcS" << "EOF"
#!/bin/sh
set -eux
for path in /etc/rcS.d/S*; do
[ -x "$path" ] && "$path"
done
EOF
chmod 755 "$root/etc/init.d/rcS"
chmod 755 "$root"

View File

@@ -0,0 +1,74 @@
# IBM Z self-hosted builder
libbpf CI uses an IBM-provided z15 self-hosted builder. There are no IBM Z
builds of GitHub Actions runner, and stable qemu-user has problems with .NET
apps, so the builder runs the x86_64 runner version with qemu-user built from
the master branch.
## Configuring the builder.
### Install prerequisites.
```
$ sudo dnf install docker # RHEL
$ sudo apt install -y docker.io # Ubuntu
```
### Add services.
```
$ sudo cp *.service /etc/systemd/system/
$ sudo systemctl daemon-reload
```
### Create a config file.
```
$ sudo tee /etc/actions-runner-libbpf
repo=<owner>/<name>
access_token=<ghp_***>
```
Access token should have the repo scope, consult
https://docs.github.com/en/rest/reference/actions#create-a-registration-token-for-a-repository
for details.
### Autostart the x86_64 emulation support.
```
$ sudo systemctl enable --now qemu-user-static
```
### Autostart the runner.
```
$ sudo systemctl enable --now actions-runner-libbpf
```
## Rebuilding the image
In order to update the `iiilinuxibmcom/actions-runner-libbpf` image, e.g. to
get the latest OS security fixes, use the following commands:
```
$ sudo docker build \
--pull \
-f actions-runner-libbpf.Dockerfile \
-t iiilinuxibmcom/actions-runner-libbpf \
.
$ sudo systemctl restart actions-runner-libbpf
```
## Removing persistent data
The `actions-runner-libbpf` service stores various temporary data, such as
runner registration information, work directories and logs, in the
`actions-runner-libbpf` volume. In order to remove it and start from scratch,
e.g. when upgrading the runner or switching it to a different repository, use
the following commands:
```
$ sudo systemctl stop actions-runner-libbpf
$ sudo docker rm -f actions-runner-libbpf
$ sudo docker volume rm actions-runner-libbpf
```

View File

@@ -0,0 +1,50 @@
# Self-Hosted IBM Z Github Actions Runner.
# Temporary image: amd64 dependencies.
FROM amd64/ubuntu:20.04 as ld-prefix
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get -y install ca-certificates libicu66 libssl1.1
# Main image.
FROM s390x/ubuntu:20.04
# Packages for libbpf testing that are not installed by .github/actions/setup.
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get -y install \
bc \
bison \
cmake \
cpu-checker \
curl \
flex \
git \
jq \
linux-image-generic \
qemu-system-s390x \
rsync \
software-properties-common \
sudo \
tree
# amd64 dependencies.
COPY --from=ld-prefix / /usr/x86_64-linux-gnu/
RUN ln -fs ../lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 /usr/x86_64-linux-gnu/lib64/
RUN ln -fs /etc/resolv.conf /usr/x86_64-linux-gnu/etc/
ENV QEMU_LD_PREFIX=/usr/x86_64-linux-gnu
# amd64 Github Actions Runner.
ARG version=2.285.0
RUN useradd -m actions-runner
RUN echo "actions-runner ALL=(ALL) NOPASSWD: ALL" >>/etc/sudoers
RUN echo "Defaults env_keep += \"DEBIAN_FRONTEND\"" >>/etc/sudoers
RUN usermod -a -G kvm actions-runner
USER actions-runner
ENV USER=actions-runner
WORKDIR /home/actions-runner
RUN curl -L https://github.com/actions/runner/releases/download/v${version}/actions-runner-linux-x64-${version}.tar.gz | tar -xz
VOLUME /home/actions-runner
# Scripts.
COPY fs/ /
ENTRYPOINT ["/usr/bin/entrypoint"]
CMD ["/usr/bin/actions-runner"]

View File

@@ -0,0 +1,24 @@
[Unit]
Description=Self-Hosted IBM Z Github Actions Runner
Wants=qemu-user-static
After=qemu-user-static
StartLimitIntervalSec=0
[Service]
Type=simple
Restart=always
ExecStart=/usr/bin/docker run \
--device=/dev/kvm \
--env-file=/etc/actions-runner-libbpf \
--init \
--interactive \
--name=actions-runner-libbpf \
--rm \
--volume=actions-runner-libbpf:/home/actions-runner \
iiilinuxibmcom/actions-runner-libbpf
ExecStop=/bin/sh -c "docker exec actions-runner-libbpf kill -INT -- -1"
ExecStop=/bin/sh -c "docker wait actions-runner-libbpf"
ExecStop=/bin/sh -c "docker rm actions-runner-libbpf"
[Install]
WantedBy=multi-user.target

View File

@@ -0,0 +1,40 @@
#!/bin/bash
#
# Ephemeral runner startup script.
#
# Expects the following environment variables:
#
# - repo=<owner>/<name>
# - access_token=<ghp_***>
#
set -e -u
# Check the cached registration token.
token_file=registration-token.json
set +e
expires_at=$(jq --raw-output .expires_at "$token_file" 2>/dev/null)
status=$?
set -e
if [[ $status -ne 0 || $(date +%s) -ge $(date -d "$expires_at" +%s) ]]; then
# Refresh the cached registration token.
curl \
-X POST \
-H "Accept: application/vnd.github.v3+json" \
-H "Authorization: token $access_token" \
"https://api.github.com/repos/$repo/actions/runners/registration-token" \
-o "$token_file"
fi
# (Re-)register the runner.
registration_token=$(jq --raw-output .token "$token_file")
./config.sh remove --token "$registration_token" || true
./config.sh \
--url "https://github.com/$repo" \
--token "$registration_token" \
--labels z15 \
--ephemeral
# Run one job.
./run.sh

View File

@@ -0,0 +1,35 @@
#!/bin/bash
#
# Container entrypoint that waits for all spawned processes.
#
set -e -u
# /dev/kvm has host permissions, fix it.
if [ -e /dev/kvm ]; then
sudo chown root:kvm /dev/kvm
fi
# Create a FIFO and start reading from its read end.
tempdir=$(mktemp -d "/tmp/done.XXXXXXXXXX")
trap 'rm -r "$tempdir"' EXIT
done="$tempdir/pipe"
mkfifo "$done"
cat "$done" & waiter=$!
# Start the workload. Its descendants will inherit the FIFO's write end.
status=0
if [ "$#" -eq 0 ]; then
bash 9>"$done" || status=$?
else
"$@" 9>"$done" || status=$?
fi
# When the workload and all of its descendants exit, the FIFO's write end will
# be closed and `cat "$done"` will exit. Wait until it happens. This is needed
# in order to handle SelfUpdater, which the workload may start in background
# before exiting.
wait "$waiter"
exit "$status"

View File

@@ -0,0 +1,11 @@
[Unit]
Description=Support for transparent execution of non-native binaries with QEMU user emulation
[Service]
Type=oneshot
# The source code for iiilinuxibmcom/qemu-user-static is at https://github.com/iii-i/qemu-user-static/tree/v6.1.0-1
# TODO: replace it with multiarch/qemu-user-static once version >6.1 is available
ExecStart=/usr/bin/docker run --rm --interactive --privileged iiilinuxibmcom/qemu-user-static --reset -p yes
[Install]
WantedBy=multi-user.target

View File

@@ -1,30 +0,0 @@
#!/bin/bash
set -eu
source $(cd $(dirname $0) && pwd)/helpers.sh
travis_fold start build_pahole "Building pahole"
CWD=$(pwd)
REPO_PATH=$1
PAHOLE_ORIGIN=https://git.kernel.org/pub/scm/devel/pahole/pahole.git
mkdir -p ${REPO_PATH}
cd ${REPO_PATH}
git init
git remote add origin ${PAHOLE_ORIGIN}
git fetch origin
git checkout master
mkdir -p build
cd build
cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo -D__LIB=lib ..
make -j$((4*$(nproc))) all
sudo make install
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH:-}:/usr/local/lib
ldd $(which pahole)
pahole --version
travis_fold end build_pahole

View File

@@ -1,44 +0,0 @@
#!/bin/bash
set -eu
source $(cd $(dirname $0) && pwd)/helpers.sh
CWD=$(pwd)
LIBBPF_PATH=$(pwd)
REPO_PATH=$1
BPF_NEXT_ORIGIN=https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git
LINUX_SHA=$(cat ${LIBBPF_PATH}/CHECKPOINT-COMMIT)
SNAPSHOT_URL=https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/snapshot/bpf-next-${LINUX_SHA}.tar.gz
echo REPO_PATH = ${REPO_PATH}
echo LINUX_SHA = ${LINUX_SHA}
if [ ! -d "${REPO_PATH}" ]; then
echo
travis_fold start pull_kernel_srcs "Fetching kernel sources"
mkdir -p $(dirname "${REPO_PATH}")
cd $(dirname "${REPO_PATH}")
# attempt to fetch desired bpf-next repo snapshot
if wget ${SNAPSHOT_URL} && tar xf bpf-next-${LINUX_SHA}.tar.gz ; then
mv bpf-next-${LINUX_SHA} $(basename ${REPO_PATH})
else
# but fallback to git fetch approach if that fails
mkdir -p $(basename ${REPO_PATH})
cd $(basename ${REPO_PATH})
git init
git remote add bpf-next ${BPF_NEXT_ORIGIN}
# try shallow clone first
git fetch --depth 32 bpf-next
# check if desired SHA exists
if ! git cat-file -e ${LINUX_SHA}^{commit} ; then
# if not, fetch all of bpf-next; slow and painful
git fetch bpf-next
fi
git reset --hard ${LINUX_SHA}
fi
travis_fold end pull_kernel_srcs
fi

View File

@@ -1,8 +0,0 @@
INDEX https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/INDEX
libbpf-vmtest-rootfs-2020.09.27.tar.zst https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/libbpf-vmtest-rootfs-2020.09.27.tar.zst
vmlinux-4.9.0.zst https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/vmlinux-4.9.0.zst
vmlinux-5.5.0-rc6.zst https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/vmlinux-5.5.0-rc6.zst
vmlinux-5.5.0.zst https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/vmlinux-5.5.0.zst
vmlinuz-5.5.0-rc6 https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/vmlinuz-5.5.0-rc6
vmlinuz-5.5.0 https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/vmlinuz-5.5.0
vmlinuz-4.9.0 https://libbpf-vmtest.s3-us-west-1.amazonaws.com/x86_64/vmlinuz-4.9.0

View File

@@ -1,5 +1,12 @@
# This file is not used and is there for historic purposes only.
# See WHITELIST-5.5.0 instead.
# PERMANENTLY DISABLED
align # verifier output format changed
atomics # new atomic operations (v5.12+)
atomic_bounds # new atomic operations (v5.12+)
bind_perm # changed semantics of return values (v5.12+)
bpf_cookie # 5.15+
bpf_iter # bpf_iter support is missing
bpf_obj_id # bpf_link support missing for GET_OBJ_INFO, GET_FD_BY_ID, etc
bpf_tcp_ca # STRUCT_OPS is missing
@@ -9,6 +16,7 @@ cg_storage_multi # v5.9+ functionality
cgroup_attach_multi # BPF_F_REPLACE_PROG missing
cgroup_link # LINK_CREATE is missing
cgroup_skb_sk_lookup # bpf_sk_lookup_tcp() helper is missing
check_mtu # missing BPF helper (v5.12+)
cls_redirect # bpf_csum_level() helper is missing
connect_force_port # cgroup/get{peer,sock}name{4,6} support is missing
d_path # v5.10+ feature
@@ -16,29 +24,40 @@ enable_stats # BPF_ENABLE_STATS support is missing
fentry_fexit # bpf_prog_test_tracing missing
fentry_test # bpf_prog_test_tracing missing
fexit_bpf2bpf # freplace is missing
fexit_sleep # relies on bpf_trampoline fix in 5.12+
fexit_test # bpf_prog_test_tracing missing
flow_dissector # bpf_link-based flow dissector is in 5.8+
flow_dissector_reattach
for_each # v5.12+
get_func_ip_test # v5.15+
get_stack_raw_tp # exercising BPF verifier bug causing infinite loop
hash_large_key # v5.11+
ima # v5.11+
kfree_skb # 32-bit pointer arith in test_pkt_access
ksyms # __start_BTF has different name
kfunc_call # v5.13+
link_pinning # bpf_link is missing
linked_vars # v5.13+
load_bytes_relative # new functionality in 5.8
lookup_and_delete # v5.14+
map_init # per-CPU LRU missing
map_ptr # test uses BPF_MAP_TYPE_RINGBUF, added in 5.8
metadata # v5.10+
migrate_reuseport # v5.14+
mmap # 5.5 kernel is too permissive with re-mmaping
modify_return # fmod_ret support is missing
module_attach # module BTF support missing (v5.11+)
netcnt
netns_cookie # v5.15+
ns_current_pid_tgid # bpf_get_ns_current_pid_tgid() helper is missing
pe_preserve_elems # v5.10+
perf_branches # bpf_read_branch_records() helper is missing
perf_link # v5.15+
pkt_access # 32-bit pointer arith in test_pkt_access
probe_read_user_str # kernel bug with garbage bytes at the end
prog_run_xattr # 32-bit pointer arith in test_pkt_access
raw_tp_test_run # v5.10+
recursion # v5.12+
ringbuf # BPF_MAP_TYPE_RINGBUF is supported in 5.8+
# bug in verifier w/ tracking references
@@ -48,17 +67,26 @@ reference_tracking
select_reuseport # UDP support is missing
send_signal # bpf_send_signal_thread() helper is missing
sk_assign # bpf_sk_assign helper missing
skb_helpers # helpers added in 5.8+
sk_lookup # v5.9+
sk_storage_tracing # missing bpf_sk_storage_get() helper
skb_ctx # ctx_{size, }_{in, out} in BPF_PROG_TEST_RUN is missing
skb_helpers # helpers added in 5.8+
snprintf # v5.13+
snprintf_btf # v5.10+
sock_fields # v5.10+
socket_cookie # v5.12+
sockmap_basic # uses new socket fields, 5.8+
sockmap_listen # no listen socket supportin SOCKMAP
sockopt_sk
sk_lookup # v5.9+
skb_ctx # ctx_{size, }_{in, out} in BPF_PROG_TEST_RUN is missing
sockopt_qos_to_cc # v5.15+
stacktrace_build_id # v5.9+
stack_var_off # v5.12+
syscall # v5.14+
task_local_storage # v5.12+
task_pt_regs # v5.15+
tcp_hdr_options # v5.10+, new TCP header options feature in BPF
tcpbpf_user # LINK_CREATE is missing
tc_redirect # v5.14+
test_bpffs # v5.10+, new CONFIG_BPF_PRELOAD=y and CONFIG_BPF_PRELOAD_UMG=y|m
test_bprm_opts # v5.11+
test_global_funcs # kernel doesn't support BTF linkage=global on FUNCs
@@ -67,13 +95,19 @@ test_lsm # no BPF_LSM support
test_overhead # no fmod_ret support
test_profiler # needs verifier logic improvements from v5.10+
test_skb_pkt_end # v5.11+
timer # v5.15+
timer_mim # v5.15+
trace_ext # v5.10+
trace_printk # v5.14+
trampoline_count # v5.12+ have lower allowed limits
udp_limit # no cgroup/sock_release BPF program type (5.9+)
varlen # verifier bug fixed in later kernels
vmlinux # hrtimer_nanosleep() signature changed incompatibly
xdp_adjust_tail # new XDP functionality added in 5.8
xdp_attach # IFLA_XDP_EXPECTED_FD support is missing
xdp_bonding # v5.15+
xdp_bpf2bpf # freplace is missing
xdp_context_test_run # v5.15+
xdp_cpumap_attach # v5.9+
xdp_devmap_attach # new feature in 5.8
xdp_link # v5.9+

View File

@@ -1 +1,5 @@
# TEMPORARY
get_stack_raw_tp # spams with kernel warnings until next bpf -> bpf-next merge
stacktrace_build_id_nmi
stacktrace_build_id
task_fd_query_rawtp

View File

@@ -0,0 +1,56 @@
# TEMPORARY
atomics # attach(add): actual -524 <= expected 0 (trampoline)
bpf_iter_setsockopt # JIT does not support calling kernel function (kfunc)
bloom_filter_map # failed to find kernel BTF type ID of '__x64_sys_getpgid': -3 (?)
bpf_tcp_ca # JIT does not support calling kernel function (kfunc)
bpf_loop # attaches to __x64_sys_nanosleep
bpf_mod_race # BPF trampoline
bpf_nf # JIT does not support calling kernel function
core_read_macros # unknown func bpf_probe_read#4 (overlapping)
d_path # failed to auto-attach program 'prog_stat': -524 (trampoline)
dummy_st_ops # test_run unexpected error: -524 (errno 524) (trampoline)
fentry_fexit # fentry attach failed: -524 (trampoline)
fentry_test # fentry_first_attach unexpected error: -524 (trampoline)
fexit_bpf2bpf # freplace_attach_trace unexpected error: -524 (trampoline)
fexit_sleep # fexit_skel_load fexit skeleton failed (trampoline)
fexit_stress # fexit attach failed prog 0 failed: -524 (trampoline)
fexit_test # fexit_first_attach unexpected error: -524 (trampoline)
get_func_args_test # trampoline
get_func_ip_test # get_func_ip_test__attach unexpected error: -524 (trampoline)
get_stack_raw_tp # user_stack corrupted user stack (no backchain userspace)
kfree_skb # attach fentry unexpected error: -524 (trampoline)
kfunc_call # 'bpf_prog_active': not found in kernel BTF (?)
ksyms_module # test_ksyms_module__open_and_load unexpected error: -9 (?)
ksyms_module_libbpf # JIT does not support calling kernel function (kfunc)
ksyms_module_lskel # test_ksyms_module_lskel__open_and_load unexpected error: -9 (?)
modify_return # modify_return attach failed: -524 (trampoline)
module_attach # skel_attach skeleton attach failed: -524 (trampoline)
netcnt # failed to load BPF skeleton 'netcnt_prog': -7 (?)
probe_user # check_kprobe_res wrong kprobe res from probe read (?)
recursion # skel_attach unexpected error: -524 (trampoline)
ringbuf # skel_load skeleton load failed (?)
sk_assign # Can't read on server: Invalid argument (?)
sk_storage_tracing # test_sk_storage_tracing__attach unexpected error: -524 (trampoline)
skc_to_unix_sock # could not attach BPF object unexpected error: -524 (trampoline)
socket_cookie # prog_attach unexpected error: -524 (trampoline)
stacktrace_build_id # compare_map_keys stackid_hmap vs. stackmap err -2 errno 2 (?)
tailcalls # tail_calls are not allowed in non-JITed programs with bpf-to-bpf calls (?)
task_local_storage # failed to auto-attach program 'trace_exit_creds': -524 (trampoline)
test_bpffs # bpffs test failed 255 (iterator)
test_bprm_opts # failed to auto-attach program 'secure_exec': -524 (trampoline)
test_ima # failed to auto-attach program 'ima': -524 (trampoline)
test_local_storage # failed to auto-attach program 'unlink_hook': -524 (trampoline)
test_lsm # failed to find kernel BTF type ID of '__x64_sys_setdomainname': -3 (?)
test_overhead # attach_fentry unexpected error: -524 (trampoline)
test_profiler # unknown func bpf_probe_read_str#45 (overlapping)
timer # failed to auto-attach program 'test1': -524 (trampoline)
timer_mim # failed to auto-attach program 'test1': -524 (trampoline)
trace_ext # failed to auto-attach program 'test_pkt_md_access_new': -524 (trampoline)
trace_printk # trace_printk__load unexpected error: -2 (errno 2) (?)
trace_vprintk # trace_vprintk__open_and_load unexpected error: -9 (?)
trampoline_count # prog 'prog1': failed to attach: ERROR: strerror_r(-524)=22 (trampoline)
verif_stats # trace_vprintk__open_and_load unexpected error: -9 (?)
vmlinux # failed to auto-attach program 'handle__fentry': -524 (trampoline)
xdp_adjust_tail # case-128 err 0 errno 28 retval 1 size 128 expect-size 3520 (?)
xdp_bonding # failed to auto-attach program 'trace_on_entry': -524 (trampoline)
xdp_bpf2bpf # failed to auto-attach program 'trace_on_entry': -524 (trampoline)

File diff suppressed because it is too large Load Diff

View File

@@ -223,6 +223,8 @@ CONFIG_SHMEM=y
CONFIG_AIO=y
CONFIG_IO_URING=y
CONFIG_ADVISE_SYSCALLS=y
CONFIG_HAVE_ARCH_USERFAULTFD_WP=y
CONFIG_HAVE_ARCH_USERFAULTFD_MINOR=y
CONFIG_MEMBARRIER=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
@@ -236,7 +238,7 @@ CONFIG_BPF_JIT_DEFAULT_ON=y
CONFIG_USERMODE_DRIVER=y
CONFIG_BPF_PRELOAD=y
CONFIG_BPF_PRELOAD_UMD=y
# CONFIG_USERFAULTFD is not set
CONFIG_USERFAULTFD=y
CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE=y
CONFIG_RSEQ=y
# CONFIG_DEBUG_RSEQ is not set
@@ -975,7 +977,7 @@ CONFIG_NETFILTER_NETLINK=y
CONFIG_NETFILTER_NETLINK_QUEUE=y
CONFIG_NETFILTER_NETLINK_LOG=y
# CONFIG_NETFILTER_NETLINK_OSF is not set
# CONFIG_NF_CONNTRACK is not set
CONFIG_NF_CONNTRACK=y
# CONFIG_NF_LOG_NETDEV is not set
# CONFIG_NF_TABLES is not set
CONFIG_NETFILTER_XTABLES=y
@@ -1048,6 +1050,7 @@ CONFIG_NETFILTER_XT_MATCH_STATISTIC=y
#
# IP: Netfilter Configuration
#
CONFIG_NF_DEFRAG_IPV4=y
# CONFIG_NF_SOCKET_IPV4 is not set
# CONFIG_NF_TPROXY_IPV4 is not set
# CONFIG_NF_DUP_IPV4 is not set
@@ -1089,6 +1092,7 @@ CONFIG_IP6_NF_IPTABLES=y
# CONFIG_IP6_NF_SECURITY is not set
# end of IPv6: Netfilter Configuration
CONFIG_NF_DEFRAG_IPV6=y
CONFIG_BPFILTER=y
CONFIG_BPFILTER_UMH=m
# CONFIG_IP_DCCP is not set
@@ -1486,7 +1490,7 @@ CONFIG_SCSI_MOD=y
# CONFIG_MACINTOSH_DRIVERS is not set
CONFIG_NETDEVICES=y
CONFIG_NET_CORE=y
# CONFIG_BONDING is not set
CONFIG_BONDING=y
# CONFIG_DUMMY is not set
# CONFIG_WIREGUARD is not set
# CONFIG_EQUALIZER is not set
@@ -1791,7 +1795,14 @@ CONFIG_BCMA_POSSIBLE=y
# end of Multifunction device drivers
# CONFIG_REGULATOR is not set
# CONFIG_RC_CORE is not set
CONFIG_RC_CORE=y
# CONFIG_RC_MAP is not set
CONFIG_LIRC=y
CONFIG_BPF_LIRC_MODE2=y
# CONFIG_RC_DECODERS is not set
CONFIG_RC_DEVICES=y
CONFIG_RC_LOOPBACK=y
# CONFIG_IR_SERIAL is not set
# CONFIG_MEDIA_CEC_SUPPORT is not set
# CONFIG_MEDIA_SUPPORT is not set

Some files were not shown because too many files have changed in this diff Show More